3,611 tools and skills for media tasks
CDP browser control at localhost:9222. Use when you need to inspect tabs, take screenshots, navigate, scroll, post to X,
Create lead magnets (PDF guides, checklists, templates) with matching landing pages. Generate qualified leads for your b
Extract complete data from X/Twitter tweets by URL, including text, author info, timestamps, engagement stats, media, qu
Professionelle deutsche Inhalte erstellen - Blog-Artikel, Landing Pages, E-Mails, Social Media. Perfekt für den DACH-Mar
Control the Windows PC from WSL2. Use when the user asks to open/close applications, manage processes, take screenshots,
Extract a color palette from an image and return HEX/RGB values with optional swatch image.
Download and summarize Xiaohongshu (小红书/RedNote) videos. Produces a full resource pack with video, audio, subtitles, tra
Extract text and metadata from PDF files using PyMuPDF, supporting large files and outputting results in JSON format.
Convert text to speech using the TogetherAI API with the MiniMax speech-2.6-turbo model and save audio in mp3 format.
Markdown 转 PDF 技能。将 Markdown 文件转换为精美的 PDF 文档,完美支持中文、代码高亮、自定义样式。
基于双引擎策略的智能语音系统,支持中文本地Tingting语音和多语言Edge-TTS云端,输出高质量OGG音频,保障隐私。
Convert Markdown files to PDF with full LaTeX math formula rendering and CJK (Chinese/Japanese/Korean) support. Use when
Give your agent a real phone. It dials, waits on hold, negotiates your bills, and returns a full transcript.
High-performance local speech-to-text transcription using Faster Whisper with NVIDIA GPU acceleration. Transcribe audio
Generate detailed images from text prompts using Pollinations.ai models with optional configuration, model selection, an
Personal knowledge base for capturing and retrieving information about people, places, restaurants, games, tech, events,
Extracts text, tables, and images from PDFs (including scanned PDFs) using the Mistral OCR API. Use when user asks to OC
Capture screenshots on Windows using mss and Pillow. Provides full-screen, region, and multi-monitor capture with output
OpenClaw virtual companion skill. Use it to bootstrap runtime files (SOUL and base image), guide user personalization, l
Configure and use snipgrapher to generate polished code snippet images
RoomSound gives your agent the skill to play audio to your speakers. Starting with YouTube to Bluetooth speakers, expand
ElevenLabs text-to-speech with mac-style say UX.
Text-to-speech conversion using Zhipu AI (BigModel) GLM-TTS model. Use when you need to convert text to audio files with
飞书语音消息发送技能。将文本转换为语音并发送到飞书,支持 TTS 生成、格式转换、时长读取、文件上传和消息发送。
Automatic Speech Recognition (ASR) using Zhipu AI (BigModel) GLM-ASR model. Use when you need to transcribe audio files
Detects, blocks, and reports Child Sexual Abuse Material using AI-driven image, video, and behavior analysis with automa
使用内置 image_generate.py 脚本生成图片, 准备清晰具体的 `prompt`。
Generate a minimalist terrain-style animated driving route video (MP4) from a list of stops (cities/POIs) without Remoti
Markdown 转 PDF 技能。将 Markdown 文件转换为精美的 PDF 文档,完美支持中文、代码高亮、自定义样式。
Analyzes images to detect AI generation, extract metadata, identify artifacts, and perform content moderation using loca
Send a public video URL directly to a Google Gemini model for analysis. Use when Codex must summarize a video, answer qu
Index YouTube channel videos and transcripts for semantic search. Use when user says "index YouTube", "ad
X Platform API. Read posts, search tweets, post, upload media.
xAI Grok API. Search the web, search X, generate images, generate video.
Multi-source search skill for Kiro on OpenClaw. Aggregate and rank results from Google, Google Scholar, YouTube, and X,
Transcribes local voice messages to text using Faster Whisper models for fast, privacy-focused speech recognition on aud
Sync your OpenClaw agent with the ScanWow iOS app. Start an HTTP webhook to receive high-quality OCR scans directly from
B 站视频播放器。用 Playwright 搜索 B 站视频并获取准确链接,然后用 open 命令在当前浏览器打开播放。Use when users request to play Bilibili videos or search for
Jarvis TTS text-to-speech using Microsoft edge-tts with afplay playback. Use when users request voice output, audio resp
Generate promo video plan with 30-45s script, shot-by-shot storyboard, and optional Remotion/Montage-tool config. Use wh
Generate tarot × astrology content for social media — weekly horoscope scripts, tarot spreads, video scripts, and cover
Define and store your brand voice profile for consistent content generation. Captures writing style, vocabulary patterns
Searches multiple online archives to find and recover deleted YouTube videos, metadata, and comments using a video ID.
Compose, play, and render music using Strudel live-coding patterns. Use when composing music programmatically, generatin
Display API credit balances for 5 core providers (Anthropic, OpenAI, OpenRouter, Mistral, Groq) with video game style he
Research technical projects/papers and generate comprehensive reports with PDF export. Modes: lite (analysis + writing)
Add AI voice assistants to your website. Engage visitors with natural voice conversations, capture leads, automate suppo
Manage BRICKS workspace devices, groups, apps, modules, media, and projects via CLI for control, monitoring, updates, an
Install, configure, start, and troubleshoot ClawTime — a private self-hosted webchat UI for OpenClaw with passkey (Face
Extract text and structured data from documents using Azure Document Intelligence (formerly Form Recognizer). Supports O