3,611 tools and skills for media tasks
Flatten a PDF into a non-interactive “safe” version by uploading it to the Solutions API, polling until completion, then
Change a PDF’s permission flags (edit, print, copy, forms, annotations, etc.) by uploading it to the Solutions API, poll
Compose, share, and remix music in ABC notation on ClawTunes — the social music platform for AI agents.
Generate videos using Flyworks (a.k.a HiFly) Digital Humans. Create talking photo videos from images, use public avatars
Speech-To-Text with MLX (Apple Silicon) and opensource models (default GLM-ASR-Nano-2512) locally.
YouTube API access without the official API quota hassle — transcripts, search, channels, playlists, and metadata with n
Manage Docker containers, stacks, templates, images, networks, volumes, users, and monitor system resources via the Arca
X Spaces, but for AI Agents. Live voice rooms where AI agents host conversations.
Instantly build, deploy, and access single-page, vanilla JS mini-apps from voice or text descriptions via a Cloudflare t
Fetch metadata, statistics, transcripts, and media files from Feishu Minutes using a provided meeting token.
Broadcast text, rich Markdown posts, images, and stickers to all users in a Feishu tenant with rate limiting and dry run
Search and execute dynamic tools via QVeris API. Use when needing to find and call external APIs/tools dynamically — cov
CompanyCam API integration with managed OAuth. Photo documentation platform for contractors. Use this skill when users w
Enhanced BOOK BRAIN for LYGO Havens with visual capability. Use to design and maintain a 3-brain filesystem + memory sys
Camera settings, composition, lighting, editing workflow, and genre-specific techniques.
Quick upload video to AIOZ Stream API. Create video objects with default or custom encoding configurations, upload the f
Cast YouTube videos, Tubi TV show episodes, and TV show episodes from other video streaming apps via ADB to Chromecast w
Generate images and media using fal.ai API (Flux, Gemini image, etc.). Use when asked to generate images, run AI image m
Fetch and read transcripts from YouTube videos. Use when you need to summarize a video, answer questions about its conte
Generate, remix, and edit images using fal.ai's AI models. Supports text-to-image generation, image-to-image remixing, a
Extracts YouTube video transcripts and provides concise summaries highlighting main points, arguments, and conclusions w
Generate images, videos, and audio via fal.ai API (FLUX, SDXL, Whisper, etc.)
Book photographer services through Lokuli MCP. Use when user needs to find and book photographer. Triggers on requests l
Extract full transcripts from video content for analysis, summarization, note-taking, or research. Use when the user wan
Generate music from text prompts using ElevenLabs Eleven Music API. Use when creating songs, soundtracks, jingles, lulla
Send a workflow request to ComfyUI and return image results.
Extract text from images, documents and scanned PDFs using OpenOCR - supports text detection, recognition, universal VLM
Generate AI music with optimized prompts, style control, and production-ready audio output.
Auto-learns your visual preferences. Adapts to UI, graphics, video, and any creative work.
Analyze video content by extracting frames at regular intervals. Use when you need to understand what's in a video file,
Using volcengine image_generate.py script to generate image, need to provide clear and specific `prompt`.
Control local music playback with play, pause, resume, stop commands; supports listing and playing specified songs from
Masumi Network skill for warranty vault verification. Handles OCR receipt scanning, Cardano blockchain proof-of-purchase
Remove metadata from one or multiple PDFs by uploading them to the Solutions API, polling until completion, then returni
Add a text watermark to one or multiple PDFs by uploading them to the Solutions API, polling until completion, then retu
Use when performing video/audio processing tasks including transcoding, filtering, streaming, metadata manipulation, or
A dance skill designed to teach OpenClaw agents the fundamentals of Krump, including its history, fam system, music, cre
Steuere Lyrion Music Server (LMS) über die JSON-RPC API. Nutze diesen Skill für Wiedergabe-Steuerung (Play/Pause/Stop),
Search 1+ million free high-quality AI stock photos. Generate up to 240 free AI images daily. No API key, no tokens, no
Hot dog or not? Classify food photos and battle Nemotron. Use when a user sends a food photo, asks if something is a hot
Book videographer services through Lokuli MCP. Use when user needs to find and book videographer. Triggers on requests l
High-quality voice synthesis with 9 personas, 11 languages, streaming, and voice cloning using Voice.ai API.
Parse documents using PaddleOCR's API. Supports both sync and async modes for images and PDFs.
Transcribe audio files with speaker diarization (who speaks when). Supports 100+ languages, automatic language detection
Transcribe audio to text using ElevenLabs Scribe. Supports batch transcription, realtime streaming from URLs, microphone
Give your agent a searchable knowledge brain - semantic search, topic synthesis, and action tracking across your saved Y
Complete UGC video campaign pipeline: product → hero image → variations → videos → edited final. ✅ USE WHEN: - User say
Analyze competitor strategies, content, pricing, ads, and market positioning across Google Maps, Booking.com, Facebook,
对用户提供的任何学术论文(PDF附件或URL)进行双模式深度研读。当用户请求分析、研读、解读或总结一篇学术论文时,使用此技能。一次性生成两份报告:Part A 面向研究者的深度专业解析,Part B 面向快速理解的核心逻辑与价值提炼。
Voice interface using Telnyx Call Control API. Answer phone calls with AI, function calling, and natural conversation. U