3,613 个媒体任务的工具和技能
Generate AI videos with Google Veo, Seedance, Wan, Grok and 40+ models via inference.sh CLI. Models: Veo 3.1, Veo 3, See
Browser automation for AI agents via inference.sh. Navigate web pages, interact with elements using @e refs, take screen
Generate AI music using ACE-Step 1.5 via ACE Music's free API. Use when the user asks to create, generate, or compose mu
Fetch and read transcripts from YouTube and Bilibili videos. Use when you need to summarize a video, answer questions ab
视频 summarizer,支持 YouTube 和 Bilibili 视频自动获取字幕并 AI 总结,输出为 md 格式。适用于:用户给出一个视频链接,希望总结内容。
Connect and control ComfyUI API efficiently using template mapping and auto-asset management for image generation and ed
Fetch latest AI-related YouTube videos from curated channels using YouTube Data API v3 and filter by keywords
Control and automate the Linux desktop GUI on X11. Use this skill to take screenshots, find and click UI elements, type
Generate professional captions and subtitles with multi-engine transcription, word-level timing, styling presets, and bu
使用 Gemini 模型生成或编辑图片,支持自定义第三方 API 端点(baseUrl)和密钥。 默认 OpenAI 兼容格式,也支持 Google 原生格式。 触发场景:文生图、图片编辑、图片合成、绘画请求、生成插画/照片/海报、 AI
Generate 3D models for 3D printing from images or text prompts using PrintPal API. Use when the user wants to create 3D
使用淘宝进行以图搜同款、候选比对和加购物车操作。用户提供商品图片并要求“搜同款/找类似款/比价/加入购物车”时使用。优先执行本地脚本(save-taobao-cookie.js、verify-taobao-runner.js)完成全流程;当
Real-time OCR and data extraction API by Veryfi. Extract structured data from receipts, invoices, bank statements, W-9s,
Possibly the cheapest AI image generation (~$0.0036/image). Text-to-image via the EvoLink API.
Automatically index and semantically search Dropbox files using OCR and Office file parsing with efficient delta-based s
Best quality AI image generation (~$0.12-0.20/image). Text-to-image, image-to-image, and image editing via the EvoLink A
Possibly the cheapest AI image generation (~$0.0036/image). Text-to-image via the EvoLink API.
Video Copy Analyzer - AI-powered video transcription and copywriting analysis skill
Analyze URLs, YouTube videos, tweets, or text for quality, bias, and reliability using the Vajra API (vajra.to). Use whe
Free transcripts, 4K downloads, and video exploration — zero API quotas burned.
Best quality AI image generation (~$0.12-0.20/image). Text-to-image, image-to-image, and image editing via the EvoLink A
Edit PDF files visually using natural language with the nano-pdf CLI tool, powered by Google's Gemini 3 Pro Image (Nano
Voice-to-voice AI assistant using Gemini Live API. Speak to the AI and get spoken responses. Use when you want to have n
Background voice journaling with Soniox realtime STT for OpenClaw. Requires SONIOX_API_KEY. Get/create your Soniox API k
Azure Foundry image generation skill for OpenClaw; generates images via a Foundry deployment and returns image bytes or
Upload videos and custom thumbnails to YouTube. Use when the user wants to publish, upload, or post a video to YouTube,
Control Chrome browser with AI using MCP protocol. Use when users want to automate browser tasks, take screenshots, fill
Post to Moltgram — Instagram for AI Agents. Register, generate images, post, like, follow, and comment.
Control Chrome browser with AI using MCP protocol. Use when users want to automate browser tasks, take screenshots, fill
Query stats.fm (Spotify listening stats) via the public REST API. Provides music listening data, Spotify stats, top arti
Open Animate — the creative suite for AI agents. Create professional motion graphics, generate images, and render MP4 vi
Edit my recording, turn a long video into shorts, generate captions and thumbnails, estimate cost before processing. Upl
Interact with the openLesson tutoring API to generate learning plans, start audio-based sessions, analyze reasoning gaps
Use when editing videos, creating Reels/Shorts/TikTok, cutting long videos into clips, adding AI captions or commentary,
CLI tool for recording multi-modal context (audio, keystrokes, clipboard, screenshots) locally
Produce complete code-based animated videos by scripting, generating narration, creating visual assets, and rendering fi
Transcribe audio via API Whisper with any compatible local servers.
Convert PDF files to Markdown using WiseDiag MedOcr API. Supports table recognition, multi-column layouts, and medical d
Post bounties and evaluate/accept winning submissions on poidh (pics or it didn't happen) on Arbitrum, Base, or Degen Ch
Ultimate personalization engine for Apple Music. Analyzes listening history, Apple Music Replay stats, library data, and
Make real phone calls to your users via Stringclaw voice AI
Generates images and text via reverse-engineered Gemini Web API. Supports text generation, image generation from prompts
Voice conversation interface for OpenClaw using wake word detection, streaming LLM responses, and text-to-speech. Use wh
Full ElevenLabs platform integration — text-to-speech, voice cloning, and Conversational AI agent creation. Not just TTS
Generate adult video content using each::sense API with safety checker disabled
Generate adult images, artistic nudes, glamour photography, and fantasy art using the each::sense API with safety checke
Generate music videos using each::sense AI. Create visualizers, lyric videos, animated music videos, concert visuals, an
Send paid messages to real humans via the A.I. Cheese platform (aicheese.app). Use when an agent needs human input — sur
Implements UI from design mockups (Figma, Sketch, or image) with pixel-accurate layout, responsive behavior, and design
Plan channel mix and media strategy for Meta (Facebook/Instagram), Google Ads, TikTok Ads, YouTube Ads, Amazon Ads, Shop