3,611 tools and skills for media tasks
Transcribe audio via the self-hosted Whisper ASR instance running on Kubernetes. Use this skill whenever the user wants
飞书双向语音消息工具 - 支持语音转文字接收和文字转语音发送(TTS+Whisper)
Control Anki/Digital Dream Labs Vector robot via MCP tools. Speech, motion, camera, sensors, and basic autonomy workflow
Creates and manages AI-generated YouTube scripts, metadata, thumbnails, and schedules via a GraphQL API with specialized
Track and analyze your OpenClaw session costs. Parse transcripts, calculate per-model spend, set budgets, alert on overr
ByteDance/Doubao (Volcengine ARK) API 进行文本生图、图片编辑和文本生视频的操作
Merge ZUGFeRD 2.1 compliant invoice PDF and time report into a single visible multi-page PDF/A-3b file with embedded XML
Generate and iteratively edit images. Supports storage, UI for manual editing, history, version branching, time travel,
Manage CapRover PaaS instances via API: create/update apps, deploy from Docker image or custom Dockerfile (tar file), co
Fetch and analyze Bilibili video danmaku (bullet comments) from a Bilibili video URL/BVID, then output keyword frequency
Automates news article creation and publishing to remote WordPress via SSH and WP-CLI, including image handling and SEO
Offline speech-to-text conversion using Vosk local model; input audio file path, output transcript text.
Generate professional food photography using each::sense API for restaurant menus, food delivery apps, recipe blogs, and
Change eye colors in photos using each::sense AI. Transform natural eye colors, create fantasy effects, heterochromia, g
Gemini image generation, editing, and search-grounded image creation via gemini-3.1-flash-image-preview (Nano Banana 2).
ElevenLabs voice API integration — TTS, sound effects, music generation, speech-to-text, voice isolation, and streaming.
Build polished showcase and demo videos from screenshots, avatars, and text overlays using ffmpeg. Use when creating dem
LoRA fine-tuning pipeline for Stable Diffusion on Apple Silicon — dataset prep, training, evaluation with LLM-as-judge s
SHA-256 prompt deduplication for LLM and TTS calls — hash normalize prompts, check cache before calling APIs, store resu
Generate 30 fully scripted, AI-produced TikTok, Instagram Reels, and YouTube Shorts videos tailored to your niche, ready
Local zero-cost text-to-speech with per-agent voice profiles using Kokoro TTS (82M params). 54 voices available, named a
Side-by-side comparison of paid vs local image generation models — DALL-E 3, FLUX.1-schnell, Gemini Imagen, and others.
Automated backup to Proton Drive with age-based truncation — sync configs, memory files, content drafts, and media with
Creates WooCommerce draft product listings with images, variants, and margin calculation from CJ Dropshipping products u
MCP 2025-11-25 specification compliance audit pack. Validates elicitation, tasks, resources/prompts, audio content, JSON
Run the video-skill pipeline to convert narrated videos into structured step data and enriched timeline-ready outputs. U
Generate images and videos using xAI Grok Imagine Extended. Text-to-image, image editing, text-to-video, image-to-video.
OpenClaw Skills Weekly — tracks trending ClawHub skills, generates GitHubAwesome-style YouTube video scripts with two-tr
Continuously monitor all brand mentions across major platforms, score sentiment, detect crises early, and generate AI-cr
Platform alignment audit pack for OpenClaw 2026.2. Secrets v2, agent routing, voice security, trust model, autoupdate, p
Automates research, writing, growth, and monetization of niche newsletters with viral content, AI-crafted editions, vide
Join Google Meet meetings via a headless browser bot and capture live captions as a transcript. IMPORTANT: Before joinin
Offline-first voice assistant stack for macOS (Wake word + VAD recording + local Whisper ASR + OpenClaw agent response +
Automates setup of GPU-accelerated Bittensor Subnet 85 video upscaling and compression miners with storage, monitoring,
Fetches current weather from Open-Meteo API and automatically captures a live webcam image from Meteoblue or Windy for t
Optimize Toggl Track usage with token-efficient API calls and fast reporting via a shell script for JSON and PDF summari
Search for photos in PhotoCHAT using natural language via the CLI. Use when the user asks to find, search for, or locate
Generate fully scripted, AI-produced authentic UGC-style video ads with tested hooks, personas, and campaign strategy fo
Youtube Highest Quality Downloader - Download highest quality silent video and pure audio from YouTube, then merge into
Doubao (Volcengine ARK) API Shell 脚本实现 - 文本生图、图片编辑和文本生视频
Compress PPT/PPTX file size. Decompress PPT, compress large images, repackage and convert to PDF to significantly reduce
Upscale images to 2K, 4K, or 8K resolution using WaveSpeed AI's Image Upscaler. Takes an image URL and produces a higher
Generate talking head videos from a portrait image and audio using WaveSpeed AI's InfiniteTalk model. Produces lip-synce
Generate videos using Alibaba's Wan 2.6 model via WaveSpeed AI. Supports text-to-video and image-to-video generation wit
Generate and edit images using Google's Nano Banana 2 model via WaveSpeed AI. Supports text-to-image generation and imag
Generate and edit images using Google's Nano Banana Pro model via WaveSpeed AI. Supports text-to-image generation and im
QQ 邮箱自动监控技能,支持定时检查新邮件、TTS 语音播报提醒、邮件收发功能。适用于邮件通知、验证码提取、自动回复等场景。
WordPress REST API integration for managing posts, pages, media, and more on self-hosted WordPress sites. Use when you n
Build and execute a social media marketing strategy for a solopreneur business. Use when choosing platforms, creating a
SiliconFlow 多模态服务,支持图片生成(FLUX/Qwen)、视频生成(Wan)、TTS语音合成、ASR语音识别。使用代金券支付。