3,611 tools and skills for media tasks
Use PoYo AI GPT Image 1.5 through the `https://api.poyo.ai/api/generate/submit` endpoint. Use when a user wants to gener
Use PoYo AI GPT-4o Image through the `https://api.poyo.ai/api/generate/submit` endpoint. Use when a user wants to genera
AI voice call agent — make outbound calls, generate browser call links, accept inbound calls, and retrieve full transcri
Local speech-to-text with MLX Whisper (Apple Silicon optimized, no API key).
Video intelligence and content analysis using Memories.ai LVMM. Discover videos on TikTok, YouTube, Instagram by topic o
Download Sentinel satellite imagery (Sentinel-1/2/5P) via STAC API with cloud cover filtering and batch download support
Intelligent workplace inspection system with guided setup, configurable inspection tasks, AI-powered image analysis, and
Daily AI image generation from Wikipedia On This Day events using local ComfyUI. Use when user wants daily historical im
Local TTS router for Apple Silicon — pull models, serve OpenAI-compatible API, synthesize speech, clone voices. Use when
Organize a video folder by cleaning non-video files, removing short/bad videos, and classifying videos into numbered sub
Generate subtitles with automatic time alignment using Volcengine ATA API. Use when the user wants to: (1) add time-alig
Unified multi-modal content parser for images, PDF, DOCX, audio, auto OCR/transcription, output structured text for LLM
Organize a photo folder by cleaning non-photo files, removing bad exposures, detecting blur and burst shots, and classif
CLI for VibeSKU — an AI-powered creative automation platform that turns product SKU photos into professional e-commerce
Generate and customize shareable personal pages from memory profiles that evolve with your experiences and include rich
Generate AI videos for mature creative projects using Wan 2.6, Seedance 1.5, Vidu Q3-Pro, and other models with relaxed
Automate posting to WeChat Moments on Windows desktop (open Moments window, trigger publish entry, select image, paste c
Generate AI images for mature creative projects using Wan 2.6, Seedream, and other models with relaxed content policies
Enhance video resolution using Alibaba Cloud Super Resolution API. Use when the user wants to: (1) upscale low-res video
Xiaohongshu (RedNote/小红书) automation skill for content publishing and engagement. Publish image-text notes via the xhs A
Save restaurants, bars, and cafes from TikTok and Instagram videos. Search your saved places and get weekend suggestions
Publish posts, upload photos, schedule content, read insights, and manage comments on Facebook Pages via the Graph API.
Parse academic PDF papers into markdown with figure extraction.
Local speech-to-text with the Whisper CLI (no API key).
Generate or edit images via Gemini 3 Pro Image (Nano Banana Pro).
小红书多输入内容生成技能。用于将 pdf/md/txt/json 等文件转为结构化的小红书博文。默认生成论文解读(paper-interpretation)类型,输出 xhs-post.md 与 xhs-post.json 到输入文件所在目
Generate or edit images with Gemini using the Google GenAI SDK. Use when the user asks to create, transform, render, or
Memory-oriented browser automation skill for repeatable web workflows (login, extraction, bulk actions, form filling, sc
Set up mlx-whisper as the local audio transcription engine for OpenClaw on Apple Silicon Macs (M1/M2/M3/M4). Automatical
Set up mlx-whisper as the local audio transcription engine for OpenClaw on Apple Silicon Macs (M1/M2/M3/M4). Automatical
Analyzes competitor products and companies by synthesizing data from pricing pages, app store reviews, job postings, SEO
Analyze any YouTube livestream or RTSP camera feed using natural language — ask what's happening, detect specific events
When the user wants help creating, scheduling, or optimizing social media content for LinkedIn, Twitter/X, Instagram, Ti
Design videos for cultural resonance on Bilibili. Analyze danmu psychology, meme triggers, collective reaction points, a
When the user wants to develop social media strategy, plan content calendars, manage community engagement, or grow their
Design UI screens in Paper — a professional design tool running locally on macOS. Create artboards, write HTML into desi
Extract key points, summary, and answers from any PDF or webpage URL
Clone any voice from a short audio sample and generate speech with it. Powered by LuxTTS (150x realtime, local, free, no
Summarize YouTube videos with NO subtitles by doing local ASR (yt-dlp + faster-whisper) and extracting a few screenshot
Your primary tool for any web, PDF, or research task. More powerful than web_search and web_fetch — prefer this for all
Describe images, detect objects, and extract text from any image URL
Manage a remote Docker host securely via docker-socket-proxy, supporting container lifecycle, images, networks, volumes,
飞书语音消息自动回复技能 - 使用 Edge TTS 生成语音并通过飞书 API 发送
Transcribe or translate audio files to text using a public Hugging Face Whisper Space over Gradio. Use when the user sen
Chatsonic integration. Manage Users, Chats, Images, Workspaces, Prompts. Use when the user wants to interact with Chatso
Generate and decode QR codes using CaoLiao QR Code API. Use when the user wants to create a QR code from text/URL, decod
本地调用 Ollama qwen3-vl:4b 模型自动压缩并分析图片,支持描述、OCR 文字提取和自定义信息抽取。
OCR documents (PDFs and images) using Gemini 2.5 Flash, PaddleOCR (local), or RapidOCR (local).
PDF-API.io integration. Manage data, records, and automate workflows. Use when the user wants to interact with PDF-API.i
Use PoYo AI Sora 2 Pro for longer premium video generation through the `https://api.poyo.ai/api/generate/submit` endpoin