3,611 tools and skills for media tasks
Post to social media platforms using the multi-provider social posting API. Use when user wants to post to Twitter, Link
AI视频生成与编辑,使用火山引擎 Doubao Seedance 模型。支持文生视频、图生视频、有声视频。当用户要求生成视频、制作视频、文生视频、图生视频时使用此 skill。
Skill for Tencent Cloud VITA image/video understanding. Analyzes images and videos using AI. Use when: understanding vid
将任意主题、长文、报告、纪要或说明文本稳定转换成中文视觉生图提示词, 然后调用 DashScope Qwen 图像模型直接出图。适用于“把这段内容做成信息图”、 “做成故事漫画长图”、“长文转图”、“生成中文生图 prompt”、“根据文档
Offline speech-to-text (ASR) using whisper.cpp (whisper-cli) + ffmpeg. Supports batch transcription, timestamps, SRT/TXT
Work with OpenAI-compatible image generation and image editing endpoints. Use when the user wants to generate images fro
将任意多媒体文档导入 Obsidian 知识库。支持 PPT、PDF、DOCX、图片等格式,自动提取每一页/每一张图片,使用多模态模型理解内容,生成文字描述后存入 OB。适用于:(1) 整理培训课件 (2) 迁移笔记到 OB (3) 将图片
使用多模态大模型理解图片内容,生成业务含义描述。支持多种模型:(1) MiniMax VLM (2) OpenAI GPT-4V (3) Claude Vision。用于理解截图、图表、文档照片等,生成精准的文字描述。
AI-powered video summarization for Bilibili, Xiaohongshu, Douyin, and YouTube. Extract insights from video content throu
腾讯云通用票据识别高级版(VatInvoiceOCR)接口调用技能。当用户需要识别发票图片中增值税专用发票、增值税普通发票、增值税电子专票、增值税电子普票、电子发票(普通/增值税专用)的全字段信息时,应使用此技能。支持识别发票图片中的发票代
AI 视觉监控系统:双模式架构(待机/关怀),支持人脸识别、久坐提醒、疲劳检测、光线检测、工作时长统计,飞书命令控制。
Local ASR and TTS inference server. Use when the user wants to transcribe audio to text (ASR) or convert text to speech
Skill for Tencent Cloud Video Content Moderation (VM) — AI-Generated Video Detection. Calls the CreateVideoModerationTas
Skill for Tencent Cloud Image Content Moderation (IMS) — AI-Generated Image Detection. Calls the ImageModeration API wit
Connect Even Realities G2 smart glasses to OpenClaw via Cloudflare Worker. Deploys a bridge that routes G2 voice command
Play TTS or audio on the Raspberry Pi (or gateway host) default speaker. Use when the user asks for an announcement, ala
Create music with MiniMax music models (music-2.5+, music-2.5). Use when generating songs, instrumental tracks, or chant
Generate production-ready Amazon and AliExpress listings from a product image or parameters. Outputs title, bullet point
Create high-converting YouTube thumbnail concepts, overlay text, image prompts, and optional AI-generated cover images f
Search for upcoming concerts and live music events by city, country, artist, or genre using the Ticketmaster Discovery A
Combined agent that synthesizes speech via Volcengine TTS, uploads the audio to TOS, and returns a presigned temporary U
Text-to-speech generation on Volcengine (ByteDance) speech services. Use when users need narration, multi-language speec
AI video generation via Crazyrouter API. Supports Sora 2, Kling V2, Veo 3, Seedance, Pika, MiniMax Hailuo, Runway. Text-
Text-to-speech via Crazyrouter API. OpenAI TTS voices (alloy, echo, fable, onyx, nova, shimmer). Convert text to natural
AI image generation via Crazyrouter API. Supports DALL-E 3, Midjourney, Flux Pro, Stable Diffusion, Imagen 4. Text-to-im
Reverse image search (find image source, visually similar images). Use when user provides an image and wants to find its
Generate images via Gemini Web API using Google AI Pro subscription. Uses browser cookies for authentication (no API key
AI meeting assistant via ghostmeet. Start sessions, get live transcripts, and generate AI summaries from any browser mee
Transcribe YouTube videos with smart fallback: extracts captions first (fast, free), falls back to local Whisper transcr
Mac摄像头日记技能。定时用摄像头拍照并用 AI 夸奖照片中的人,晚上生成当日总结,次日自动清理照片。上下班时间、午休、加班、照片路径、拍照间隔、模型等均可配置。当用户提到"摄像头日记"、"拍照分析"
Give your AI agent eyes to see the entire internet. Install and configure upstream tools for Twitter/X, Reddit, YouTube,
使用阿里云 DashScope/灵眸 API 生成人脸口播视频(talking head video)。支持三种模式:EMO(人像+音频驱动口播,两步流程)、AA/Animate Anyone(全身动画)、灵眸(基于模板的数字人口播视频)。
Read, summarize, and search contents of local text, markdown, JSON, DOCX, and PDF files within authorized paths under 10
A comprehensive AI agent skill for writing and managing professional bios. Crafts bios for any platform and any purpose,
自动提取公募基金定期报告中“投资策略和运作分析”部分全文,支持文本型和扫描版PDF的精准定位与汇总。
Upload, schedule, and batch-manage TikTok videos via browser automation. Use when: user wants to upload a video to TikTo
Manage and control Mopidy music service on NAS including playback, volume, playlist viewing, and local music scanning.
MiniMax Hailuo 视频生成技能 - 使用 S2V-01 模型进行主体参考视频生成。可以生成视频、查询任务状态、下载视频文件。
One-command YouTube video transcription. Automatically downloads audio and transcribes using OpenAI Whisper API — works
Generate Feishu voice replies in user required voice. Generate OPUS file to match Feishu audio.
Generate high-resolution PNG images from detailed text prompts using the NVIDIA Stable Diffusion XL model with customiza
Complete media and public relations intelligence system. Trigger whenever someone needs to get press coverage, write a p
The Universal Imaging & Optical Sovereignty Protocol. A high-fidelity standard for the generation, authentication, a
皮皮虾职场短剧全流程制作技能。用于为「皮皮虾」(机械龙虾AI-bot)职场短剧生成镜头视频、剪辑成片、配音配乐并发布到飞书群。完整流程:图生视频(I2V) → ffmpeg规范化+剪辑 → TTS配音 → BGM混音 → 飞书媒体消息发送。
Nuclear-grade image metadata cleanser. Strip ALL EXIF/GPS/camera data, re-encode with noise injection. Forensically untr
AI-powered video intelligence - download, analyze, clip, GIF from any URL. Supports YouTube, TikTok, Instagram, X. Uses
完整的家庭健康管理套件。包含健康档案管理、病程记录、用药追踪、指标监测、症状分诊、 急救指导、多源问诊对比、健康科普、饮食追踪、体重管理、可穿戴设备同步等功能。 支持图片识别、就医前摘要生成(文本/图片/PDF)、药物安全检索、主动健康提醒
Windows 桌面自动化技能,支持截图、文字识别(OCR)、图像定位。用于:(1) 截取屏幕内容 (2) 从图片提取文字 (3) 定位UI元素位置进行自动化操作
Search past session transcripts to recover lost conversation context. MUST use when: (1) the current session is new or h
Browser automation via remote Playwright WebSocket server for screenshots, PDFs and testing.