Media AI Skills - 3,611 Tools

pdf-ocr-layout

基于智谱 GLM-OCR、GLM-4.7 及 GLM-4.6V 的多模态文档深度解析工具。 Use when: - 需要高精度提取文档（PDF/图片）中的表格并转换为 Markdown 格式 - 需要从文档页面中自动裁剪并提取插图、图表为

by clawhub · community · Quality: medium

AI Media Generation En

Generate images & videos with AIsa. Gemini 3 Pro Image (image) + Qwen Wan 2.6 (video) via one API key.

by clawhub · community · Quality: medium

Sprite Animator

Generate animated pixel art sprites from any image using AI. Send a photo, get a 16-frame animated GIF.

by clawhub · community · Quality: medium

Upload audio to AIOZ Stream

Quick upload audio to AIOZ Stream API. Create audio objects with default or custom encoding configurations, upload the f

by clawhub · community · Quality: medium

Play Music from YouTube

Play music on YouTube via browser automation with playwright-cli. Use when the user wants to: (1) play a specific song

by clawhub · community · Quality: medium

Home Assistant Assist

Control Home Assistant smart home devices using the Assist (Conversation) API. Use this skill when the user wants to con

by clawhub · community · Quality: medium

Canvas Design

Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user

by clawhub · community · Quality: medium

Suno Automation

Control Suno.com via OpenClaw browser to input lyrics, style, title, create, and play AI-generated music tracks.

by clawhub · community · Quality: medium

Browser Automation CLI

Automate web browser interactions using natural language via CLI commands. Use when the user asks to browse websites, na

by clawhub · community · Quality: medium

mediaproc

Process media files (video, audio, images) via a locked-down SSH container with ffmpeg, sox, and imagemagick

by clawhub · community · Quality: medium

Sora Video Generation

Generate videos using OpenAI's Sora API. Use when the user asks to generate, create, or make videos from text prompts or

by clawhub · community · Quality: medium

TubeScribe

YouTube video summarizer with speaker detection, formatted documents, and audio output. Works out of the box with macOS

by clawhub · community · Quality: medium

Nextbrowser

Use Nextbrowser cloud API to spin up cloud browsers for Openclaw to run autonomous browser tasks. Primary use is creatin

by clawhub · community · Quality: medium

Media Player

Play audio/video locally on the host

by clawhub · community · Quality: medium

Radarr+

Add and manage movies in a Radarr instance via its HTTP API (search/lookup movies, list quality profiles and root folder

by clawhub · community · Quality: medium

Яндекс Дзен. Публикация постов с фото и видео

Publish articles and posts to Dzen.ru (Yandex Zen). Supports text, images, and videos. Requires session cookies and a CS

by clawhub · community · Quality: medium

YT-to-Blog Content Engine

Full content pipeline: YouTube URL → transcript → blog post → Substack draft → X/Twitter thread → vertical video clips v

by clawhub · community · Quality: medium

Ephemeral Media Hosting

自動削除機能付き一時メディアホスティングシステム

by clawhub · community · Quality: medium

MoltMedia

The official visual expression layer for AI Agents. Post images to MoltMedia.lol and join the AI visual revolution.

by clawhub · community · Quality: medium

upstage-document-parse

Parse documents (PDF, images, DOCX, PPTX, XLSX, HWP) using Upstage Document Parse API. Extracts text, tables, figures, a

by clawhub · community · Quality: medium

Renderful AI

Generate images and videos via renderful.ai API (FLUX, Kling, Sora, WAN, etc.) with crypto payments. Use when the user w

by clawhub · community · Quality: medium

Render Stl Png

Render an STL file to a PNG image with a solid color using a deterministic software renderer and adjustable 3D perspecti

by clawhub · community · Quality: medium

Elevenlabs Integration with Openclaw

ClawVox - ElevenLabs voice studio for OpenClaw. Generate speech, transcribe audio, clone voices, create sound effects, a

by clawhub · community · Quality: medium

Youtube Data

Access YouTube video data — transcripts, metadata, channel info, search, and playlists. A lightweight alternative to Goo

by clawhub · community · Quality: medium

Supernote Cloud

Access a self-hosted Supernote Private Cloud instance to browse files and folders, upload documents (PDF, EPUB) and note

by clawhub · community · Quality: medium

Local Whisper

Local speech-to-text using OpenAI Whisper. Runs fully offline after model download. High quality transcription with mult

by clawhub · community · Quality: medium

Songsee

Generate spectrograms and feature-panel visualizations from audio with the songsee CLI.

by clawhub · community · Quality: medium

Camsnap

Capture frames or clips from RTSP/ONVIF cameras.

by clawhub · community · Quality: medium

Miro board

Workshop photos/notes -> an editable Miro diagram (real FRAMES as containers + stickies + connectors) with idempotent

by clawhub · community · Quality: medium

Grok Image Cli

Generate and edit images via Grok API from the command line. Secure macOS Keychain storage for xAI API key. Supports bat

by clawhub · community · Quality: medium

SaaS (Screenshot As A Service)

Give your agent the ability to instantly take screenshots of any website with just the URL. Cloud-based so your agent ha

by clawhub · community · Quality: medium

Voicenotes Official

This official skill from the Voicenotes team gives OpenClaw access to new APIs and the ability to search semantically, r

by clawhub · community · Quality: medium

Claw Desktop Pet

Give OpenClaw a body — a tiny fluid glass ball desktop pet with voice cloning, 15+ eye expressions, desktop lyrics overl

by clawhub · community · Quality: medium

Research Library

Local-first multimedia research library for hardware projects. Capture code, CAD, PDFs, images. Search with material-typ

by clawhub · community · Quality: medium

video-cog

Long-form AI video production: the frontier of multi-agent coordination. CellCog orchestrates 6-7 foundation models to p

by clawhub · community · Quality: medium

Windows Control

Full Windows desktop control. Mouse, keyboard, screenshots - interact with any Windows application like a human.

by clawhub · community · Quality: medium

SiliconFlow Video Gen

Generate videos using SiliconFlow API with Wan2.2 model. Supports both Text-to-Video and Image-to-Video.

by clawhub · community · Quality: medium

Voice Agent Builder Pro

Build and manage Voice AI agents using Vapi, Bland.ai, or Retell. Create agents, configure voices, set prompts, make out

by clawhub · community · Quality: medium

Slybroadcast Voicemail

Send voicemail drops via Slybroadcast using local CLI with options for ElevenLabs TTS or custom audio URLs and campaign

by clawhub · community · Quality: medium

Ai Media

Generate photorealistic images, videos, talking heads, and natural TTS audio using GPU-accelerated AI models and scripts

by clawhub · community · Quality: medium

Sociclaw

An autonomous social media manager agent that researches, plans, and posts content.

by clawhub · community · Quality: medium

Chief Experience Officer

Lead customer and employee experience with journey mapping, voice of customer programs, and service design excellence.

by clawhub · community · Quality: medium

Social Media Content Calendar

Generate structured social media content calendars with platform-specific posts, hashtags, and scheduling. Use when crea

by clawhub · community · Quality: medium

Alicloud Ai Video Wan R2v

Generate reference-based videos with Alibaba Cloud Model Studio Wan R2V (wan2.6-r2v-flash). Use when creating multi-shot

by clawhub · community · Quality: medium

Alicloud Ai Audio Tts Realtime

Real-time speech synthesis with Alibaba Cloud Model Studio Qwen TTS Realtime models. Use when low-latency interactive sp

by clawhub · community · Quality: medium

Dizest Summarize

Summarize long-form content — articles, podcasts, research papers, PDFs, notes, and more — using the Dizest API. Turn wh

by clawhub · community · Quality: medium

Mistral OCR

Extract text, tables, and images from PDFs or images using Mistral OCR API and output in Markdown, JSON, or HTML formats

by clawhub · community · Quality: medium

ZenMux Image Generation

Generate images via ZenMux API (Pro/Elite). Supports Text-to-Image, Image-to-Image, and Multi-Image reference fusion.

by clawhub · community · Quality: medium

Railway Deploy

This skill should be used when the user wants to push code to Railway, says "railway up", "deploy",

by clawhub · community · Quality: medium

MarkItDown Skill

OpenClaw agent skill for converting documents to Markdown. Documentation and utilities for Microsoft's MarkItDown librar

by clawhub · community · Quality: medium