Media AI Skills - 3,611 Tools

EvidenceOps - Forensic Evidence Management

Forensic media triage with chain of custody. Use when receiving images, videos, audio, PDFs, or documents that need evid

by msrovani · community · Quality: medium

Feishu Voice Assistant

Sends voice messages (audio) to Feishu chats using Duby TTS.

by autogame-17 · community · Quality: medium

seedance-cog

Seedance × CellCog. ByteDance's #1 video model meets the frontier of multi-agent coordination — CellCog orchestrates See

by nitishgargiitd · community · Quality: medium

MinerU PDF Extractor

Extract PDF content to Markdown using MinerU API. Supports formulas, tables, OCR. Provides both local file and online UR

by A-I-R · community · Quality: medium

Voice Notes

Transform chaotic voice memos into a searchable knowledge base with automatic organization, linking, and tag-based retri

by ivangdavila · community · Quality: medium

removebg-api

Remove image backgrounds using the remove.bg API with API-key auth and transparent PNG output. Use when high-quality cut

by rolandkakonyi · community · Quality: medium

amber-voice-assistant

Phone-capable AI voice agent for OpenClaw — Twilio + OpenAI Realtime SIP bridge with call log dashboard

by batthis · community · Quality: medium · 3 stars

NVIDIA Kimi Vision

Analyze images using NVIDIA Kimi K2.5 vision model via NVIDIA NIM API. Supports png, jpg, jpeg, webp.

by miladnoo · community · Quality: medium

ton

Ton namespace for Netsnek e.U. audio and media processing tools. Handles audio transcription, format conversion, wavefor

by kleberbaum · community · Quality: medium

Pywayne Visualization Rerun Utils

Static 3D visualization utilities wrapping Rerun SDK for adding point clouds, trajectories, cameras, planes, and chessbo

by wangyendt · community · Quality: medium

Pywayne Visualization Pangolin Utils

3D visualization toolkit wrapping Pangolin viewer for real-time display of point clouds, trajectories, cameras, planes,

by wangyendt · community · Quality: medium

Pywayne Vio Se3

SE(3) rigid body transformation library for 3D rotation and translation operations. Use when working with robot poses, c

by wangyendt · community · Quality: medium

Unified Invoice

통합 견적서/세금계산서 생성기. 한국형 견적서(사업자등록번호, 부가세) + 프리랜서 인보이스(다국어, VAT). 거래처/품목 DB, PDF 출력, 자동 계산.

by clawhub · community · Quality: medium

Mirroir

Control a real iPhone through macOS iPhone Mirroring — screenshot, tap, swipe, type, launch apps, record video, OCR, and

by clawhub · community · Quality: medium

appointment-scheduler

Automated appointment management for beauty salons, clinics, studios, and photo booths. Handles booking requests, calend

by mupengi-bot · community · Quality: medium

Video Ad Specs

Video ad creation with exact platform-specific specs for TikTok, Instagram, YouTube, Facebook, LinkedIn. Covers dimensio

by okaris · community · Quality: medium

Og Image Design

Open Graph and social sharing image design with platform specs, text placement, and branding. Covers OG meta tags, Twitt

by okaris · community · Quality: medium

Youtube Thumbnail Design

YouTube thumbnail design with specific dimensions, contrast rules, and mobile preview optimization. Covers safe zones, t

by okaris · community · Quality: medium

Speech To Text

Transcribe audio to text with Whisper models via inference.sh CLI. Models: Fast Whisper Large V3, Whisper V3 Large. Capa

by okaris · community · Quality: medium

Storyboard Creation

Film and video storyboarding with shot vocabulary, continuity rules, and panel layout. Covers shot types, camera angles,

by okaris · community · Quality: medium

Ai Podcast Creation

Create AI-powered podcasts with text-to-speech, music, and audio editing. Tools: Kokoro TTS, DIA TTS, Chatterbox, AI mus

by clawhub · community · Quality: medium

Freepik API

Generate images, videos, icons, audio, and more using Freepik's AI API. Supports Mystic, Flux, Kling, Hailuo, Seedream,

by cohnen · community · Quality: medium

Prompt Engineering

Master prompt engineering for AI models: LLMs, image generators, video models. Techniques: chain-of-thought, few-shot, s

by okaris · community · Quality: medium

Ai Avatar Video

Create AI avatar and talking head videos with OmniHuman, Fabric, PixVerse via inference.sh CLI. Models: OmniHuman 1.5, O

by okaris · community · Quality: medium

Ai Voice Cloning

AI voice generation, text-to-speech, and voice synthesis via inference.sh CLI. Models: Kokoro TTS, DIA, Chatterbox, Higg

by okaris · community · Quality: medium

Image To Video

Still-to-video conversion guide: model selection, motion prompting, and camera movement. Covers Wan 2.5 i2v, Seedance, F

by okaris · community · Quality: medium

Clawprompt

Launch a smart teleprompter with mobile remote control for video recording. Use when the user wants to read scripts whil

by jiafar · community · Quality: medium

Claw Mouse

Control a Linux X11 desktop by taking screenshots and moving/clicking/typing via xdotool + scrot.

by rylena · community · Quality: medium

Generate Images via GLM

Generate images using GLM-Image API. Use when the user wants to generate, create, or draw an image from a text prompt. T

by chunhualiao · community · Quality: medium

audio-broadcast

控制小播鼠广播系统进行音频播放和广播通知。使用当用户需要向广播设备播放音频、设置音量、管理定时广播任务、或查看设备状态时。支持播放音频文件、URL播放、音量调节、设备管理、定时任务管理、文字转语音(TTS)广播等功能。Control xia

by oxiaom · community · Quality: medium

Product Hunt Launch

Product Hunt launch optimization with specific specs, timing, and gallery strategy. Covers taglines, gallery images, mak

by okaris · community · Quality: medium

Talking Head Production

Talking head video production with AI avatars, lipsync, and voiceover. Covers portrait requirements, audio quality, Omni

by okaris · community · Quality: medium

Twitter Thread Creation

Twitter/X thread writing with hook tweets, thread structure, and engagement optimization. Covers tweet formatting, chara

by okaris · community · Quality: medium

Video Prompting Guide

Best practices and techniques for writing effective AI video generation prompts. Covers: Veo, Seedance, Wan, Grok, Kling

by okaris · community · Quality: medium

Text To Speech

Convert text to natural speech with DIA TTS, Kokoro, Chatterbox, and more via inference.sh CLI. Models: DIA TTS (convers

by okaris · community · Quality: medium

Character Design Sheet

Character consistency across AI-generated images with reference sheets and LoRA techniques. Covers turnaround views, exp

by okaris · community · Quality: medium

Ai Content Pipeline

Build multi-step AI content creation pipelines combining image, video, audio, and text. Workflow examples: generate imag

by okaris · community · Quality: medium

App Store Screenshots

App Store and Google Play screenshot creation with exact platform specs. Covers iOS/Android dimensions, gallery ordering

by okaris · community · Quality: medium

Book Cover Design

Book cover design with genre-specific conventions, typography rules, and AI image generation. Covers fiction and non-fic

by okaris · community · Quality: medium

Dialogue Audio

Multi-speaker dialogue audio creation with Dia TTS. Covers speaker tags, emotion control, pacing, conversation flow, and

by okaris · community · Quality: medium

Explainer Video Guide

Explainer video production guide: scripting, voiceover, visuals, and assembly. Covers script formulas, pacing rules, sce

by okaris · community · Quality: medium

Logo Design Guide

Logo design principles and AI image generation best practices for creating logos. Covers logo types, prompting technique

by okaris · community · Quality: medium

Image Upscaling

Upscale and enhance images with Real-ESRGAN, Thera, Topaz, FLUX Upscaler via inference.sh CLI. Models: Real-ESRGAN, Ther

by okaris · community · Quality: medium

Google Veo

Generate videos with Google Veo models via inference.sh CLI. Models: Veo 3.1, Veo 3.1 Fast, Veo 3, Veo 3 Fast, Veo 2. Ca

by okaris · community · Quality: medium

Background Removal

Remove backgrounds from images with BiRefNet via inference.sh CLI. Model: BiRefNet (high accuracy background removal). U

by okaris · community · Quality: medium

Ai Product Photography

Generate professional AI product photography and commercial images. Models: FLUX, Imagen 3, Grok, Seedream for product s

by okaris · community · Quality: medium

Ai Music Generation

Generate AI music and songs with Diffrythm, Tencent Song Generation via inference.sh CLI. Models: Diffrythm (fast song g

by okaris · community · Quality: medium

Ai Image Generation

Generate AI images with FLUX, Gemini, Grok, Seedream, Reve and 50+ models via inference.sh CLI. Models: FLUX Dev LoRA, F

by okaris · community · Quality: medium

Fathom

Access Fathom AI meeting recordings, transcripts, summaries, and action items via the Fathom API. Use when the user asks

by lauren-hayes-ai · community · Quality: medium

DeepReader

The default web content reader for OpenClaw. Reads X (Twitter), Reddit, YouTube, and any webpage into clean Markdown — z

by astonysh · community · Quality: medium