3,611 tools and skills for media tasks
Generate music with ACE-Step via HuggingFace private Space. Supports text-to-music, lyrics, style tags, and reference-au
Monitor live streams (YouTube, Bilibili) and get notified when specific keywords are mentioned. Uses browser SpeechRecog
生成专业级 A股早报/晚报,包含大盘指数行情、市场情绪、K线走势图、 行业/概念板块排行、个股涨跌榜、主题新闻追踪、综合分析, 输出 Markdown + PNG 图表 + PDF。数据源为东方财富公开 API。 Use when aske
Fetch and analyze YouTube video content using transcripts when available, or fall back to video descriptions with source
AI security toolkit — deepfake and AI-generated media detection. Use when verifying if an image, video, or audio is a de
将多张图片自动旋转合并为单个PDF,支持根据Excel清单重命名及扫描PDF的OCR文字提取。
腾讯云试题批改Agent(SubmitQuestionMarkAgentJob/DescribeQuestionMarkAgentJob)接口调用技能。当用户需要对试卷图片或试题图片中的K12试卷或试题进行自动批改、手写答案识别、知识点分析
腾讯云行驶证识别(VehicleLicenseOCR)接口调用技能。当用户需要识别行驶证图片主页(车牌号码、车辆类型、所有人、住址、使用性质、品牌型号、识别代码、发动机号、注册日期、发证日期)或副页(号牌号码、档案编号、核定载人数、总质量、
腾讯云护照识别(多国多地区)(MLIDPassportOCR)接口调用技能。当用户需要识别护照图片中中国大陆、港澳台地区或其他国家/地区的护照信息(护照ID、姓名、出生日期、性别、有效期、发行国、国籍、国家地区代码、MRZ码等)时,应使用此
腾讯云表格识别v3(RecognizeTableAccurateOCR)接口调用技能。当用户需要从表格图片或PDF中识别常规表格、无线表格、多表格的内容,提取每个单元格的文字信息,或将表格图片识别结果导出为Excel文件时,应使用此技能。支
Automatically generate social media posts from articles. Supports Twitter, LinkedIn, and more. Perfect for content repur
飞书语音消息发送技能(Windows 版)。使用 Edge TTS(微软,免费)生成语音并以飞书语音气泡发送。
Extract and break down content from web documents, PDFs, images, and URLs into structured markdown notes stored locally
Generate stunning images with Flux Dev. Best quality open image model. No API keys needed. $2 FREE credits to start. Pay
多片段短视频自动拼接工具,支持按文件名排序、统一音视频参数、淡入淡出转场、分块/完整拼接,适合短剧、分镜头视频批量拼接
腾讯云广告文字识别(AdvertiseOCR)接口调用技能。当用户需要从图片中识别文字内容时,应使用此技能。支持中英文、横排、竖排及倾斜场景的图片文字识别,支持90度、180度、270度翻转场景的图片识别,返回文本框位置与文字内容。支持图片
腾讯云实时文档抽取Agent(ExtractDocAgent)接口调用技能。当用户需要从图片或PDF中按自定义字段名称进行结构化信息抽取时,应使用此技能。支持自定义字段名称、字段类型(KV对或表格字段)和字段提示词,实现灵活的文档信息提取。
腾讯云营业执照识别(BizLicenseOCR)接口调用技能。当用户需要识别营业执照图片上的字段信息(统一社会信用代码、公司名称、主体类型、法定代表人、注册资本、组成形式、成立日期、营业期限、经营范围等)时,应使用此技能。支持图片Base6
Use the internet: search, read, and interact with 13+ platforms including Twitter/X, Reddit, YouTube, GitHub, Bilibili,
腾讯云车牌识别(LicensePlateOCR)接口调用技能。当用户需要对中国大陆机动车车牌进行自动定位和识别时,应使用此技能。支持返回车牌号码、车牌颜色、置信度和像素坐标信息,支持多车牌场景识别,支持图片Base64和URL两种输入方式。
腾讯云身份证识别(IDCardOCR)接口调用技能。当用户需要识别身份证图片中中国大陆居民二代身份证正反面信息(姓名、性别、民族、出生日期、住址、身份证号、签发机关、有效期限等)时,应使用此技能。支持图片Base64和URL两种输入方式,同
Transcribe audio with free credits. Whisper-powered, 99 languages. No API keys needed. $2 FREE credits to start. Pay-as-
Generate images with free credits. Flux, DALL-E, and more. No credit card to start. No API keys needed. $2 FREE credits
Generate images in ~1 second with Flux Schnell. Fastest high-quality image model. No API keys needed. $2 FREE credits to
Generate images with Black Forest Labs Flux. Models: flux-schnell (fast), flux-dev (quality). Best open-source image mod
Image and video analysis powered by Isaac vision models. Capabilities include visual Q&A, object detection, OCR, cap
Remove visible Gemini AI watermarks from images via reverse alpha blending. Use for cleaning Gemini-generated images, re
Bidirectional LAN file sharing for AI agents. Provides a static file server (port 18801) for serving files to users, and
AI image generation with OpenAI, Google, DashScope and Replicate APIs. Supports text-to-image, reference images, aspect
Find guitar tabs/sheet sources for a song from a title or link (especially YouTube), rank the best matches, and produce
Automate web browsing tasks like navigation, data extraction, form filling, clicking, and screenshots using the agent-br
Generate and iterate on images using Image Sprout projects. Creates consistent outputs from reference images, style guid
Generate fixed-template daily AI news posters from five news items. Use when the user asks to create a poster, social ca
Extract text from PDFs using Google Gemini OCR. Use when extracting text from PDFs, performing OCR on scanned documents,
HTML-first PDF production skill for reports, papers, and structured documents. Must be applied before generating PDF del
YouTube video search, download & subtitle extraction. 40 Stars! Supports video/audio/subtitles. Each call charges 0.
支持PDF、Word、Markdown智能摘要和格式转换,提供批量处理与进度报告,提升文档处理效率。
Production-grade OCR with intelligent engine selection. Tesseract (lightweight, fast) and PaddleOCR (high accuracy, Chin
Vision-driven HarmonyOS NEXT device automation using Midscene. Operates entirely from screenshots — no DOM or accessibil
Vision-driven browser automation using Midscene. Operates entirely from screenshots — no DOM or accessibility labels req
Create mobile-friendly newspaper-style long images from raw text or summaries by extracting key points and rendering str
Generate Xiaohongshu (RedNote) infographic images. 7.2K Stars! 9 styles × 6 layouts. Each call charges 0.001 USDT via Sk
Convert content from sources like YouTube, PDFs, and WeChat into podcasts, PPTs, mind maps, or quizzes with Google Noteb
监控视频平台官方频道更新,快速获取指定频道在过去一周内发布的新视频(排除 Shorts 短视频)。支持 YouTube、Vimeo 等视频平台。用于: (1) 获取竞品或行业标杆的品牌内容更新,(2) 追踪多个频道的视频发布动态,(3) 生
Give your AI agent eyes to see the entire internet - read Twitter, Reddit, YouTube, GitHub, Bilibili, XiaoHongShu. 6.5K
从 frameset.app 搜索视频参考片段,找到合集页面和原视频链接。用于: (1) 根据关键词搜索广告/电影片段参考,(2) 获取原视频 YouTube/Vimeo 链接,(3) 下载视频到本地。
AI diary service - push diary entries, query diaries, get AI analysis and cover images via HTTP API.
Read, analyze, convert, trim, merge, adjust volume, and transcribe audio files in multiple formats including MP3, WAV, F
Read, analyze metadata, convert formats, resize, rotate, crop, compress, and batch process PNG, JPG, GIF, WebP, TIFF, BM
Automatically generate social media posts from articles. Supports Twitter, LinkedIn, and more. Perfect for content repur