Automatically extracts audio from video, transcribes it using qwen3-asr-flash, and generates segmented text summaries saved alongside the original file.
查看全部媒体技能