Evaluates agent outputs for correctness, clarity, completeness, and safety, providing numeric scores and detailed feedback for quality control.
查看全部其他技能