LLM-as-a-Judge evaluation system using Langfuse. Score AI outputs on relevance, accuracy, hallucination, and helpfulness. Backfill scoring on historical trac...
by clawhubcommunitySource: clawhub
Quality: mediumSafety: communityCategory: AI & MLUpdated: 2026-03-05