AgentBench
JD

Quick Benchmark

Accuracy:Test output correctness against curated Q&A