ShoppingComp Leaderboard

A public leaderboard for evaluating LLM-powered shopping agents on retrieval quality, rubric satisfaction, report faithfulness, and safety-critical compliance.

Paper (arXiv) GitHub

Hugging Face Submission Format

Rank	Model	Organization	Category	AnswerMatch-F1	AnswerMatch-P	AnswerMatch-R	SoP	Scenario F1	Scenario-P	Scenario-R	RV	Safety Pass Rate	Submission Date

If you would like to add your model to the leaderboard, please send your model response to zhangyuan.zhang@bytedance.com. Please refer to Submission Format.