Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for benchmark chatgpt
benchmark
x
chatgpt
x
10 search results found
Baichuan2
⭐
3,527
A series of large language models developed by Baichuan Intelligent Technology
Opencompass
⭐
2,758
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
Baichuan 13b
⭐
2,579
A 13B large language model developed by Baichuan Intelligent Technology
Promptbench
⭐
1,655
A unified evaluation framework for large language models
Evalplus
⭐
605
EvalPlus for rigourous evaluation of LLM-synthesized code
Awesome Llm Eval
⭐
183
Awesome-LLM-Eval: a curated list of tools, datasets/benchmark, demos, learderboard, papers, docs and models, mainly for Evaluation on LLMs.
Uhgeval
⭐
140
Benchmarking the Hallucination of Chinese Large Language Models via Unconstrained Generation
Vlmevalkit
⭐
137
Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 30+ HF models, 15+ benchmarks
Lawbench
⭐
69
Benchmarking Legal Knowledge of Large Language Models
Rusentrel Leaderboard
⭐
7
This is an official Leaderboard for the RuSentRel-1.1 dataset originally described in paper (arxiv:1808.08932)
Related Searches
Python Benchmark (1,941)
Openai Chatgpt (1,648)
Python Chatgpt (1,618)
C Plus Plus Benchmark (1,219)
Artificial Intelligence Chatgpt (1,213)
Javascript Benchmark (1,165)
Golang Benchmark (1,080)
Benchmark Benchmarking (1,073)
Java Benchmark (993)
C Benchmark (902)
1-10 of 10 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.