Awesome Open Source

Programming Languages

Search results for benchmark chatgpt

10 search results found

Baichuan2 ⭐ 3,527

A series of large language models developed by Baichuan Intelligent Technology

Opencompass ⭐ 2,758

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Baichuan 13b ⭐ 2,579

A 13B large language model developed by Baichuan Intelligent Technology

Promptbench ⭐ 1,655

A unified evaluation framework for large language models

Evalplus ⭐ 605

EvalPlus for rigourous evaluation of LLM-synthesized code

Awesome Llm Eval ⭐ 183

Awesome-LLM-Eval: a curated list of tools, datasets/benchmark, demos, learderboard, papers, docs and models, mainly for Evaluation on LLMs.

Uhgeval ⭐ 140

Benchmarking the Hallucination of Chinese Large Language Models via Unconstrained Generation

Vlmevalkit ⭐ 137

Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 30+ HF models, 15+ benchmarks

Lawbench ⭐ 69

Benchmarking Legal Knowledge of Large Language Models

Rusentrel Leaderboard ⭐ 7

This is an official Leaderboard for the RuSentRel-1.1 dataset originally described in paper (arxiv:1808.08932)

Related Searches

Python Benchmark (1,941)

Openai Chatgpt (1,648)

Python Chatgpt (1,618)

C Plus Plus Benchmark (1,219)

Artificial Intelligence Chatgpt (1,213)

Javascript Benchmark (1,165)

Golang Benchmark (1,080)

Benchmark Benchmarking (1,073)

Java Benchmark (993)

C Benchmark (902)

1-10 of 10 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.