Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for python evaluation framework
evaluation-framework
x
python
x
39 search results found
Lm Evaluation Harness
⭐
3,768
A framework for few-shot evaluation of language models.
Deepeval
⭐
1,070
The Evaluation Framework for LLMs
Recsys2019_deeplearning_evaluation
⭐
871
This is the repository of our article published in RecSys 2019 "Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches" and of several follow-up studies.
Pydgn
⭐
211
A research library for automating experiments on Deep Graph Networks
Zeno
⭐
202
AI Data Management & Evaluation Platform
Tonic_validate
⭐
128
Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.
Pysodevaltoolkit
⭐
112
PySODEvalToolkit: A Python-based Evaluation Toolbox for Salient Object Detection and Camouflaged Object Detection
Crowdflow
⭐
92
Optical Flow Dataset and Benchmark for Visual Crowd Analysis
Continuous Eval
⭐
78
Evaluation for LLM / RAG pipelines, ready for CI/CD
Sordi Ai Evaluation Gui
⭐
68
This repository allows you to evaluate a trained computer vision model and get general information and evaluation metrics with little configuration.
Rankeval
⭐
65
Official repository of RankEval: An Evaluation and Analysis Framework for Learning-to-Rank Solutions.
Birl
⭐
63
BIRL: Benchmark on Image Registration methods with Landmark validations
Dialogentailment
⭐
62
The implementation of the paper "Evaluating Coherence in Dialogue Systems using Entailment"
Vectory
⭐
56
Vectory provides a collection of tools to track and compare embedding versions.
Athina Evals
⭐
45
Python SDK for running evaluations on LLM generated responses
Od Test
⭐
44
OD-test: A Less Biased Evaluation of Out-of-Distribution (Outlier) Detectors (PyTorch)
Evalify
⭐
41
Evaluate your biometric verification models literally in seconds.
Kolena
⭐
37
Python client for Kolena's machine learning testing platform
Codefuse Evaluation
⭐
37
Industrial-level evaluation benchmarks for Coding LLMs in the full life-cycle of AI native software developing.企业级代码大模型评测体系,持续开放中
Pactus
⭐
37
Framework to evaluate Trajectory Classification Algorithms
Irspack
⭐
27
Train, evaluate, and optimize implicit feedback-based recommender systems.
Repsys
⭐
26
Framework for Interactive Evaluation of Recommender Systems
Gval
⭐
18
A high-level Python framework to evaluate the skill of geospatial datasets by comparing candidates to benchmark maps producing agreement maps and metrics.
Corl
⭐
15
The Core Reinforcement Learning library is intended to enable scalable deep reinforcement learning experimentation in a manner extensible to new simulations and new ways for the learning agents to interact with them. The hope is that this makes RL research easier by removing lock-in to particular simulations.The work is released under the follow APRS approval. Initial release of CoRL - Part #1 -Approved on 2022-05-2024 12:08:51 - PA Approval # [AFRL-2022-2455]" Documentation https://act3-ace.g
Fast_prototype
⭐
15
This is a machine learning framework that enables developers to iterate fast over different ML architecture designs.
Quica
⭐
15
quica is a tool to run inter coder agreement pipelines in an easy and effective ways. Multiple measures are run and results are collected in a single table than can be easily exported in Latex
Tieval
⭐
14
An Evaluation Framework for Temporal Information Extraction Systems
Elevant
⭐
14
Entity linking evaluation and analysis tool
Moonshot
⭐
14
Moonshot - A simple and modular tool to evaluate and red-team any LLM application.
Galileo
⭐
10
🪐 A framework for distributed load testing experiments
Lapixdl
⭐
10
Python package with Deep Learning utilities for Computer Vision
Yeast In Microstructures Dataset
⭐
8
Official and maintained implementation of the dataset paper "An Instance Segmentation Dataset of Yeast Cells in Microstructures" [EMBC 2023].
Ggme
⭐
8
Official repository for the ICLR 2022 paper "Evaluation Metrics for Graph Generative Models: Problems, Pitfalls, and Practical Solutions" https://openreview.net/forum?id=tBtoZYKd9n
Gan Evaluator
⭐
8
A pip-installable evaluator for GANs (IS and FID). Accepts either dataloaders or individual batches. Supports on-the-fly evaluation during training. A working DCGAN SVHN demo script provided.
Gval
⭐
7
A Python framework to evaluate geospatial datasets by comparing candidate and benchmark maps to compute agreement maps and statistics.
Orbis_eval
⭐
7
An Extendable Evaluation Pipeline for Named Entity Drill-Down Analysis
Evalytics
⭐
6
HR tool to orchestrate the Performance Review Cycle of the employees of a company.
Redeval
⭐
6
Auditing with LLM evals for LLM applications.
Xlingeval
⭐
5
Code and Resources for the paper, "Better to Ask in English: Cross-Lingual Evaluation of Large Language Models for Healthcare Queries"
Evaluation Framework
⭐
5
It is an evaluation framework for evaluating and comparing graph embedding techniques
Related Searches
Python Machine Learning (20,195)
Python Dataset (14,792)
Python Docker (14,113)
Python Tensorflow (13,736)
Python Deep Learning (13,092)
Python Jupyter Notebook (12,976)
Python Network (11,495)
Python Algorithms (10,033)
Python Testing (9,479)
Python Natural Language Processing (9,064)
1-39 of 39 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.