Awesome Open Source

Programming Languages

Search results for python evaluation framework

evaluation-framework x

39 search results found

Lm Evaluation Harness ⭐ 3,768

A framework for few-shot evaluation of language models.

Deepeval ⭐ 1,070

The Evaluation Framework for LLMs

Recsys2019_deeplearning_evaluation ⭐ 871

This is the repository of our article published in RecSys 2019 "Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches" and of several follow-up studies.

A research library for automating experiments on Deep Graph Networks

AI Data Management & Evaluation Platform

Tonic_validate ⭐ 128

Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.

Pysodevaltoolkit ⭐ 112

PySODEvalToolkit: A Python-based Evaluation Toolbox for Salient Object Detection and Camouflaged Object Detection

Crowdflow ⭐ 92

Optical Flow Dataset and Benchmark for Visual Crowd Analysis

Continuous Eval ⭐ 78

Evaluation for LLM / RAG pipelines, ready for CI/CD

Sordi Ai Evaluation Gui ⭐ 68

This repository allows you to evaluate a trained computer vision model and get general information and evaluation metrics with little configuration.

Rankeval ⭐ 65

Official repository of RankEval: An Evaluation and Analysis Framework for Learning-to-Rank Solutions.

BIRL: Benchmark on Image Registration methods with Landmark validations

Dialogentailment ⭐ 62

The implementation of the paper "Evaluating Coherence in Dialogue Systems using Entailment"

Vectory provides a collection of tools to track and compare embedding versions.

Athina Evals ⭐ 45

Python SDK for running evaluations on LLM generated responses

OD-test: A Less Biased Evaluation of Out-of-Distribution (Outlier) Detectors (PyTorch)

Evaluate your biometric verification models literally in seconds.

Python client for Kolena's machine learning testing platform

Codefuse Evaluation ⭐ 37

Industrial-level evaluation benchmarks for Coding LLMs in the full life-cycle of AI native software developing.企业级代码大模型评测体系,持续开放中

Framework to evaluate Trajectory Classification Algorithms

Train, evaluate, and optimize implicit feedback-based recommender systems.

Framework for Interactive Evaluation of Recommender Systems

A high-level Python framework to evaluate the skill of geospatial datasets by comparing candidates to benchmark maps producing agreement maps and metrics.

The Core Reinforcement Learning library is intended to enable scalable deep reinforcement learning experimentation in a manner extensible to new simulations and new ways for the learning agents to interact with them. The hope is that this makes RL research easier by removing lock-in to particular simulations.The work is released under the follow APRS approval. Initial release of CoRL - Part #1 -Approved on 2022-05-2024 12:08:51 - PA Approval # [AFRL-2022-2455]" Documentation https://act3-ace.g

Fast_prototype ⭐ 15

This is a machine learning framework that enables developers to iterate fast over different ML architecture designs.

quica is a tool to run inter coder agreement pipelines in an easy and effective ways. Multiple measures are run and results are collected in a single table than can be easily exported in Latex

An Evaluation Framework for Temporal Information Extraction Systems

Entity linking evaluation and analysis tool

Moonshot ⭐ 14

Moonshot - A simple and modular tool to evaluate and red-team any LLM application.

🪐 A framework for distributed load testing experiments

Python package with Deep Learning utilities for Computer Vision

Yeast In Microstructures Dataset ⭐ 8

Official and maintained implementation of the dataset paper "An Instance Segmentation Dataset of Yeast Cells in Microstructures" [EMBC 2023].

Official repository for the ICLR 2022 paper "Evaluation Metrics for Graph Generative Models: Problems, Pitfalls, and Practical Solutions" https://openreview.net/forum?id=tBtoZYKd9n

Gan Evaluator ⭐ 8

A pip-installable evaluator for GANs (IS and FID). Accepts either dataloaders or individual batches. Supports on-the-fly evaluation during training. A working DCGAN SVHN demo script provided.

A Python framework to evaluate geospatial datasets by comparing candidate and benchmark maps to compute agreement maps and statistics.

Orbis_eval ⭐ 7

An Extendable Evaluation Pipeline for Named Entity Drill-Down Analysis

Evalytics ⭐ 6

HR tool to orchestrate the Performance Review Cycle of the employees of a company.

Auditing with LLM evals for LLM applications.

Xlingeval ⭐ 5

Code and Resources for the paper, "Better to Ask in English: Cross-Lingual Evaluation of Large Language Models for Healthcare Queries"

Evaluation Framework ⭐ 5

It is an evaluation framework for evaluating and comparing graph embedding techniques

Related Searches

Python Machine Learning (20,195)

Python Dataset (14,792)

Python Docker (14,113)

Python Tensorflow (13,736)

Python Deep Learning (13,092)

Python Jupyter Notebook (12,976)

Python Network (11,495)

Python Algorithms (10,033)

Python Testing (9,479)

Python Natural Language Processing (9,064)

1-39 of 39 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.