Awesome Open Source

Programming Languages

Search results for llm rlhf

27 search results found

Llama Factory ⭐ 10,715

Easy-to-use LLM fine-tuning framework (LLaMA, BLOOM, Mistral, Baichuan, Qwen, ChatGLM)

Llmsurvey ⭐ 7,255

The official GitHub page for the survey paper "A Survey of Large Language Models".

Chinese Llama Alpaca 2 ⭐ 5,810

中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)

Internlm ⭐ 4,412

Official release of InternLM2 7B and 20B base and chat models. 200K context support

Argilla ⭐ 3,097

Argilla is a collaboration platform for AI engineers and domain experts that require high-quality outputs, full data ownership, and overall efficiency.

Alignment Handbook ⭐ 3,024

Robust recipes for to align language models with human and AI preferences

Webglm ⭐ 1,198

WebGLM: An Efficient Web-enhanced Question Answering System (KDD 2023)

Safe Rlhf ⭐ 1,040

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Openrlhf ⭐ 704

A Ray-based High-performance RLHF framework (Support 70B+ full tuning & LoRA & Mixtral)

pykoi: Active learning in one unified interface

Llm Rlhf Tuning ⭐ 225

LLM Tuning with PEFT (SFT+RM+PPO+DPO with LoRA)

Step_into_llm ⭐ 211

MindSpore online courses: Step into LLM

Awesome Llm Human Preference Datasets ⭐ 116

A curated list of Human Preference Datasets for LLM fine-tuning, RLHF, and eval.

Open Chatgpt ⭐ 66

The open source implementation of ChatGPT, Alpaca, Vicuna and RLHF Pipeline. 从0开始实现一个ChatGPT.

Opening Up Chatgpt.github.io ⭐ 52

Tracking instruction-tuned LLM openness. Paper: Liesenfeld, Andreas, Alianda Lopez, and Mark Dingemanse. 2023. “Opening up ChatGPT: Tracking Openness, Transparency, and Accountability in Instruction-Tuned Text Generators.” In Proceedings of the 5th International Conference on Conversational User Interfaces. doi:10.1145/3571884.3604316.

Alpaca Rlhf ⭐ 42

Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat

Llm_rlhf ⭐ 24

realize the reinforcement learning training for gpt2 llama bloom and so on llm model

Chatglm Lora Rlhf Pytorch ⭐ 21

A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the ChatGLM architecture. Basically ChatGPT but with ChatGLM

Vicuna Lora Rlhf Pytorch ⭐ 17

A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Vicuna architecture. Basically ChatGPT but with Vicuna

Awesome Llm ⭐ 15

Curated list of open source and openly accessible large language models

Prompt Oirl ⭐ 14

code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning

Lm Research Hub ⭐ 14

Language Modeling Research Hub, a comprehensive compendium for enthusiasts and scholars delving into the fascinating realm of language models (LMs), with a particular focus on large language models (LLMs)

Zero Shot Reward Models ⭐ 14

ZYN: Zero-Shot Reward Models with Yes-No Questions

Beavertails ⭐ 12

BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).

Alpaca Lora Rlhf Pytorch ⭐ 10

A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Alpaca architecture. Basically ChatGPT but with Alpaca

Create Your Own Chatgpt ⭐ 7

Create your own ChatGPT with Python

Awesome Rlaif ⭐ 5

A curated and updated list of relevant articles and repositories on Reinforcement Learning from AI Feedback (RLAIF)

Related Searches

Python Llm (1,377)

Openai Llm (569)

Artificial Intelligence Llm (445)

Llm Large Language Models (359)

Typescript Llm (258)

Llm Gpt 4 (211)

Llm Generative Ai (129)

Llm Llama2 (124)

Llm Language Model (82)

1-27 of 27 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.