Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for llama rlhf
llama
x
rlhf
x
17 search results found
Llama Factory
⭐
10,715
Easy-to-use LLM fine-tuning framework (LLaMA, BLOOM, Mistral, Baichuan, Qwen, ChatGLM)
Chinese Llama Alpaca 2
⭐
5,810
中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)
Safe Rlhf
⭐
1,040
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
Alignllmhumansurvey
⭐
368
Aligning Large Language Models with Human: A Survey
Llm Rlhf Tuning
⭐
225
LLM Tuning with PEFT (SFT+RM+PPO+DPO with LoRA)
Step_into_llm
⭐
211
MindSpore online courses: Step into LLM
Cornucopia Llama Fin Chinese
⭐
178
聚宝盆(Cornucopia): 基于中文金融知识的LLaMA微调模型;涉及SFT、RLHF、GPU训练部署等
Open Chatgpt
⭐
66
The open source implementation of ChatGPT, Alpaca, Vicuna and RLHF Pipeline. 从0开始实现一个ChatGPT.
Alpaca Rlhf
⭐
42
Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat
Llama Trl
⭐
38
LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA
Okapi
⭐
36
Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback
Chatglm Lora Rlhf Pytorch
⭐
21
A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the ChatGLM architecture. Basically ChatGPT but with ChatGLM
Vicuna Lora Rlhf Pytorch
⭐
17
A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Vicuna architecture. Basically ChatGPT but with Vicuna
Awesome Llm
⭐
15
Curated list of open source and openly accessible large language models
Beavertails
⭐
12
BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).
Jax Models
⭐
10
Explore implementations of deep learning concepts like Transformers, Attention, Llama, GPT, InstructGPT, RLHF, Gaussian Processes, Bayesian Inference, Newton Raphson, Distributed Trainers and more!
Alpaca Lora Rlhf Pytorch
⭐
10
A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Alpaca architecture. Basically ChatGPT but with Alpaca
Related Searches
Llama Llama2 (99)
Large Language Models Llama (85)
Natural Language Processing Llama (56)
Python Llama (38)
Llm Llama (37)
Python Rlhf (34)
Llm Rlhf (28)
Chatgpt Llama (26)
Large Language Models Rlhf (18)
Alpaca Llama (16)
1-17 of 17 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.