Awesome Open Source

Programming Languages

Search results for llm safety

8 search results found

Safe Rlhf ⭐ 1,040

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Safety Prompts ⭐ 404

Chinese safety prompts for evaluating and improving the safety of LLMs. 中文安全prompts，用于评估和提升大模型的安全性。

A reading list for large models safety, security, and privacy.

Deepinception ⭐ 35

Source code for the paper "DeepInception: Hypnotize Large Language Model to Be Jailbreaker"

Jailbreaking Large Vision-language Models via Typographic Visual Prompts

Multilingual Safety For Llms ⭐ 20

[ICLR 2024]Data for "Multilingual Jailbreak Challenges in Large Language Models"

Vllm Safety Benchmark ⭐ 15

Official PyTorch Implementation of "How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs"

Beavertails ⭐ 12

BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).

Related Searches

Python Llm (1,377)

Openai Llm (569)

Chatgpt Llm (533)

Artificial Intelligence Llm (445)

Natural Language Processing Llm (285)

Llm Llama (269)

Typescript Llm (258)

Machine Learning Llm (214)

Llm Large Language Models (213)

1-8 of 8 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.