Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for flash attention
flash-attention
x
9 search results found
Qwen
⭐
11,085
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Chinese Llama Alpaca 2
⭐
5,810
中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)
Internlm
⭐
4,412
Official release of InternLM2 7B and 20B base and chat models. 200K context support
Awesome Llm Inference
⭐
715
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
Flash_attention_inference
⭐
72
Performance of the C++ interface of flash attention, flash attention v2 and self quantized decoding attention in large language model (LLM) inference scenarios.
Gdgpt
⭐
58
Train llm (bloom, llama, baichuan2-7b, chatglm3-6b) with deepspeed pipeline mode. Faster than zero/zero++/fsdp.
Fastckpt
⭐
18
Python package for rematerialization-aware gradient checkpointing
Flashperceiver
⭐
7
Fast and memory efficient PyTorch implementation of the Perceiver with FlashAttention.
Fastcode
⭐
7
Utilities for efficient fine-tuning, inference and evaluation of code generation models
Related Searches
Python Flash Attention (4)
Natural Language Processing Flash Attention (4)
Llm Flash Attention (4)
1-9 of 9 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.