Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Jupyter Notebook