LMCache

Open-source project (9.2K stars) that boosts inference speed for self-hosted large models and saves GPU memory; joined PyTorch Foundation, integrated by NVID...

GitHub →Find grants for this project

Trust score100/100

About

Open-source project (9.2K stars) that boosts inference speed for self-hosted large models and saves GPU memory; joined PyTorch Foundation, integrated by NVIDIA Dynamo

LMCache

About

Tags

Are you the author of this project?