Moozonian

💻 Developer Nexus: Throughput

GitHub

vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

⭐ 71294 | 🍴 13728
GitHub

microsoft/garnet

Garnet is a remote cache-store from Microsoft Research that offers strong performance (throughput and latency), scalability, storage, recovery, cluster sharding, key migration, and replication features. Garnet can work with existing Redis clients.

⭐ 11753 | 🍴 642
GitHub

FMInference/FlexLLMGen

Running large language models on a single GPU for throughput-oriented scenarios.

⭐ 9382 | 🍴 590
GitHub

MystenLabs/sui

Sui, a next-generation smart contract platform with high throughput, low latency, and an asset-oriented programming model powered by the Move programming language

⭐ 7618 | 🍴 11713