gpu servers Vector

GitHub Repo https://github.com/mosszxc/grimoire

mosszxc/grimoire

Multi-project Knowledge Base — Graph RAG with GPU transcription, vector search, MCP server

GitHub Repo https://github.com/angelsu/sdr-memory

angelsu/sdr-memory

Long-term memory for LLM agents using Sparse Distributed Representations — no vector DB, no GPU, no external server.

GitHub Repo https://github.com/latenceainew/colsearch

High-performance late-interaction retrieval engine for on-prem AI. ColBERT/ColPali multi-vector search with Rust fused MaxSim, Triton GPU kernels, ROQ quantization, LEMUR routing, WAL-backed CRUD, and a FastAPI server — single machine, CPU or GPU.

GitHub Repo https://github.com/theesfeld/VGP

theesfeld/VGP

GPU-accelerated vector display server and desktop environment for Linux

GitHub Repo https://github.com/DigitLib/aip

DigitLib/aip

Interactive AI Infrastructure Builder. GPU + Vector Database = Workstation/Server

GitHub Repo https://github.com/dextera-labs/ctxengine

dextera-labs/ctxengine

GPU-accelerated vector retrieval server with FAISS and Go

GitHub Repo https://github.com/orneryd/NornicDB

orneryd/NornicDB

Nornicdb is a distributed low-latency, Graph+Vector, Temporal MVCC with all sub-ms HNSW search, graph traversal, and writes. Using Neo4j Bolt/Cypher and qdrant's gRPC means you can switch with no changes. Then, adding intelligent features like schemas, managed embeddings, LLM reranking+inferrence, GPU accel, Auto-TLP, Memory Decay, and MCP server.

GitHub Repo https://github.com/dhaya/ctxengine

mosszxc/grimoire

angelsu/sdr-memory

latenceainew/colsearch

theesfeld/VGP

DigitLib/aip

dextera-labs/ctxengine

orneryd/NornicDB

dhaya/ctxengine

buckyinsfo/homelab-ai-stack

bryankthompson/mcp-server-qdrant-enhanced