Moozonian

💻 Developer Nexus: evaluation

GitHub

lm-sys/FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

⭐ 39425 | 🍴 4777
GitHub

mlflow/mlflow

The open source developer platform to build AI agents and models with confidence. Enhance your AI applications with end-to-end tracking, observability, and evaluations, all in one integrated platform.

⭐ 24515 | 🍴 5347
GitHub

langfuse/langfuse

🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23

⭐ 22580 | 🍴 2255
GitHub

google/adk-python

An open-source, code-first Python toolkit for building, evaluating, and deploying sophisticated AI agents with flexibility and control.

⭐ 18136 | 🍴 3006