Moozonian

💻 Developer Nexus: Evaluation

GitHub

lm-sys/FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

⭐ 39414 | 🍴 4779
GitHub

mlflow/mlflow

The open source developer platform to build AI agents and models with confidence. Enhance your AI applications with end-to-end tracking, observability, and evaluations, all in one integrated platform.

⭐ 24360 | 🍴 5310
GitHub

langfuse/langfuse

🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23

⭐ 22151 | 🍴 2201
GitHub

google/adk-python

An open-source, code-first Python toolkit for building, evaluating, and deploying sophisticated AI agents with flexibility and control.

⭐ 17910 | 🍴 2955