Showing results for Evaluating
GitHub Repo
https://github.com/facebookresearch/ParlAI
facebookresearch/ParlAI
A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
GitHub Repo
https://github.com/confident-ai/deepeval
confident-ai/deepeval
The LLM Evaluation Framework
GitHub Repo
https://github.com/openai/evals
openai/evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
GitHub Repo
https://github.com/ShishirPatil/gorilla
ShishirPatil/gorilla
Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)
GitHub Repo
https://github.com/huggingface/evaluate
huggingface/evaluate
🤗 Evaluate: A library for easily evaluating machine learning models and datasets.
GitHub Repo
https://github.com/bargavj/EvaluatingDPML
bargavj/EvaluatingDPML
This project's goal is to evaluate the privacy leakage of differentially private machine learning models.
GitHub Repo
https://github.com/faber03/AndroidMalwareEvaluatingTools
faber03/AndroidMalwareEvaluatingTools
Evaluation tools for malware Android
GitHub Repo
https://github.com/lm-sys/FastChat
lm-sys/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
GitHub Repo
https://github.com/google/adk-python