Moozonian
Web Images Developer News Books Maps Shopping Moo-AI
Showing results for Evaluating
GitHub Repo https://github.com/facebookresearch/ParlAI

facebookresearch/ParlAI

A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
GitHub Repo https://github.com/confident-ai/deepeval

confident-ai/deepeval

The LLM Evaluation Framework
GitHub Repo https://github.com/Arize-ai/phoenix

Arize-ai/phoenix

AI Observability & Evaluation
GitHub Repo https://github.com/openai/evals

openai/evals

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
GitHub Repo https://github.com/ShishirPatil/gorilla

ShishirPatil/gorilla

Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)
GitHub Repo https://github.com/huggingface/evaluate

huggingface/evaluate

🤗 Evaluate: A library for easily evaluating machine learning models and datasets.
GitHub Repo https://github.com/bargavj/EvaluatingDPML

bargavj/EvaluatingDPML

This project's goal is to evaluate the privacy leakage of differentially private machine learning models.
GitHub Repo https://github.com/faber03/AndroidMalwareEvaluatingTools

faber03/AndroidMalwareEvaluatingTools

Evaluation tools for malware Android
GitHub Repo https://github.com/lm-sys/FastChat

lm-sys/FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
GitHub Repo https://github.com/google/adk-python

google/adk-python

An open-source, code-first Python toolkit for building, evaluating, and deploying sophisticated AI agents with flexibility and control.