🔭 DeepEval - Open-Source Evals with Tracing

What is DeepEval?

DeepEval is an open-source evaluation framework for LLMs (Github).

What is Confident AI?

Confident AI (the deepeval platfrom) offers an Observatory for teams to trace and monitor LLM applications. Think Datadog for LLM apps. The observatory allows you to:

Detect and debug issues in your LLM applications in real-time
Search and analyze historical generation data with powerful filters
Collect human feedback on model responses
Run evaluations to measure and improve performance
Track costs and latency to optimize resource usage

Quickstart

import os
import time
import litellm


os.environ['OPENAI_API_KEY']='<your-openai-api-key>'
os.environ['CONFIDENT_API_KEY']='<your-confident-api-key>'

litellm.success_callback = ["deepeval"]
litellm.failure_callback = ["deepeval"]

try:
    response = litellm.completion(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "user", "content": "What's the weather like in San Francisco?"}
        ],
    )
except Exception as e:
    print(e)

print(response)

info

You can obtain your CONFIDENT_API_KEY by logging into Confident AI platform.

What is DeepEval?​

What is Confident AI?​

Quickstart​

Support & Talk with Deepeval team​

What is DeepEval?

What is Confident AI?

Quickstart

Support & Talk with Deepeval team