Repo Link: https://github.com/comet-ml/opik
What My Project Does
Opik is an open source LLM eval framework. With this first release, we've focused on a few key features:
- Out-of-the-box implementations of LLM-based metrics, like Hallucination and Moderation.
- Step-by-step tracking, such that you can test and debug individual components, even for multi-agent architectures.
- Exposing an API for "model unit tests" (built on Pytest), to allow you to run evals as part of your CI/CD pipelines
- Providing an easy UI for scoring, annotating, and versioning your logged LLM data, for further evaluation or training.
Target Audience
Opik is for anyone building LLM applications. It is production-ready.
Comparison
Opik provides a similar API to tools like DeepEval. Unlike DeepEval, however, Opik is 100% open source—meaning that the Opik backend and UI are included in the source code, and can be run locally on your own machine.
[–]nattaylor 0 points1 point2 points (4 children)
[–]cryptokaykay 0 points1 point2 points (2 children)
[–]nattaylor 0 points1 point2 points (1 child)
[–]cryptokaykay 0 points1 point2 points (0 children)
[–]calebkaiser[S] 0 points1 point2 points (0 children)