#245 — Update LLM evaluation pipeline
Repo: Twill-AI/twill-llm-engine State: closed | Status: done Assignee: meliascosta
Created: 2025-01-29 · Updated: 2025-03-13
Description
AC:
- Testing datasets have been migrated to the new langsmith account
- New metrics have been added into twill-llm-engine in-repo tests to monitor SQL performance
- Datasets have been checked and updated to make sure they are working with the current version of the LLM and relevant.
Notes
Add implementation notes, blockers, and context here
Related
Add wikilinks to related people, meetings, or other tickets