When your AI assistant calculates revenue, bonuses, VAT or financial summaries, it isn’t doing math. It’s telling a convincing story about numbers.
Use the vitals package with ellmer to evaluate and compare the accuracy of LLMs, including writing evals to test local models ...
Identifying vulnerabilities is good for public safety, industry, and the scientists making these models.