Company attributes
Other attributes
Patronus AI is an artificial intelligence (AI) evaluation and security company developing an automated evaluation platform for large language models (LLMs). Patronus AI allows development teams to score LLM performance, generate adversarial test cases, benchmark LLMs, and more. The platform is used to detect LLM mistakes at scale and improve the safety of AI product deployments. It can evaluate any LLM, proprietary or open source, first or third party, at scale. Models are scored based on real-world scenarios. The platform uses adversarial test models for potential issues and benchmarks them against each other to help customers find the best model for their use case.
PatronusAI also offers EnterprisePII and developed FinanceBench. EnterprisePII is an LLM dataset for detecting business-sensitive information. It enables developers to test whether an LLM detects confidential information in business documents. FinanceBench is a benchmark for testing how LLMs perform on financial questions developed by AI researchers at Patronus AI and fifteen financial industry domain experts. It consists of a large-scale set of 10,000 question-and-answer pairs based on publicly available financial documents like SEC 10Ks, SEC 10Qs, SEC 8Ks, earnings reports, and earnings call transcripts. It is intended to be a first line of evaluation for LLMs on financial questions, with more advanced tests to be released in the future.
Founded in 2023 by Anand Kannappan (CEO) and Rebecca Qian (CTO), the company came out of stealth on September 14, 2023, making their platform available and announcing $3 million in seed funding led by Lightspeed Venture Partners with participation from Factorial Capital, the CEO of Replit Amjad Masad, Gokul Rajaram, and a number of other angel investors that include Fortune 500 executives and board members. Kannappan previously worked on explainable machine learning frameworks at Meta Reality Labs, and Quian led natural language processing (NLP) research at Meta AI.
The company has stated it will initially concentrate on highly regulated industries where the consequences of LLM mistakes are greater. Patronus AI is partnered with AI organizations, including Cohere, Nomic AI, and Naologic. In January 2024, Patronus AI and MongoDB announced a partnership to bring automated LLM evaluation and testing to enterprise customers. The partnership consists of a joint offering combining Patronus AI’s capabilities with MongoDB’s Atlas Vector Search product.