Oliver
LAST UPDATED: 10/14/2025
| Vendor | August |
| Founded | 2024 |
| VLAIR Evaluation | 5 tasks evaluated |
Oliver is August’s legal AI assistant built entirely on agentic workflows that coordinate a network of specialized AI agents. The platform leverages Meta’s Llama models, OpenAI’s o1, and Anthropic’s Claude 3.5 Sonnet to deliver autonomous problem-solving with multi-step reasoning. As the youngest company in the VLAIR study, Oliver was the only AI tool to participate in EDGAR Research, demonstrating capability for complex, open-ended investigation tasks.
Performance Summary
Oliver was evaluated across 5 tasks in the VLAIR benchmark, with scores ranging from 55.2% to 74.0%.
Oliver demonstrated competitive performance across evaluated tasks, surpassing the lawyer baseline in Document Q&A (74.0%) and Document Summarization (62.4%), while being the only AI tool to participate in the challenging EDGAR Research task (55.2%). The platform’s agentic workflow architecture enables multi-step reasoning with response times of 5+ minutes per query, significantly longer than other tools but still exponentially faster than human lawyers.
Key Strengths
Based on VLAIR evaluation results, Oliver demonstrates several notable capabilities:
-
Agentic workflow innovation: Unique architecture coordinating specialized AI agents enables sophisticated multi-step reasoning and autonomous decision-making for complex tasks requiring iterative investigation.
-
EDGAR Research capability: Only AI tool to participate in EDGAR Research, achieving 55.2% accuracy on open-ended market research questions that demand navigation of large document spaces without specific filing type guidance.
-
Reasoning transparency: Platform explains its reasoning and actions as it works, making outputs particularly trustworthy and educational for legal professionals who value understanding the AI’s methodology.
-
Comprehensive responses: Most verbose output style among evaluated products provides extensive explanations, detailed context, and structured guidance particularly beneficial for complex analysis and training purposes.
-
Document Q&A performance: Achieved 74.0% accuracy, surpassing the lawyer baseline by 3.9pp, with one question achieving 100% accuracy through comprehensive capture of all required elements.
-
Multi-model integration: Leverages diverse LLMs including Meta’s Llama, OpenAI’s o1, and Anthropic’s Claude 3.5 Sonnet to bring varied capabilities to different problem types.
-
Emerging technology leadership: As the youngest company evaluated, demonstrated competitive performance on par with or outperforming more established offerings while pioneering agentic approaches at the frontier of legal AI development.
Based on the Vals Legal AI Report (VLAIR), February 2025 - First independent benchmarking study of legal AI tools using real legal work from Am Law 100 law firms.