Application Reports


Vendor	August
Founded	2024
VLAIR Evaluation	5 tasks evaluated

Oliver is August’s legal AI assistant built entirely on agentic workflows that coordinate a network of specialized AI agents. The platform leverages Meta’s Llama models, OpenAI’s o1, and Anthropic’s Claude 3.5 Sonnet to deliver autonomous problem-solving with multi-step reasoning. As the youngest company in the VLAIR study, Oliver was the only AI tool to participate in EDGAR Research, demonstrating capability for complex, open-ended investigation tasks.

Performance Summary

Oliver was evaluated across 5 tasks in the VLAIR benchmark, with scores ranging from 55.2% to 74.0%.

Document Q&A

About This Task

Evaluates answering questions about documents by extracting and synthesizing information for legal research.

Oliver

74% ± 5

Lawyer baseline

70.1% ± 5.2

Avg accuracy from other competitors

80.2%

Chronology Generation

About This Task

Creates accurate timelines of events from legal documents for case preparation.

Oliver

66.9% ± 4.3

Lawyer baseline

80.2% ± 3.6

Avg accuracy from other competitors

76.3%

Data Extraction

About This Task

Extracts specific information, clauses, or data points from legal documents.

Oliver

64% ± 3.4

Lawyer baseline

71.1% ± 3.2

Avg accuracy from other competitors

70.5%

Document Summarization

About This Task

Summarizes documents or specific parts to identify key information in legal workflows.

Oliver

62.4% ± 3.5

Lawyer baseline

50.3% ± 3.6

Avg accuracy from other competitors

64.2%

EDGAR Research

About This Task

Conducts research using SEC EDGAR filings and financial documents for securities law.

Oliver

55.2% ± 3.3

Lawyer baseline

70.1% ± 3

Avg accuracy from other competitors

62.7%

Oliver demonstrated competitive performance across evaluated tasks, surpassing the lawyer baseline in Document Q&A (74.0%) and Document Summarization (62.4%), while being the only AI tool to participate in the challenging EDGAR Research task (55.2%). The platform’s agentic workflow architecture enables multi-step reasoning with response times of 5+ minutes per query, significantly longer than other tools but still exponentially faster than human lawyers.

Key Strengths

Based on VLAIR evaluation results, Oliver demonstrates several notable capabilities:

Agentic workflow innovation: Unique architecture coordinating specialized AI agents enables sophisticated multi-step reasoning and autonomous decision-making for complex tasks requiring iterative investigation.
EDGAR Research capability: Only AI tool to participate in EDGAR Research, achieving 55.2% accuracy on open-ended market research questions that demand navigation of large document spaces without specific filing type guidance.
Reasoning transparency: Platform explains its reasoning and actions as it works, making outputs particularly trustworthy and educational for legal professionals who value understanding the AI’s methodology.
Comprehensive responses: Most verbose output style among evaluated products provides extensive explanations, detailed context, and structured guidance particularly beneficial for complex analysis and training purposes.
Document Q&A performance: Achieved 74.0% accuracy, surpassing the lawyer baseline by 3.9pp, with one question achieving 100% accuracy through comprehensive capture of all required elements.
Multi-model integration: Leverages diverse LLMs including Meta’s Llama, OpenAI’s o1, and Anthropic’s Claude 3.5 Sonnet to bring varied capabilities to different problem types.
Emerging technology leadership: As the youngest company evaluated, demonstrated competitive performance on par with or outperforming more established offerings while pioneering agentic approaches at the frontier of legal AI development.

Based on the Vals Legal AI Report (VLAIR), February 2025 - First independent benchmarking study of legal AI tools using real legal work from Am Law 100 law firms.