Could Our Data Beat Their Models?

The problem

Emburse is a B2B2C SaaS platform for corporate spend optimization. The most painful workflow for end users was filling out expense reports — receipt parsing didn't always work, and the inefficiency was costing the business real money. The conversation initially landed as a feature decision (which OCR vendor?). I pushed back: it was an architecture decision that would define the company's product experience for years.

My role

Led a structured build-vs-buy evaluation — well before AI became mainstream. Reported directly to the CPO and presented recommendations to the President for executive sign-off on the multi-year financial and legal commitments. Worked with our internal ML engineering team to first test whether Emburse's own years of receipt data could train models to match or beat specialized vendors. Then ran the same accuracy, latency, API completeness, and extensibility tests against Google Document AI. When the data made the choice clear, I negotiated directly with Google and Emburse legal teams on the partnership architecture — including contract structure, SLAs, escalation paths, data isolation terms, and custom training rights — to keep a multi-year ML dependency operationally honest.

What we built

I led the evaluation across the dimensions that would define a multi-year ML dependency.

Multilingual accuracy benchmarking: tested document parsing performance across languages, receipt formats, and low-quality scans against the same benchmark dataset we'd applied to an in-house ML approach.
Structured metadata: defined the data we'd need back from the API — merchant, tax, category, confidence scores, line-item detail — to power downstream reimbursement, fraud detection, and accounting workflows. Evaluated API performance: latency, reliability, timeout rates, and processing consistency to support high-volume expense workflows.
Data isolation: negotiated terms ensuring Emburse customer data would not train Google's generalized models — protecting the enterprise privacy posture our customers expected.
Confidence-based human review: designed handoff patterns where ambiguous outputs routed to human review, balancing automation with accuracy at the edges.

Outcome

$22M Strategic outcome — Google Document AI selected as foundation of Emburse's ML stack

$560K+ Immediate new revenue unlocked by capabilities built on the decision

12M Users — manual receipt workflow converted to near-zero effort

4+ Downstream capabilities unlocked: auto-submission, fraud detection, duplicate matching, PaaS

The decision freed internal engineering to invest on the application layer above the ML substrate — none of the downstream automation would have been possible while building OCR from scratch.

Detailed partnership architecture, evaluation framework, and decision artifacts available on request.

What it unlocked

A shared ML platform capability

The partnership created the foundation for a centralized service that processed expense receipts across Emburse business units — handling scale and orchestration before routing documents through Google's ML APIs.