Case Study · Emburse · 2019–2021

Could our data beat their models?

Could our own data outperform a specialized vendor's models? A data-science-led decision.

$22M Strategic outcome
$560K+ Immediate new revenue
12M Users on the ML stack

Emburse is a B2B2C SaaS platform for corporate spend optimization. The most painful workflow for end users was filling out expense reports — receipt parsing didn't always work, and the inefficiency was costing the business real money. The conversation initially landed as a feature decision (which OCR vendor?). I pushed back: it was an architecture decision that would define the company's product experience for years.

Led a structured build-vs-buy evaluation — well before AI became mainstream. Worked with our internal ML engineering team to first test whether Emburse's own years of receipt data could train models to match or beat specialized vendors. Then ran the same accuracy, latency, API completeness, and extensibility tests against Google Document AI. When the data made the choice clear, I negotiated the partnership architecture itself with legal and engineering — SLAs, penalty structures, escalation paths, and custom training rights — to keep a multi-year ML dependency operationally honest.

I led the evaluation across the dimensions that would define a multi-year ML dependency.

  • Multilingual accuracy benchmarking: tested document parsing performance across languages, receipt formats, and low-quality scans against the same benchmark dataset we'd applied to an in-house ML approach.
  • Structured metadata: defined the data we'd need back from the API — merchant, tax, category, confidence scores, line-item detail — to power downstream reimbursement, fraud detection, and accounting workflows. Evaluated API performance: latency, reliability, timeout rates, and processing consistency to support high-volume expense workflows.
  • Data isolation: negotiated terms ensuring Emburse customer data would not train Google's generalized models — protecting the enterprise privacy posture our customers expected.
  • Confidence-based human review: designed handoff patterns where ambiguous outputs routed to human review, balancing automation with accuracy at the edges.
$22M Strategic outcome — Google Document AI selected as foundation of Emburse's ML stack
$560K+ Immediate new revenue unlocked by capabilities built on the decision
12M Users — manual receipt workflow converted to near-zero effort
4+ Downstream capabilities unlocked: auto-submission, fraud detection, duplicate matching, PaaS

The decision freed internal engineering to invest on the application layer above the ML substrate — none of the downstream automation would have been possible while building OCR from scratch.

Detailed partnership architecture, evaluation framework, and decision artifacts available on request.