Opendoor

Reducing incident resolution time and on-call burden with 48% faster MTTR

How Opendoor reclaimed ~2,000 engineering hours annually by automating incident triage with Deeptrace.

Opendoor banner

About Opendoor

Opendoor is a technology-driven real estate marketplace that enables customers to buy and sell homes online with speed and certainty. Operating at large scale, the company supports tens of thousands of transactions and manages billions in real estate assets per year. Its platform spans pricing, inventory management, transaction workflows, integrations, and customer-facing applications that together power the end-to-end home buying and selling experience.

The challenge

As Opendoor's platform and engineering organization scaled, incident response became an increasingly complex challenge. Engineers spent significant time establishing basic context — correlating alerts with recent fingerprinted issues, identifying impacted services, and deciding where to begin investigating. In many cases, the first 10–40 minutes of an incident were spent on manual triage rather than resolution.

  • Alert noise from duplicate pages, cascading failures, and pattern-based alerts made it difficult to distinguish root causes from symptoms
  • Frequent escalations to senior staff engineers increased on-call fatigue
  • Fragmented runbooks and historical incident data were spread across tools, requiring effort to locate and cross-reference
  • Tribal knowledge dependencies reduced team readiness, especially for recurring issues

Together, these factors led to longer MTTR, higher stress during incidents, and growing operational overhead as the platform continued to scale.

The solution

Opendoor partnered with Deeptrace to introduce AI-powered incident triage directly into existing on-call and observability workflows. Deeptrace integrated with Opendoor's stack — including Datadog, Sentry, GitHub, and Slack — to automatically augment how engineers investigate and resolve incidents.

Automated signal correlation

When an alert is triggered, Deeptrace automatically correlates signals across logs, metrics, traces, and alerts to identify likely root causes. It surfaces relevant historical incidents, Datadog context, and service ownership while sending investigation results and actions directly in Slack.

Structured rollout

Opendoor and Deeptrace began with a structured rollout focused on a subset of critical services. The team evaluated performance in a read-only mode, observing conclusions, correlating incidents, generating trust in the system's recommendations, and iterating on accuracy and context. This allowed Opendoor's engineering team to progressively expand coverage and build confidence without disrupting existing processes.

Results

MetricImpact
MTTR reduction48% faster across covered incidents
Engineering hours reclaimed~2,000 hours annually
Investigations run50,000+ and scaling
Investigation time improvement60% faster by automating early triage and surfacing root causes
Jonah Back

Running reliable software at Opendoor's scale is mission-critical to our operations. Deeptrace has become an integral part of our debugging workflow, helping us keep the platform stable as we grow.

Jonah Back

VP of Engineering, Opendoor

Ready to stop firefighting?

Book a demo and see how Deeptrace can cut your MTTR in half.