About Opendoor
Opendoor is a technology-driven real estate marketplace that enables customers to buy and sell homes online with speed and certainty. Operating at large scale, the company supports tens of thousands of transactions and manages billions in real estate assets per year. Its platform spans pricing, inventory management, transaction workflows, integrations, and customer-facing applications that together power the end-to-end home buying and selling experience.
The challenge
As Opendoor's platform and engineering organization scaled, incident response became an increasingly complex challenge. Engineers spent significant time establishing basic context — correlating alerts with recent fingerprinted issues, identifying impacted services, and deciding where to begin investigating. In many cases, the first 10–40 minutes of an incident were spent on manual triage rather than resolution.
- Alert noise from duplicate pages, cascading failures, and pattern-based alerts made it difficult to distinguish root causes from symptoms
- Frequent escalations to senior staff engineers increased on-call fatigue
- Fragmented runbooks and historical incident data were spread across tools, requiring effort to locate and cross-reference
- Tribal knowledge dependencies reduced team readiness, especially for recurring issues
Together, these factors led to longer MTTR, higher stress during incidents, and growing operational overhead as the platform continued to scale.
The solution
Opendoor partnered with Deeptrace to introduce AI-powered incident triage directly into existing on-call and observability workflows. Deeptrace integrated with Opendoor's stack — including Datadog, Sentry, GitHub, and Slack — to automatically augment how engineers investigate and resolve incidents.
Automated signal correlation
When an alert is triggered, Deeptrace automatically correlates signals across logs, metrics, traces, and alerts to identify likely root causes. It surfaces relevant historical incidents, Datadog context, and service ownership while sending investigation results and actions directly in Slack.
Structured rollout
Opendoor and Deeptrace began with a structured rollout focused on a subset of critical services. The team evaluated performance in a read-only mode, observing conclusions, correlating incidents, generating trust in the system's recommendations, and iterating on accuracy and context. This allowed Opendoor's engineering team to progressively expand coverage and build confidence without disrupting existing processes.
Results
| Metric | Impact |
|---|---|
| MTTR reduction | 48% faster across covered incidents |
| Engineering hours reclaimed | ~2,000 hours annually |
| Investigations run | 50,000+ and scaling |
| Investigation time improvement | 60% faster by automating early triage and surfacing root causes |

“Running reliable software at Opendoor's scale is mission-critical to our operations. Deeptrace has become an integral part of our debugging workflow, helping us keep the platform stable as we grow.”
Jonah Back
VP of Engineering, Opendoor
