loading . . . Findings from DXâs 2025 report: AI wonât save you from your engineering culture The DX AI-assisted engineering: Q4 (2025) impact report offers one of the most substantial empirical views yet of how AI coding assistants are affecting software development, and largely corroborates the key findings from the 2025 DORA State of AI-assisted Software Development Report: quality outcomes vary dramatically based on existing engineering practices, and both the biggest limitation and the biggest benefit come from adopting modern software engineering best practices â which remain rare even in 2025. AI accelerates whatever culture you already have.
## Who are DX and why the report matters
DX is probably the leading and most respected developer intelligence platform. They sell productivity measurement tools to engineering organisations including Dropbox, Block, Pinterest, and BNY Mellon. They combine telemetry from development tools with periodic developer surveys to help engineering leaders track and improve productivity.
This creates potential bias â DXâs business depends on organisations believing productivity can be measured. But it also means they have access to data most researchers donât.
### **Data collection**
The report examines data collected between July and October 2025. Drawing on data from 135,000 developers across 435 companies, the data set is substantially larger than most productivity research, and the methodology is transparent. It combines:
* **System telemetry** from AI coding assistants (GitHub Copilot, Cursor, Claude Code), Git (PR throughput, time to 10th PR), and quality metrics (Change Failure Rate).
* **Self-reported surveys** asking about time savings, AI-authored code percentage, maintainability perception, and enablement quality.
## Key Findings
### Quality impact varies dramatically
The report tracks Change Failure Rate (CFR) â the percentage of changes causing production issues. Results split sharply: some organisations see CFR improvements, others see degradation. The report calls this âvaried,â but Iâd argue itâs the most important signal in the entire dataset.
What differentiates organisations seeing improvement from those seeing degradation? The report doesnât fully unpack this.
### Existing bottlenecks dwarf AI time savings
This should be the headline: **meetings, interruptions, review delays, and CI wait times cost developers more time than AI saves.** Meeting-heavy days are reported as the single biggest obstacle to productivity, followed by interruption frequency (context switching). Individual task-level gains from AI are being swamped by organisational dysfunction. This corroborates 2025 DORA State of AI-assisted Software Development Report findings that systemic constraints limit AI impact.
You can save 4 hours writing code faster, but if you lose 6 hours to slow builds, context switching, poorly-run meetings, the net effect is negative.
### Modest time savings claimed, but seem to have hit a wall
Developers _report_ saving 3.6 hours per week on average, with daily users reporting 4.1 hours. But this is self-reported, not measured (see limitations).
More interesting: **times savings have plateaued around 4 hours even as adoption climbed from ~50% to 91%**. The report initially presents this as a puzzle, but the data actually explains it. The biggest finding, buried on page 20, is â as above â that **non-AI bottlenecks dwarf AI gains**.
### Throughput gains measured, but problematic
Daily AI users merge 60% more PRs per week than non-users (2.3 vs 1.4). Thatâs a measurable difference in activity. Whether it represents productivity is another matter entirely. (More on this in the limitations section.)
### Traditional enterprises show higher adoption
Non-tech companies in regulated industries show higher adoption rates than big tech. The report attributes this to deliberate, structured rollouts with strong governance.
Thereâs likely a more pragmatic explanation: traditional enterprises are aggressively rolling out AI tools in hopes of compensating for weak underlying engineering practices. The question is whether this works. If the goal is to shortcut or leapfrog organisational dysfunction without fixing the root causes, the quality degradation data suggests it wonât. AI canât substitute for modern engineering practices; it can only accelerate whatever practices already exist.
### Other findings
* **Adoption is near-universal:** 91% of developers now use AI coding assistants, matching DORAâs 2025 findings. The report also reveals significant âshadow AIâ usage: developers using tools they pay for themselves, even when their organisation provides approved alternatives.
* **Onboarding acceleration:** Time to 10th PR dropped from 91 days to 49 days for daily AI users. The report cites Microsoft research showing early output patterns predict long-term performance.
* **Junior devs use AI most, senior devs save most time** : Junior developers have highest adoption, but Staff+ engineers report biggest time savings (4.4 hours/week). Staff+ engineers also have the _lowest_ adoption rates. Why arenât senior engineers adopting as readily? Scepticism about quality? Lack of compelling use cases for complex architectural work?
## Limitations and Flaws
### Pull requests as a productivity metric
The report treats â60% more PRs mergedâ as evidence of productivity gains. This is where I need to call out a significant problem â and interestingly, DX themselves have previously written about why this is flawed.
PRs are a poor productivity metric because:
* **They measure motion, not progress.** Counting PRs shows how many code changes occurred, not whether they improved product quality, reliability, or customer value.
* **Theyâre highly workflow-dependent.** Some teams merge once per feature, others many times daily. Comparing PR counts between teams or over time is meaningless unless workflows are identical.
* **Theyâre easily gamed and inflated.** Developers (or AI) can create more, smaller, or trivial PRs without increasing real output. âMore PRsâ often just means more noise.
* **Theyâre actively misleading in mature Continuous Delivery environments.** Teams practising trunk-based development integrate continuously with few or no PRs. Low PR counts in that model actually indicate _higher_ productivity.
### Self-reported time savings canât be trusted
The â3.6 hours saved per weekâ is self-reported, not measured. People overestimate time savings. As an example. the METR Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity study found developers predicted 24% speedup from AI but were actually 19% slower.
### Quality findings under-explored
The varied CFR results are the most important finding, but theyâre presented briefly and then the report moves on. What differentiates organisations seeing improvement from those seeing degradation? Code review practices? Testing infrastructure? Team maturity?
The enablement data hints at answers but doesnât fully investigate. This is a missed opportunity to identify the practices that make AI a quality accelerator rather than a debt accelerator.
### Missing DORA Metrics
The report covers Lead Time (poorly, approximated via PR throughput) and Change Failure Rate. But it doesnât measure deployment frequency or Mean Time to Recovery.
That means weâre missing the end-to-end delivery picture. We know code is written and merged faster, but we donât know if itâs deployed faster or if failures are resolved more quickly. Without deployment frequency and MTTR, we canât assess full delivery-cycle productivity.
## Conclusion
This is one of the better empirical datasets on AIâs impact, corroborating DORA 2025âs key findings. But the real story isnât in the headline numbers about time saved or PRs merged. Itâs in two findings:
### **Non-AI bottlenecks still dominate.**
Meetings, interruptions, review delays, and slow CI pipelines cost more than AI saves. Individual productivity tools canât fix organisational dysfunction.
As with DORAâs findings, the biggest limitation and the biggest opportunity both come from adopting modern engineering practices. Small batch sizes, trunk-based development, automated testing, fast feedback loops. AI makes their presence more valuable and their absence more costly.
### **AI is an accelerant, not a fix**
It reveals and amplifies existing engineering culture. Strong quality practices get faster. Weak practices accumulate debt faster. The variation in CFR outcomes isnât noise â itâs the signal. The organisations seeing genuine gains are those already practising modern software engineering. Those practices remain rare.
My advice for engineering leaders:
1. **Tackle system-level friction first.** Four hours saved writing code doesnât matter if you lose six to meetings, context switching and poor CI infrastructure and tooling.
2. **Adopt modern engineering practices.** The gains from adopting a continuous delivery approach dwarf what AI alone can deliver.
3. **Donât expect AI to fix broken processes.** If review is shallow, testing is weak, or deployment is slow, AI amplifies those problems.
4. **Invest in structured enablement.** The correlation between training quality and outcomes is strong.
5. **Track throughput properly alongside quality.** More PRs merged isnât a win if it isnât actually resulting in shipping faster and your CFR goes up. Measure end to end cycle times, CFR, MTTR, and maintainability.
https://blog.robbowley.net/2025/11/05/findings-from-dxs-2025-report-ai-wont-save-you-from-your-engineering-culture/