Replay Time Travelogue: How Replay MCP Helped Find a React Bug Faster than Dan Abramov Did
We’ve always said that Replay time-travel recordings enable developers to solve really hard bugs, by giving them the ability to inspect the app’s behavior at any point in time. This is especially true for timing bugs and race conditions, where it may be impossible to actually debug them with standard developer tools that would block the race condition from even happening. Now, with Replay MCP, we’re giving agents those same time-travel investigation superpowers.
I recently tried using Replay MCP to investigate a complex React internals bug that Dan Abramov had already investigated and eventually fixed via an agent (but only after some failed attempts). Would Replay enable an agent to find the right answer? And how long would it take?
Background: The React useDeferredValue Bug
In early February, former React core team member and well-known React expert Dan Abramov filed React issue #35821: useDeferredValue gets stuck with a stale value . He reported that he had seen useDeferredValue get “stuck” in prod builds and never re-render with the updated result. He included a fairly minimal repro that had two textboxes. Type in the first textbox, the entries got sent to the server and reflected into the second textbox via useDeferredValue. This worked fine in dev build, but could semi-consistently repro and get stuck in prod builds.
A month later, Dan filed React PR #36134: Fix useDeferredValue getting stuck . The actual fix was 4 lines of code deep in ReactFiberWorkLoop.js to ensure React’s internal “lanes” data structure got updated properly.
Interestingly, the PR itself was described as “Written/debugged by Claude”.
Dan’s Agent Investigation Thoughts
After filing the PR, Dan posted his thoughts on the investigation process on Bluesky :
💡 i tried to get Claude to fix the PR a month ago and it spent a ton of tokens jumping between wrong conclusions, unable to even write a failing test. hopeless.
Today Andrew Clark gave me a tip which led to a fix in just five minutes.
the tip was to tell claude to instrument react production bundle with logs. that's all that was really neededmy workflow was:
tell it to repro the bug with my real repro via plawright
then tell it to study react source code and relevant parts (which i suggested)
add logs to repro
fix
this is just to say that the most valuable thing it needs to find bugs is information over time, especially if you're dealing with some mutable system. with the right log granularity and a high level idea of what algorithm should be doing, it'll find it
i think there's something to this particular sequence of
reliably confirm the repro
study the relevant code (without looking for the bug yet)
instrument the repro with logs
only THEN, try to pinpoint the problem
red/green workflow to verify fix
will try it again in the future
Dan later clarified that the “month” was really just two sessions a month apart: a first session where he tried to have Claude build a repro and it failed, and a second session where he instructed it to add logs and got the correct solution.
I think Dan’s point about “information over time” is critical and 100% accurate.
This is exactly why Replay exists, and why we’ve built Replay MCP!
Once you have a Replay recording of a bug, you can investigate it as much and as deeply as you want. The runtime execution becomes data you can query. When did React render, and why? How many times did a given line of code execute? What was the value of x every time this line of code ran?
Given that, I wanted to compare how well a standard AI agent could investigate the same bug, given only the bug report and access to Replay recordings of the issue.
Agentic Time-Travel Debugging with Replay MCP
Let’s step back and recap what Replay MCP actually is.
Replay is a time-travel debugger for web apps. The Replay recording browser captures DVR-style recordings of an entire web app. Because we capture the entire browser’s behavior and inputs, we can replay the app’s execution exactly the way it ran during the original recording. Unlike session replay or prod monitoring tools, Replay lets you do time-travel debugging and inspect the app’s behavior at any point in time: see which lines of code executed, evaluate log statements for every time a line ran, view the DOM tree and React component tree at any point in time, and much more. This makes it possible to investigate and solve bugs in ways no other tool can.
Replay DevTools is our debugging UI for humans - browser devtools with time-travel built in. Replay MCP gives agents those same time-travel debugging capabilities. Agents can open a recording and use the MCP tools to investigate the same way a human would: looking at console messages, adding logpoints to evaluate expressions each time a line of code ran, getting screenshots and stack traces, and getting framework-specific insights into libraries like React, Redux, Zustand, and TanStack Query. This means agents can now do the investigation work for you automatically!
Investigation Process and Setup
For this experiment, I used my own personal agent setup: OpenCode 1.4 and Opus 4.6. I have some file search and context management plugins enabled, but otherwise no specific skills or custom behaviors.
In other investigations, I’ve found that the context and investigation prompts have a huge influence on the results: telling an agent how deep to go, giving it directions on scientific method steps, providing context on the available codebase.
I did an initial run to see if the agent could even get close to the correct answer. I was thrilled to see that just by analyzing the Replay recordings, the agent successfully identified the root cause and the fix in under 10 minutes!
That alone is an amazing result. As Dan described, his own agent struggled investigating originally, and only later succeeded once he used Andrew Clark’s hint and had it rebuild React with logging added. In comparison, just having Replay recordings available to investigate was enough for an agent to solve it right away!
With that as a baseline, I set up a proper experiment: how much do prompts and investigation instructions matter? I kicked off four parallel agent investigation sessions. Each agent session was given the same access to the Replay recordings of the bug and a local copy of the demo app source, but with varying prompt instructions:
- Light details, bug repro, only told to “write a bug report with a root cause and suggested fix”
- Same repro steps, but a detailed 8-step investigation methodology including explaining why the problem is happening
- Additional summary of React’s internal scheduling system concepts
- Additional list of Replay MCP Tools and their purposes
How would they do? What differences would we see between them in results or investigation times?
Investigation Session Results
I was thrilled to see that with Replay recordings of the bug available, all 4 agent sessions successfully used Replay MCP to nail the actual root cause and produced valid suggested fixes, in under 30 minutes!
Here’s how they tackled the investigation.
Agent #1 (Basic Instructions): 28 Minutes
Agent #1 was given the least context and instructions, so it spent the most time trying to orient itself and understand React’s internals. It also went down the most rabbit holes chasing false leads :)
After starting with the Replay MCP RecordingOverview tool and seeing the prod recording ending with a render commit mismatch and no SuspenseResumed commit, it dug into the implementation of useDeferredValue . This got stuck on promise semantics for several minutes before concluding it was a scheduling problem.
It used the Logpoint tool in pingSuspendedRoot and confirmed there were ping issues. It took another 10 minutes of tracing through RSC promise resolution, including checking call stacks to confirm portions of the call stack were synchronous, before it found the relevant Suspense ternary condition and analyzed that.
It ultimately proposed three fixes: fixing the pingSuspendedRoot ternary (matching PR #36134), making pings async, and updating pinged lanes after renders complete.
This was the longest session, but it did a remarkably thorough job of tracing through the complexity of React’s internals and using Replay MCP Logpoints and other tools to understand what was going on.
Agent #1 prompt
💡 You are a debugging agent with access to Replay MCP tools and source code. Your job is to investigate a bug, identify the root cause, and suggest a fix.
Bug Report
App: Next.js app using React 19 with server actions. Source at
~/projects/temp/react-udv-bug.Behavior:
A textarea input feeds text to a server action (
renderAction) which returnsPromise<ReactNode>The promise is wrapped in
useDeferredValue, then consumed viause()inside<Suspense>In dev mode (
next dev): the deferred output box tracks the input and catches up after typing stopsIn prod mode (
next build && next start): the deferred output frequently gets stuck and never catches up to the final input valueRecordings:
Dev (working):
06447062-f403-49f3-8bd8-476a65b448d2Prod (broken):
c0b7c6fb-8f2a-461c-9fca-0ba2e9be3083In the broken prod recording, the user typed "hello world from prod" but the deferred output is stuck showing "hello world from pro" — one character behind, permanently.
What we know:
No console errors in either recording
The server action POST requests all return 200 OK with valid data
The app code is identical between dev and prod — only the build mode differs
It only manifests in production builds
Goal: Figure out why the deferred value stops updating in production and identify the root cause.
Deliverable
Write a detailed bug report with:
Bug Description — What happens
Root Cause Analysis — What exactly causes the deferred value to get stuck in prod but not dev
Suggested Fix — What code change would fix this
Key Evidence Trail — Table of tools used and findings at each step
Agent #1 final analysis
💡 Bug Report:
useDeferredValuePermanently Stalls in React 19 Production with Server Action RSC Payloads1. Bug Description
When using
useDeferredValuewith a promise returned from a server action (theuseDeferredValue(promise)pattern), the deferred output frequently gets permanently stuck in production builds. The deferred value stops updating and never catches up to the final input.Repro: A textarea input calls
renderAction(value)(a server action returningPromise<ReactNode>), wraps it inuseDeferredValue, then consumes viause()inside<Suspense>. In dev mode, the deferred output tracks correctly. In prod, it stalls — e.g., user types "hello world from prod" but deferred output shows "hello world from pro" forever.Affected version: React 19.3.0-canary (via Next.js 16.1.6 production build)
2. Root Cause Analysis
This is a race condition in React's concurrent renderer where a synchronous ping from an RSC lazy chunk fires during an active render, causing the deferred lane to be permanently lost.
The Bug Mechanism
The failure occurs in
pingSuspendedRoot(react-dom-client.production.js L13073-13089) when three conditions converge simultaneously:
root.suspendedLanes = 0— The lane hasn't been marked as suspended yet becausemarkRootSuspendedruns after the render completes, but the ping fires during the render
executionContext & 2is truthy — The ping fires inside the render context
workInProgressRootExitStatus = 4(RootSuspended) — Set byrenderDidSuspendDelayIfPossibleinthrowException, beforeattachPingListeneris calledThis creates a double failure:
Failure 1 — Lane not pinged (L13076):
root.pingedLanes |= root.suspendedLanes & pingedLanes; // ^^^^^^^^^^^^^^^^^ = 0, so pingedLanes stays 0Failure 2 — No recovery path (L13078-13088):
workInProgressRoot === root // TRUE && (workInProgressRootRenderLanes & pingedLanes) // TRUE && (4 === workInProgressRootExitStatus // TRUE (RootSuspended) ? 0 === (executionContext & 2) // FALSE! In render context. && prepareFreshStack(root, 0) // NOT CALLED : (workInProgressRootPingedLanes |= pingedLanes) // NOT REACHED (wrong ternary branch) )The ternary evaluates
4 === workInProgressRootExitStatusastrue, taking the true-branch whereexecutionContext & 2blocksprepareFreshStack. The else-branch (workInProgressRootPingedLanes |= pingedLanes) is never reached. The lane is lost with no recovery path.How the Synchronous Ping Happens
The key trigger is the RSC (React Server Components) client's
ReactPromise.prototype.thenimplementation (react-server-dom-turbopack-client.browser.production.js:539-563):// L542-549 if (this.status === "resolved_model") initializeModelChunk(this); if (this.status === "fulfilled") { if (typeof resolve === "function") resolve(this.value); // SYNCHRONOUS! }When
attachPingListenercalls.then(pingSuspendedRoot)on an RSC chunk that is already fulfilled (or transitions to fulfilled duringinitializeModelChunk), the ping callback fires synchronously, while still insidethrowAndUnwindWorkLoop→throwException, which is insiderenderRootConcurrentwithexecutionContext = 2.Detailed Timeline (from prod recording
c0b7c6fb)
Time Event Key State 5147ms User types "d" (final keystroke) Sync render, lanes=2 5150ms Commit #37 (last ever) markRootFinishedclearssuspendedLanes=0,pingedLanes=05150ms 1st deferred render starts use(promise)→ pending → SuspenseException5151ms Render yields after suspension markRootSuspendedsetssuspendedLanes=5248005158ms Pings 1 & 2 (fetch response) executionContext=0,suspendedLanes=524800→ pings work correctly5159ms 2nd deferred render starts prepareFreshStack, promise now fulfilled, render proceeds5163ms RSC lazy chunk encountered Pending → SuspenseException → yield 5169ms Re-entry: chunk not resolved throwAndUnwindWorkLoop→throwException→renderDidSuspendDelayIfPossiblesetsexitStatus=45169ms attachPingListener→.then()RSC chunk .then()fires synchronously (chunk already fulfilled)5169ms pingSuspendedRoot(THE BUG)executionContext=2,suspendedLanes=0,pingedLanes=0→ lane 524288 LOST5191ms Render completes markRootSuspendedsetssuspendedLanes=524288butpingedLanes=0— permanently stuckWhy Dev Mode Works
In development builds, the RSC client likely doesn't trigger synchronous resolution in
.then(), or React's dev-mode Suspense handling processes thenables differently. The ping always fires asynchronously (whenexecutionContext = 0), soroot.suspendedLanesis properly populated by the time the ping runs.Evidence: Dev recording shows 125 render commits (every keystroke gets both a SetState and SuspenseResumed). Prod shows only 38 commits, with the last being a SetState with no follow-up SuspenseResumed.
3. Suggested Fix
There are two complementary fixes, either of which would prevent this class of bug:
Fix A: Handle synchronous pings during render in
pingSuspendedRootThe ternary at L13078-13088 has a logic gap. When
exitStatus === 4andexecutionContext & 2, the code should still updateworkInProgressRootPingedLanesso the render can retry the pinged lanes when it completes:// In pingSuspendedRoot, after L13076: if (workInProgressRoot === root) { if ((workInProgressRootRenderLanes & pingedLanes) === pingedLanes) { if (4 === workInProgressRootExitStatus || ...) { if (0 === (executionContext & 2)) { prepareFreshStack(root, 0); } else { // NEW: Even if we can't restart now, mark lanes as pinged // so the scheduler picks them up after render completes workInProgressRootPingedLanes |= pingedLanes; } } else { workInProgressRootPingedLanes |= pingedLanes; } } }Fix B: Prevent synchronous ping resolution from RSC chunks
In
attachPingListeneror in the RSC client'sReactPromise.prototype.then, ensure the ping callback is always invoked asynchronously (e.g., via microtask), so it never fires inside the render context:// In attachPingListener, wrap the ping: wakeable.then( () => queueMicrotask(() => pingSuspendedRoot(root, pingedLanes)), () => queueMicrotask(() => pingSuspendedRoot(root, pingedLanes)) );Fix C: Also update
root.pingedLanesfromworkInProgressRootPingedLanesafter renderIn
performWorkOnRoot, afterrenderRootConcurrentreturns withexitStatus = 4(RootSuspended), propagate any in-render pinged lanes to the root before callingmarkRootSuspended:// Before markRootSuspended in the exitStatus=4 path: root.pingedLanes |= workInProgressRootPingedLanes;Recommended approach: Fix A is the most targeted — it addresses the exact ternary logic gap without changing ping timing semantics. Fix B is a defense-in-depth measure. Fix C ensures the information isn't lost even if other paths hit the same issue.
4. Key Evidence Trail
Step Tool Finding 1 RecordingOverview(both recordings)Dev: 125 commits, Prod: 38 commits. Same React 19.3.0-canary. 2 ReactRenderscommits (prod)Last commit #37 is SetState (2 fibers), no SuspenseResumed follows. Recording continues 5+ seconds. 3 NetworkRequestdetail (prod)All 21 POSTs return 200 OK. Final POST body "hello world from prod"returns correct data at 5158ms.4 ReactRenderstrigger-detailCommit #37 triggered by keypress "d". Previous commit #36 was SuspenseResumed from network. 5 SearchSources/ReadSourceFound updateDeferredValueImpl(L5407-5424): sync renders defer, deferred renders accept.6 LogpointonupdateDeferredValueImplLast 8 hits show hit #6 (5150ms) and #8 (5159ms) accepting deferred value — but no render commits follow. 7 DescribePointat hit #8 (Point:162)Promise is fulfilled with correct "hello world from prod" ReactNode. 8 SearchSourcesforpingSuspendedRootFound L13073-13089: lane pinged via `root.pingedLanes 9 LogpointonpingSuspendedRootL130763 pings after final keystroke. Pings 1-2 at 5158ms work (suspendedLanes=524800). Ping 3 at 5169ms fails (suspendedLanes=0, executionContext=2). 10 LogpointonmarkRootSuspendedL121494 hits. Hit 4 at 5191ms: suspendedLanes goes 0→524288 with pingedLanes=0. Permanent stuck state. 11 ReadSource+DescribePointonthrowExceptionL6205-6218renderDidSuspendDelayIfPossible()called BEFOREattachPingListener()— sets exitStatus=4 before ping can fire.12 GetStackat Point:182 (broken ping)Call stack proves synchronous invocation: pingSuspendedRoot← RSC.then()←attachPingListener←throwException←throwAndUnwindWorkLoop←renderRootConcurrent.13 ReadSourceRSC client L539-563ReactPromise.prototype.then: if status is "fulfilled", callsresolve(this.value)synchronously.14 ReadSourceperformWorkOnRootL11847-11851exitStatus=0 (yielded): no markRootSuspendedcalled, soroot.suspendedLanesstays 0 during re-entry render.15 Evaluateat Point:182Confirmed: root.suspendedLanes=0,root.pingedLanes=0,executionContext=2,workInProgressRootExitStatus=4. All conditions for the double failure met.Summary
This is a React 19 scheduler bug where a synchronous ping from an RSC lazy chunk fires during an active concurrent render, hitting a logic gap in
pingSuspendedRoot's ternary condition. The deferred lane is neither pinged on the root nor recorded inworkInProgressRootPingedLanes, causing it to be permanently suspended with no recovery mechanism. The bug is prod-only because the RSC client in production resolves chunks synchronously in.then(), while the dev bundle resolves them asynchronously.
Agent #2 (Investigation Methodology): 17 Minutes
Agent #2 also started with RecordingOverview and quickly identified the commit mismatch. It then used Replay MCP’s React render trigger details to trace the causation chain from keystrokes to render commits.
It made its way to pingSuspendedRoot and used the Logpoint tool to check the hits and values inside. It got briefly sidetracked on scheduling behavior before coming back to the rendering logic.
Along the way, it actually identified a second potential bug that none of the other runs found. It found a isThenableResolved(thenable) call that checks to see if a promise is resolved or not, and identified that RSC promises use a different status value, "resolved_model" instead of "fulfilled". It suggested a tweak to this logic to allow resuming synchronously instead of re-throwing.
In the end it also suggested the common “re-ping lanes” fix that the other agents and the PR had, but this was a fascinating insight, and derived solely from inspecting the runtime behavior in this one recording.
Agent #2 prompt
💡 You are a debugging agent with access to Replay MCP tools and source code. Your job is to investigate a bug, identify the root cause, and suggest a fix.
Bug Report
App: Next.js app using React 19 with server actions. Source at
~/projects/temp/react-udv-bug.Behavior:
A textarea input feeds text to a server action (
renderAction) which returnsPromise<ReactNode>The promise is wrapped in
useDeferredValue, then consumed viause()inside<Suspense>In dev mode (
next dev): the deferred output box tracks the input and catches up after typing stopsIn prod mode (
next build && next start): the deferred output frequently gets stuck and never catches up to the final input valueRecordings:
Dev (working):
06447062-f403-49f3-8bd8-476a65b448d2Prod (broken):
c0b7c6fb-8f2a-461c-9fca-0ba2e9be3083In the broken prod recording, the user typed "hello world from prod" but the deferred output is stuck showing "hello world from pro" — one character behind, permanently.
What we know:
No console errors in either recording
The server action POST requests all return 200 OK with valid data
The app code is identical between dev and prod — only the build mode differs
It only manifests in production builds
Goal: Figure out why the deferred value stops updating in production and identify the root cause.
Investigation Methodology
Follow this structured process:
Step 1: Establish the baseline
Get an overview of both recordings. Compare high-level stats: render commit counts, network request counts, timing. The quantitative difference between dev and prod is your first clue.
Step 2: Confirm the visual bug
Take screenshots at the end of both recordings to verify what each shows. Confirm the prod recording is stuck one character behind.
Step 3: Compare render commit patterns
Look at the sequence of React render commits in both recordings. Focus on the tail end — the last few commits. In a working recording, you'd expect the render pattern to complete cleanly. In the broken one, something is missing or different. What's the last commit in each? What triggered it? What components re-rendered?
Step 4: Map the data flow end-to-end
Trace the full lifecycle: keystroke → server action POST → response → promise resolution → deferred value update → component re-render. Verify each step completes in both recordings. If all steps complete but the render doesn't happen, the bug is in React's scheduling, not the data flow.
Step 5: Find the divergence point
The app code is identical. The data flow completes in both. So what's different at the React runtime level? Read the relevant React source code in the recorded bundles. Focus on understanding the algorithm before looking for bugs — understand what SHOULD happen first, then look for where reality diverges.
Step 6: Trace state through the critical path
Once you've identified the relevant scheduling/reconciler functions, trace mutable state values through the critical time window. Focus on the last update that fails in prod — what state does React see? Compare the same state in the dev recording for the equivalent update. The divergence in state values IS the root cause.
Step 7: Explain why dev vs prod
Dev and prod React builds differ: dev is slower (extra validation, warnings, double-renders in StrictMode), prod is faster (optimized, no dev checks). This timing difference can change when asynchronous callbacks (microtasks, promise
.then()) fire relative to React's render cycle. If a callback fires at a different point in the render lifecycle between dev and prod, that's likely the mechanism.Step 8: Red-team your conclusion
Before finalizing: Does your explanation account for WHY it's prod-only? Does your fix address the root cause or just a symptom? Could there be other scenarios that trigger the same bug?
Deliverable
Write a detailed bug report with:
Bug Description — What happens
Root Cause Analysis — What exactly causes the deferred value to get stuck in prod but not dev
Suggested Fix — What code change would fix this
Key Evidence Trail — Table of tools used and findings at each step
Agent #2 final analysis
💡 Bug Report:
useDeferredValueGets Permanently Stuck in Production with RSC Server Actions1. Bug Description
When using
useDeferredValuewith a promise returned by a React Server Action inside<Suspense>, the deferred output permanently stops updating in production builds. The last update is lost — the output gets stuck one (or more) characters behind the input and never catches up, even though the server action completes successfully and returns correct data.Reproduction: Type rapidly into a textarea that triggers a server action on each keystroke. The server action returns
Promise<ReactNode>, which is passed throughuseDeferredValueand consumed viause()inside<Suspense>. In dev mode, output catches up after typing stops. In prod mode, output freezes permanently behind.2. Root Cause Analysis
This is a race condition between React's Suspense ping mechanism and RSC chunk resolution, exposed only in production due to faster React execution.
The Two Interacting Bugs
Bug A:
isThenableResolveddoesn't recognize RSC intermediate statesReact's
isThenableResolvedfunction (react-dom-client.production.js:3394) only checks forstatus === "fulfilled"orstatus === "rejected". But RSC flight client chunks use an intermediate"resolved_model"status, meaning HTTP data has arrived but hasn't been fully deserialized yet. When React'srenderRootConcurrentresumes a suspended render (case 7: SuspendedOnData) and checksisThenableResolved(thrownValue), it returnsfalsefor"resolved_model"chunks and falls through tothrowAndUnwindWorkLoop— treating already-available data as still-pending.Bug B: Synchronous ping fires before lane is marked suspended, creating a dead lane
After
throwAndUnwindWorkLoop, React callsattachPingListenerwhich attaches.then(pingSuspendedRoot)to the RSC chunk. But RSC'sReactPromise.prototype.then(line 539) detects the"resolved_model"status, synchronously callsinitializeModelChunk()(transitioning to"fulfilled"), and synchronously invokespingSuspendedRoot— all within the same call stack asattachPingListener.At this moment, the render is still in progress.
root.suspendedLanesis0becausemarkRootSuspendedhasn't run yet. SopingSuspendedRootcomputes:root.pingedLanes |= root.suspendedLanes & pingedLanes // ^^^^^^^^^^^^^^^^ // = 0 (not yet set!) // Result: pingedLanes stays 0. Ping is wasted.Later (~26ms), the render completes with exit status 4 (RootSuspended).
markRootSuspendedfinally setsroot.suspendedLanes |= 524288. But the ping already fired and won't fire again. The lane is now:
pendingLanes: has bit 19 ✓
suspendedLanes: has bit 19 ✓
pingedLanes: 0 ✗
getNextLanescomputesnonIdlePendingLanes & ~suspendedLanes = 0. The lane is permanently stuck — pending but unreachable.The Fatal Timeline (from prod recording)
Time (ms) Event State 5147 Final keystroke "d" setPromise(renderAction("hello world from prod"))5149 POST request sent Server action in flight 5158 POST response arrives RSC chunk status → "resolved_model"5165 renderRootConcurrentcase 7isThenableResolved("resolved_model")→ false5165 throwAndUnwindWorkLoopSuspense boundary catches, render continues unwinding 5168 attachPingListener.then(pingSuspendedRoot)attached to chunk5169 RSC .then()fires synchronouslyinitializeModelChunk→"fulfilled"→pingSuspendedRoot5169 Wasted ping suspendedLanes=0, sopingedLanesstays 05191 Render completes Exit status 4 (RootSuspended) 5191 markRootSuspendedsuspendedLanes= 524288 (too late for ping)5192 scheduleTaskForRootDuringMicrotaskgetNextLanes→ 0. Dead lane. No more renders.Why Dev Mode Doesn't Hit This
Dev React has extra overhead: StrictMode double-renders, validation checks, debug IO tracking in
trackUsedThenable(lines 3838-3854). This extra time creates a window where the RSC chunk transitions from"resolved_model"→"fulfilled"before React's case 7 check runs. SoisThenableResolvedreturnstrue,replaySuspendedUnitOfWorkis called instead ofthrowAndUnwindWorkLoop, and the value is consumed directly — no ping needed.3. Suggested Fix
There are two complementary fixes, either of which would prevent the dead lane:
Fix A (React reconciler): Recognize RSC intermediate resolved states
In
isThenableResolved, also check for"resolved_model"(and potentially other RSC intermediate resolved states like"resolved_module"):// react-reconciler: ReactFiberThenable.js function isThenableResolved(thenable) { var status = thenable.status; return status === "fulfilled" || status === "rejected" || status === "resolved_model" || status === "resolved_module"; }This way, when case 7 checks the thenable, it recognizes the data has arrived and calls
replaySuspendedUnitOfWorkinstead ofthrowAndUnwindWorkLoop. The render succeeds without needing a ping at all.Fix B (React reconciler): Guard against synchronous pings during render
In
pingSuspendedRoot, if the lane isn't yet insuspendedLanes(because render is still in progress), defer the ping or ensure the lane gets re-pinged aftermarkRootSuspended:// react-reconciler: ReactFiberWorkLoop.js function markRootSuspended(root, suspendedLanes, ...) { root.suspendedLanes |= suspendedLanes; root.pingedLanes &= ~suspendedLanes; // NEW: Check if any of these lanes were pinged while render was in progress // If so, immediately re-ping them if (root._pendingPingsDuringRender & suspendedLanes) { root.pingedLanes |= root._pendingPingsDuringRender & suspendedLanes; root._pendingPingsDuringRender = 0; } }Or, alternatively in
pingSuspendedRoot, detect that a render is in progress and record the ping for later application.Fix A is the cleaner solution — it avoids the ping mechanism entirely for already-resolved data and handles the root cause (misidentification of RSC chunk state). Fix B is a defense-in-depth measure for any other scenario where synchronous pings could race with in-progress renders.
4. Key Evidence Trail
Step Tool Finding Baseline comparison RecordingOverview(both recordings)Dev: 125 commits, 318ms render. Prod: 38 commits, 107ms render. 3x render difference. Visual confirmation Screenshot(both, final timestamp)Dev output matches input. Prod stuck at "hello world from pro" — one char behind. Last render commits ReactRenderscommits mode (prod)Last commit #37 at 5.15s is a setStatefor the typing event, not a Suspense resolution. No commit for final value.Network verification NetworkRequestdetail (prod, last 2 POSTs)Both return 200 OK. Final POST contains "hello world from prod"— correct data.Trigger chains ReactRenderstrigger-detail (prod, commits 35-37)Commit 36 triggered by network response for "hello world from pro". No commit triggered by final response. Ping analysis LogpointonpingSuspendedRoot3 pings after final keystroke. Ping #3 (Point:114, 5169ms): suspendedLanes=0, ping is a no-op.Re-suspension LogpointonmarkRootSuspendedPoint:118 (5191ms): Lane 524288 re-suspended. pingedLanesalready 0.Scheduling dead-end LogpointonscheduleTaskForRootDuringMicrotaskAfter re-suspension: getNextLanesreturns 0. Lane stuck forever.isThenableResolvedLogpointon line 3394 +DescribePointReturns falseforstatus === "resolved_model". Only checks"fulfilled"/"rejected".Case 7 path DescribePointonrenderRootConcurrentcase 7isThenableResolvedfalse →throwAndUnwindWorkLoopinstead ofreplaySuspendedUnitOfWork.RSC chunk .then()ReadSourceon RSC client codeReactPromise.prototype.thensynchronously resolves"resolved_model"→"fulfilled"and invokes callback inline.Dev vs prod timing DescribePointcomparison oftrackUsedThenableDev has extra debug IO code (lines 3838-3854) creating enough delay for chunk to reach "fulfilled"before case 7 check.5. Impact Assessment
Severity: High — causes permanent UI staleness with no recovery
Scope: Any Next.js app using
useDeferredValue+use()+ RSC server actions with rapid updatesWorkarounds: None clean. Adding artificial delays would mask but not fix. Avoiding
useDeferredValuewith server actions works but defeats the purpose.React version affected: 19.3.0-canary-f93b9fd4-20251217 (and likely all React 19 builds with RSC support)
Agent #3 (React Scheduling Concepts): 8 Minutes
Agent #3 started with RecordingOverview to check the output, and Screenshot to confirm the problem visually. Since it already had the terminology to describe React’s scheduling internals, it made a beeline to functions like markRootSuspended, even without any actual knowledge of the implementation.
It used the Logpoint tool to check hits for pingSuspendedRoot and markRootSuspended, and quickly identified the relevant ternary logic as the culprit. It then checked vs the dev recording line hits, and came up with the pinged lanes solution, without any major rabbit trails.
Agent #3 prompt
💡 You are a debugging agent with access to Replay MCP tools and source code. Your job is to investigate a bug, identify the root cause, and suggest a fix.
Bug Report
App: Next.js app using React 19 with server actions. Source at
~/projects/temp/react-udv-bug.Behavior:
A textarea input feeds text to a server action (
renderAction) which returnsPromise<ReactNode>The promise is wrapped in
useDeferredValue, then consumed viause()inside<Suspense>In dev mode (
next dev): the deferred output box tracks the input and catches up after typing stopsIn prod mode (
next build && next start): the deferred output frequently gets stuck and never catches up to the final input valueRecordings:
Dev (working):
06447062-f403-49f3-8bd8-476a65b448d2Prod (broken):
c0b7c6fb-8f2a-461c-9fca-0ba2e9be3083In the broken prod recording, the user typed "hello world from prod" but the deferred output is stuck showing "hello world from pro" — one character behind, permanently.
What we know:
No console errors in either recording
The server action POST requests all return 200 OK with valid data
The app code is identical between dev and prod — only the build mode differs
It only manifests in production builds
Goal: Figure out why the deferred value stops updating in production and identify the root cause.
Investigation Methodology
Follow this structured process:
Step 1: Establish the baseline
Get an overview of both recordings. Compare high-level stats: render commit counts, network request counts, timing. The quantitative difference between dev and prod is your first clue.
Step 2: Confirm the visual bug
Take screenshots at the end of both recordings to verify what each shows. Confirm the prod recording is stuck one character behind.
Step 3: Compare render commit patterns
Look at the sequence of React render commits in both recordings. Focus on the tail end — the last few commits. In a working recording, you'd expect the render pattern to complete cleanly. In the broken one, something is missing or different. What's the last commit in each? What triggered it? What components re-rendered?
Step 4: Map the data flow end-to-end
Trace the full lifecycle: keystroke → server action POST → response → promise resolution → deferred value update → component re-render. Verify each step completes in both recordings. If all steps complete but the render doesn't happen, the bug is in React's scheduling, not the data flow.
Step 5: Find the divergence point
The app code is identical. The data flow completes in both. So what's different at the React runtime level? Read the relevant React source code in the recorded bundles. Focus on understanding the algorithm before looking for bugs — understand what SHOULD happen first, then look for where reality diverges.
Step 6: Trace state through the critical path
Once you've identified the relevant scheduling/reconciler functions, trace mutable state values through the critical time window. Focus on the last update that fails in prod — what state does React see? Compare the same state in the dev recording for the equivalent update. The divergence in state values IS the root cause.
Step 7: Explain why dev vs prod
Dev and prod React builds differ: dev is slower (extra validation, warnings, double-renders in StrictMode), prod is faster (optimized, no dev checks). This timing difference can change when asynchronous callbacks (microtasks, promise
.then()) fire relative to React's render cycle. If a callback fires at a different point in the render lifecycle between dev and prod, that's likely the mechanism.Step 8: Red-team your conclusion
Before finalizing: Does your explanation account for WHY it's prod-only? Does your fix address the root cause or just a symptom? Could there be other scenarios that trigger the same bug?
Conceptual Context: React's Scheduling System
These concepts will help you navigate the React internals you'll encounter:
Lanes: React uses a bitmask-based priority system called "lanes." Each update is assigned a lane (e.g., SyncLane, TransitionLane, DeferredLane).
useDeferredValueschedules work on deferred lanes, which are lower priority than user-initiated updates.Suspension and pinging: When a component throws a Promise (Suspense), React marks the update's lanes as "suspended." When the promise resolves, a "ping" fires that marks those lanes as "pinged," telling the scheduler to retry. The lifecycle is: pending → rendering → suspended → pinged → retry rendering → committed.
workInProgressstate: During rendering, React tracks the current render's state in module-scoped variables prefixed withworkInProgress— things like what root is being rendered, what lanes, what the exit status is, which lanes have been pinged during this render. These are mutable and change throughout the render cycle.
markRootSuspended****: When a render suspends, this function marks the lanes as suspended on the root. It usesworkInProgressRootPingedLanesto exclude lanes that were pinged during the current render (so they don't get stuck as suspended). It also adds lanes towarmLaneswhich affects future scheduling.Concurrent rendering and yielding: In prod, React uses time-slicing — renders can yield back to the browser mid-work and resume later. This means microtasks (like promise
.then()callbacks) can fire between render chunks. In dev mode, renders tend to run longer without yielding due to extra dev checks, so microtasks fire after the render completes.The key timing question: When a suspended promise resolves, the
.then()ping callback might fire during an active render (prod, due to yielding) or after the render completes (dev, due to slower execution). If the ping handling code doesn't correctly handle the "ping arrives mid-render" case, the lane can get stuck.Deliverable
Write a detailed bug report with:
Bug Description — What happens
Root Cause Analysis — What exactly causes the deferred value to get stuck in prod but not dev
Suggested Fix — What code change would fix this
Key Evidence Trail — Table of tools used and findings at each step
Agent #3 final analysis
💡 Bug Report:
useDeferredValuePermanently Stalls in Production React Builds1. Bug Description
When using
useDeferredValuewith a server action that returnsPromise<ReactNode>, wrapped in<Suspense>and consumed viause(), the deferred output frequently stops updating in production builds. The user types "hello world from prod" but the deferred display is stuck showing "hello world from pro" — one character behind, permanently. The same app code works correctly in dev mode.2. Root Cause Analysis
The bug is in React's
pingSuspendedRootfunction — specifically, a code path that silently drops a Suspense ping when it arrives during an active concurrent render withRootSuspendedWithDelayexit status.The mechanism step by step:
1. User types the final 'd' in "prod" (5149ms). React:
Schedules a new server action POST for "hello world from prod"
Begins a concurrent render on lane 524288 (DeferredLane)
2. The server action responds quickly (5155ms, ~6ms round-trip). The promise's
.then()callback firespingSuspendedRootat 5169ms — while React is still mid-render of the same lane due to concurrent time-slicing.3. Inside
pingSuspendedRoot(line 13073), the ping enters the critical branch:function pingSuspendedRoot(root, wakeable, pingedLanes) { // Line 13076: Try to mark root.pingedLanes root.pingedLanes |= root.suspendedLanes & pingedLanes; // ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ // BUG: root.suspendedLanes is STILL 0 here! // The lane hasn't been marked suspended yet (render is still in progress). // So this is: 0 |= (0 & 524288) = 0. NO EFFECT. root.warmLanes &= ~pingedLanes; // warmLanes also 0 at this point // Line 13078: Are we currently rendering this root+lanes? workInProgressRoot === root && // TRUE (same root) (workInProgressRootRenderLanes & pingedLanes) === pingedLanes && // TRUE (same lane) // Line 13080: Ternary — exitStatus check (4 === workInProgressRootExitStatus || // TRUE (RootSuspendedWithDelay = 4) (...)) // FIRST BRANCH taken: ? 0 === (executionContext & 2) && prepareFreshStack(root, 0) // ^^^^^^^^^^^^^^^^^^^^^^ // BUG: executionContext IS 2 (RenderContext)! We're mid-render! // So 0 === (2 & 2) is FALSE. prepareFreshStack is NOT called. // This entire branch is a no-op. // ELSE BRANCH (never reached because ternary was truthy): : (workInProgressRootPingedLanes |= pingedLanes) // ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ // This would have saved us! But it's unreachable. ensureRootIsScheduled(root); // Called, but getNextLanes returns 0 }4. The render completes and calls
markRootSuspended(line 12141):function markRootSuspended(root, suspendedLanes, spawnedLane, didAttemptEntireTree) { suspendedLanes &= ~workInProgressRootPingedLanes; // 524288 &= ~0 = 524288 (no exclusion) root.suspendedLanes |= suspendedLanes; // Lane 524288 marked SUSPENDED root.pingedLanes &= ~suspendedLanes; // Clears any pings (already 0) didAttemptEntireTree && (root.warmLanes |= suspendedLanes); // Lane marked WARM }5. The lane is now permanently stuck:
root.pendingLanes = 524288✓ (work pending)
root.suspendedLanes = 524288✗ (excluded from scheduling)
root.pingedLanes = 0✗ (no ping recorded to rescue it)
root.warmLanes = 524288✗ (excluded from cold-retry path)6.
getNextLanes(line 706) returns 0 — all three rescue paths fail:
pendingLanes & ~suspendedLanes= 0 (suspended)
pingedLanes & nonIdlePendingLanes= 0 (not pinged)
nonIdlePendingLanes & ~warmLanes= 0 (warm)The scheduler permanently sees no work. The UI is stuck.
Why dev mode works
In development mode, React runs significantly slower due to:
Extra validation checks, warning infrastructure, DevTools hooks
StrictModedouble-renderingThis means renders take longer and never yield to the browser mid-render. Promise
.then()microtask callbacks therefore always fire after the render completes. In every single ping across 126 hits in the dev recording,workInProgressRoot !== rootandexecutionContext === 0— the ping always takes the fast path of just settingroot.pingedLanes(line 13076, whereroot.suspendedLanesis already set) and callingensureRootIsScheduled. The buggy ternary at line 13078 is never entered.In production, concurrent time-slicing allows yields mid-render, so microtasks from resolved promises can fire during the render. This creates the timing window where the ping arrives before
markRootSuspendedruns.3. Suggested Fix
The fix is in
pingSuspendedRoot. When the ping fires mid-render (executionContext has RenderContext bit set) and the exit status preventsprepareFreshStack, the function must still record the ping inworkInProgressRootPingedLanesso thatmarkRootSuspendedcan exclude the lane from suspension.Option A (minimal fix): When the
prepareFreshStackguard fails, fall through to setworkInProgressRootPingedLanes:// In pingSuspendedRoot, replace the ternary at lines 13080-13086: // BEFORE (buggy): (4 === workInProgressRootExitStatus || (...)) ? 0 === (executionContext & 2) && prepareFreshStack(root, 0) : (workInProgressRootPingedLanes |= pingedLanes) // AFTER (fixed): if (4 === workInProgressRootExitStatus || (...)) { if (0 === (executionContext & 2)) { prepareFreshStack(root, 0); } else { // Ping arrived mid-render. Can't restart now, but record the ping // so markRootSuspended won't mark this lane as suspended. workInProgressRootPingedLanes |= pingedLanes; } } else { workInProgressRootPingedLanes |= pingedLanes; }Option B (defensive): Additionally ensure
root.pingedLanesis set even when the lane isn't yet suspended, so the ping survives regardless of ordering:// At the top of pingSuspendedRoot, unconditionally record the ping on the root // for lanes that are pending (not just suspended): root.pingedLanes |= root.suspendedLanes & pingedLanes; root.pingedLanes |= root.pendingLanes & pingedLanes; // NEW: catch pre-suspension pingsOption A is the more surgical fix addressing the exact logic error. Option B is belt-and-suspenders.
4. Key Evidence Trail
Step Tool Finding Recording overview RecordingOverview(both)Dev: 125 commits, 63 POSTs. Prod: 38 commits, 21 POSTs. Massive gap. Visual confirmation Screenshot(both at final timestamps)Dev shows "hello world from prod" in output. Prod shows "hello world from pro" — stuck. Render commit tails ReactRenderscommit mode (both)Dev ends with SuspenseResumed at 8.47s. Prod ends with SetState at 5.15s — no final SuspenseResumed. Network completion NetworkRequestdetail (prod #36)Final POST "hello world from prod" returns 200 OK at 5155ms with valid RSC payload. Data arrived. App source code grepika_getonapp/client.tsxuseDeferredValue(promise)→use()in<Suspense>. Standard pattern.pingSuspendedRootanalysisReadSource+Logpointon line 1307346 total hits. Hit #5 (Point:114, 5169ms) fires mid-render: executionContext=2,root.suspendedLanes=0,workInProgressRootExitStatus=4. Ping silently dropped.markRootSuspendedaftermathLogpointon line 12147 + post-state evalAfter line 12161: root.suspendedLanes=524288,root.pingedLanes=0,root.warmLanes=524288. Lane permanently stuck.getNextLanesdeadlockReadSourceon line 706Three rescue paths all return 0 with stuck state. No work scheduled. Dev comparison Logpointon line 13073 (dev recording)All 126 pings: executionContext=0,workInProgressRoot !== root. Buggy branch never entered.markRootSuspendedexclusion logicReadSourceon line 12147suspendedLanes &= ~workInProgressRootPingedLanes— would exclude pinged lanes, butworkInProgressRootPingedLanes=0because the ping was dropped.5. Summary
This is a race condition in React's concurrent rendering scheduler. When a Suspense promise resolves during an active concurrent render (via microtask firing during a yield),
pingSuspendedRootenters a code path where:
root.pingedLanescan't be set (lane not yet suspended on root)
prepareFreshStackcan't be called (currently in render context)
workInProgressRootPingedLanesisn't set (unreachable else-branch of ternary)The ping is completely lost. The subsequent
markRootSuspendedmarks the lane as suspended with no ping, andgetNextLanescan never find work for it again. The fix is to ensureworkInProgressRootPingedLanesis set in the mid-render case, somarkRootSuspendedexcludes the lane from suspension.
Agent #4 (React Concepts + Replay Tools Overview): 7 Minutes
Agent #4 started off the same way as #3, with RecordingOverview and Screenshot tool calls. After scanning the source for pingSuspendedRoot in the React bundle, it tried to use Logpoint calls but struggled a bit with the syntax, so it switched to Evaluate instead to similarly retrieve real values in scope at various execution points.
It found the relevant ternary logic in 4 minutes, confirmed the issue and compared with the dev build, did some red-team review, and wrote up the final report with the correct pinged lanes solution.
This run was the most efficient - it read the source, knew where to instrument, and produced the best final report .
Agent #4 prompt
💡 You are a debugging agent with access to Replay MCP tools and source code. Your job is to investigate a bug, identify the root cause, and suggest a fix.
Bug Report
App: Next.js app using React 19 with server actions. Source at
~/projects/temp/react-udv-bug.Behavior:
A textarea input feeds text to a server action (
renderAction) which returnsPromise<ReactNode>The promise is wrapped in
useDeferredValue, then consumed viause()inside<Suspense>In dev mode (
next dev): the deferred output box tracks the input and catches up after typing stopsIn prod mode (
next build && next start): the deferred output frequently gets stuck and never catches up to the final input valueRecordings:
Dev (working):
06447062-f403-49f3-8bd8-476a65b448d2Prod (broken):
c0b7c6fb-8f2a-461c-9fca-0ba2e9be3083In the broken prod recording, the user typed "hello world from prod" but the deferred output is stuck showing "hello world from pro" — one character behind, permanently.
What we know:
No console errors in either recording
The server action POST requests all return 200 OK with valid data
The app code is identical between dev and prod — only the build mode differs
It only manifests in production builds
Goal: Figure out why the deferred value stops updating in production and identify the root cause.
Investigation Methodology
Follow this structured process:
Step 1: Establish the baseline
Get an overview of both recordings. Compare high-level stats: render commit counts, network request counts, timing. The quantitative difference between dev and prod is your first clue.
Step 2: Confirm the visual bug
Take screenshots at the end of both recordings to verify what each shows. Confirm the prod recording is stuck one character behind.
Step 3: Compare render commit patterns
Look at the sequence of React render commits in both recordings. Focus on the tail end — the last few commits. In a working recording, you'd expect the render pattern to complete cleanly. In the broken one, something is missing or different. What's the last commit in each? What triggered it? What components re-rendered?
Step 4: Map the data flow end-to-end
Trace the full lifecycle: keystroke → server action POST → response → promise resolution → deferred value update → component re-render. Verify each step completes in both recordings. If all steps complete but the render doesn't happen, the bug is in React's scheduling, not the data flow.
Step 5: Find the divergence point
The app code is identical. The data flow completes in both. So what's different at the React runtime level? Read the relevant React source code in the recorded bundles. Focus on understanding the algorithm before looking for bugs — understand what SHOULD happen first, then look for where reality diverges.
Step 6: Trace state through the critical path
Once you've identified the relevant scheduling/reconciler functions, trace mutable state values through the critical time window. Focus on the last update that fails in prod — what state does React see? Compare the same state in the dev recording for the equivalent update. The divergence in state values IS the root cause.
Step 7: Explain why dev vs prod
Dev and prod React builds differ: dev is slower (extra validation, warnings, double-renders in StrictMode), prod is faster (optimized, no dev checks). This timing difference can change when asynchronous callbacks (microtasks, promise
.then()) fire relative to React's render cycle. If a callback fires at a different point in the render lifecycle between dev and prod, that's likely the mechanism.Step 8: Red-team your conclusion
Before finalizing: Does your explanation account for WHY it's prod-only? Does your fix address the root cause or just a symptom? Could there be other scenarios that trigger the same bug?
Conceptual Context: React's Scheduling System
These concepts will help you navigate the React internals you'll encounter:
Lanes: React uses a bitmask-based priority system called "lanes." Each update is assigned a lane (e.g., SyncLane, TransitionLane, DeferredLane).
useDeferredValueschedules work on deferred lanes, which are lower priority than user-initiated updates.Suspension and pinging: When a component throws a Promise (Suspense), React marks the update's lanes as "suspended." When the promise resolves, a "ping" fires that marks those lanes as "pinged," telling the scheduler to retry. The lifecycle is: pending → rendering → suspended → pinged → retry rendering → committed.
workInProgressstate: During rendering, React tracks the current render's state in module-scoped variables prefixed withworkInProgress— things like what root is being rendered, what lanes, what the exit status is, which lanes have been pinged during this render. These are mutable and change throughout the render cycle.
markRootSuspended****: When a render suspends, this function marks the lanes as suspended on the root. It usesworkInProgressRootPingedLanesto exclude lanes that were pinged during the current render (so they don't get stuck as suspended). It also adds lanes towarmLaneswhich affects future scheduling.Concurrent rendering and yielding: In prod, React uses time-slicing — renders can yield back to the browser mid-work and resume later. This means microtasks (like promise
.then()callbacks) can fire between render chunks. In dev mode, renders tend to run longer without yielding due to extra dev checks, so microtasks fire after the render completes.The key timing question: When a suspended promise resolves, the
.then()ping callback might fire during an active render (prod, due to yielding) or after the render completes (dev, due to slower execution). If the ping handling code doesn't correctly handle the "ping arrives mid-render" case, the lane can get stuck.Replay MCP Tool Strategy
You have access to powerful runtime inspection tools. Here's how to use them effectively for this investigation:
RecordingOverview— Start here for each recording. Gets you error counts, render commit stats, network summary, environment info. Run on BOTH recordings first.
ReactRenders— Your primary comparison tool. Usemode: commitsto see the full commit sequence. Usemode: commitwithcommitIndexto drill into individual commits. Focus on the last few commits in each recording — where does prod diverge from dev?
NetworkRequest— Verify all server action POSTs returned successfully. Use to rule out data-level issues early. Check timing of responses relative to render commits.
SearchSources— Find functions in the recorded bundles by name. React 19 ships unminified production bundles, so function names are preserved. Search for scheduling and reconciler functions to find their locations.
ReadSource— Read source code with execution hit counts per line. This is a built-in static profiler — you can see which branches executed and how many times. Use this to understand code flow without needing to instrument anything.
Logpoint— Your most powerful investigation tool. Evaluates any JavaScript expression at a specific source line, every time that line executes. Returns the expression value at each hit with timestamps. This is equivalent to addingconsole.logstatements retroactively — unlimited, no rebuild, no reproduction needed.
DescribePoint/Evaluate— Inspect full program state at a specific execution point. Use after Logpoint identifies an interesting moment — drill in to see all variables, the call stack, and dependency chains.
Screenshot— Confirm visual state at timestamps. Use at the end of each recording to verify the bug visually.Investigation pattern: Overview both recordings → Compare render commits → Read source to understand the algorithm → Instrument with Logpoints to trace state through the critical window → Find where dev and prod state diverges → Explain the mechanism.
Deliverable
Write a detailed bug report with:
Bug Description — What happens
Root Cause Analysis — What exactly causes the deferred value to get stuck in prod but not dev
Suggested Fix — What code change would fix this
Key Evidence Trail — Table of tools used and findings at each step
Agent #4 final analysis
💡 Bug Report: React
useDeferredValueStuck in Production — Race Condition inpingSuspendedRoot1. Bug Description
When using
useDeferredValuewith aPromise<ReactNode>(fed by a server action) inside<Suspense>, the deferred output permanently stops updating in production builds. The final typed character never renders — the deferred value gets stuck one character behind the input. This does not happen in dev mode.In the investigated case, the user typed "hello world from prod" but the output is permanently stuck showing "hello world from pro".
2. Root Cause Analysis
The bug is a race condition in React's
pingSuspendedRootfunction inreact-dom-client.production.js. It only manifests in production because production builds use concurrent time-slicing, which allows microtasks (promise.then()callbacks) to fire during an active render.The Flow That Breaks
User types the final character ('d'). React starts rendering the deferred lane (lane
524288=0x80000).During rendering, the component suspends (throws a Promise) because the server action response hasn't arrived yet. React sets
workInProgressRootExitStatus = 4(RootSuspended) but continues in the render context (executionContext = 2).The server action response arrives at ~5155ms. The promise resolves, firing the
.then()ping callback at 5169ms — while the render is still active.
pingSuspendedRootfires mid-render with this state:Three things fail simultaneously:
5191ms —
markRootSuspendedruns after the render completes:Why Dev Mode Works
In dev mode, React is slower due to extra validation, warnings, and StrictMode double-renders. It doesn't yield to the browser mid-render as often. The Replay recording confirms this: the condition
workInProgressRoot === rootinsidepingSuspendedRoothas 0 hits in dev. The ping.then()callback always fires after the render completes, sopingSuspendedRoottakes the early-exit path (no active render) andensureRootIsScheduledcorrectly reschedules the retry.3. The Exact Bug in Code
File:
packages/react-reconciler/src/ReactFiberWorkLoop.js(in the React repo)Function:
pingSuspendedRootProd bundle location:
react-dom-client.production.jslines 13078-13088// Current code (simplified): if (workInProgressRoot === root && (workInProgressRootRenderLanes & pingedLanes) === pingedLanes) { if (exitStatus === RootSuspended || (exitStatus === RootSuspendedWithDelay && ...)) { // TRUE branch: try to restart if ((executionContext & RenderContext) === 0) { prepareFreshStack(root, 0); // Can't do this mid-render! } // If mid-render, NOTHING happens — ping is silently dropped } else { // FALSE branch: record the ping for later workInProgressRootPingedLanes |= pingedLanes; // Never reached when exitStatus===4 } }The bug: When
exitStatus === 4(suspended) ANDexecutionContexthas the render bit set (mid-render), neitherprepareFreshStacknorworkInProgressRootPingedLanes |= pingedLanesexecutes. The ping is completely lost.4. Suggested Fix
When the render has suspended and we're mid-render (can't restart), we should still record the ping in
workInProgressRootPingedLanesso thatmarkRootSuspendedwill exclude the pinged lane from being marked suspended:// In pingSuspendedRoot, the ternary handling for exitStatus === 4: if (exitStatus === RootSuspended || (exitStatus === RootSuspendedWithDelay && ...)) { if ((executionContext & RenderContext) === 0) { prepareFreshStack(root, 0); } else { // FIX: Record the ping even when we can't restart mid-render. // markRootSuspended will check workInProgressRootPingedLanes // and exclude this lane from being marked suspended. workInProgressRootPingedLanes |= pingedLanes; } } else { workInProgressRootPingedLanes |= pingedLanes; }This ensures that when
markRootSuspendedruns at line 12147 (suspendedLanes &= ~workInProgressRootPingedLanes), the pinged lane is excluded and won't be permanently stuck. The subsequentensureRootIsScheduled(line 13089, which always runs) will then correctly schedule a retry render.5. Key Evidence Trail
Step Tool Recording Finding 1 RecordingOverviewBoth Dev: 125 commits, 63 POSTs. Prod: 38 commits, 21 POSTs. 3x fewer renders in prod. 2 ScreenshotBoth Dev shows "hello world from dev" (complete). Prod shows "hello world from pro" (stuck). 3 ReactRenders(commits)Prod Last commit #37 is SetState for final keystroke. No SuspenseResumed follows — missing final commit. 4 ReactRenders(commits)Dev Last commits #123→#124 show SetState→SuspenseResumed. Pattern completes. 5 NetworkRequest(detail)Prod POST #36 at 5149ms sends "hello world from prod", gets 200 OK. Response arrives at 5155ms with valid RSC data. Data is fine. 6 SearchSources+ReadSourceProd Located pingSuspendedRoot(L13073),markRootSuspended(L12141),prepareFreshStack(L12184).7 Logpoint+EvaluateProd At Point:114 (5169ms ping): exitStatus=4,executionContext=2,root.suspendedLanes=0,workInProgressRootPingedLanes=0. Confirmed ping is lost.8 Logpoint+EvaluateProd At Point:118 (5191ms markRootSuspended): workInProgressRootPingedLanes=0, lane524288permanently marked suspended+warm.9 ReadSource(hit counts)Dev pingSuspendedRootL10047:workInProgressRoot === roothas 0 hits. Dev pings never fire mid-render. Confirms timing difference is the mechanism.10 ReadSource(hit counts)Prod pingSuspendedRootL13087: 4 hits on the mid-render path. L13085 (prepareFreshStack): 0 hits. Confirmed: entered the ternary 4 times butprepareFreshStacknever executed.6. Red-Team Analysis
Does this explain why it's prod-only? Yes. The race requires time-slicing (concurrent rendering with yields), which only happens in production. Dev mode's extra overhead means renders don't yield, so promise callbacks fire after renders complete.
Is this a React bug or an app bug? React bug. The app correctly uses
useDeferredValue+Suspense+ server actions. The race is entirely within React's scheduler.Could this affect other patterns? Yes. Any pattern where a promise resolves during an active concurrent render of the same lane —
useDeferredValue,lazy(), data fetching with Suspense — could theoretically hit this. TheuseDeferredValue+ server action pattern is particularly susceptible because server action responses arrive asynchronously at unpredictable times.Is the fix complete? The suggested fix handles the specific case where
exitStatus === 4 && (executionContext & 2) !== 0. An alternative fix could be inmarkRootSuspendeditself (checkingroot.pingedLanesas a fallback), but fixingpingSuspendedRootat the point where the ping is dropped is cleaner and more direct.
Analyzing the Agent Results
I went into this hoping that having Replay recordings available would help prove that Replay runtime data makes it easier to solve hard bugs. I’ve certainly experienced that myself just working on Replay and using it over the last few years! So it was extremely satisfying to see that every single one of the agent runs was able to find this complex React bug and propose the correct fix, based only on the Replay recordings of the bug!
As Dan noted: his own agent wasn’t able to solve the issue itself. It wasn’t until Andrew Clark pointed him in the right direction that his agent was able to add the right log calls to React’s source, rebuild, analyze the logs, repeat the process, and eventually converge on the right answer.
Replay MCP’s tools gave my agents the ability to analyze the runtime behavior without having to keep rebuilding React! The Logpoint tool allowed agents to dynamically evaluate an expression every time a line of code ran, while the Sources tool and its built-in “hit counters per line” values acted as an impromptu profiler to help guide the investigation.
Comparing Agent Behaviors
Looking at the prompts and the results, I see a few key differences:
| Dimension | Run 1 (Baseline) | Run 2 (+Method) | Run 3 (+Context) | Run 4 (+Tools) |
|---|---|---|---|---|
| Duration | ~28 min | ~17 min | ~8 min | ~7 min |
| Messages | 154 | 111 | 47 | 42 |
| Context compressions | 6 | 4 | 2 | 2 |
| Found ternary bug | ✅ | ✅ | ✅ | ✅ |
| Fix matches PR #36134 | ✅ (Fix A of 3) | Partial (Fix B) | ✅ | ✅ |
| Found isThenableResolved gap | ❌ | ✅ (unique) | ❌ | ❌ |
| Proved sync call chain | ✅ (GetStack) | ✅ (ReadSource) | ❌ | ❌ |
| Quantitative hit counts | ✅ | ✅ | ✅ (best) | ✅ |
| Dev vs prod comparison | ✅ | ✅ | ✅ | ✅ |
| False lead time | ~8 min | ~4 min | ~1 min | <1 min |
| Red-team analysis | Moderate | Moderate | Light | Best |
| Number of fixes proposed | 3 | 2 | 1 | 1 |
| Bottom-up exploration | Extensive | Moderate | Minimal | Minimal |
Replay recordings and MCP tools were sufficient to let all of my agents find and fix the bug, in less than half an hour.
Even with the most basic instructions, just having the recording and MCP tools was enough to let Agent #1 eventually figure out the root cause and propose a valid solution for an otherwise unsolvable bug.
Modern agents and models are extremely powerful and capable
We’ve all seen this over the last year, but it’s still incredible to me that an AI can dive into a codebase or problem space and orient itself just by reading some files and produce real value.
Prompting and context are still critical
Just giving a more detailed step-by-step investigation pattern cut the investigation time in half. A few paragraphs of “here’s some concepts of what React’s scheduling internals involve” cut it in half again. And as I’ve seen in some other investigations I’ll talk about in a future post, having some relevant skills files available can produce drastically better investigation results.
Replay MCP’s tools are exactly the solution Dan Abramov described
Dan said his agent “instrumented the React codebase with logs”, and really needed “information over time”. That’s exactly what Replay MCP provided! Tools like RecordingOverview and ReactRenders to surface info on React behavior and error messages; Logpoint and Evaluate to extract specific values at various points in time; Sources to view source files in the bundle and see the hit counts to understand execution; Screenshot to visually inspect the UI at a given point in time; NetworkRequests to check the requests and results; and even more niche tools like GetStack and DescribePoint to inspect the JS execution flow.
Replay MCP provides all these and more, enabling agents to actually understand the runtime behavior over time, without having to rebuild the app with more logging. Capture the recording once with all the tricky timing behavior, investigate as deeply as needed, automatically.
What do Replay MCP’s tools look like?
Here’s some of the actual MCP tool calls the agents ran during this investigation:
RecordingOverview
# Recording Overview
## Recording Info & Environment
- **URL:** http://localhost:3000/
- **Duration:** 10.3s
- **React:** v19.3.0-canary-f93b9fd4-20251217 (production)
- **Libraries:** NextJS, React
## Console Errors & Warnings
App healthy — no console errors or warnings detected.
## Screenshot
**Final recording state** — screenshot available at timestamp 9955ms.
Use the **Screenshot** tool with `timestamp: 9955` to view it.
## Network Summary
- **Total requests:** 37
- **Failed:** 1
- **Slow:** 0
### API Patterns
- `POST /` — 21 requests
- `GET /` — 1 request
## React Render Summary
- **Total commits:** 38
- **Total render time:** 107ms
### Costliest Commits
- Commit #24: 54ms, 4 fibers — SuspenseResumed
- Commit #13: 4ms, 4 fibers — SuspenseResumed
- Commit #22: 4ms, 4 fibers — SuspenseResumed
- Commit #19: 4ms, 4 fibers — SuspenseResumed
- Commit #9: 4ms, 4 fibers — SuspenseResumed
## Suggested Next Steps
1. **NetworkRequest** — Investigate failed network requests
_1 failed request_
2. **ReactRenders** — Analyze React render performance
_38 commits, 107ms total render time_
3. **Screenshot** — View screenshot of the application state
_Final recording state at 9955ms_
ReactRenders - List
# Commits
| # | Time | Duration | Fibers | Mutating | Wasted | Waste% | Trigger |
| -: | ----: | -------: | -----: | -------: | -----: | -----: | --------------- |
| 28 | 4.52s | 3.3ms | 4 | 1 | 3 | 75.0% | SuspenseResumed |
| 29 | 4.56s | 15µs | 2 | 1 | 1 | 50.0% | SetState |
| 30 | 4.66s | 1.6ms | 3 | 1 | 2 | 66.7% | SetState |
| 31 | 4.67s | 3.3ms | 4 | 1 | 3 | 75.0% | SuspenseResumed |
| 32 | 4.76s | 4µs | 2 | 1 | 1 | 50.0% | SetState |
| 33 | 4.77s | 3.3ms | 4 | 1 | 3 | 75.0% | SuspenseResumed |
| 34 | 4.83s | 4µs | 2 | 1 | 1 | 50.0% | SetState |
| 35 | 5.07s | 3.3ms | 4 | 1 | 3 | 75.0% | SetState |
| 36 | 5.08s | 3.3ms | 4 | 1 | 3 | 75.0% | SuspenseResumed |
| 37 | 5.15s | 4µs | 2 | 1 | 1 | 50.0% | SetState |
Showing 29–38 of 38
## Next Steps
- Use mode=commit&commitIndex=N to inspect a specific commit
- Use mode=summary to see the recording overview
ReactRenders - Commit
# Commit #37
**Timestamp:** 5.15s
**Render duration:** 4µs
**Fibers rendered:** 2 (1 mutating, 1 wasted — 50.0%)
## Triggers
- **SetState**
- Source: TestPreviewClient (client.tsx:38:25)
- Point: 11682667932122379955827658622763010
## Components
| Component | Instances | Renders | Wasted | Duration | DOM Muts |
| ----------------- | --------: | ------: | -----: | -------: | -------: |
| Resolved | 1 | 1 | 1 | 4µs | 0 |
| TestPreviewClient | 1 | 1 | 0 | 0µs | 1 |
## DOM Mutations (1 total)
- TestPreviewClient: 1
## Next Steps
- Use mode=component&componentName=X to investigate a specific component
- Use mode=commits to browse other commits
- Use mode=summary to see the recording overview
ReadSource
hits | L | code
------|------|------
0 | 12116 | for (var i = 0; i < tag.length; i++) {
0 | 12117 | var check = tag[i],
0 | 12118 | getSnapshot = check.getSnapshot;
0 | 12119 | check = check.value;
| 12120 | try {
0 | 12121 | if (!objectIs(getSnapshot(), check)) return !1;
| 12122 | } catch (error) {
0 | 12123 | return !1;
| 12124 | }
| 12125 | }
| 12126 | tag = node.child;
87 | 12127 | if (node.subtreeFlags & 16384 && null !== tag)
0 | 12128 | (tag.return = node), (node = tag);
| 12129 | else {
87 | 12130 | if (node === finishedWork) break;
0 | 12131 | for (; null === node.sibling; ) {
0 | 12132 | if (null === node.return || node.return === finishedWork) return !0;
0 | 12133 | node = node.return;
| 12134 | }
0 | 12135 | node.sibling.return = node.return;
0 | 12136 | node = node.sibling;
| 12137 | }
| 12138 | }
87 | 12139 | return !0;
87 | 12140 | }
| *12141 | function markRootSuspended(
| 12142 | root,
| 12143 | suspendedLanes,
| 12144 | spawnedLane,
| 12145 | didAttemptEntireTree
| 12146 | ) {
67 | 12147 | suspendedLanes &= ~workInProgressRootPingedLanes;
67 | 12148 | suspendedLanes &= ~workInProgressRootInterleavedUpdatedLanes;
67 | 12149 | root.suspendedLanes |= suspendedLanes;
67 | 12150 | root.pingedLanes &= ~suspendedLanes;
67 | 12151 | didAttemptEntireTree && (root.warmLanes |= suspendedLanes);
67 | 12152 | didAttemptEntireTree = root.expirationTimes;
134 | 12153 | for (var lanes = suspendedLanes; /*<HIT>*/0 < lanes; ) {
67 | 12154 | var index$6 = 31 - clz32(lanes),
67 | 12155 | lane = 1 << index$6;
67 | 12156 | didAttemptEntireTree[index$6] = -1;
67 | 12157 | lanes &= ~lane;
| 12158 | }
67 | 12159 | 0 !== spawnedLane &&
0 | 12160 | markSpawnedDeferredLane(root, spawnedLane, suspendedLanes);
| 12161 | }
| 12162 | function flushSyncWork$1() {
21 | 12163 | return 0 === (executionContext & 6)
21 | 12164 | ? (flushSyncWorkAcrossRoots_impl(0, !1), !1)
| 12165 | : !0;
| 12166 | }
Logpoint
Found 46 total hits at turbopack:///[project]/node_modules/.pnpm/next@16.1.6_react-dom@19.2.4_react@19.2.4__react@19.2.4/node_modules/next/dist/compiled/react-dom/cjs/react-dom-client.production.js:13076 (showing first 20)
Hit 1 at point Point:239 @ 2676.17ms:
Expression: ["ping", pingedLanes, "suspended", root.suspendedLanes, "renderLanes", workInProgressRootRenderLanes, "exit", workInProgressRootExitStatus, "isRoot", workInProgressRoot === root, "execCtx", executionContext]
Result: [
"ping",
524288,
"suspended",
524800,
"renderLanes",
0,
"exit",
4,
"isRoot",
false,
"execCtx",
0
]
Hit 2 at point Point:240 @ 2676.21ms:
Expression: ["ping", pingedLanes, "suspended", root.suspendedLanes, "renderLanes", workInProgressRootRenderLanes, "exit", workInProgressRootExitStatus, "isRoot", workInProgressRoot === root, "execCtx", executionContext]
Result: [
"ping",
512,
"suspended",
524800,
"renderLanes",
0,
"exit",
4,
"isRoot",
false,
"execCtx",
0
]
Hit 3 at point Point:241 @ 2841.09ms:
Expression: ["ping", pingedLanes, "suspended", root.suspendedLanes, "renderLanes", workInProgressRootRenderLanes, "exit", workInProgressRootExitStatus, "isRoot", workInProgressRoot === root, "execCtx", executionContext]
Result: [
"ping",
1048576,
"suspended",
1049600,
"renderLanes",
0,
"exit",
4,
"isRoot",
false,
"execCtx",
0
]
// snip
Replay MCP: The Time-Travel Superpower Your Agents Need
I joined Replay because I saw the potential and promise of time-travel debugging and how it can make the debugging process drastically faster and better. It enables solving bugs that are otherwise impossible, and provides insights into real runtime behavior that no other monitoring or analysis tool can provide.
Replay DevTools gave humans the ability to investigate and solve bugs with time travel.
Now, Replay MCP gives your agents the same time-travel superpowers.
You can add Replay MCP to your own agents and workflows today! Plug it in, make a recording of a bug or a failed test, and let your agent do the time-travel investigative work for you.
And, coming soon: we’re working on expanding our existing E2E Playwright and Cypress recording integrations to automatically investigate test failures and provide analysis and recommended fixes! This will help your team fix flaky tests, ensure PR code quality, and improve shipping velocity.
Try out Replay MCP and our E2E Test Suites integrations today, and stay tuned - we’ve got a lot more awesome time-travel debugging capabilities coming soon!