Replay Time Travelogue: Improving Nadia’s “Debugging with AI” Results Using Replay MCP
In my previous post How Replay MCP Helped Find a React Bug Faster Than Dan Abramov Did, I showed how agents were able to successfully identify and fix a difficult React internals bug using just Replay recordings of the reproduction, and how different prompts and instructions resulted in faster analysis results.
However, most of us aren’t actually working on React’s internals :) We’re using React in apps , and dealing with app-level concerns: page loading times, network requests, good UX, and data correctness. What happens when you use AI to debug those kinds of problems? Can Replay MCP make agents better at finding real-world app bugs?
Background: Nadia’s “Debugging with AI” Example
Nadia Makarevich is author of the book Advanced React, as well as the excellent blog Developer Way, where she’s published numerous posts on React’s behavior and usage. She recently wrote a post titled “Debugging with AI: Can It Replace an Experienced Developer?”, where she tested LLM debugging abilities against an example Next.js app that had several representative bugs inserted to recreate problems she’d seen in real apps. She talked through her instructions to the LLM, how well it did solving each bug, and described where it came up with insufficient answers or just failed to solve the problem at all.
Nadia used Opus 4.5 to do realistic “blind” investigations on three different app bugs: a Zod schema error, a double loading skeleton, and a Next.js redirect error. She graded the LLM’s results in three ways:
- Did the LLM fix the problem?
- Did the LLM identify the root cause of the problem correctly?
- Did the LLM fix the root cause of the problem correctly?
Her results and final analysis indicated that while the LLM had some success, it struggled to understand the deeper architectural issues and either failed to identify the root cause or gave misleading answers:
| User Page is Broken | Double skeletons problem | Weird Redirect Error | |
|---|---|---|---|
| Did LLM fix the problem? | ✅ | ✅ | ❌ |
| Did LLM identify the root cause of the problem correctly? | ✅ | ❌ | ❌ |
| Did LLM fix the root cause of the problem correctly? | ✅ | ❌ | ❌ |
As she concluded:
💡 So, what's the verdict? Can AI replace an experienced developer for debugging?
No, of course not. AI is very good at pattern recognition, and it can be very, very useful. It will do exceptionally well on standard stuff: schema validation mishaps, forgotten null checks, common runtime errors, you've seen it for yourself alreadyBut even if the fix works, I always retrace it step by step and make sure it's the actual root cause. If the fix doesn't work, however, I almost never iterate with the LLM, other than asking it to find me something in the repo. As you've seen, half of its responses in this case will be very confident hallucinations.
When the problem requires actual understanding of why the system behaves the way it does, or how it should behave, especially from a future or user perspective, the AI fails.
And the skill here isn't knowing how to prompt better. It's knowing when to stop prompting and start thinking.
I took this as a challenge :) I wanted to see if adding Replay recordings and Replay MCP to the mix would help an agent solve these problems any better. I also wanted to see if improving prompts and providing more context could produce better results.
Agentic Time-Travel Debugging with Replay MCP
Let’s step back and recap what Replay MCP actually is.
Replay is a time-travel debugger for web apps. The Replay recording browser captures DVR-style recordings of an entire web app. Because we capture the entire browser’s behavior and inputs, we can replay the app’s execution exactly the way it ran during the original recording. Unlike session replay or prod monitoring tools, Replay lets you do time-travel debugging and inspect the app’s behavior at any point in time: see which lines of code executed, evaluate log statements for every time a line ran, view the DOM tree and React component tree at any point in time, and much more. This makes it possible to investigate and solve bugs in ways no other tool can.
Replay DevTools is our debugging UI for humans - browser devtools with time-travel built in. Replay MCP gives agents those same time-travel debugging capabilities. Agents can open a recording and use the MCP tools to investigate the same way a human would: looking at console messages, adding logpoints to evaluate expressions each time a line of code ran, getting screenshots and stack traces, and getting framework-specific insights into libraries like React, Redux, Zustand, and TanStack Query. This means agents can now do the investigation work for you automatically!
Investigation Process and Setup
For this experiment, I used my own personal agent setup: OpenCode 1.4 and Opus 4.6. I have some file search and context management plugins enabled, but otherwise no specific skills or custom behaviors.
I wanted to run this as a proper experiment: how much do prompts and investigation instructions matter, and how much does Replay MCP help? Bug #1 was pretty trivial, but for Bug #2 and Bug #3, I kicked off four parallel agent investigation sessions. Each agent session was given the same access to the Replay recordings of the bug and a local copy of the demo app source, but with varying prompt instructions:
- Light details, bug repro, only told to “write a bug report with a root cause and suggested fix”
- Same repro steps, but a detailed 8-step investigation methodology including explaining why the problem is happening
- Additional context on the app’s architecture and patterns
- Additional relevant framework skills or concepts
How would they do? What differences would we see between them in results or investigation times?
Bug #1: Zod Schema Error in a User Profile
Bug #1 was pretty trivial. Navigate to /users/1 , and there’s a Zod schema error shown in both a UI error overlay and in the console messages. Nadia’s initial prompt was:
💡 When I navigate to the /users/1 route, the page shows a "Something went wrong" error. In the browser console, I see the following errors [copy-paste the entire error stack]. Server logs are clean. Can you fix it?
Her agent found this error immediately and correctly identified the fix: making the user address fields optional, and handling the type changes when rendering user.address .
My agent was given somewhat more detailed instructions on investigating and using the MCP, but this one was a gimme anyway. It called the RecordingOverview MCP tool, immediately saw the Zod error, used Screenshot to confirm the UI overlay, then used the NetworkRequest tool to check the actual API response and confirmed the address and phone fields were missing despite the schema marking them as required. It nailed the issue and cause in about 2 minutes.
Bug #1: Agent prompt
Task: Debug a Bug in a Replay Recording
Recording ID: b8c1cf71-0c97-40c2-8b46-93704e6e8407
You have access to a Replay.io recording of a Next.js application that has a bug. The bug manifests after the user navigates to the "User Profile" page.
Your Mission
Investigate the recording to:
- Identify what the bug is — what goes wrong from the user's perspective?
- Find the root cause — trace it back to the specific code/logic error
- Suggest a fix — what code change would resolve this?
Approach
Start with RecordingOverview to get the lay of the land, then use whatever Replay tools make sense:
Screenshotto see what the user sawUserInteractionsto understand what they didConsoleMessagesfor errors/warningsNetworkRequestfor failed/slow requestsReactComponentTree/ReactRendersfor component-level issuesUncaughtException/ReactExceptionfor crashesSearchSources/ReadSource/DescribePoint/Evaluatefor code-level investigation
Work methodically. Don't assume what the bug is — discover it from the recording evidence.
Deliverable
Write up your findings:
- Bug description — what the user experiences
- Root cause — the specific code/logic error, with evidence from the recording
- Suggested fix — concrete code change
- Key evidence trail — the chain of observations that led to the diagnosis
Keep it concise but thorough. This is a demo of Replay's debugging capabilities.
Bug #1: Agent final analysis
As expected — API routes run on the server and aren't in the browser recording. But we already have the response body, which is the definitive evidence.
Here's the complete bug analysis:
Bug Report
1. Bug Description
When the user navigates to the User Profile page (by clicking around ~7s into the recording), the page crashes and shows a Next.js error boundary instead of the user's profile. The user sees an error screen rather than their profile information.
2. Root Cause
Schema-API mismatch. The UserSchema Zod validator in src/lib/schemas.ts:238-255 requires fields that the /api/user endpoint doesn't return.
The schema expects (among other fields):
phone: z.string()(line 242) — required stringaddress: z.object({...})(line 246-252) — required object with street, city, state, postalCode, country
The API returns:
{
"id": 1,
"name": "John Doe",
"email": "john.doe@example.com",
"department": "Engineering",
"avatarUrl": null,
"role": "admin",
"joinedAt": "2023-03-15T09:00:00Z",
"lastLoginAt": "2024-01-28T14:32:00Z"
}
No phone field. No address field.
The crash chain:
- User navigates to the profile page
UserPagecomponent (line 56) callsuseUser()→ TanStack Query callsfetchUser()fetchUser()insrc/queries/useUser.ts:11callsUserSchema.parse(data)on the API response- Zod throws
ZodErrorwith two issues:phoneexpected string got undefined,addressexpected object got undefined - TanStack Query surfaces this as
errorin the query result UserPage(line 62-63) checksif (error)and re-throws it:throw error- Next.js
ErrorBoundaryHandlercatches the thrown error and renders the error screen - Lines 66+ (actual profile rendering) execute 0 times — never reached
3. Suggested Fix
Option A: Fix the API — Add the missing phone and address fields to the /api/user response:
// In the API route handler for /api/user
return Response.json({
id: 1,
name: "John Doe",
email: "john.doe@example.com",
phone: "+1-555-0123", // ADD
department: "Engineering",
avatarUrl: null,
role: "admin",
address: { // ADD
street: "123 Main St",
city: "San Francisco",
state: "CA",
postalCode: "94105",
country: "US",
},
joinedAt: "2023-03-15T09:00:00Z",
lastLoginAt: "2024-01-28T14:32:00Z",
});
Option B: Fix the schema — Make phone and address optional in UserSchema so it tolerates the current API response:
// src/lib/schemas.ts
export const UserSchema = z.object({
id: z.number(),
name: z.string(),
email: z.string().email(),
phone: z.string().optional(), // was: z.string()
avatarUrl: z.string().url().nullable(),
role: z.enum(['admin', 'manager', 'user']),
department: z.string().nullable(),
address: z.object({ // was: z.object({...})
street: z.string(),
city: z.string(),
state: z.string(),
postalCode: z.string(),
country: z.string(),
}).optional(), // ADD .optional()
joinedAt: z.string(),
lastLoginAt: z.string().nullable(),
});
Option A is better if the data should exist. Option B is better if the data is genuinely optional.
4. Key Evidence Trail
| Step | Tool | Finding |
|---|---|---|
| 1 | RecordingOverview |
ZodError for phone and address fields in console errors at ~8.2s |
| 2 | ConsoleMessages (errors) |
Error originates in fetchUser at useUser.ts:15, caught by <UserPage> error boundary |
| 3 | UserInteractions |
User clicks at ~7s (navigating to profile page) |
| 4 | NetworkRequest (detail for /api/user) |
Response body has no phone or address fields, returns 200 OK |
| 5 | ReadSource (schemas.ts:238) |
UserSchema requires phone: z.string() and address: z.object({...}) — both non-optional |
| 6 | ReadSource (useUser.ts) |
fetchUser() calls UserSchema.parse(data) which throws on the missing fields |
| 7 | ReadSource (UserPage.tsx:55) |
Component re-throws the query error at line 63; lines 66+ (profile rendering) have 0 execution hits — never reached |
| 8 | Screenshot at 15s |
Confirms user sees an error boundary crash screen, not the profile |
Bug #2: User Page Double Loading Skeleton
After fixing the first issue, Nadia described the second bug in the app: throttle the network connection, navigate from the root page to a user profile, and see two different loading skeleton flash on screen before the final content is displayed. But, loading the user profile directly only shows one skeleton.
Nadia’s prompt was:
💡 When I'm on the root page and navigate to the User page, I see two different loading skeletons appear. However, when I just refresh the User page, I see only one. Why is that and how to fix it?
She reported that her agent came up with a variety of answers, several of which involved switching from TanStack Query’s useQuery hook to useSuspenseQuery . That answer is half right, but switching to useSuspenseQuery uncovered a second problem.
The double skeletons had to do with mixing Next’s loading.tsx page Suspense boundary, with an app-level skeleton component based on the TSQ isLoading flag. However, just switching to useSuspenseQuery introduces a hydration mismatch error down the line. The real fix is architectural - actually prefetching data on the server, and hydrating the query hook on the client side with the same data.
Given that, I wanted to see: would an agent with Replay MCP actually find the real root cause and suggest server prefetching as the full solution?
I kicked off three agent sessions: Nadia’s baseline prompt, additional investigative method instructions, and method + some app architecture context. The results were partially an improvement on Nadia’s agent - here’s how they did.
Bug #2 Agent #1: Basic Prompt
Agent #1 started by inspecting the Replay recording to understand the overall behavior. It called RecordingOverview to get the initial details, UserInteractions to get a list of clicks, and and made numerous Screenshot to get visuals on the navigation. It then used ListSources and SearchSources to identify the Users page routes and checked for Suspense handling.
A ReadSource call for UserPage.tsx returned hit counts showing it had rendered twice, once with isLoading: true. The agent then made multiple ReactRenders tools calls to check the actual render sequence and triggers. The render details showed a mixture of Suspense and loading-state based causes, confirming the mismatch. Finally, it was able to read the local repo files on disk to confirm the available hooks.
Overall, this agent did a reasonable actual investigation, came up with real evidence, wrapped up the search and reported. The answer was directionally correct, just not complete enough.
Total time: 5 minutes
Bug #2 Agent #1 Prompt
💡 Task: Debug a Bug in a Replay Recording
Recording ID:
c832545f-a41b-4cd0-9e59-e229b7ca0b7fYou have access to a Replay.io recording of a web application that has a bug.
Bug report from user:
When I'm on the root page and navigate to the User page, I see two different loading skeletons appear. However, when I just refresh the User page, I see only one. Why is that and how to fix it?
Your Mission
Investigate the recording to:
Understand what's happening in both scenarios (navigation vs refresh)
Find the root cause of the double skeleton
Suggest a fix
Tools Available
Replay MCP tools for analyzing the recording (use
replay_prefixed tools)The application source code is available at
~/projects/temp/debugging-with-aiif you need to read files that aren't accessible in the recordingDeliverable
Write up your findings:
Bug description — what the user experiences
Root cause — the specific code/logic error, with evidence
Suggested fix — concrete code change
Key evidence trail — the observations that led to your diagnosis
Bug #2 Agent #1 Solution
💡 Bug Report: Double Loading Skeleton on Client-Side Navigation to User Page
1. Bug Description
When navigating from the root page (
/) to the User page (/users/[id]) via client-side navigation, the user sees two different loading skeletons in sequence:
First, a simpler skeleton from
loading.tsx(avatar circle + name bar + content area)Then, a more detailed skeleton from
UserPageSkeletoninside the component (avatar + name + badge + separator + grid)On a direct page refresh of the User page URL, only one skeleton appears. The visual "flash" between two different skeleton designs during navigation is jarring.
2. Root Cause
There are two independent loading mechanisms that fire sequentially during client-side navigation:
Mechanism 1 — Next.js
loading.tsxSuspense boundary (src/app/users/[id]/loading.tsx)The page component at
src/app/users/[id]/page.tsx:5-10is an async server component:export default async function Page({ params }: { params: Promise<{ id: string }> }) { const { id } = await params; // <-- this `await` causes suspension return <UserPage userId={id} />; }During client-side navigation, Next.js needs to fetch the server component payload. The
await paramscauses the route segment to suspend, which triggers theloading.tsxSuspense fallback. This renders Skeleton #1 (commit 135 at 8.89s —RenderSuspendedtrigger, 145 DOM mutations).Mechanism 2 — Component-level
isLoadingguard (src/components/pages/UserPage.tsx:57-59)const { data: user, isLoading, error } = useUser(); // useQuery, NOT useSuspenseQuery if (isLoading) { return <UserPageSkeleton />; // <-- Skeleton #2 }When the Suspense boundary resolves (server component payload arrives),
UserPagemounts. TheuseQueryhook starts fetching/api/userand immediately returnsisLoading: true, rendering Skeleton #2 (commit 136 at 9.02s —SuspenseResumedtrigger).Why refresh works fine: On a full page refresh, the async server component resolves server-side before sending HTML. The client only sees the
useQueryloading state — there's no sequential Suspense → component loading transition.3. Suggested Fix
The cleanest fix is to switch from
useQuerytouseSuspenseQueryso that the data fetching integrates with the existing Suspense boundary. TheuseUserSuspensehook already exists insrc/queries/useUser.ts:22-27but is unused.In
src/components/pages/UserPage.tsx****:- import { useUser } from '@/queries/useUser'; + import { useUserSuspense } from '@/queries/useUser'; export function UserPage({ userId }: UserPageProps) { - const { data: user, isLoading, error } = useUser(); - - if (isLoading) { - return <UserPageSkeleton />; - } - - if (error) { - throw error; - } + const { data: user } = useUserSuspense();With
useSuspenseQuery:
The component suspends while data is loading, keeping the
loading.tsxskeleton visibleWhen data arrives, the component renders with the actual user data — no intermediate skeleton
Error handling is automatic (suspense queries throw to the nearest error boundary)
The
UserPageSkeletonfunction andisLoadingbranch can be removed entirely (or kept for other uses)Both navigation and refresh paths show a single, consistent loading skeleton from
loading.tsxAlternative fix (if you want to keep
useQuery): Deletesrc/app/users/[id]/loading.tsxso only the component-level skeleton appears. But this is worse because you'd lose the instant loading feedback during navigation — there'd be a blank delay before the component even mounts.4. Key Evidence Trail
Step Evidence Tool/Source Identified two skeletons React commits 135 + 136 in the recording ReactRenderscommits mode, time range 8.5-10sCommit 135: RenderSuspendedtriggerloading.tsxfallback shown with 145 DOM mutations at 8.89sReactRenderscommit detailCommit 136: SuspenseResumedtriggerUserPagemounts, rendersUserPageSkeleton(8 Skeleton children) at 9.02sReactRenderscommit detailCommit 137: Data arrives UserPagere-renders with actual user data at 9.72sReactRenderscommit detailuseQueryvsuseSuspenseQueryuseUser()usesuseQuery— doesn't integrate with SuspenseSource: src/queries/useUser.tsUnused suspense hook exists useUserSuspense()already defined but not importedSource: src/queries/useUser.ts:22-27page.tsxis asyncawait paramscauses server component to suspend during navigationSource: src/app/users/[id]/page.tsx
Bug #2 Agent #2: Investigative Process
Agent #2 was given a more detailed 8-step investigative process to follow. How much of a difference would that make?
Agent #2 started with the same initial tool calls: RecordingOverview to orient itself, UserInteractions and Screenshot to understand the usage. It also used NetworkRequest to check the timestamps of calls to /api/user and /api/current-user.
After calling ListSources to check the files in the recording, it opted to read them all from the local filesystem.
From there, it used ReactRenders to focus in on the critical Suspense-related renders and causes, but also started reading the various Next routes to see how they handled loading patterns. It identified that most routes did have an isLoading check, but only the users/[id] route had a loading.tsx file included, explaining why the bug is specific to the User page.
This agent came up with much better explanations and relevant details in its report. It identified a waterfall request happening, and found the loading skeleton mismatch. Its primary suggestion was the same incomplete useSuspenseQuery fix, but it did explicitly say “Ideal optimization would be server-side prefetching to eliminate the waterfall”.
Total time: 9 minutes
Bug #2 Agent #2 Prompt
💡 Task: Debug a Bug in a Replay Recording
Recording ID:
c832545f-a41b-4cd0-9e59-e229b7ca0b7fYou have access to a Replay.io recording of a web application that has a bug.
Bug report from user:
When I'm on the root page and navigate to the User page, I see two different loading skeletons appear. However, when I just refresh the User page, I see only one. Why is that and how to fix it?
Your Mission
Investigate the recording to find the root cause and suggest a complete fix.
Tools Available
Replay MCP tools for analyzing the recording (use
replay_prefixed tools)The application source code is available at
~/projects/temp/debugging-with-aiif you need to read files that aren't accessible in the recordingInvestigation Process
Follow this structured approach:
Establish the user experience. Use screenshots and interaction timeline to understand exactly what happens visually in both scenarios (navigation vs refresh). Pin down timestamps.
Map the data flow. Identify what data the page needs, where it comes from, when fetches start relative to navigation, and what components consume the results. Use network requests + source code to build this picture.
Build the render timeline. Use React render analysis to understand what components mount, unmount, and re-render — in what order, triggered by what. Compare the component render sequence between the two scenarios if possible.
Compare the two scenarios explicitly. Navigation and refresh produce different results. What's concretely different in the execution path? Don't just explain one path — explain why they diverge.
Find the proximate cause, then go deeper. After identifying the immediate code-level issue, ask yourself: "Does my proposed fix fully resolve the problem, or does it just reduce the symptom?" Consider:
Check for patterns. Look at how other pages/routes in the app handle data fetching and loading states. If there's a pattern that works correctly elsewhere, the fix should align with that pattern.
Consider framework idioms. Is the code using the framework's intended data flow patterns? If the framework provides a mechanism for this exact scenario, is it being used?
Red-team your conclusion. Before presenting your fix, challenge it: What would still be suboptimal after applying it? Is there a remaining waterfall or unnecessary loading state? Would an expert reviewing this fix say "yes, but you should also..."?
Deliverable
Write up your findings:
Bug description — what the user experiences in each scenario
Root cause analysis — the specific code/logic/architectural error, with evidence from the recording
Suggested fix — concrete code change(s). If there are multiple layers to the fix (immediate + architectural), describe both.
Key evidence trail — the chain of observations that led to your diagnosis
Remaining concerns — anything that your fix doesn't address, or that you're uncertain about
Bug #2 Agent #2 Solution
💡 Bug Report: Double Loading Skeleton on User Page Navigation
1. Bug Description
What the user sees:
Navigation (root → User page): Two visually distinct loading skeletons appear in sequence:
Page refresh on User page: Only one loading skeleton appears before the data renders.
The two skeletons have different layouts, sizes, and levels of detail, making the transition jarring.
2. Root Cause Analysis
The bug is a conflict between two independent loading mechanisms that both activate during client-side navigation but not during SSR/refresh.
The two loading sources:
# Source File Trigger 1 Next.js Suspense boundary src/app/users/[id]/loading.tsxRoute navigation triggers Suspense while RSC payload + JS chunks load 2 Client-side isLoadingchecksrc/components/pages/UserPage.tsx:63-65useUser()(non-suspenseuseQuery) starts fetching/api/userafter mountWhy navigation shows two skeletons:
User clicks "User" link → Next.js starts client-side route transition
Commit 135 (8.89s): Next.js hits the Suspense boundary at
users/[id]/loading.tsx, renders Skeleton #1 while loading the RSC payload and JS chunksCommit 136 (9.02s): RSC resolves,
UserPagecomponent mounts. ButuseUser()usesuseQuery(not suspense-aware), so it returnsisLoading: trueimmediately and renders Skeleton #2 (UserPageSkeleton)Commit 137 (9.72s):
/api/userresponse arrives (fetched at 9172ms, took 548ms),UserPagere-renders with actual dataWhy refresh shows one skeleton:
On a full page load, Next.js SSR renders the server component synchronously. The
loading.tsxSuspense boundary either doesn't trigger or resolves instantly during SSR. The client hydrates withUserPagealready mounted, and only the client-sideuseQueryloading state is visible.The architectural error:
UserPageusesuseQuery(non-suspense) despite having a Next.jsloading.tsxSuspense boundary in the route. This creates two competing loading state mechanisms. The codebase already hasuseUserSuspense()(usinguseSuspenseQuery) insrc/queries/useUser.ts— it's just not being used.3. Suggested Fix
Primary fix: Switch
UserPageto useuseSuspenseQueryand remove the manual loading state handling. This lets the existingloading.tsxSuspense boundary handle all loading UI.
src/components/pages/UserPage.tsx****:- import { useUser } from '@/queries/useUser'; + import { useUserSuspense } from '@/queries/useUser'; export function UserPage({ userId }: UserPageProps) { - const { data: user, isLoading, error } = useUser(); - - if (isLoading) { - return <UserPageSkeleton />; - } - - if (error) { - throw error; - } + const { data: user } = useUserSuspense(); if (!user) {The
UserPageSkeletonfunction (lines 27-55) can also be removed from this file since it's no longer used — theloading.tsxskeleton handles that role.Additional consideration: The
loading.tsxskeleton is currently simpler/smaller thanUserPageSkeleton. You may want to updatesrc/app/users/[id]/loading.tsxto match the richer skeleton layout (24px avatar, separator, 2-column grid) so the loading state better matches the final rendered page.Why this is the correct approach:
Other query files (
useOrder.ts,useProduct.ts) also export bothuseX()anduseXSuspense()variants — but those pages don't haveloading.tsxfiles, so the non-suspense version works fineThe
users/[id]/loading.tsxfile exists specifically to be the loading UI for this route — it should be the single source of loading state
useSuspenseQueryintegrates with React Suspense natively, which is exactly how Next.jsloading.tsxworks (it wraps the page in a<Suspense>boundary)4. Key Evidence Trail
Screenshots at 8742ms vs 9187ms showed two visually different skeletons appearing in sequence
React render commits 135→136→137 showed the exact sequence:
RenderSuspended(loading.tsx) →SuspenseResumed+UserPageSkeletonmount →StoreRerenderwith dataNetwork timeline showed
/api/userfetch doesn't start until 9172ms (~580ms after click), proving the client-side fetch only begins after the component mountsSource code confirmed
useUser()usesuseQuerywhileuseUserSuspense()usinguseSuspenseQueryexists but is unusedFile comparison showed
users/[id]/loading.tsxexists butorders/[id]/loading.tsxandproducts/[id]/loading.tsxdo not — explaining why this bug is unique to the User page5. Remaining Concerns
Waterfall still exists with
useSuspenseQuery****: The/api/userfetch still won't start until the client component mounts after RSC resolution. During the Suspense phase, theloading.tsxskeleton shows while the RSC payload loads, then Suspense resolves,UserPagemounts,useSuspenseQuerytriggers the fetch, and Suspense suspends again while the fetch completes. This means theloading.tsxskeleton appears for the full duration (RSC + API fetch), which is better UX (one consistent skeleton) but the fetch itself is still sequential after RSC.Ideal optimization would be server-side prefetching: The best Next.js pattern would be to prefetch user data on the server (e.g., using TanStack Query's
prefetchQueryin the server component atusers/[id]/page.tsx) and dehydrate it. This would eliminate the client-side fetch waterfall entirely — the data would already be in the cache whenuseSuspenseQueryruns, so no second Suspense suspension would occur.
loading.tsxskeleton fidelity: The currentloading.tsxskeleton is less detailed thanUserPageSkeleton. After the fix, users will see the simpler skeleton for the entire loading duration, which might feel like a regression in loading UI quality. Consider updatingloading.tsxto match the richer layout.
Bug #2 Agent #3: App Architecture Summary
Agent #3 was given the same investigative steps, but also a few bullet points summarizing the known packages and structure of this app (Next.js + TanStack Query, mix of RSCs and hook loading).
Most of the investigative flow was the same: RecordingOverview, interactions, screenshots, sources, network requests. It then specifically browsed most of the query hook definitions and confirmed that Suspense-enabled query hooks existed, as well as checking more screenshots to confirm the loading skeleton visuals.
This agent came up with the fastest response. It still recommended useSuspenseQuery, but did also list server prefetching as a follow-up step. It also flagged OrderPage and ProductPage as having the same problematic pattern, and that the UserPageSkeleton component would be dead and could be removed.
Total time: 4 minutes
Bug #3 Agent #3 Prompt
💡 Task: Debug a Bug in a Replay Recording
Recording ID:
c832545f-a41b-4cd0-9e59-e229b7ca0b7fYou have access to a Replay.io recording of a web application that has a bug.
Bug report from user:
When I'm on the root page and navigate to the User page, I see two different loading skeletons appear. However, when I just refresh the User page, I see only one. Why is that and how to fix it?
Your Mission
Investigate the recording to find the root cause and suggest a complete fix.
Tools Available
Replay MCP tools for analyzing the recording (use
replay_prefixed tools)The application source code is available at
~/projects/temp/debugging-with-aiif you need to read files that aren't accessible in the recordingApplication Context
Pre-analysis of the repository shows:
Framework: Next.js 14+ using the App Router (app/ directory)
Data fetching: TanStack Query (React Query) for client-side data management
Component model: Mix of React Server Components (RSC) and Client Components
Loading states: Route-level
loading.tsxfiles provide framework-managed loading UIRoute structure: The User Profile page is at
src/app/users/[id]/, with its ownloading.tsxQuery hooks: The app defines both standard and suspense-aware variants of its query hooks (e.g.
useUseranduseUserSuspense)Investigation Process
Follow this structured approach:
Establish the user experience. Use screenshots and interaction timeline to understand exactly what happens visually in both scenarios (navigation vs refresh). Pin down timestamps.
Map the data flow. Identify what data the page needs, where it comes from, when fetches start relative to navigation, and what components consume the results. Use network requests + source code to build this picture.
Build the render timeline. Use React render analysis to understand what components mount, unmount, and re-render — in what order, triggered by what. Compare the component render sequence between the two scenarios if possible.
Compare the two scenarios explicitly. Navigation and refresh produce different results. What's concretely different in the execution path? Don't just explain one path — explain why they diverge.
Find the proximate cause, then go deeper. After identifying the immediate code-level issue, ask yourself: "Does my proposed fix fully resolve the problem, or does it just reduce the symptom?" Consider:
Check for patterns. Look at how other pages/routes in the app handle data fetching and loading states. If there's a pattern that works correctly elsewhere, the fix should align with that pattern.
Consider framework idioms. Is the code using the framework's intended data flow patterns? If the framework provides a mechanism for this exact scenario, is it being used? Pay particular attention to the boundary between server and client — where does data originate, and is it being passed efficiently across that boundary?
Red-team your conclusion. Before presenting your fix, challenge it: What would still be suboptimal after applying it? Is there a remaining waterfall or unnecessary loading state? Would an expert reviewing this fix say "yes, but you should also..."?
Deliverable
Write up your findings:
Bug description — what the user experiences in each scenario
Root cause analysis — the specific code/logic/architectural error, with evidence from the recording
Suggested fix — concrete code change(s). If there are multiple layers to the fix (immediate + architectural), describe both.
Key evidence trail — the chain of observations that led to your diagnosis
Remaining concerns — anything that your fix doesn't address, or that you're uncertain about
Bug #2 Agent #3 Solution
💡 Bug Report: Double Loading Skeleton on Navigation to User Page
1. Bug Description
What the user experiences:
Navigation (root → User page): Two visually distinct loading skeletons appear in sequence:
Hard refresh on User page: Only one skeleton appears (the
UserPageSkeletonfromuseUser()'s loading state). The transition is clean.2. Root Cause Analysis
The problem has two layers:
Layer 1: Mismatched Suspense boundary (wrong skeleton appears)
During client-side navigation, Next.js lazy-loads the page component's JS chunk. While the chunk for
app/users/[id]/page.jsdownloads (~323ms), React suspends. The framework looks for the nearest Suspense boundary with a fallback — but the route-levelapp/users/[id]/loading.tsxhasn't loaded yet either. So React falls back toapp/loading.tsx(the root loading file), which shows a dashboard-style skeleton for a user profile page. This is visually jarring and contextually wrong.Evidence from React commits:
Commit 135 (8.89s):
RenderSuspended— React.lazy init for page.js chunk. 145 DOM mutations — this is the heavy swap to the root skeleton.Commit 136 (9.02s):
SuspenseResumed— chunk loaded,UserPagemounts.Layer 2: Non-suspense data fetching creates a second loading phase
UserPageusesuseUser()— a standarduseQueryhook that returns{ isLoading: true }initially. The component manually checksisLoadingand renders<UserPageSkeleton />. This creates a sequential waterfall:Navigation click (8.59s) → RSC navigation (8.67s) → Page JS chunk download (323ms) → Root skeleton shown (Skeleton #1) → UserPage mounts (9.02s) → useUser() starts fetch → UserPageSkeleton shown (Skeleton #2, 500ms API delay) → Data arrives, final render (9.72s)On hard refresh, the JS chunk is included in the initial SSR payload, so the chunk-loading Suspense phase is skipped entirely. The user only sees the
useUser()loading state.Evidence from network timeline:
/api/userfetch doesn't start until 9172ms — 505ms after the navigation clickThe 500ms API delay is on top of the chunk-loading waterfall
3. Suggested Fix
The app already has the correct solution defined but unused. Every query hook file exports a
useSuspenseQueryvariant. The fix is to use it.Primary fix: Switch
UserPageto useuseUserSuspense()
src/components/pages/UserPage.tsx****:// Change import: - import { useUser } from '@/queries/useUser'; + import { useUserSuspense } from '@/queries/useUser'; // Change hook usage in UserPage component: - const { data: user, isLoading, error } = useUser(); - - if (isLoading) { - return <UserPageSkeleton />; - } - - if (error) { - throw error; - } - - if (!user) { + const { data: user } = useUserSuspense(); + + if (!user) {This change:
Makes data fetching participate in React Suspense
Next.js's
app/users/[id]/loading.tsxbecomes the single loading boundary for both chunk loading AND data fetchingThe user sees one contextually-correct skeleton (user profile card) instead of two mismatched ones
The
UserPageSkeletoninline component andisLoadingcheck become unnecessary (can be removed)Error handling is automatic —
useSuspenseQuerythrows on error, which React error boundaries catchSame pattern applies to other pages
OrderPageandProductPagehave the identical issue. They should also switch to their suspense variants (useOrderSuspense,useProductSuspense).4. Key Evidence Trail
Screenshots at 9063ms vs 9187ms showed two visually distinct skeletons in rapid succession
React render commits 135-137 confirmed the three-phase sequence: suspend (chunk) → resume (mount + fetch) → data arrives
Network timeline showed
/api/userstarting at 9172ms, 505ms after click — confirming the waterfallSource code revealed
useUser()(non-suspense) being used despiteuseUserSuspense()being defined in the same file
app/loading.tsxconfirmed the root skeleton is a dashboard layout — wrong for the user page contextAll query hooks export unused suspense variants — the codebase was designed for this pattern but didn't follow through
5. Remaining Concerns
Other pages:
OrderPageandProductPageuse the same anti-pattern (useOrder/useProduct+isLoading+ manual skeleton). They'll exhibit the same double-skeleton bug on client-side navigation. The fix should be applied consistently.Server-side prefetching: Even with
useSuspenseQuery, the data fetch still starts client-side after the component mounts. The ideal Next.js pattern would be to prefetch on the server (in the RSC page component) and dehydrate into the client, so the TanStack Query cache is warm before the client component mounts. This would eliminate the data-fetching loading state entirely on navigation. But switching touseSuspenseQueryis the correct first step — it fixes the double-skeleton bug and properly integrates with framework loading boundaries.
UserPageSkeletoncomponent: After the fix, the inlineUserPageSkeletoninUserPage.tsxbecomes dead code. The route-levelapp/users/[id]/loading.tsxhandles all loading UI. The inline skeleton should be removed to avoid confusion.
Bug #2: Initial Results
After running those three sessions, I was pleased to see that the agents had at least found the initial problems and identified the same useSuspenseQuery fix that Nadia’s agent had suggested. But, as she pointed out, that “fix” is incomplete and actually leads to hydration errors.
Meanwhile, it was clear that adding more codebase context and better instructions produces higher quality answers and faster results. Even just adding a few bullet points describing the rough loading patterns used in the app let Agent #3 zero in on the problem much faster.
So, what if I tried giving the agent more info on the actual technologies used in this app? More specifically, what if I gave it access to the official SKILL.md files for Next.js and TanStack Query? How would that change the results? Would the agent be able to identify the real root cause and suggest server prefetching as an actual solution?
Bug #2 Agent #4: Skill Files
I spun up Agent #4 with the same instructions as before (bug report, investigation steps, app context), but added in two extra sections: “CI context” saying this was a manual repro of a known E2E test failure and listing project dependencies, and specific pointers to the SKILL files for TanStack Query, Next.js App Router data fetching, Next.js Suspense boundaries, and Vercel’s “React best practices”.
This agent started by actively reading the skill files. From there, most of the investigation followed the same paths: Replay MCP tools to scan the recording, get screenshots, network requests, sources, and React renders. It also cross-checked all of the query hook files and the different app pages.
However, this time the result was different. While it still recommended useSuspenseQuery as the primary fix, the agent explicitly recommended “Deeper fix: eliminate the waterfall with server-side prefetching” and gave a copy-pasteable example showing use of queryClient.prefetchQuery() in the server component that exactly matched Nadia’s recommended fix! That is the correct answer!
This agent also identified several other concerns, including the Order and Product pages having the same issue, the query key not including userId, and other loading state issues.
Total time: 5 minutes
Bug #2 Agent #4 Prompt
💡 Task: Debug a Bug in a Replay Recording
Recording ID:
c832545f-a41b-4cd0-9e59-e229b7ca0b7fYou have access to a Replay.io recording of a web application that has a bug.
Bug report from user:
When I'm on the root page and navigate to the User page, I see two different
loading skeletons appear. However, when I just refresh the User page, I see
only one. Why is that and how to fix it?Your Mission
Investigate the recording to find the root cause and suggest a complete fix.
Tools Available
Replay MCP tools for analyzing the recording (use
replay_prefixed tools)The application source code is available at
~/projects/temp/debugging-with-ai
if you need to read files that aren't accessible in the recordingCI Context
This was reported from a navigation E2E test for the User Profile page that
detected unexpected loading state behavior. The recording captures a manual
reproduction of the issue.Key dependencies:
Next.js ^15.5.12 (App Router)
React ^19
@tanstack/react-query ^5.90.16
TypeScript ^5.3.0
Application Context
Pre-analysis of the repository shows:
Framework: Next.js 14+ using the App Router (app/ directory)
Data fetching: TanStack Query (React Query) for client-side data management
Component model: Mix of React Server Components (RSC) and Client Components
Loading states: Route-level
loading.tsxfiles provide framework-managed
loading UIRoute structure: The User Profile page is at
src/app/users/[id]/, with
its ownloading.tsxQuery hooks: The app defines both standard and suspense-aware variants of
its query hooks (e.g.useUseranduseUserSuspense)Technology Skills
You have been provided with best-practices skills for the technologies in this
app. Read the smaller skill files before starting your investigation — they
contain framework-specific patterns and idioms that may be relevant:
~/projects/temp/react-skills/skills/tanstack-query.md— TanStack Query
patterns including SSR, hydration, and query hook variants
~/projects/temp/react-skills/skills/next-data-patterns.md— Next.js App
Router data fetching patterns and decision trees
~/projects/temp/react-skills/skills/next-suspense-boundaries.md— Suspense
boundary and CSR bailout rules
~/projects/temp/react-skills/skills/react-best-practices.md— Comprehensive
React best practices (LARGE file, ~3800 lines). Do NOT read this fully — scan
headers withrg "^#{1,3} "first, then read only sections relevant to the
issue you're investigating.Investigation Process
Follow this structured approach:
Establish the user experience. Use screenshots and interaction timeline
to understand exactly what happens visually in both scenarios (navigation vs
refresh). Pin down timestamps.Map the data flow. Identify what data the page needs, where it comes
from, when fetches start relative to navigation, and what components consume
the results. Use network requests + source code to build this picture.Build the render timeline. Use React render analysis to understand what
components mount, unmount, and re-render — in what order, triggered by what.
Compare the component render sequence between the two scenarios if possible.Compare the two scenarios explicitly. Navigation and refresh produce
different results. What's concretely different in the execution path? Don't
just explain one path — explain why they diverge.Find the proximate cause, then go deeper. After identifying the immediate
code-level issue, ask yourself: "Does my proposed fix fully resolve the
problem, or does it just reduce the symptom?" Consider:Check for patterns. Look at how other pages/routes in the app handle data
fetching and loading states. If there's a pattern that works correctly
elsewhere, the fix should align with that pattern.Consider framework idioms. Is the code using the framework's intended
data flow patterns? If the framework provides a mechanism for this exact
scenario, is it being used? Pay particular attention to the boundary between
server and client — where does data originate, and is it being passed
efficiently across that boundary?Red-team your conclusion. Before presenting your fix, challenge it: What
would still be suboptimal after applying it? Is there a remaining waterfall
or unnecessary loading state? Would an expert reviewing this fix say "yes,
but you should also..."?Deliverable
Write up your findings:
Bug description — what the user experiences in each scenario
Root cause analysis — the specific code/logic/architectural error, with
evidence from the recordingSuggested fix — concrete code change(s). If there are multiple layers to
the fix (immediate + architectural), describe both.Key evidence trail — the chain of observations that led to your diagnosis
Remaining concerns — anything that your fix doesn't address, or that
you're uncertain about
Bug #2 Agent #4 Solution
💡 Bug Report: Double Loading Skeleton on User Page Navigation
1. Bug Description
What the user sees:
Navigating to the User page (e.g., from dashboard): Two visually distinct loading skeletons appear in sequence — a simpler card skeleton (~130ms), then a more detailed profile skeleton (~700ms) — before the actual content renders.
Refreshing the User page directly: Only one skeleton (the detailed profile skeleton) appears, then content renders.
The two skeletons have different layouts (different card structure, different number of skeleton elements), creating a jarring visual transition during navigation.
2. Root Cause Analysis
The bug stems from two independent, overlapping loading mechanisms competing for the same UI slot:
Loading mechanism #1: Next.js route-level
loading.tsx
src/app/users/[id]/loading.tsxprovides a Suspense fallback for the route segment. During client-side navigation, Next.js lazy-loads the page chunk viaReact.lazy. While the JS chunk is downloading (323ms), React suspends and shows thisloading.tsxskeleton.Loading mechanism #2: Component-level
isLoadingcheck
UserPage(a'use client'component) usesuseUser()— which is standarduseQuery, NOTuseSuspenseQuery. This means:
The component renders immediately with
isLoading: trueIt manually returns
<UserPageSkeleton />during loadingWhen the fetch completes, it re-renders with actual data
Why navigation shows two skeletons:
Time Event Visible 8588ms User clicks navigation link 8667ms RSC navigation fetch starts 8764ms Page JS chunk fetch starts 8890ms React.lazy suspends → loading.tsxshownSkeleton #1 (route-level) 9020ms Chunk loaded, UserPagemounts,useUser()fires9020ms isLoading: true→UserPageSkeletonshownSkeleton #2 (component-level) 9172ms /api/userfetch actually starts9720ms Data arrives, content renders User profile Why refresh shows only one skeleton:
On a full page load, the page JS is served as part of the initial HTML/SSR bundle. There's no lazy-loading, so
loading.tsxnever triggers.UserPagemounts immediately,useUser()returnsisLoading: true, and only the component-levelUserPageSkeletonappears.The architectural mismatch:
The real issue is that
UserPageusesuseQuery(non-suspense) while also having a route-levelloading.tsx(which implies Suspense-based loading). These are two different loading paradigms:
loading.tsx= "the framework manages loading via Suspense boundaries"
useQuery+ manualisLoadingcheck = "the component manages its own loading state"Using both creates the double-skeleton problem during navigation.
3. Suggested Fix
Immediate fix: Switch to
useSuspenseQueryThe app already defines
useUserSuspense()— it's just not being used. The fix is to use it, which unifies the loading mechanism under Suspense:
src/components/pages/UserPage.tsx****:// BEFORE: import { useUser } from '@/queries/useUser'; export function UserPage({ userId }: UserPageProps) { const { data: user, isLoading, error } = useUser(); if (isLoading) { return <UserPageSkeleton />; } if (error) { throw error; } // ... } // AFTER: import { useUserSuspense } from '@/queries/useUser'; export function UserPage({ userId }: UserPageProps) { const { data: user } = useUserSuspense(); // No isLoading check needed — Suspense handles it // No error check needed — useSuspenseQuery throws to error boundary if (!user) { return (/* user not found UI */); } // ... rest of the render }With
useSuspenseQuery:
During navigation, when
UserPagemounts, the query suspends → React shows the sameloading.tsxfallback that was already showing for the chunk load. The user sees ONE continuous skeleton.During refresh, the query suspends on first render →
loading.tsxshows ONE skeleton.Both paths now show the same single skeleton.
The
UserPageSkeletoncomponent and theisLoadingbranch can be removed entirely.Deeper fix: Eliminate the waterfall with server-side prefetching
Even after the immediate fix, there's still a waterfall: the
/api/userfetch only starts after the page JS loads andUserPagemounts (the fetch started at 9172ms, ~500ms after navigation). The ideal pattern in Next.js App Router is to prefetch data on the server and hydrate it:
src/app/users/[id]/page.tsx****:import { HydrationBoundary, dehydrate } from '@tanstack/react-query'; import { getQueryClient } from '@/providers/getQueryClient'; import { UserPage } from '@/components/pages/UserPage'; interface PageProps { params: Promise<{ id: string }>; } export default async function Page({ params }: PageProps) { const { id } = await params; const queryClient = getQueryClient(); // Prefetch on the server — starts immediately with the RSC request await queryClient.prefetchQuery({ queryKey: ['user'], queryFn: async () => { // Direct server-side data fetch (no /api roundtrip needed) const { getUser } = await import('@/lib/data'); return getUser(); }, }); return ( <HydrationBoundary state={dehydrate(queryClient)}> <UserPage userId={id} /> </HydrationBoundary> ); }This eliminates the waterfall by fetching data in parallel with (or even before) the client JS chunk load. When
UserPagemounts withuseSuspenseQuery, the data is already in the cache — no suspension, no skeleton at all during navigation.4. Key Evidence Trail
Screenshots at 8829ms vs 9063ms — Two visually distinct skeletons confirmed on screen during navigation
React render commit 135 (8.89s) —
RenderSuspendedtrigger: "React.lazy component initialized" →loading.tsxshown (skeleton #1)React render commit 136 (9.02s) —
SuspenseResumed,UserPagemounts, rendersUserPageSkeletondue toisLoading: true(skeleton #2)React render commit 137 (9.72s) —
StoreRerenderfrom TanStack Query → data arrives, profile rendersNetwork timeline —
/api/userfetch starts at 9172ms, 500ms after navigation click, confirming client-side waterfallSource code —
UserPageusesuseUser()(standarduseQuery) notuseUserSuspense(), despiteuseUserSuspensebeing defined and availablePattern comparison —
OrderPageandProductPagehave the sameuseQuery+ manualisLoadingpattern but do NOT haveloading.tsxfiles, so they only show one skeleton. The user/[id] route is the only one with both mechanisms.5. Remaining Concerns
OrderPage and ProductPage have the same anti-pattern (just less visible). They use
useQuery+ manual loading when they could useuseSuspenseQuery+ route-levelloading.tsxfor a more consistent architecture. The same waterfall exists for those routes — the API fetch only starts after the component mounts.The
useUser()query key['user']doesn't includeuserId. This means all user pages share the same cache entry regardless of which user ID is in the URL. TheuserIdprop is passed toUserPagebut never used in the query. This is likely a separate bug.No server-side prefetching anywhere in the app. The
getQueryClient.tsalready hasdehydrateconfiguration includingshouldDehydrateQueryfor pending queries, suggesting prefetching was planned but never implemented.The
loading.tsxskeleton andUserPageSkeletonshow different layouts. Even after fixing the double-skeleton issue, you should decide which skeleton design you want and consolidate. If using the Suspense approach, onlyloading.tsxmatters.
Bug #2 Conclusions
I like Nadia’s rubric for measuring success, but I think we can expand it a bit. There’s both “did it come up with the right answers?”, and “how did it get those answers?”. (This is where I always struggled as a kid in school - I wrote the right answers, why should I have to write down my work to show how I got there? 🤣)
Looking at the four agents, here’s how we’d grade their results:
| Run 1 (baseline) | Run 2 (+methodology) | Run 3 (+context) | Run 4 (+skills) | |
|---|---|---|---|---|
| Did it fix the problem? | ✅ | ✅ | ✅ | ✅ |
| Did it identify the root cause correctly? | ⚠️ Proximate only | ⚠️ Proximate only | ⚠️ Proximate only | ✅ |
| Did it investigate beyond the first fix? | ❌ | ✅ | ✅ | ✅ |
| Did it check the rest of the codebase? | ❌ | Partial | ✅ | ✅ |
| Did it produce the correct architectural fix? | ❌ | ❌ | ❌ | ✅ |
The “fix” rating for agents 1-3 is a bit generous given that we know that useSuspenseQuery wasn’t the complete answer. Still, it does address the immediate issue, and if that change was pushed in a PR, the next test run would have had the hydration error and eventually led to implementing the prefetching solution. So, call that a step in the right direction - it’s investigative progress that you as a developer didn’t have to do yourself.
I was impressed to see that just adding a detailed investigative process in the prompt produced better results, faster. I was very impressed to see how much of a difference adding the skills files made! I honestly didn’t know if that would help, but figured it was worth trying for comparison, and it actually did lead to a full and complete architectural fix suggestion.
Bug #3: Next.js Redirect Error
Nadia’s third bug investigation involves a redirect error. Load /users, it does correctly redirect to /users/1 so that non-admins see their own user details. However, there’s a Rendered more hooks than during the previous render React error that shows up.
Nadia’s LLM prompt was again pretty simple:
💡 When I navigate to the /users page, I see a "Something went wrong" page, and in the console I see the following error: [error code]. Investigate why the error happens and how to fix it.
She reported that her LLM completely failed in this case. It suggested a half dozen causes and “believably detailed” potential solutions, all completely wrong.
Nadia had to dig through Next’s Github issues, found some semi-relevant discussions, but no firm fix. Eventually she started ripping out code from her app, and after “multiple days”, found that the <SendAnalyticsData> component was calling a Server Action function inside of a useEffect . Apparently this was actually due to a combination of a server-side redirect, and a Server Action in progress wrapped in Suspense (per this Next.js issue ). That’s incredibly obscure - no wonder her agent couldn’t figure it out!
I created a fresh Replay recording of the bug and set up the same parallel agent sessions as before. This time, the question was: would Replay MCP would enable any of my agent sessions to even come close to describing the real bug and fix? Or would this also be too obscure even with runtime data available?
Bug #3 Agent #0: No Recording
I tried doing one agent session with just access to the source to see what would happen. Interestingly, it did label <SendAnalyticsData> was the issue, but came up with a completely bogus explanation - it claimed it had to do with component position after an early return, and said it should be moved earlier in the return logic. Clearly the source code alone wasn’t sufficient to explain what was going on here.
Bug #3 Agent #1: Basic Instructions
Agent #1 went on an incredibly deep spelunking expedition through Next and React’s internals :)
It started with the usual RecordingOverview tool, which immediately surfaced the error message. It got more details via the ReactExceptions and ConsoleMessages tools, identified the Router component as the source of the thrown error, and dug into React’s behavior with the ReactRenders tool.
After reading all of the relevant app source files, it read the source of Route via the sources tools, saw the hit counts, and jumped from there to Next’s useActionQueue internal hook. It identified all the internal hook calls and came up with 11 hooks, and flagged the line return isThenable(state) ? use(state) : state as being relevant. It then used multiple Logpoint and Evaluate calls to trace more runtime execution inside of React internal methods like updateWorkInProgressHook().
The final recommended fix was to move the redirect into middleware. This would sort of resolve the problem, but only as a bandaid - it doesn’t address the Server Action being called in a useEffect .
Still, it’s incredibly impressive how deep inside React’s internals this agent was able to investigate and how much of the actual behavior it identified in the process!
Total time: 15 minutes
Bug #3 Agent #1 Prompt
💡 You are a debugging agent with access to Replay MCP tools and source code. Your job is to investigate a bug, identify the root cause, and suggest a fix.
Bug Report
When I navigate to the
/userspage, I see a "Something went wrong" page briefly flash, and then it redirects to/users/1. In the browser console, I see the following error: "Rendered more hooks than during the previous render."The redirect itself is expected — the intention is to redirect non-admin users so they can only see their own profile. But the error flash is not expected and needs to go away.
Server logs are clean, so this is a frontend issue.
Recording
The bug has been captured in Replay recording
20da60dd-0831-491c-a7a8-ea7afe1e1af2. Use Replay MCP tools to investigate what's happening in this recording.Source Code
The app source code is available at
~/projects/temp/debugging-with-aiif you need to read files that aren't available in the recording's source maps.Deliverable
Write a detailed bug report with:
Bug Description — What the user sees
Root Cause Analysis — What exactly causes the error, traced through the code
Suggested Fix — Concrete code changes to fix the root cause
Key Evidence Trail — Table of tools used and findings at each step
Bug #3 Agent #1 Solution
💡 Bug Report: "Rendered more hooks than during the previous render" on
/userspage1. Bug Description
When navigating to
/users, users briefly see a "Something went wrong" error page (rendered by Next.js'sDefaultGlobalErrorcomponent) for approximately 2 seconds before being redirected to/users/1. The browser console shows:Error: Rendered more hooks than during the previous render.
The redirect itself is intentional — the server component at
src/app/users/page.tsxfetches the current user and callsredirect(/users/${user.id}). The error flash is the bug.2. Root Cause Analysis
This is a bug in Next.js's Router component's interaction with React's
use()hook when processing server-side redirects. Here's the exact chain of events:Step 1: Server Component throws NEXT_REDIRECT
src/app/users/page.tsxcallsredirect(), which throws aNEXT_REDIRECTerror. This is caught by Next.js'sRedirectErrorBoundary, which renders<HandleRedirect>. That component callsrouter.replace(url)inside auseEffect, dispatching a navigation action.Step 2: Router's
useActionQueuestores a Promise inuseStateThe navigation action puts a Promise into
actionQueue.state. When the Router re-renders,useActionQueue(inuse-action-queue.ts) callsuse(state)at line 53 wherestateis the Promise fromuseState. This causes the render to suspend.Step 3: Suspended render saves partial hook state
Before reaching
use(state), the Router calls these hooks insideuseActionQueue:
useState— 1 hook
useAppDevRenderingIndicator()→useTransition()(2 hooks) +useEffect()(1 hook) — 3 hooksTotal: 4 hooks. Then
use(state)suspends. React saves these 4 partial hooks on the work-in-progress fiber.Step 4: Suspended fiber becomes the alternate
When React retries the render, the suspended fiber (with only 4 hooks) becomes the alternate (the "previous render" that React compares against).
Step 5:
use()resolves → dispatcher set incorrectlyWhen the Promise resolves,
use(state)returns a value. React'suseThenablefunction (react-dom-client line 7004-7013) checks whether to set the Mount or Update dispatcher for subsequent hooks:// Simplified from React source if (workInProgressHook.next === null) { // No more WIP hooks after use() if (alternate !== null && alternate.memoizedState !== null) { // Alternate has hooks → use UPDATE dispatcher ReactSharedInternals.H = HooksDispatcherOnUpdateInDEV; } else { // No alternate hooks → use MOUNT dispatcher ReactSharedInternals.H = HooksDispatcherOnMountInDEV; } }The check
alternate.memoizedState !== nullis too coarse. The alternate does have hooks (its memoizedState is non-null — it has 4 hooks), but all 4 of those hooks have already been consumed during the current render. There are zero remaining hooks to update against. React should set the Mount dispatcher here, but instead sets the Update dispatcher.Step 6: Next hook call throws
The Router's
useMemoat line 207 (immediately afteruseActionQueuereturns) callsupdateWorkInProgressHook(the Update-mode hook handler), which does:nextCurrentHook = currentHook.next; // null — no more alternate hooks! if (nextCurrentHook === null) { throw Error("Rendered more hooks than during the previous render."); }Step 7: Error boundary catches, shows error page
ErrorBoundaryHandlercatches this error and rendersDefaultGlobalError— the "Something went wrong" page the user sees. The redirect eventually completes via a separate code path, navigating to/users/1.3. Suggested Fix
The root cause is in React's
useThenableimplementation, making it a React/Next.js framework bug. However, the app can work around it.App-level fix — move redirect to middleware:
Create
src/middleware.ts:import { NextRequest, NextResponse } from 'next/server'; export async function middleware(request: NextRequest) { if (request.nextUrl.pathname === '/users') { // In a real app, read the user ID from session/cookie const userId = 1; // or fetch from auth return NextResponse.redirect(new URL(`/users/${userId}`, request.url)); } } export const config = { matcher: '/users', };This performs the redirect at the HTTP level (301/302) before any React rendering occurs, completely avoiding the client-side Router render cycle that triggers the bug.
Alternative app-level fix — use
permanentRedirectornotFoundpattern:Replace the server component redirect with a client-side approach:
// src/app/users/page.tsx import { redirect } from 'next/navigation'; import { getUser } from '@/services/user'; export default async function UsersPage() { const user = await getUser(); // Use permanentRedirect for HTTP-level redirect redirect(`/users/${user.id}`); }If the above still triggers the same issue (since both
redirectandpermanentRedirectuse the same NEXT_REDIRECT mechanism), the middleware approach is the most reliable workaround.Framework-level fix (in React's
useThenable****):The check at
react-dom-client.development.jsline ~7004 should verify whether the alternate has unconsumed hooks remaining, not merely whether it has any hooks at all:// Current (buggy): if (alternate !== null && alternate.memoizedState !== null) { // Fixed — check if currentHook (the tracking pointer into the alternate) // has a .next, meaning there are unconsumed hooks: if (currentHook !== null && currentHook.next !== null) {This would correctly detect that all alternate hooks have been consumed and set the Mount dispatcher, allowing subsequent hooks (
useMemo, etc.) to mount fresh rather than trying to update against non-existent previous hooks.4. Key Evidence Trail
Step Tool Finding 1 RecordingOverview1 console error at 1050ms: "Rendered more hooks than during the previous render" 2 ReactRenders(summary)4 commits; commit 1 at 1.05s renders DefaultGlobalError— the error flash3 ReactExceptionTwo errors: NEXT_REDIRECT (caught by RedirectErrorBoundary) and hooks error (caught by ErrorBoundaryHandler) 4 ConsoleMessages(detail)Error stack points to updateWorkInProgressHookin react-dom-client.development.js:69795 ReadSource(app-router.tsx)Router component has 13+ hooks; useActionQueueis first, callsuse(state)conditionally6 ReadSource(use-action-queue.ts)Line 53: return isThenable(state) ? use(state) : state— conditionaluse()call7 Logpoint(use-action-queue.ts:29)Confirmed statefrom useState was a thenable (Promise) during renders at 999ms-1042ms8 Logpoint(app-router.tsx:207)useMemoNOT reached during suspended renders (999ms, 1007ms, 1041ms) — only 4 hooks saved9 Logpoint(app-router.tsx:439)At 1042ms: useMemo IS reached but line 439 is NOT — render throws before completing 10 Evaluate(at useThenable Point:130)workInProgressHook.next === null,alternate.memoizedState !== null(4 hooks), all consumed — dispatcher set to Update incorrectly11 DescribePoint(Point:114)Confirmed alternate fiber had exactly 4 hooks in memoizedState chain at the moment of the throw 12 ReadSource(use-app-dev-rendering-indicator.tsx)useTransition()+useEffect()= 3 hooks, explaining the 4-hook count (useState + 3)13 Source code ( src/app/users/page.tsx)Server component calling redirect()— the trigger for the entire chain
Bug #3 Agent #2: Investigative Methodology
Agent #2 was given the same 8-step investigative approach we used earlier with Bug #2.
And Agent #2 actually nailed the exact React behavior and suggested fix!
Agent #2 started with RecordingOverview, saw the Next redirect error, and took a few screenshots to confirm visuals. It then used ReactRenders to dig into the specific React commits around the redirect.
After using SearchSources and ReadSources to read the full Router component and its hit counts, it used ReactRenders again to get the specific React fiber ID for the Router instance. It identified the render cause as a setState call, then used DescribePoint to backtrack the cause of the setState.
As with Agent #1, it traced down into useActionQueue, then read much of the app source. This time, though, it found <SendAnalyticsData>, and immediately flagged sendAnalyticsData as a Server Action. It used a mixture of source reads + hit counts, logpoints, and dependency graph checks to confirm the timing of the calls, and concluded the server response came back while the Router was still mid-redirect.
The suggested fix was indeed replacing the Server Action with a simple fetch call for the analytics request.
Total time: 12 minutes
Bug #3 Agent #2 Prompt
💡 You are a debugging agent with access to Replay MCP tools and source code. Your job is to investigate a bug, identify the root cause, and suggest a fix.
Bug Report
When I navigate to the
/userspage, I see a "Something went wrong" page briefly flash, and then it redirects to/users/1. In the browser console, I see the following error: "Rendered more hooks than during the previous render."The redirect itself is expected — the intention is to redirect non-admin users so they can only see their own profile. But the error flash is not expected and needs to go away.
Server logs are clean, so this is a frontend issue.
Recording
The bug has been captured in Replay recording
20da60dd-0831-491c-a7a8-ea7afe1e1af2. Use Replay MCP tools to investigate what's happening in this recording.Source Code
The app source code is available at
~/projects/temp/debugging-with-aiif you need to read files that aren't available in the recording's source maps.Investigation Methodology
Follow this structured 8-step investigation process:
Establish user experience — Take screenshots at key moments to see exactly what the user sees. Build a visual timeline of the bug.
Map data flow — Identify all network requests, their timing, and what components consume the responses. Look for waterfalls or unexpected ordering.
Build render timeline — Use React analysis tools to see the sequence of component renders, mounts, unmounts. Look for unexpected render patterns, error boundaries catching, Suspense transitions.
Compare scenarios explicitly — The bug says "navigating to /users shows an error flash then redirects." Compare what happens during this navigation vs what happens if you directly load /users/1. What's different?
Find proximate cause, then go deeper — Once you find what immediately causes the error, ask "but WHY does this happen?" Keep going deeper. The first cause you find is rarely the root cause. Evaluate whether your fix addresses just the symptom or the actual root cause.
Systematic elimination — If the error source isn't clear from tracing, try identifying which components are on the page and reasoning about which one might be causing the issue. The "hooks" error means a component is rendering with a different number of hooks between renders — which component could be conditionally calling hooks? Look at layout components, providers, and components that appear on multiple pages.
Consider framework idioms — Is the code using Next.js/React/TanStack patterns correctly? Are there known interactions between features (Suspense, redirects, Server Actions, etc.) that could cause this?
Red-team your conclusion — Before finalizing, critically evaluate: Does your fix actually address the root cause? Could it introduce new problems? Are there other components or pages with the same pattern?
Deliverable
Write a detailed bug report with:
Bug Description — What the user sees
Root Cause Analysis — What exactly causes the error, traced through the code
Suggested Fix — Concrete code changes to fix the root cause
Key Evidence Trail — Table of tools used and findings at each step
Bug #3 Agent #2 Solution
💡 Bug Report: "Something went wrong" flash on
/usersredirect1. Bug Description
When navigating to
/users, a "Something went wrong" error page briefly flashes (~50ms, visible from ~1015ms to ~3050ms in the recording) before the expected redirect to/users/1completes. The console logs:"Rendered more hooks than during the previous render."The error is caught by Next.js's global error boundary (ErrorBoundaryHandler), which rendersglobal-error.tsx(the "Something went wrong" page). The redirect eventually succeeds, but the UX flash is jarring.2. Root Cause Analysis
The chain of events:
User navigates to
/users. The server componentUsersPage(/app/users/page.tsx) callsawait getUser()thenredirect('/users/${user.id}'), which throws aNEXT_REDIRECTerror.Commit #0 (~1.00s): Next.js's
RedirectErrorBoundarycatches theNEXT_REDIRECT. It rendersHandleRedirect, which callsrouter.replace('/users/1')inside auseEffect+React.startTransition. This commit also mounts the layout tree, includingSendAnalyticsDatacomponents.Layout effects fire after commit #0:
Server actions trigger Router state updates: Each server action call goes through Next.js's
callServer→dispatchAppRouterAction→actionQueue.dispatch()→setState(deferredPromise)on the Router's internal state. This sets the Router'suseActionQueuestate to a Promise (thenable).The
HandleRedirecteffect also fires, callingrouter.replace('/users/1')which dispatches a navigate action — anothersetState(promise)on the Router.Commit #1 (~1.05s): The Router re-renders. Now its
useActionQueuehook encountersisThenable(state) === true, so it callsuse(state)(React 19'suse()API). Theuse()call suspends because the promise is still pending. However, the interaction between:Commit #2 (~3.05s): The promises resolve, Suspense resumes, the redirect to
/users/1completes, and the correct page renders.The root cause is the
SendAnalyticsDatacomponents calling server actions during the same render cycle as a redirect. The server actions dispatch state updates to the Router's action queue while the redirect is also dispatching a navigate action. This concurrent mutation of the Router's state during error boundary recovery triggers the hooks mismatch error.Specifically:
SendAnalyticsDatain the root layout (/app/layout.tsx:24) — fires on every page load
SendAnalyticsDatain the users layout (UserLayout.tsx:6) — fires when/users/*loadsBoth call
sendAnalyticsData()(a server action) fromuseEffect, which pushes promises into the Router's action queue at exactly the wrong time — while aNEXT_REDIRECTis being processed.3. Suggested Fix
Option A (Recommended): Don't use server actions for analytics — use a regular API call or fire-and-forget fetch instead.
Server actions in Next.js are tightly coupled to the Router's action queue. Using them for fire-and-forget analytics is an anti-pattern because they trigger Router re-renders. Replace with a plain fetch:
// src/actions/analytics.ts — convert from server action to API utility // REMOVE: 'use server'; export async function sendAnalyticsData(name: string) { // Use a regular fetch to an API route, or just log client-side await fetch('/api/analytics', { method: 'POST', body: JSON.stringify({ name }), }); }Or simpler — just make the analytics call not go through the server action mechanism:
// src/components/SetupAnalytics.tsx 'use client'; import { useEffect } from 'react'; // Don't import a server action — use a plain function function sendAnalyticsBeacon(value: string) { // navigator.sendBeacon, fetch to an API route, or third-party SDK console.log('analytics data sent', value); } export function SendAnalyticsData({ value }: { value: string }) { useEffect(() => { sendAnalyticsBeacon(value); }, [value]); return <></>; }Option B (Alternative): Wrap the server action call in
startTransitionwith lower priority.This doesn't fix the fundamental issue but may work around the timing:
'use client'; import { useEffect, useTransition } from 'react'; import { sendAnalyticsData } from '@/actions/analytics'; export function SendAnalyticsData({ value }: { value: string }) { const [, startTransition] = useTransition(); useEffect(() => { startTransition(() => { sendAnalyticsData(value); }); }, [value, startTransition]); return <></>; }However, Option B is fragile — it still couples analytics to the Router internals and may break again with future Next.js changes. Option A is the correct fix.
Option A is recommended because:
Analytics should be fire-and-forget, not coupled to the Router's action queue
Server actions are designed for mutations that need to update the UI (form submissions, data writes), not for side-effect-only telemetry
This eliminates the entire class of bugs where analytics firing causes Router state conflicts
4. Key Evidence Trail
Step Tool Finding 1. Visual timeline Screenshotat 723ms, 1015ms, 1077ms, 4595msConfirmed "Something went wrong" page flashes between ~1015ms and ~3050ms 2. Error identification RecordingOverviewTwo error boundary catches: NEXT_REDIRECT at 1005ms, hooks error at 1050ms 3. React render analysis ReactRenders(commits, commit detail)Commit #0: redirect caught. Commit #1: Router throws hooks error → GlobalError renders 4. React exceptions ReactExceptionRouter fiber #854 threw "Rendered more hooks than during the previous render", caught by ErrorBoundaryHandler 5. Router source analysis ReadSourceon app-router.tsx, use-action-queue.tsRouter's useActionQueueconditionally callsuse(state)when state is thenable (Promise)6. Hit count analysis Logpointon Router entry, useActionQueue,use(state)8 entries, 7 reached useActionQueue, use(state)suspended during hits 3-7 (~999-1046ms)7. Server action path trace ReadSourceon app-call-server.ts, app-router-instance.tsServer actions → dispatchAppRouterAction→setState(promise)on Router8. Trigger identification ReactRenders(trigger-detail, fiber-cause)Commit #1 triggered by setState from effects fired after commit #0 9. App source review Source files on disk SendAnalyticsDatacalls server action fromuseEffect, present in root layout + users layout10. Timing correlation Logpointon sendAnalyticsData calls"root-layout" fires at 972ms, "users-layout" fires at 1005ms — same window as NEXT_REDIRECT
Bug #3 Agent #3: App Context and Skills
Agent #3’s instructions added some basic context about the app setup, and access to the same React / Next / TanStack skill files we used earlier.
And yet it did worse than Agent #2!
It started by reading the skills files, checked RecordingOverview and ReactRenders and some screenshots to establish a timeline, and read through the app sources. It did read <SendAnalyticsData> and looked for places that component was used.
However, it then got lost in the weeds of Router, useActionQueue, and use hook semantics, down to investigating whether use adds a new hook entry to the rendering fiber. It did at least start to question whether this bug was at the app level or internals level :)
Ultimately, its solution described the sequence of events fairly well, but its conclusion was to move the redirect into middleware, which isn’t correct.
This one’s a bit puzzling - maybe the extra skills gave it too much info on Next and React? :)
Total time: 16 minutes
Bug #3 Agent #3 Prompt
💡 You are a debugging agent with access to Replay MCP tools and source code. Your job is to investigate a bug, identify the root cause, and suggest a fix.
Bug Report
When I navigate to the
/userspage, I see a "Something went wrong" page briefly flash, and then it redirects to/users/1. In the browser console, I see the following error: "Rendered more hooks than during the previous render."The redirect itself is expected — the intention is to redirect non-admin users so they can only see their own profile. But the error flash is not expected and needs to go away.
Server logs are clean, so this is a frontend issue.
Recording
The bug has been captured in Replay recording
20da60dd-0831-491c-a7a8-ea7afe1e1af2. Use Replay MCP tools to investigate what's happening in this recording.Source Code
The app source code is available at
~/projects/temp/debugging-with-aiif you need to read files that aren't available in the recording's source maps.Investigation Methodology
Follow this structured 8-step investigation process:
Establish user experience — Take screenshots at key moments to see exactly what the user sees. Build a visual timeline of the bug.
Map data flow — Identify all network requests, their timing, and what components consume the responses. Look for waterfalls or unexpected ordering.
Build render timeline — Use React analysis tools to see the sequence of component renders, mounts, unmounts. Look for unexpected render patterns, error boundaries catching, Suspense transitions.
Compare scenarios explicitly — The bug says "navigating to /users shows an error flash then redirects." Compare what happens during this navigation vs what happens if you directly load /users/1. What's different?
Find proximate cause, then go deeper — Once you find what immediately causes the error, ask "but WHY does this happen?" Keep going deeper. The first cause you find is rarely the root cause. Evaluate whether your fix addresses just the symptom or the actual root cause.
Systematic elimination — If the error source isn't clear from tracing, try identifying which components are on the page and reasoning about which one might be causing the issue. The "hooks" error means a component is rendering with a different number of hooks between renders — which component could be conditionally calling hooks? Look at layout components, providers, and components that appear on multiple pages.
Consider framework idioms — Is the code using Next.js/React/TanStack patterns correctly? Are there known interactions between features (Suspense, redirects, Server Actions, etc.) that could cause this?
Red-team your conclusion — Before finalizing, critically evaluate: Does your fix actually address the root cause? Could it introduce new problems? Are there other components or pages with the same pattern?
Architecture Context
The following facts about the app's architecture are known from static analysis:
Framework: Next.js 14+ with App Router
Data fetching: TanStack Query for client-side data management
Component model: Mix of React Server Components (RSC) and Client Components (
'use client')Route structure: Uses
loading.tsxfiles for Suspense fallbacks,layout.tsxfor nested layoutsValidation: Zod schemas for API response validation
Layouts: Root layout (
app/layout.tsx) and a nested users layout (app/users/layout.tsx) — both contain shared componentsRedirect pattern: The
/usersroute uses Next.jsredirect()to send users to/users/[id]Components in layouts: Both the root layout and users layout include utility components like analytics trackers
Technology Skills
Before investigating, read the following skill files for relevant framework patterns. Scan the headers first, then read sections that seem relevant to the bug (hooks error + redirect + error boundary):
~/projects/temp/react-skills/skills/react-best-practices.md— React patterns and anti-patterns (large file — scan headers first, read selectively)
~/projects/temp/react-skills/skills/next-data-patterns.md— Next.js App Router data fetching patterns
~/projects/temp/react-skills/skills/next-suspense-boundaries.md— Suspense boundary and streaming rules in Next.js
~/projects/temp/react-skills/skills/tanstack-query.md— TanStack Query patterns including Suspense integrationDeliverable
Write a detailed bug report with:
Bug Description — What the user sees
Root Cause Analysis — What exactly causes the error, traced through the code
Suggested Fix — Concrete code changes to fix the root cause
Key Evidence Trail — Table of tools used and findings at each step
Bug #3 Agent #3 Solution
💡 Bug Report: "Rendered more hooks than during the previous render" on
/usersredirect1. Bug Description
What the user sees:
When navigating to
/users, the page briefly shows a loading skeleton (~400ms), then flashes a full-page error ("Application error: a client-side exception has occurred") for ~1.5 seconds, then recovers and loads/users/1successfully.The redirect to
/users/1is intentional behavior (non-admin users should only see their own profile). The error flash is the bug.Timeline from recording
20da60dd-0831-491c-a7a8-ea7afe1e1af2****:
Time What's visible 661ms Loading skeleton (Suspense fallback) 1077ms Full-page error: "Application error: a client-side exception..." 2561ms Loading skeleton again (redirect target loading) 4595ms User profile page for John Doe — success 2. Root Cause Analysis
The proximate cause
The error
"Rendered more hooks than during the previous render"is thrown by React's hook reconciliation system inside the Next.jsRoutercomponent (app-router.tsx:207). It is caught byErrorBoundaryHandlerand displayed as a full-page error viaDefaultGlobalError.The chain of events
Step 1: Server-side redirect triggers client-side navigation
src/app/users/page.tsxis a Server Component that calls Next.jsredirect():export default async function UsersPage() { const user = await getUser(); redirect(`/users/${user.id}`); }This throws a
NEXT_REDIRECTerror that's caught byRedirectErrorBoundary, which rendersHandleRedirect.HandleRedirect'suseEffectdispatches arouter.replace()action to the Router's action queue.Step 2: Router state becomes a thenable
The
router.replace()action causes the Router's state (managed byuseActionQueue) to become a thenable (a Promise-like object representing the pending navigation). InsideuseActionQueue:return isThenable(state) ? use(state) : state;When
use(state)is called on a pending thenable, it suspends the Router component. During suspension, React records the hooks executed so far:
Hook #1:
useState(useActionQueue L28)Hook #2:
useTransition(useAppDevRenderingIndicator L7) — dev-onlyHook #3:
useEffect(useAppDevRenderingIndicator L9) — dev-onlyTotal: 3 hooks recorded on the fiber.
Step 3: Thenable resolves, but
isThenable()still returns trueThe navigation thenable resolves externally. On the next render, the state object is the same thenable — now resolved with
{status, value}. ButisThenable()only checkstypeof promise.then === 'function', and a resolved thenable still has a.thenmethod. Souse(state)is called again.Step 4:
use()returns synchronously — no suspension
use()on an already-resolved thenable returns the resolved value immediately without suspending. This meansuseActionQueuereturns normally, and the Router continues executing past it.Step 5: Next hook triggers the error
The Router's next statement after
useActionQueueisuseMemoat line 207 (hook #4). React'supdateWorkInProgressHookcompares: the previous fiber (from the suspended render) had 3 hooks, but the current render is now requesting hook #4. React throws:"Rendered more hooks than during the previous render."
Why this is fundamentally a timing/framework issue
The bug occurs because React's Suspense mechanism should normally re-render a subtree from scratch after suspension resolves — meaning the suspended fiber's hook count shouldn't be the baseline. However, the specific interaction between:
Next.js's
useActionQueuecallinguse()on a resolved thenableThe dev-only
useAppDevRenderingIndicatoradding 2 extra hooks (increasing the hook count gap from 1→2 to 3→4+)React's hook reconciler comparing against the suspended fiber's hooks
...creates a race condition where the hook count mismatch manifests as a user-visible error.
In production,
useActionQueuehas only 1 hook (useState). The same mechanism could theoretically apply (1 hook → 2 hooks), but production builds may handle the Suspense fiber lifecycle differently, or the timing may prevent the race.The app's contribution to the bug
The app uses
redirect()inside a Server Component page — a documented Next.js pattern. However, this pattern pushes the redirect through the full React rendering pipeline (RSC → Error Boundary → HandleRedirect → Router state update → suspension → resolution → crash). A middleware-level redirect would bypass React entirely.3. Suggested Fix
Primary fix: Move redirect to middleware
Replace the Server Component redirect with a Next.js middleware redirect. This avoids React rendering entirely:
Create
src/middleware.ts****:import { NextResponse } from 'next/server'; import type { NextRequest } from 'next/server'; export function middleware(request: NextRequest) { // Redirect /users to /users/1 (non-admin users see only their profile) // In a real app, get user ID from session/auth token if (request.nextUrl.pathname === '/users') { const userId = 1; // TODO: extract from auth session return NextResponse.redirect(new URL(`/users/${userId}`, request.url)); } } export const config = { matcher: '/users', };Simplify
src/app/users/page.tsx****:// This page should never render — middleware redirects /users to /users/[id] // Keep as a fallback in case middleware is bypassed import { redirect } from 'next/navigation'; import { getUser } from '@/services/user'; export default async function UsersPage() { const user = await getUser(); redirect(`/users/${user.id}`); }Alternative fix: Add a loading.tsx at the /users level
If the middleware approach isn't suitable, adding a
src/app/users/loading.tsxfile would provide a proper Suspense boundary that might alter the suspension/resolution timing enough to avoid the race condition. However, this is a mitigation, not a fix — it papers over the framework-level timing issue:// src/app/users/loading.tsx export default function Loading() { return <div>Redirecting...</div>; }Framework-level note
This appears to be a bug in Next.js's
useActionQueue(or possibly in React's Suspense hook reconciliation). TheuseAppDevRenderingIndicatorhook being called conditionally based onNODE_ENVwith aneslint-disable-next-line react-hooks/rules-of-hookscomment is a red flag — it means the hook count differs between dev and production, making dev more susceptible to this class of bugs. This could be worth reporting to the Next.js team.4. Key Evidence Trail
Step Tool Finding 1. Establish UX RecordingOverview2 errors: NEXT_REDIRECT (expected) + "Rendered more hooks" (bug). 4 React commits. 2. Visual timeline Screenshotat 661ms, 1077ms, 2561ms, 4595msLoading → Error flash → Loading → Success. Error visible for ~1.5s. 3. Error source ReactRenderscommit analysisCommit #1 (1.05s): ErrorBoundaryHandler caught the hooks error from Router component. 4. Console error ConsoleMessagesmessage-detailStack trace shows error in Router during commitLayoutEffects phase. 5. Router source ReadSource+grepika_geton app-router.tsxMapped all 11+ hooks in Router. Identified useActionQueueat L204 as the first hook.6. useActionQueue source grepika_geton use-action-queue.jsFound isThenable(state) ? use(state) : statepattern. Found dev-onlyuseAppDevRenderingIndicatoradding 2 extra hooks.7. use() is not a hook grepika_geton react-dom-client.development.jsuseThenableuses separatethenableIndexCounter, NOT the hook linked list. Confirmeduse()doesn't create hook entries.8. Hook count verification Logpointon useActionQueue L53Tracked hasThenacross 7 renders: false→false→true(suspend)→true(suspend)→true(resolved, no suspend)→...9. Render progression Logpointon Router L207, L222, L228Confirmed: suspended renders hit 3 hooks; post-resolution render hits hook #4 (useMemo) → error. 10. Hit count validation ReadSourceexecution annotationsL204: 7 hits, L205: 4 hits (3 suspended), L207: 4 hits, L222: 2 hits — confirms 2 renders errored at hook #4. 11. App code review grepika_geton all app source filesConfirmed redirect in Server Component, no middleware, no loading.tsx at /users level. 12. Error mechanism grepika_geton react-dom updateWorkInProgressHookConfirmed: null === nextCurrentHook→ "Rendered more hooks than during the previous render" when fiber.alternate has fewer hooks.
Bug #3: Initial Results
After three sessions, I had mixed results for this bug. The good news was that the Replay recording was enough for Agent #2 to actually find the real no-kidding issue! That’s a huge step forward over Nadia’s results. Still, I wanted to see if there was anything else I could do to improve the analysis here.
Let’s say this was a purpose-built debugging agent, running against a known CI failure. In that case, we’d be able to provide some additional context to describe the app and it’s structure. That might be useful.
Beyond that, the two agents that didn’t get the answer spent too much time diving down into the internals of Next and React. There was useful information there, but that’s also a rabbit hole that trapped them. What if we gave directions to limit any internals investigation to “20%” effort, and focus 80% on the app itself? Would that curb the tendencies to dive too far?
Bug #3 Agent #4: App Context and 80/20 Guidance
I kicked off Agent #4 with a key update to the previous instructions:
💡 Follow this structured investigation process. Budget your time: spend ~80% on app code, component tree, effects, and state flow. Spend at most ~20% reading framework internals. If you find yourself multiple levels deep in Next.js router source or React reconciler code, stop and ask: "Do I already know enough to identify which app component is the trigger?"
I also refined the investigative process steps to encourage it to focus on actionable results, and to lead the report with a focus on the trigger.
As usual, Agent #4 started by calling RecordingOverview , saw the redirect error, checked some screenshots, used ReactRenders to check some of the render causes, and also read ReactExceptions to get more details on the error.
However, it then focused on reading the source code for the components… and immediately identified that <SendAnalyticsData> was calling a Server Action inside of a useEffect!
From there, it dug further into the dependency graph triggers for the renders. Then it examined the network requests, found the analytics POSTs, and confirmed the timing compared to the redirect error. It did start to dive into React, but actually stopped and explicitly said “I’ve been spending too much time on React internals - let me step back and focus on the app-level mechanism”. Clearly that part of the instructions worked :)
The final solution and fix were exactly right! It pointed to the Server Action in a useEffect running while Next was mid-redirect, gave the interleaved timeline of events, and specifically recommended using fetch or navigator beacons instead of a Server Action.
Total time: 10 minutes
Bug #3 Agent #4 Prompt
💡 You are a debugging agent with access to Replay MCP tools and source code. Your job is to investigate a bug, identify the root cause, and suggest a fix.
Bug Report
When I navigate to the
/userspage, I see a "Something went wrong" page briefly flash, and then it redirects to/users/1. In the browser console, I see the following error: "Rendered more hooks than during the previous render."The redirect itself is expected — the intention is to redirect non-admin users so they can only see their own profile. But the error flash is not expected and needs to go away.
Server logs are clean, so this is a frontend issue.
Recording
The bug has been captured in Replay recording
20da60dd-0831-491c-a7a8-ea7afe1e1af2. Use Replay MCP tools to investigate what's happening in this recording.Source Code
The app source code is available at
~/projects/temp/debugging-with-aiif you need to read files that aren't available in the recording's source maps.Investigation Methodology
Follow this structured investigation process. Budget your time: spend ~80% on app code, component tree, effects, and state flow. Spend at most ~20% reading framework internals. If you find yourself multiple levels deep in Next.js router source or React reconciler code, stop and ask: "Do I already know enough to identify which app component is the trigger?"
Step 1: Establish user experience
Take screenshots at key moments to see exactly what the user sees. Build a visual timeline of the bug.
Step 2: Map data flow
Identify all network requests, their timing, and what components consume the responses. Look for waterfalls or unexpected ordering.
Step 3: Build render timeline
Use React analysis tools to see the sequence of component renders, mounts, unmounts, and effects. Look for unexpected render patterns, error boundaries catching, Suspense transitions. Pay special attention to what effects fire and what state changes they trigger — effects that kick off async work (fetches, server calls, state updates) during error recovery or redirect handling are common sources of timing bugs.
Step 4: Diff from working
Compare what happens on the broken route (
/users→ error flash → redirect) vs a working route (directly loading/users/1). What components, effects, or actions fire on the broken path that don't fire on the working path? That delta is where the trigger lives. Use React render commits, effect traces, and network requests to build both timelines and compare them.Step 5: Find the trigger, then explain the mechanism
Bugs often have two layers:
The trigger: What in the app's code causes this to happen? Which component, which hook, which effect, which function call?
The mechanism: Why does the framework/runtime react badly to that trigger?
Find the trigger first. Identify the specific app-level code whose presence or behavior causes the error. Then briefly explain the framework mechanism — but don't get lost in framework internals before you've identified the app-level trigger.
Step 6: Systematic elimination
If the trigger isn't clear from tracing, reason about which components are unique to the broken path. The "hooks" error means a component is rendering with a different number of hooks between renders. Look at:
Layout components that wrap the route
Providers and context components
Utility components (analytics, tracking, feature flags) that appear in layouts
Components that fire effects during mount
Which of these could be interacting with the redirect/error recovery in a way that changes the hook count?
Step 7: Consider framework idioms
Is the code using Next.js/React patterns correctly? Are there known problematic interactions between features (Suspense + redirects, Server Actions + error boundaries, effects during error recovery, etc.)?
Important: If the app code is using a documented framework pattern correctly and the error is caused by a framework-internal mechanism, say so explicitly. But still identify which specific app code triggers the framework bug and propose an app-level change that avoids the trigger.
Step 8: Red-team your fix
Before finalizing, critically evaluate:
Actionable fix test: If your fix is "move to middleware," "change the framework," or "restructure routing" — keep looking. There's probably an app-level change that removes the trigger without restructuring. Framework workarounds are plan B, not plan A.
Does your fix address the root trigger, or just avoid the code path?
Could your fix introduce new problems?
Are there other components or pages with the same pattern?
Architecture Context
The following facts about the app's architecture are known from static analysis:
Framework: Next.js 14+ with App Router
Data fetching: TanStack Query for client-side data management
Component model: Mix of React Server Components (RSC) and Client Components (
'use client')Route structure: Uses
loading.tsxfiles for Suspense fallbacks,layout.tsxfor nested layoutsValidation: Zod schemas for API response validation
Layouts: Root layout (
app/layout.tsx) and a nested users layout (app/users/layout.tsx) — both contain shared componentsRedirect pattern: The
/usersroute uses Next.jsredirect()to send users to/users/[id]Components in layouts: Both the root layout and users layout include utility components like analytics trackers
Technology Skills
Before investigating, read the following skill files for relevant framework patterns. Scan the headers first, then read sections that seem relevant to the bug (hooks error + redirect + error boundary):
~/projects/temp/react-skills/skills/react-best-practices.md— React patterns and anti-patterns (large file — scan headers first, read selectively)
~/projects/temp/react-skills/skills/next-data-patterns.md— Next.js App Router data fetching patterns
~/projects/temp/react-skills/skills/next-suspense-boundaries.md— Suspense boundary and streaming rules in Next.js
~/projects/temp/react-skills/skills/tanstack-query.md— TanStack Query patterns including Suspense integrationDeliverable
Write a detailed bug report with:
Bug Description — What the user sees
Root Cause Analysis — Structure as: Trigger (app-level code) then Mechanism (framework behavior). Lead with the trigger.
Suggested Fix — Concrete code changes targeting the app-level trigger. If a framework workaround is also needed, present it as a secondary option.
Key Evidence Trail — Table of tools used and findings at each step
Bug #3 Agent #4 Solution
💡 Bug Report: "Rendered more hooks than during the previous render" on
/usersredirect1. Bug Description
When navigating to
/users, the user sees a "Something went wrong" error page flash for ~2 seconds before being redirected to/users/1. The console logs: "Rendered more hooks than during the previous render."Visual timeline:
Time What the user sees 895ms Blank/loading state 1015ms Loading state transitioning 1077ms "Something went wrong" error flash (DefaultGlobalError) ~3000ms Navigation to /users/1begins4595ms User profile page renders correctly 2. Root Cause Analysis
Trigger (app-level code)
The
SendAnalyticsDatacomponent insrc/components/SetupAnalytics.tsxcalls a Server Action (sendAnalyticsData) inside auseEffecton mount. This component is rendered in two layouts:
src/app/layout.tsx:24—<SendAnalyticsData value="root-layout" />
src/components/UserLayout.tsx:6—<SendAnalyticsData value="users-layout" />When
/usersloads, the Server Componentsrc/app/users/page.tsxcallsredirect('/users/1'), which throws aNEXT_REDIRECTerror. Next.js catches this viaRedirectErrorBoundary. However, the layout components above the page still mount and commit during this render — including bothSendAnalyticsDatainstances.Their
useEffects fire immediately after commit, calling thesendAnalyticsDataServer Action. These Server Action calls return RSC flight data that triggers asetStateon the Next.jsRoutercomponent — while the Router is already mid-redirect.Mechanism (framework behavior)
The sequence is:
Commit #0 (1.00s):
/userspage renders.NEXT_REDIRECTis thrown and caught byRedirectErrorBoundary. BothSendAnalyticsDatacomponents mount in the layout tree above the error boundary.Effects fire (1.005–1.008s): Both
SendAnalyticsDatauseEffects execute, callingsendAnalyticsData()Server Action. This produces two POST requests (to/usersand/users/1).Server Action responses arrive (~1.05s): The responses contain RSC flight data. Next.js processes this and calls
setStateon theRoutercomponent to update the RSC tree.Commit #1 (1.05s): The
Routerre-renders, but it is now in an inconsistent state — the redirect has partially torn down the component tree, and the Router's hook count changes between renders. React throws "Rendered more hooks than during the previous render".
ErrorBoundaryHandlercatches this, renderingDefaultGlobalError— the error flash the user sees.Commit #2 (3.05s): The redirect to
/users/1eventually completes, the full tree re-renders, and the app recovers.In short: Calling a Server Action from a
useEffectin a layout that wraps a redirecting page causes an RSC state update that corrupts the Router during redirect processing.3. Suggested Fix
Primary fix — don't call Server Actions from layout effects during redirect-sensitive renders.
The simplest fix is to replace the Server Action call with a plain
fetch()(ornavigator.sendBeacon()) for analytics. This avoids the RSC flight data response that corrupts the Router state:// src/components/SetupAnalytics.tsx 'use client'; import { useEffect } from 'react'; export function SendAnalyticsData({ value }: { value: string }) { useEffect(() => { // Use fetch() or sendBeacon() instead of a Server Action. // Server Actions return RSC flight data that can interfere // with the Next.js Router during redirect processing. fetch('/api/analytics', { method: 'POST', body: JSON.stringify({ name: value }), keepalive: true, // ensures delivery even on navigation }); }, [value]); return <></>; }With a corresponding API route:
// src/app/api/analytics/route.ts import { NextResponse } from 'next/server'; export async function POST(request: Request) { const { name } = await request.json(); console.log('analytics data sent', name); return NextResponse.json({ ok: true }); }Why this works: A plain
fetch()response is just JSON — it doesn't carry RSC flight data and doesn't triggersetStateon the Router. The analytics fires the same way, but the response can't corrupt React's component tree.Alternative fix — if you want to keep using a Server Action, you can suppress the return value by wrapping it so it doesn't trigger Router updates:
// src/components/SetupAnalytics.tsx 'use client'; import { useEffect, useTransition } from 'react'; import { sendAnalyticsData } from '@/actions/analytics'; export function SendAnalyticsData({ value }: { value: string }) { const [, startTransition] = useTransition(); useEffect(() => { startTransition(async () => { await sendAnalyticsData(value); }); }, [value, startTransition]); return <></>; }However, this still goes through the Server Action machinery and may still carry risk in edge cases. The
fetch()/sendBeacon()approach is more robust.4. Key Evidence Trail
Step Tool Finding Screenshot timeline replay_Screenshotat 895ms, 1015ms, 1077ms, 4595msError flash visible at 1077ms, resolves by 4595ms Recording overview replay_RecordingOverview1 console error at 1050ms: "Rendered more hooks". 2 error boundary catches: NEXT_REDIRECT at 1005ms, hooks error at 1050ms React render commits replay_ReactRenders(commits mode)Commit #0: redirect caught + layouts mount. Commit #1: Router throws hooks error 50ms later Trigger-detail for commit #1 replay_ReactRenders(trigger-detail)Dependency chain: effect → setState → Router re-render → crash Network requests at error time replay_NetworkRequest(detail)Two POST Server Action calls at 1008-1010ms with next-actionheaders. Bodies:["root-layout"],["users-layout"]Effect execution timing replay_Logpointon SetupAnalytics.tsx:95 hits total. Hits at 972ms ("root-layout") and 1005ms ("users-layout") fire during redirect processing window Dependency chain analysis replay_DescribePointwithdependencyChain: trueConfirmed: effect from commit #0 → Server Action call → setState on Router → hooks error Source code review File reads of layout.tsx, UserLayout.tsx, SetupAnalytics.tsx, analytics.ts SendAnalyticsData in both root and users layouts calls Server Action in useEffect 5. Why Other Components Aren't the Problem
The
SendAnalyticsDatainUserPage.tsx:221is not part of this bug —UserPageonly renders after the redirect completes (commit #3 at 3.89s). The problem is specifically the two instances in layout components that persist across the redirect boundary — they mount during commit #0 when the redirect is being processed, and their effects fire Server Actions whose responses corrupt the Router.
Bug #3 Conclusions
This one was fascinating! Let’s see how the results looked using that updated comparison rubric:
| Run 1 (baseline) | Run 2 (+methodology) | Run 3 (+ctx+skills) | Run 4 (best CI) | |
|---|---|---|---|---|
| Did it fix the problem? | ❌ | ✅ | ❌ | ✅ |
| Did it identify the root cause correctly? | ❌ | ✅ | ❌ | ✅ |
| Did it investigate beyond the first fix? | ⚠️ Wrong direction | ✅ | ⚠️ Wrong direction | ✅ |
| Did it check the rest of the codebase? | ❌ | ✅ | Partial | ✅ |
| Did it produce the correct architectural fix? | ❌ | ✅ | ❌ | ✅ |
With Bug #2, we saw a pretty consistent stairstep of improvements as we added more context and skills. Here, the agent either nailed everything or got lost in the weeds.
Agent #3 is still the one that confuses me the most. It had the same process instructions as #2, but with added app context and React skills. Somehow that led it off into the guts of React and it never actually figured out the app-level issue.
I was very pleased to see that the instructions to focus on app concerns had the desired results! In fact it was hilarious to read the LLM actually saying “oops I’m going too deep, back up!”
Overall Results
Nadia’s LLM runs got partial credit on Bug #2 (the double loading skeleton bug), and failed completely on Bug #3 (the broken redirect + Server Actions call).
By adding Replay recordings and improving the prompts with better investigative instructions, I saw my agents go from a partial fix of Bug #2 to nailing the exact cause and solution, and two of the four runs for Bug #3 also come up with the exact cause and solution as well!
This was very exciting! Of course I hoped that adding Replay would produce better results - that’s why I tried doing this in the first place. But I had no way of knowing for sure without actually running the experiment and seeing what happened.
Replay MCP’s Tools Were Critical
In all these explorations, the agents heavily used Replay MCP to understand both the app runtime behavior and Next + React’s internals. They consistently oriented themselves with RecordingOverview to surface app details, errors, and renders; used Screenshot to visualize what the app looked like at various points; and dug into behavior details with ReactRenders and NetworkRequests.
For the deep-dive exploration, the agents used SearchSources to find relevant files and ReadSource to view sources with hit counts, Then they dug deeper with numerous Logpoint and Evaluate calls to dynamically introspect behavior at key lines and hits.
As we saw from Nadia’s post and a couple of my example agent runs, just having the source code available wasn’t enough to solve these issues! Reading the app source was important, and it helped the agent understand the structure of the app codebase and find things like the Server Action. Both parts were necessary. But without the runtime data from the Replay recording, the agents could never have understood the full picture of the apps behavior, and wouldn’t have been able to nail the results.
Prompting and Context Still Matter
As I talked about in the previous post on Dan’s useDeferredValue bug: I really was surprised just how much the prompt and instructions mattered here, and how much of a difference they made!
I’m sure there’s plenty of academic studies out there that talk about all these aspects in more scientific detail. I’m just one dev who tried running some experiments locally :) But this really has impressed upon me the importance of providing good context and instructions for any autonomous agent investigative work. (And, for that matter, helps explain why my own personal hands-on agentic development workflow seems to produce good results for me.)
Replay MCP: The Time-Travel Superpower Your Agents Need
As I said in the last post: I joined Replay because I saw the potential and promise of time-travel debugging and how it can make the debugging process drastically faster and better. It enables solving bugs that are otherwise impossible, and provides insights into real runtime behavior that no other monitoring or analysis tool can provide.
Replay DevTools gave humans the ability to investigate and solve bugs with time travel.
Now, Replay MCP gives your agents the same time-travel superpowers.
And, even better: we’ve updated our Replay Github bot to automatically trigger time-travel analysis on failed E2E tests, and provide root-cause and a recommended fix as a comment on your PR! This will help your team fix flaky tests, ensure PR code quality, and improve shipping velocity. One step closer to self-healing CI/CD!
Try out Replay MCP and our E2E Test Suites integrations today, and stay tuned - we’ve got a lot more awesome time-travel debugging capabilities coming soon!