The failure pattern is easy to recognize once you have seen it. A developer writes a component, then writes a test that mirrors it line by line: assert this state variable is true, assert that child received this prop, assert the handler was called with these arguments. Six months later someone replaces useState with a reducer, or splits the component in two. The behavior is identical — no user could tell the difference — but a dozen tests go red. After the third refactor like this, the team starts treating red as noise. That is the moment the suite becomes a liability, because the one failure that matters now drowns in eleven that do not.
The fix is not more tests. It is a rule about what a test is allowed to know. A good test knows two things: what the user sees and does, and what a function returns for a given input. It does not know your state shape, your hook choices, or your file structure. The acid test: if you can rewrite the component from scratch and the old test still passes, the test was measuring the right thing.
The same rule produces the skip list. Some things are simply not yours to test:
- Styling. Asserting class names or colors welds the suite to your CSS. A visual bug is caught by your eyes or a screenshot diff, not by expecting a class name.
- Third-party internals. You do not need to prove that Zod validates or that React renders. Test your schema against your inputs, not the library against its own documentation.
- Implementation state. Whether a value lives in useState, a reducer, or a URL param is a private choice. Tests that assert it will punish every refactor.
- Trivial code. A component that renders a prop into a heading does not need a test. The type checker already covers it.
warning
The 100% coverage, zero confidence trap: coverage measures which lines executed, not whether anything was checked. A suite can run every line of your code while asserting nothing a user would notice. Sixty percent coverage of the code that handles money beats one hundred percent of everything.
The testing pyramid predates React, but the idea survives translation: many cheap tests at the bottom, a few expensive ones at the top, confidence rising as you climb. In a current React stack the layers map onto specific tools.
- TypeScript strict mode and ESLint sit underneath the pyramid. They catch typos, null access, and wrong argument shapes before a single test runs — the cheapest bugs you will ever fix.
- Vitest unit tests cover pure logic: money math, validation, data transforms. Each one runs in well under a millisecond, so they run on every save.
- Testing Library component tests render components in jsdom and drive them the way a user would. This is the bulk of your suite, because this is where React bugs actually live — in the wiring between components.
- Playwright end-to-end tests walk a real browser through the few flows that pay the bills. Slow and heavy, so you keep them rare.
Kent C. Dodds famously redrew the pyramid as a trophy with a wide middle, and for React applications he is right: the component layer gives the best ratio of confidence to cost. The exact shape matters less than the discipline of putting each test at the cheapest layer that can catch the bug. The whole toolchain installs in a minute:
npm install -D vitest @testing-library/react @testing-library/jest-dom
npm install -D @testing-library/user-event jsdom msw
npm install -D @playwright/test
npx playwright installTesting Library is built on one opinion: a test should find elements the way a person would. A person looking for the submit button does not scan the DOM for a test ID. They look for a button that says Sign In.
screen.getByTestId("submit-btn"); // knows your markup
screen.getByRole("button", { name: "Sign In" }); // knows what the user seesThis is more than taste. getByRole only finds the element if it actually has the button role and an accessible name — which is exactly what a screen reader needs to find it. Every role-based query is a free accessibility check. When the query fails because your clickable div is not a real button, the test is reporting a bug that screen-reader users hit yesterday.
Here is the pattern applied to a login form. Notice what the test never touches: no state, no internals, no mocked children. It types, clicks, and reads the screen.
import { describe, it, expect, vi } from "vitest";
import { render, screen } from "@testing-library/react";
import userEvent from "@testing-library/user-event";
import LoginForm from "./LoginForm";
describe("LoginForm", () => {
it("submits the email and password", async () => {
const user = userEvent.setup();
const onSubmit = vi.fn().mockResolvedValue(undefined);
render(<LoginForm onSubmit={onSubmit} />);
await user.type(screen.getByLabelText("Email"), "test@example.com");
await user.type(screen.getByLabelText("Password"), "password123");
await user.click(screen.getByRole("button", { name: "Sign In" }));
expect(onSubmit).toHaveBeenCalledWith("test@example.com", "password123");
});
it("shows the server error to the user", async () => {
const user = userEvent.setup();
const onSubmit = vi.fn().mockRejectedValue(new Error("Invalid credentials"));
render(<LoginForm onSubmit={onSubmit} />);
await user.type(screen.getByLabelText("Email"), "test@example.com");
await user.type(screen.getByLabelText("Password"), "wrong");
await user.click(screen.getByRole("button", { name: "Sign In" }));
expect(screen.getByRole("alert")).toHaveTextContent("Invalid credentials");
});
});Both tests survive any rewrite of LoginForm — useReducer, a form library, React 19 form actions — as long as the form still behaves the same. That is the contract you want: tests pinned to behavior, free to ignore implementation.
tip
If a component is hard to test this way — you need to mock ten things just to render it — the test is telling you the component does too much. Hard to test is a design smell before it is a testing problem.
A broken layout announces itself; the wrong price does not. Pure functions deserve unit tests in proportion to how quietly their bugs fail: money calculations, date arithmetic, validation rules, the transforms that reshape API data before display. Nobody notices a rounding error in a discount function until the invoices are already wrong.
export function formatPrice(cents: number, currency = "USD"): string {
return new Intl.NumberFormat("en-US", {
style: "currency",
currency,
}).format(cents / 100);
}
export function applyDiscount(cents: number, percent: number): number {
if (percent < 0 || percent > 100) {
throw new Error(`Invalid discount: ${percent}%`);
}
return Math.round(cents * (1 - percent / 100));
}import { describe, it, expect } from "vitest";
import { formatPrice, applyDiscount } from "./money";
describe("applyDiscount", () => {
it("rounds to whole cents", () => {
expect(applyDiscount(999, 15)).toBe(849); // 849.15 -> 849
});
it("rejects impossible discounts", () => {
expect(() => applyDiscount(999, 110)).toThrow();
});
});
describe("formatPrice", () => {
it("formats cents to dollars", () => {
expect(formatPrice(1999)).toBe("$19.99");
});
it("supports other currencies", () => {
expect(formatPrice(1999, "EUR")).toBe("€19.99");
});
});These tests cost almost nothing — a few minutes to write, microseconds to run — and they guard the exact code where a silent bug turns into a refund email. There is a useful side effect, too: logic that is hard to unit test is usually logic that is tangled inside a component. Extracting it into lib/ improves the component and the test at the same time.
Mocking is where good suites quietly go bad. Reaching for vi.mock on one of your own modules looks harmless, but it couples the test to a file path and a function signature — implementation details again. Rename the module or change how the function builds its request, and the test breaks while the app keeps working. Worse, your fetching code never actually runs, so the error handling and response parsing inside it are never tested at all.
The discipline: mock at the boundary you do not own — the network — and let everything you wrote run for real. Mock Service Worker intercepts requests at that level, so your fetch wrapper, your error handling, and your caching layer all get exercised:
import { http, HttpResponse } from "msw";
export const handlers = [
http.get("/api/posts", () => {
return HttpResponse.json([
{ id: "1", title: "First Post", author: "Alice" },
{ id: "2", title: "Second Post", author: "Bob" },
]);
}),
http.post("/api/posts", async ({ request }) => {
const body = await request.json();
return HttpResponse.json({ id: "3", ...body }, { status: 201 });
}),
];The handlers double as documentation of the API contract, and they are shared: the same file can back your Vitest runs and your local dev server. When the real API changes shape, you update one file and every affected test fails honestly.
End-to-end tests are the most convincing and the most expensive things in the suite. A Playwright test boots a real browser against a real build of your app — seconds per test instead of milliseconds, plus a new way to be flaky for every moving part. So the entry bar is simple: a flow gets a browser test if its failure costs money or users. Signup, login, checkout, and the one action your product exists to do. For a content site that is the newsletter signup; for a shop it is the path from product page to paid order.
import { test, expect } from "@playwright/test";
test("user can sign up and reach the dashboard", async ({ page }) => {
await page.goto("/auth/signup");
await page.getByLabel("Email").fill("newuser@example.com");
await page.getByLabel("Password").fill("SecurePass123!");
await page.getByRole("button", { name: "Create Account" }).click();
await expect(page).toHaveURL("/dashboard");
await expect(page.getByText("Welcome")).toBeVisible();
});
test("protected routes redirect to login", async ({ page }) => {
await page.goto("/dashboard");
await expect(page).toHaveURL(/\/auth\/login/);
});A healthy end-to-end suite for a small product is a dozen tests, not hundreds. They run in CI before every deploy, and when one fails, someone looks immediately — which only happens if failures are rare and real.
warning
A flaky test is a broken test. Wiring up automatic retries until it passes trains everyone to ignore red. When a test fails intermittently, find the cause — almost always a missing await, a race with a response, or state shared between tests — and fix it or delete it. A suite people do not trust is worth less than no suite.
The highest-value tests you will ever write are regression tests, because they guard against proven bugs instead of hypothetical ones. The rule costs one sentence: every bug fix ships with the test that would have caught it. Before you fix anything, write the failing test that reproduces the bug. Watch it fail, fix the code, watch it pass, commit both together. Over a year the suite stops being a coverage exercise and becomes a record of every way your application has actually failed — which is precisely the set of ways it is most likely to fail again.
This rule also solves the cold-start problem. If you have an app in production with no tests at all, the wrong move is a three-month backfill. It stalls feature work, and tests written against code you assume is correct mostly freeze the current behavior — bugs included. Start where the leverage is instead:
- Spend one afternoon on setup: Vitest, Testing Library, Playwright, and the install commands above. The barrier to writing the first test should be zero.
- Write one Playwright test for the single flow that would hurt most if it broke tonight. One. Run it in CI on every deploy.
- Adopt the regression rule today: from now on, every bug fix carries its test.
- Test new code as you build it — the component test for this week's feature, not for last year's.
- Unit test the scary pure functions: anything touching money, permissions, or dates.
In three months you will have a small suite that maps exactly onto the parts of your app that matter and the ways it has really broken — and refactoring will have turned into something you do on a Tuesday without holding your breath. Pick the flow that pays your bills and write its Playwright test this week. The rest of the strategy follows from there.
