Craft·June 1, 2026·5 min read

Why AI Starts Sounding Like AI After 2,000 Words

The first few paragraphs feel right. By page three, something has shifted — it's technically correct but it doesn't quite sound like you. Here's the mechanical reason this happens.

There's a pattern that shows up across AI-assisted writing — in fiction, in newsletters, in ghostwritten content, in business writing — that's hard to name but easy to recognize.

The first few paragraphs of a draft feel right. The voice is there. Then something shifts around page two or three. By page four, the prose is technically correct but it doesn't quite sound like the person who started it. It's cleaner than it should be. More generic. More like a competent writer producing competent text, and less like the specific person with the specific perspective who actually has something to say.

If you've ever read an AI draft and thought "this is fine but it's a little flat" — or handed a draft to someone who asked "did you write this?" with a slight hesitation — you've run into this. It has a mechanical cause.

What every language model defaults to

Every language model was trained on an enormous corpus of text from across the internet and books. The result is a kind of averaged voice — the blend of patterns most common across all that writing. It's articulate, clear, reasonably authoritative. It's also, by construction, nobody in particular.

When you prompt a model with your writing, or with instructions about your voice, the model shifts toward your patterns. For the first few hundred words, with your examples still prominent in the context window, it adjusts. Then the model's training reasserts. As the generation grows, your example paragraphs take up a smaller and smaller share of the context window. The model's defaults — the averaged patterns of everything it was trained on — start winning.

This is why voice drift isn't a prompting problem. "Write in my voice" is an instruction. Instructions dilute. The model's training is billions of parameters baked into the weights. Your instructions are text in the context window. Across a long generation, the weights win.

Why this matters more than people expect

Voice is the thing that makes writing worth reading. Not the ideas alone — ideas can be summarized. Not the structure alone — structure can be copied. Voice is the particular way a specific person thinks on the page: their sentence rhythms, the things they emphasize, the moves they make between a claim and its support, the words they reach for and the words they avoid.

Readers who follow a writer recognize their voice before they're conscious of recognizing it. When it shifts — when it goes from distinctive to generic — they feel it as a vague dissatisfaction they often can't name. "This feels like a lot of words." "Something is off." "This doesn't sound like you."

What actually holds voice across a long generation

The model needs to know your voice rules — not as text it might or might not weight appropriately, but as constraints it's checked against after generation.

Voice rules are explicit and mechanical: sentences average this length. Avoid these specific words. Lead paragraphs with the main point, don't build to it. Contractions throughout. Never use "leverage" as a verb. These aren't stylistic suggestions — they're parameters a tool can check an output against before you see it.

When a generation comes back, the tool runs it against your voice rules and flags violations: third paragraph runs 47 words against a 15–25 target; passive construction in paragraph two; the word "utilize" appears twice. You review the flags and decide what to do. The drift doesn't slip through while you're focused on the ideas.

Rules injected at the system level, checked on every output — that's what keeps voice stable across a long generation. Not better prompting, but structural enforcement that doesn't depend on where your instructions happen to fall in the context window.

The test worth running

If you want to know whether your current tool is drifting on voice, generate something long — 1,500 words or more on a topic you know well. Read the first 300 words, then jump to the last 300. If those sections feel like different writers, you've seen the problem.

The model found your voice at the start and gradually lost it. The fix isn't to start over with a better prompt. The fix is a tool that treats your voice as a persistent constraint, not a contextual suggestion that gets diluted as the prose accumulates.