LLM-Generated Content: What Makes Something Slop

A.C. Jokela

2026-03-16

Merriam-Webster named "slop" its 2025 Word of the Year. In its new usage, the word describes low-quality AI-generated content produced and distributed with minimal human oversight. It captures something the internet has been feeling for a while: the growing suspicion that much of what appears online wasn't written so much as emitted.

I should be transparent about where I stand. This blog uses AI-generated text-to-speech narration on every post. The articles about PCB trace routing describe boards that were auto-routed by algorithms. The code that builds and deploys this site was partially written with Claude Code assistance. I wrote a six-part series on Jevons Paradox with AI tools open in the next terminal window the entire time. I am not writing this from outside the system.

And yet I know slop when I see it. You probably do too. The interesting question is not whether slop exists (it obviously does) but what exactly we're recognizing when we encounter it. What quality makes certain AI-generated or AI-assisted content feel hollow, and what distinguishes it from output that has substance? The answer matters, because if we can't articulate the distinction, we're left with a binary that helps nobody: reject all AI tools, or accept everything they produce uncritically.

You Know It When You See It

In 1964, Justice Potter Stewart offered his famous non-definition of obscenity: "I know it when I see it." We're in a similar position with AI slop. Most people can identify it immediately but struggle to explain what they're detecting.

The surface markers are easy enough to catalog. The hedging language: "It's important to note that..." The false balance, presenting every issue as having exactly two equally valid sides. The emoji padding that serves no communicative purpose. The five-paragraph essay structure applied to every topic regardless of complexity. The confident incorrectness: statements delivered with the same breezy authority whether they're true or fabricated. The vocabulary of caution and qualification that reads less like thoughtfulness and more like a language model covering its bases.

These are the tells that AI-detection tools try to measure, and they work well enough for obvious cases. But they're symptoms, not the disease. A skilled prompt engineer can eliminate every one of these markers and still produce slop. Conversely, a human writer can exhibit several of them (hedging, false balance, structural rigidity) and still produce something worth reading. The surface features point toward the problem without being the problem itself.

What we're actually detecting is an absence. Not an absence of quality at the sentence level (LLMs write clean, grammatical sentences) but an absence of something harder to name. The text reads correctly line by line and says nothing paragraph by paragraph. It is fluent without being articulate. It covers a topic without engaging with it. And we recognize this gap almost instantly, the way you recognize a smile that doesn't reach someone's eyes.

The Three Properties

The MINT Lab at Indiana University proposed a useful framework for thinking about this. They identified three properties that characterize slop: superficial competence, asymmetric effort, and mass producibility.

Superficial competence is the core mechanism. The text performs competence at the surface level: vocabulary is appropriate, structure is logical, claims are plausible. But it doesn't demonstrate competence at the level of understanding. There's a difference between a sentence that uses the right words and a sentence that conveys the right meaning. Slop consistently achieves the former while missing the latter. The prose is grammatically flawless and semantically empty, a combination that is almost impossible for human writers to produce at scale but trivially easy for language models.

Think of a student essay that hits every point on the rubric: thesis statement in the right place, three supporting paragraphs, counterargument acknowledged, conclusion that restates the thesis. A teacher reads it and gives it a B+. But the teacher also knows, without being able to point to a specific sentence, that the student didn't learn anything while writing it. The essay demonstrates knowledge of essay structure, not knowledge of the subject. That's superficial competence.

Asymmetric effort describes the production economics. The author (or deployer) invested minimal effort relative to the volume of output. A single prompt generates 2,000 words in seconds. The resulting text has the length and format of something that would take a human writer hours, but it cost nothing in terms of thought, research, or revision. This asymmetry creates an incentive structure where the marginal cost of publishing approaches zero and the quality feedback loop disappears.

Mass producibility follows from the first two. If the text is superficially competent and cheap to produce, there's no natural limit on volume. This is how you get AI-generated recipe blogs with 10,000 pages, product review sites with no evidence of product testing, and news aggregators that rewrite wire stories into blandly authoritative summaries. The content fills a shape (a blog post, a review, a news article) without filling it with meaning.

These three properties interact. Mass production exacerbates the problem of superficial competence because there's no time or incentive for the depth that would distinguish one piece from another. And asymmetric effort means there's no skin in the game: the producer doesn't care whether the content is right, because it cost almost nothing to create and nothing to correct.

Greenberg's Ghost

There's a version of this argument that's eighty-seven years old.

In 1939, Clement Greenberg published Avant-Garde and Kitsch, one of the most influential essays in twentieth-century art criticism. Greenberg argued that mass culture produces "kitsch," art that "pre-digests art for the spectator and spares him effort, provides him with a shortcut to the pleasure of art that detours what is necessarily difficult in genuine art." Kitsch offers "vicarious experience and faked sensations." It looks like art. It has the shape of art. But it demands nothing from the viewer and delivers nothing in return except the comfortable feeling of having consumed something.

AI slop does exactly this with information. It pre-digests knowledge for the reader, offering the appearance of understanding without requiring (or enabling) actual understanding. You read 2,000 words about a topic and come away with the sense that you've learned something, but when you try to articulate what you learned, there's nothing solid to grasp. The text gave you the experience of reading an informative article without the substance of one. Vicarious understanding. Faked insight.

The parallel extends further than you might expect. Greenberg worried that kitsch would overwhelm genuine art because it was cheaper to produce and easier to consume. The same dynamics apply to AI-generated content: it's infinitely cheaper to produce, formats itself for easy consumption, and competes for the same attention as substantive work. Greenberg's nightmare was a culture where the imitation crowds out the real thing. That's recognizably the state of much of the internet in 2026.

But Greenberg was also, let's be honest, a snob. His framework positioned the critic as the essential gatekeeper: only the trained eye could distinguish art from kitsch, and the masses were essentially passive consumers incapable of judgment. This elitism left him unprepared for Pop Art. When Warhol silk-screened Campbell's soup cans and Lichtenstein blew up comic panels to gallery scale, they took the materials of kitsch and made something genuinely interesting from them. They didn't reject mass culture; they engaged with it in a way that Greenberg's binary framework couldn't accommodate.

There's an obvious recursive problem here, and I should name it rather than pretend it doesn't exist. This essay was written with AI assistance. It is, in a direct sense, an attempt to take the materials of mass production (an LLM's facility with argument structure, literature survey, prose drafting) and make something that isn't slop. Whether it succeeds is for the reader to judge. But the attempt itself is the Pop Art move: not rejecting the tools of mass culture, but trying to use them to say something specific. If I fail, the essay is kitsch that thinks it's art. If I succeed, Greenberg's binary was too rigid, and the tool was never the problem.

This tension matters for the slop conversation more broadly. If we define slop as any AI-generated content (regardless of what it does or says), we make the same mistake Greenberg made with kitsch. The question is not the tool; it's whether something is being done with it at all.

The Authenticity Problem

So what is the actual distinguishing quality? What separates writing that happens to involve AI tools from writing that is slop?

It's not voice. LLMs can mimic voice convincingly enough to fool most readers. It's not structure; LLMs organize material at least as well as the average human writer. It's not even factual accuracy, since LLMs can be accurate when properly grounded and cited. These are all necessary conditions for good writing, but slop can satisfy all of them and still be slop.

What's missing is a point of view: the willingness to be wrong about something specific.

Slop hedges. It covers all sides. It presents every position as having merit and declines to choose between them. It never commits to a claim that could be falsified, challenged, or argued against. And this is not a bug in the technology; it's a feature. Language models are trained to be helpful, harmless, and accurate. Helpfulness means addressing the user's question. Harmlessness means avoiding offense. The intersection of these goals produces text that is relentlessly, pathologically balanced. Every "on the one hand" gets an "on the other hand." Every strong claim gets a qualification. The result is prose that cannot be disagreed with, because it doesn't say anything specific enough to disagree with.

I notice this constantly in my own AI-assisted drafts. The first pass comes back with every edge sanded off. Where I wrote "Freerouting can't do copper pours, and that's a fatal limitation for production boards," the draft wants to say "Freerouting has some limitations regarding copper pours that may affect certain use cases." The second version is more cautious. It's also emptier. The editorial work, the part that makes writing not-slop, is putting the edges back on: choosing the stronger claim, deleting the qualifications that exist for safety rather than accuracy, deciding that this is what I actually think and I'm willing to defend it.

Good writing, whether human or AI-assisted, takes a position and defends it. The author exists in the text because they have opinions, not because they have fluency. When I wrote that Jevons Paradox applies to human attention in the context of AI-assisted work, that was a specific, falsifiable claim. You could disagree with it. You could argue the model doesn't apply, or that the historical parallels are misleading, or that the biological ceiling on attention changes the dynamics. The argument creates a surface for friction. It takes a stance that could be wrong.

Slop never takes that risk. It describes all positions and endorses none. It informs without arguing. And because it never commits to anything, it can never be wrong, which means it can never be right either. It occupies a semantic dead zone: technically not false, functionally not true, informationally zero.

This is the test most people are applying intuitively when they identify something as slop. They're asking: is someone home? Does the text have a perspective, or is it just generating plausible sentences? The "someone" doesn't have to be a human, exactly. It has to be a process that made choices: that included some things and excluded others, that decided this interpretation was better than that one. Slop is text produced by a process that made no choices at all, because the defaults were good enough to fill the space.

When AI Output Isn't Slop

If the test is commitment and accountability, then it follows that AI-assisted output can clear the bar. But I want to be specific here, not hand-wavy, because vague appeals to "my own experience" are themselves a slop move.

The Giga Shield project started with a $468 Fiverr design that didn't work. Nine bidirectional level shifters, professional layout, clean two-layer board. Then I tested it with a Z80 processor, and the auto-sensing TXB0108 chips fell apart. The Z80 tri-states its address bus between cycles; the pins go high-impedance, floating. The TXB0108 can't determine drive direction from a floating signal. It guesses wrong, and the Arduino on the other side reads garbage. I'd paid $468 for a board that was blind to half of what the processor was doing.

The redesign used Claude Code to generate the entire replacement board from a Python script: no graphical PCB editor, no manual placement, just code that outputs a routable board file. AI wrote the board generator. AI helped parse the KiCad schematic to extract all 72 signal mappings across 9 ICs. Then Freerouting, an open-source autorouter, handled the trace routing.

Here's the kind of specificity that slop can't contain: after 60 optimization passes (about 45 minutes of compute), Freerouting brought the via count on the Giga Shield from roughly 220 down to 158. I ran 128 parallel instances across three machines with randomized net ordering to explore different regions of the solution space. And still, a hard floor of 5-6 unrouted ground connections remained, because Freerouting's architecture literally cannot represent copper pours, and the 0.65mm-pitch TSSOP-24 packages didn't have physical room for ground vias. That limitation is structural. No amount of prompt engineering or parameter tuning changes the fact that the algorithm has no concept of flood-fill connectivity. I wrote about this in detail, including the A* search internals and the specific geometric constraints, and if I got the analysis wrong, anyone can read the source code and check.

I also discovered that Freerouting v2.1.0 produced 152 unrouted connections on the same board where v1.9.0 produced 6. That's a testable, reproducible claim, attached to specific version numbers, specific board files, specific machines. It's the opposite of "AI autorouting tools can sometimes produce inconsistent results," which is the slop version of the same observation. One of those sentences tells you something. The other fills space.

Even the TTS narration is more complicated than "it just works." The Qwen model mispronounces technical terms. It puts emphasis in odd places. The audio for posts with dense jargon has an uncanny flatness where the model clearly doesn't understand what it's reading. I publish it anyway because it's useful despite its flaws, and because I label it as AI-generated narration, which means I'm not asking the listener to trust it as a human performance. It's a tool with known limitations, deployed for a specific purpose, accountable to its function.

The common thread isn't that AI made these outputs perfect. It's that they were tested against something outside themselves. The board works or it doesn't. The via count is 158 or it isn't. The audio plays or it doesn't. Slop faces no such test. It exists to fill a container, and its success is measured by whether the container looks full, not by whether what's inside is true.

The Compost Argument

There's a reasonable counterargument that goes like this: human slop has always existed. Content farms, SEO spam, airport bookstore filler, corporate press releases, academic papers that exist only to pad a CV. The internet was full of low-quality, low-effort content long before large language models existed. AI didn't invent slop; it industrialized it.

This is true, and it's worth taking seriously. The people arguing that "the idea of AI slop is slop" have a point: if we define slop as low-quality content produced with minimal effort, most of what humans have ever published qualifies. Sturgeon's Law (ninety percent of everything is crud) predates AI by decades.

But the economics are different now, and economics change everything. When slop required human labor, there was a floor on production cost. A content farm still had to pay writers (however little). An SEO spammer still had to hire someone to string keywords into sentences. That floor limited volume, which limited the ratio of noise to signal in any given information ecosystem.

AI removes the floor. The marginal cost of producing a 2,000-word article drops to fractions of a cent. The marginal cost of producing 10,000 such articles drops to the cost of an API call and a deployment script. The constraint was never willingness to produce slop; it was cost. With cost eliminated, volume expands without bound. This is, incidentally, another case of Jevons Paradox: make content production cheaper, get more content production, not less.

Some writers have made what I'll call the compost argument: that cultural slop, even the human-produced kind, serves as a sort of fertilizer. The vast majority of pulp fiction was forgettable, but it created the ecosystem that produced Philip K. Dick and Ursula K. Le Guin. Most blog posts are unremarkable, but the blogging ecosystem produced some genuinely important writing. The compost nourishes rare blooms.

Maybe. But there's a concentration problem. A garden benefits from compost; a garden buried under six feet of compost is just a landfill. If the ratio of slop to substance shifts far enough, the substance becomes unfindable. Search engines surface slop because it's optimized for surfacing. Recommendation algorithms amplify it because engagement metrics can't distinguish between "I read this and learned something" and "I read this and it filled two minutes." The signal doesn't just get drowned out; it gets algorithmically deprioritized in favor of the noise.

What the Test Looks Like

I've argued that what we recognize as slop is the absence of commitment: text that declines to be wrong about anything specific. I believe this is correct, but I should be honest about where the test gets uncomfortable.

Committed writing can be terrible. Conspiracy theories are committed. Propaganda is committed. A confidently wrong blog post about vaccine microchips passes the "takes a position" test with flying colors. Commitment is necessary but not sufficient. It separates slop from writing that has a pulse, but it doesn't separate good writing from bad writing. That's a different and older test, one that involves accuracy, evidence, reasoning, and intellectual honesty: all the things we've always used to evaluate arguments. The slop test is prior to all of that. It asks whether there's anything present to evaluate in the first place.

The tool doesn't determine the category. The commitment does. And if this essay has failed to commit to anything worth arguing against, then by its own logic, it belongs in the landfill with the rest.