Skip to content
← All posts

Why Asking ChatGPT to 'Review This as a 30-Something Woman' Misleads You

3 min readIlkim Team

Before publishing, when you want to know "how will this read," a lot of people now ask ChatGPT something like: "Evaluate this as a 30-something woman would." You get a fast, plausible answer. But there's a structural trap hiding in that prompt.

A general LLM imagines one average person

Ask ChatGPT to "review this as a 30-something woman," and the model conjures a single, average image of the label "30-something woman" and performs it. The problem: real 30-something women are not one person.

Within the same age and gender, occupation, region, education, interests, and free time vary wildly. A marketer working in Seoul, a self-employed shop owner in a small city, a parent on childcare leave, and a graduate student all read the same piece differently. Some bounce at the first paragraph; others read to the end and share.

A general LLM's answer flattens all of them into a single average. The average is convenient, but it hides how content actually spreads.

Content lives or dies in the tails of the distribution

Reach and virality are usually decided not by the average reaction but by the extremes — the tails of the distribution.

  • A small group that reacts strongly shares the content, and reach explodes.
  • A specific segment bails at the first sentence, and even a respectable average score can't save real reach.

"On average, not bad" misses both risks. What you actually need before publishing is "who reacts strongly, and who quietly leaves." That never comes from a single average evaluator.

The average gives you the illusion of safety. What actually makes or breaks content is the minority far from the mean.

How to see the distribution, not the average

The fix is simple: instead of one imagined evaluator, build a crowd that mirrors the real population and let each member read.

Ilkim samples multiple synthetic personas that follow the population distribution from KOSIS (Statistics Korea). Even among "30-something women," occupation, region, and interests are spread out the way the statistics say they are. Each reads your draft from their own vantage point and returns completion/drop-off, a score, and a comment. The result isn't one number — it's a distribution of reactions.

This is built on NVIDIA's Nemotron-Personas-Korea dataset (CC BY 4.0) together with KOSIS distributions. "A statistically grounded crowd" rather than "one imagined person" is the decisive difference from a general LLM.

When you actually need this

Not every piece needs a distribution analysis. But in these situations, the opinion of one average persona is risky.

  1. Broad-audience content — magazine articles or brand campaigns that must reach many kinds of readers.
  2. Content where drop-off is fatal — landing copy or newsletters where first-paragraph bounce dictates conversion.
  3. Content that's hard to fix after publishing — print, press releases, anything you can't quietly edit once it's out.

For pieces like these, it's safer to confirm "who reads it and how" before publishing — not just "it's fine on average."


In short: asking ChatGPT to role-play a specific reader is fast, but it collapses to one average person. Because reach is decided in the tails of the distribution, pre-publish validation should look at the reactions of a crowd that mirrors the real statistical distribution.