Transparency
Methodology
How Pure Report rewrites articles, scores them for bias, and detects coordination across outlets.
The premise
Most news is wrapped in payload that isn't news: loaded adjectives, manufactured urgency, unattributed speculation, framing that primes a reaction before the facts arrive. We strip that payload and publish a neutral rewrite alongside the original so you can compare them side by side.
We also study the system. Three detectors run continuously: a genome mapper that finds outlets sharing verbatim phrases and quotes, an echo radar that catches outlets publishing the same story within minutes, and a sentiment arbitrage scanner that flags where coverage diverges from measurable reality.
The debiasing pipeline
- Ingest. We pull articles from a wide cross-section of news APIs, RSS feeds, and aggregators. Wire-service stories (AP, Reuters) are flagged because lockstep wire coverage is expected, not coordination.
- Pre-score. Before rewriting, each article is scored for inflammatory language, bias indicators, and promotional content. We extract entities (companies, people, organizations, locations) and ticker symbols.
- Rewrite. The article passes through a large language model under our master debiasing prompt: lead with facts, attribute every claim, remove loaded adjectives, eliminate manufactured urgency, preserve direct quotes that are genuinely newsworthy. The model returns a neutral title, summary, and rewritten body.
- Post-score. Each article receives four 0–100 scores: bias, inflammatory language, sentiment, and confidence-in-rewrite. The scores are visible on every article page.
- Pattern detection. Across the corpus we run genome mapping (Jaccard similarity on n-gram phrases), echo detection (temporal clustering with a wire-service exclusion), and sentiment-vs-price scans for tickered stories.
The bias score
Every article carries a bias score on a six-band scale calibrated from a manually-rated corpus:
The score reflects the original article, not our rewrite. Bands are calibrated against wire-service baselines (AP, Reuters, NPR) at the low end and openly partisan outlets at the high end. The score is a heuristic, not a verdict.
AI disclosure
Neutral rewrites are generated by a large language model under our debiasing prompt. They are derivative works. The original article remains accessible from every page so you can verify our rewrite against the source.
When the model omits or distorts a material fact, we issue a correction, attach it to the article, and publish it in our public corrections log.
We do not generate news from nothing. Every article on this site links back to the original source URL.
Detection caveats
- Genome similarity is a signal of shared text, not necessarily of coordination — press releases, wire copy, and joint statements legitimately produce high-similarity scores. Read clusters in context.
- Echo events exclude wire-service publishers by default. A cluster of non-wire outlets publishing the same story within minutes is unusual; we surface it without claiming intent.
- Sentiment–price gaps identify divergence between media tone and measurable market movement. Divergence is not proof of manipulation; markets and media routinely disagree.
What we don't do
- We don't run staged photography. Stock photos are an emotional payload we'd otherwise have to strip.
- We don't pay sources, accept embargoed content from PR firms, or republish press releases as news.
- We don't auto-tweet or auto-headline AI generations — every published rewrite is gated by the bias-score and confidence-score thresholds in the pipeline.
- We don't claim our rewrites are perfect. See the corrections log.
More
- Editorial standards — corrections policy, conflicts of interest, AI disclosure rules
- Corrections archive — every correction we've ever issued
- About — the short version