Populism-LLM

Cross-Validated LLM Annotations of European Party Manifestos

A cross-validated populism corpus for political-economic research

Three independent large language models — Anthropic Claude Sonnet 4.6, OpenAI GPT-4.1-mini, and Google Gemini Flash — score every CMP party manifesto from 1920 to 2025 on twelve dimensions of populism and liberalism. Each score comes with a verbatim quote, contextual snippet, and machine-readable evidence status.

5,285 manifestos

67 countries

3 LLM families

12 scored dimensions

What you get

📂 Browse 3,327 manifestos

Open browser → Country, party, year filters. Click a manifesto to see every score from every model, with the verbatim evidence each model relied on.

🔀 Cross-model agreement

Open agreement view → Where do the three LLMs agree? Where do they diverge? Per-dimension Pearson and Spearman correlations, full disagreement-flagged outlier list.

📈 Interactive regressions

Open regression tool → Filter by country, year, choose which models to include, pick within-manifesto and between-model aggregation weights — run the panel regression in your browser. Powered by WebR.

⬇️ Download & replicate

Get the data and code → Bulk parquet on Hugging Face, full R replication pipeline on GitHub.

Methodology in one paragraph

Each manifesto is split into ≤20,000-token chunks. Each chunk is scored by all three LLMs using an identical strict JSON schema. Models must back every non-null score with a 3–8-word verbatim quote and a surrounding context snippet. Evidence spans are revalidated post-hoc against the input text with a 6-tier fuzzy matcher plus an LLM-as-judge backstop. Final scores are confidence-weighted means across chunks, computed independently per model; the cross-model aggregate is a configurable choice (default: simple mean of the three families). All choices can be inspected and changed in the interactive regression module and visualised as a specification curve.

→ Full pipeline: methodology.

Authors

Martin Rode

Department of Economics, Universidad de Navarra · co-author

Sebastian Stöckl

Liechtenstein Business School, University of Liechtenstein · co-author, site maintainer & creator

Paper

The methodology, headline regressions, robustness checks, and the specification-curve appendix are described in the working paper:

Rode, M. & Stöckl, S. (2026). The Milei question, or: How liberal are right-wing populist parties? A measurement approach using large language models. Working paper. [PDF link forthcoming]

Cite the dataset

@misc{rode_stoeckl_2026_populismllm,
  author    = {Rode, Martin and St{\"o}ckl, Sebastian},
  title     = {Populism-LLM v1: A Cross-Validated LLM Annotation of
               European Party Manifestos},
  year      = {2026},
  publisher = {Hugging Face Datasets},
  url       = {https://huggingface.co/datasets/sstoeckl/populism-llm},
  note      = {Dataset DOI: tba}
}

@article{rode_stoeckl_2026_milei,
  author  = {Rode, Martin and St{\"o}ckl, Sebastian},
  title   = {The Milei question, or: How liberal are right-wing populist
             parties? A measurement approach using large language models},
  year    = {2026},
  journal = {Working paper}
}