Example output
# Hypothesis-to-Experiment Mapping
## Original Hypothesis
"Exposure to algorithmically curated news feeds increases political polarization."
## Popperian Restatement
The hypothesis as stated is **not falsifiable in its current form.** "Polarization" and "algorithmic curation" are both under-specified, and any observation can be rescued by redefining either. I will tighten it and then test it.
**Tightened hypothesis:** Users randomly assigned to receive a chronologically ordered news feed for four weeks will show a smaller increase in affective polarization (measured by feeling thermometer differential toward opposing partisans) than users continuing to receive algorithmically ranked feeds.
**Falsifying observation:** If the treatment group shows equal or greater increase in affective polarization than control over the study period, at p < 0.05 with n sufficient to detect a 5-point thermometer differential change, the hypothesis is refuted.
## Operational Definitions
- **Algorithmic curation:** Platform-default ranking (engagement-weighted) on a major social platform. Specify the platform and confirm the default ranking signal.
- **Chronological feed:** Reverse-chronological, same content sources, no engagement-based reordering.
- **Affective polarization:** Feeling thermometer differential (own party rating minus opposing party rating, 0-100 scale), administered at baseline, week 2, and week 4.
- **Secondary measure — ideological polarization:** Issue-position extremity index across 10 pre-validated policy items.
- **Exposure compliance:** Browser extension or platform-level logging confirms >80% of news-feed sessions use the assigned condition.
## Experimental Design
- **Independent variable:** Feed type (algorithmic vs. chronological) — between subjects
- **Dependent variables:** Affective polarization change (primary), ideological polarization change, self-reported news diversity (secondary)
- **Controls:** Pre-registered covariates — baseline polarization, prior social media use (hours/week), political interest, age, education, partisanship strength
- **Randomization:** Stratified by partisanship strength and age bracket
- **Duration:** 4 weeks treatment, with measurements at week 0, 2, 4, and 8 (follow-up)
## Pre-Registered Predictions
- **Primary support:** Treatment group mean change in thermometer differential is at least 5 points lower than control, 95% CI excludes 0.
- **Partial support:** Direction is correct but effect < 5 points or CI crosses 0 narrowly.
- **Null:** CI centered on 0.
- **Falsification:** Treatment group mean change is equal to or greater than control.
## Too-Easy-to-Confirm Audit
Earlier formulations ("algorithms contribute to polarization") could be confirmed by almost any non-null finding. The tightened version requires a specific direction and a pre-committed magnitude (5 points). This is adequately risky: ~30% of prior studies in this area report null or reversed effects, so confirmation is not guaranteed.
## Alternative Explanations
1. **Self-selection out of the feed:** Treatment users may simply use the platform less, so observed effects are a use-reduction effect, not a ranking effect. *Ruled out by* measuring and controlling for session time; excluding users with >50% drop in usage.
2. **Novelty effect:** Chronological feed feels novel and may temporarily shift behavior without reflecting a stable effect. *Ruled out by* the 8-week follow-up measurement after treatment ends.
3. **Demand characteristics:** Participants know they are in a study about polarization. *Mitigated by* framing the study as "user experience research" and using validated but non-obvious thermometer measures embedded in a longer battery.
## Decision Rules
- **Ship / publish as supporting evidence:** Primary prediction met AND at least one secondary measure directionally consistent AND no alternative explanation survives robustness checks.
- **Report as null:** CI for primary effect centered on zero; publish with clear null framing.
- **Report as refutation:** Treatment group polarization increases more than control; this is the scientifically valuable outcome if it occurs — do not hide it.
## Honest Note
Even tightened, this hypothesis faces a generalizability ceiling: effects on one platform in one election cycle do not license claims about "algorithms and polarization" as a category. Frame findings accordingly.