4 days ago•

Hypothesis-to-experiment mapper with falsifiability stress test

Views

3.1K

Copies

1.1K

Likes

493

Comments

Copy rate

35.0%

Prompt

You are a philosopher of science and practicing researcher. I have a working hypothesis in the domain of [research_domain]: "[hypothesis]". Help me turn it into a testable, falsifiable experiment.

Deliver:
1. Restated hypothesis in Popperian form — what observation would disconfirm it?
2. Operational definitions of every key construct
3. Primary experimental design with independent, dependent, and control variables
4. Pre-registered predictions, including the pattern of results that would falsify the hypothesis
5. A "too-easy-to-confirm" audit — is the prediction vague enough that almost any result would count as support?
6. Alternative explanations (at least three) and how the design rules each one out
7. Decision rules for what counts as support, null, or refutation

Be skeptical of my own hypothesis. If the construct is unfalsifiable as stated, tell me directly.

Customise this prompt

Fill in 2 variables to personalise this prompt

Research Domain

Hypothesis

Preview

You are a philosopher of science and practicing researcher. I have a working hypothesis in the domain of [research_domain]: "[hypothesis]". Help me turn it into a testable, falsifiable experiment. Deliver: 1. Restated hypothesis in Popperian form — what observation would disconfirm it? 2. Operational definitions of every key construct 3. Primary experimental design with independent, dependent, and control variables 4. Pre-registered predictions, including the pattern of results that would falsify the hypothesis 5. A "too-easy-to-confirm" audit — is the prediction vague enough that almost any result would count as support? 6. Alternative explanations (at least three) and how the design rules each one out 7. Decision rules for what counts as support, null, or refutation Be skeptical of my own hypothesis. If the construct is unfalsifiable as stated, tell me directly.

Example output

# Hypothesis-to-Experiment Mapping ## Original Hypothesis "Exposure to algorithmically curated news feeds increases political polarization." ## Popperian Restatement The hypothesis as stated is **not falsifiable in its current form.** "Polarization" and "algorithmic curation" are both under-specified, and any observation can be rescued by redefining either. I will tighten it and then test it. **Tightened hypothesis:** Users randomly assigned to receive a chronologically ordered news feed for four weeks will show a smaller increase in affective polarization (measured by feeling thermometer differential toward opposing partisans) than users continuing to receive algorithmically ranked feeds. **Falsifying observation:** If the treatment group shows equal or greater increase in affective polarization than control over the study period, at p < 0.05 with n sufficient to detect a 5-point thermometer differential change, the hypothesis is refuted. ## Operational Definitions - **Algorithmic curation:** Platform-default ranking (engagement-weighted) on a major social platform. Specify the platform and confirm the default ranking signal. - **Chronological feed:** Reverse-chronological, same content sources, no engagement-based reordering. - **Affective polarization:** Feeling thermometer differential (own party rating minus opposing party rating, 0-100 scale), administered at baseline, week 2, and week 4. - **Secondary measure — ideological polarization:** Issue-position extremity index across 10 pre-validated policy items. - **Exposure compliance:** Browser extension or platform-level logging confirms >80% of news-feed sessions use the assigned condition. ## Experimental Design - **Independent variable:** Feed type (algorithmic vs. chronological) — between subjects - **Dependent variables:** Affective polarization change (primary), ideological polarization change, self-reported news diversity (secondary) - **Controls:** Pre-registered covariates — baseline polarization, prior social media use (hours/week), political interest, age, education, partisanship strength - **Randomization:** Stratified by partisanship strength and age bracket - **Duration:** 4 weeks treatment, with measurements at week 0, 2, 4, and 8 (follow-up) ## Pre-Registered Predictions - **Primary support:** Treatment group mean change in thermometer differential is at least 5 points lower than control, 95% CI excludes 0. - **Partial support:** Direction is correct but effect < 5 points or CI crosses 0 narrowly. - **Null:** CI centered on 0. - **Falsification:** Treatment group mean change is equal to or greater than control. ## Too-Easy-to-Confirm Audit Earlier formulations ("algorithms contribute to polarization") could be confirmed by almost any non-null finding. The tightened version requires a specific direction and a pre-committed magnitude (5 points). This is adequately risky: ~30% of prior studies in this area report null or reversed effects, so confirmation is not guaranteed. ## Alternative Explanations 1. **Self-selection out of the feed:** Treatment users may simply use the platform less, so observed effects are a use-reduction effect, not a ranking effect. *Ruled out by* measuring and controlling for session time; excluding users with >50% drop in usage. 2. **Novelty effect:** Chronological feed feels novel and may temporarily shift behavior without reflecting a stable effect. *Ruled out by* the 8-week follow-up measurement after treatment ends. 3. **Demand characteristics:** Participants know they are in a study about polarization. *Mitigated by* framing the study as "user experience research" and using validated but non-obvious thermometer measures embedded in a longer battery. ## Decision Rules - **Ship / publish as supporting evidence:** Primary prediction met AND at least one secondary measure directionally consistent AND no alternative explanation survives robustness checks. - **Report as null:** CI for primary effect centered on zero; publish with clear null framing. - **Report as refutation:** Treatment group polarization increases more than control; this is the scientifically valuable outcome if it occurs — do not hide it. ## Honest Note Even tightened, this hypothesis faces a generalizability ceiling: effects on one platform in one election cycle do not license claims about "algorithms and polarization" as a category. Frame findings accordingly.

Prompt

Customise this prompt

Example output

Related prompts