Statistical test chooser with visual decision tree
Views
8.1K
Copies
2.1K
Likes
1.4K
Comments
0
Copy rate
26.0%
Prompt
You are a biostatistics professor who makes complex statistical concepts accessible. I need to choose the right statistical test for my research.
My study details:
- Research question: [research_question]
- Independent variable(s): [independent_variables]
- Dependent variable(s): [dependent_variables]
- Sample size: [sample_size]
Walk me through a decision tree to select the correct test. At each node, explain in plain language why we're asking that question. Then provide:
1. The recommended primary test
2. An alternative if assumptions are violated
3. Effect size measure to report
4. How to report the result in APA format
5. Common mistakes to avoid with this testCustomise this prompt
Fill in 4 variables to personalise this prompt
Preview
You are a biostatistics professor who makes complex statistical concepts accessible. I need to choose the right statistical test for my research.
My study details:
- Research question: [research_question]
- Independent variable(s): [independent_variables]
- Dependent variable(s): [dependent_variables]
- Sample size: [sample_size]
Walk me through a decision tree to select the correct test. At each node, explain in plain language why we're asking that question. Then provide:
1. The recommended primary test
2. An alternative if assumptions are violated
3. Effect size measure to report
4. How to report the result in APA format
5. Common mistakes to avoid with this test
Example output
# Statistical Test Selection: Medication Adherence Study
**Research Question:** Does a mobile app reminder system improve medication adherence compared to standard care in patients with Type 2 diabetes?
**IV:** Group (app reminder vs. standard care) — categorical, 2 levels, between-subjects
**DV:** Medication adherence score (0-100%, continuous) measured at baseline, 3 months, and 6 months
**Sample:** n=120 (60 per group)
---
## Decision Tree Walk-Through
### Node 1: What type of question are you asking?
- Comparing groups → **Yes** (app vs. standard care)
- Predicting an outcome → No
- Measuring association → No
*We go down the "comparing groups" branch.*
### Node 2: How many independent variables?
- One IV (group assignment) → **Yes**
- Multiple IVs → No
*But wait — you also have repeated measurements over time. This means you effectively have two factors: Group (between-subjects) and Time (within-subjects). This is a mixed design.*
### Node 3: Is the design between-subjects, within-subjects, or mixed?
- Between-subjects only → Would be if you measured adherence once
- Within-subjects only → Would be if everyone got both treatments
- **Mixed (split-plot)** → ✅ **Your design** — Group is between, Time is within
### Node 4: What is the measurement level of the DV?
- Continuous (interval/ratio) → ✅ Adherence percentage is continuous
- Ordinal → Would need non-parametric alternatives
- Categorical → Would need chi-square family
### Node 5: Are the assumptions of parametric tests likely met?
For your data, we need to check:
- **Normality** of residuals (Shapiro-Wilk test, Q-Q plots)
- **Sphericity** (Mauchly's test — for repeated measures)
- **Homogeneity of variance** (Levene's test at each time point)
- **No extreme outliers** (check boxplots)
With n=60 per group, the Central Limit Theorem gives you reasonable robustness to normality violations. Adherence scores (0-100%) may be ceiling/floor bounded, so check distributions carefully.
---
## Recommended Primary Test
### Mixed-Design ANOVA (Split-Plot ANOVA)
Also called: Two-Way Mixed ANOVA, Repeated Measures ANOVA with Between-Subjects Factor
**What it tests:**
1. **Main effect of Group:** Is there an overall difference between app and standard care (averaging across time points)?
2. **Main effect of Time:** Does adherence change over time (averaging across groups)?
3. **Group × Time interaction:** ✅ **This is your key test** — Does the pattern of change over time differ between groups? (i.e., does the app group improve more than standard care?)
**Software implementation:**
```r
# R (using afex package for correct Type III SS)
library(afex)
model <- aov_ez(
id = "participant_id",
dv = "adherence_score",
data = adherence_data,
between = "group",
within = "time_point"
)
summary(model)
# Follow-up: Simple effects of Group at each time point
library(emmeans)
emm <- emmeans(model, ~ group | time_point)
pairs(emm, adjust = "bonferroni")
```
---
## Alternative If Assumptions Are Violated
### If sphericity is violated (common):
- **Greenhouse-Geisser correction** (ε < 0.75) or **Huynh-Feldt correction** (ε ≥ 0.75)
- These adjust the degrees of freedom to make the F-test more conservative
- Most software reports these automatically
### If normality is seriously violated:
- **Linear Mixed-Effects Model (LMM)** — more robust, handles missing data better, does not require sphericity
```r
library(lmerTest)
model_lmm <- lmer(
adherence_score ~ group * time_point + (1 | participant_id),
data = adherence_data
)
anova(model_lmm)
```
- LMM is increasingly preferred over traditional mixed ANOVA even when assumptions are met, because it handles unbalanced designs and missing data gracefully.
### If data is strongly non-normal or ordinal:
- **Aligned Rank Transform ANOVA** (ARTool package in R) — non-parametric alternative that handles interactions
---
## Effect Size Measures
| Effect | Measure | Interpretation |
|--------|---------|---------------|
| Group × Time interaction | **Partial η²** (eta-squared) | Small: 0.01, Medium: 0.06, Large: 0.14 |
| Pairwise group differences at each time point | **Cohen's d** | Small: 0.2, Medium: 0.5, Large: 0.8 |
| Overall model | **Generalized η²** (recommended for mixed designs by Bakeman, 2005) | More comparable across studies than partial η² |
---
## APA-Style Reporting Template
> A two-way mixed ANOVA was conducted to examine the effect of intervention type (mobile app reminder vs. standard care) and time (baseline, 3 months, 6 months) on medication adherence scores. Mauchly's test indicated that the assumption of sphericity was violated for the main effect of time, χ²(2) = 12.45, p = .002; therefore, Greenhouse-Geisser corrected results are reported (ε = .78).
> There was a statistically significant interaction between group and time on adherence scores, F(1.56, 184.08) = 14.32, p < .001, partial η² = .11. Simple effects analysis with Bonferroni correction revealed no significant difference between groups at baseline (p = .82, d = 0.04), a significant difference at 3 months (p = .003, d = 0.58), and a larger significant difference at 6 months (p < .001, d = 0.89). The app reminder group showed a 23-percentage-point increase in adherence from baseline to 6 months (M = 62.3 to M = 85.1), while the standard care group showed a 6-percentage-point increase (M = 61.8 to M = 67.9).
---
## Common Mistakes to Avoid
1. **Ignoring the interaction and only reporting main effects.** The interaction is almost always the most important result in a mixed design. A significant main effect of Group is misleading if the groups only differ at some time points.
2. **Running separate t-tests at each time point instead of the omnibus ANOVA.** This inflates Type I error and ignores the repeated-measures structure. Always run the mixed ANOVA first, then follow up with simple effects if the interaction is significant.
3. **Not checking or correcting for sphericity.** Violated sphericity inflates the F-statistic, leading to false positives. Always report Mauchly's test and apply corrections.
4. **Forgetting to report effect sizes.** p-values tell you whether an effect exists; effect sizes tell you whether it matters. With n=120, even trivial effects can be "significant."
5. **Listwise deletion of missing data.** If participants miss a time point, mixed ANOVA drops them entirely. Switch to LMM if you have >5% missing data — it uses all available observations.
6. **Not specifying Type III sums of squares.** R's default `aov()` uses Type I SS, which is order-dependent. Use `afex::aov_ez()` or specify `type = 3` in your ANOVA call.