Ask me what skills you need
What are you building?
Tell me what you're working on and I'll find the best agent skills for you.
How to read experiment results without fooling yourself. Confidence intervals, p-values, multiple testing, sequential testing, CUPED, heterogeneous treatment effects, ratio metrics, network effects, dashboard reconciliation, and the interpretation failures that produce confidently wrong shipping decisions.
A data-team-mentor's playbook for interpreting experiment results without fooling yourself.
The result panel is the moment-of-truth for an experiment. The numbers on it determine whether you ship, kill, or iterate. They also expose every shortcut taken in the design phase: an underpowered test produces wide confidence intervals; a peeked test produces a too-narrow p-value; a ratio metric without delta-method correction produces overconfident lift estimates. Most ship-the-wrong-thing decisions trace back to misreading the result panel.
This skill is the discipline that prevents misreading. It assumes the experiment was designed well (see the experiment-design skill). It assumes the platform's results panel is technically correct (most modern platforms are; some older ones are not). It assumes you can read a number off a screen. The hard part is knowing what each number actually means and what it does not, and that is what is here.
When to use this skill: any time you are reading an experiment result panel and about to make a ship, kill, or iterate decision.
npx skills add rampstackco/claude-skills --skill experimentation-analyticsHow clear and easy to understand the SKILL.md instructions are, rated from 1 to 5.
Mostly clear, but there are still a few confusing or poorly structured parts.
How directly an agent can act on the SKILL.md instructions, rated from 1 to 5.
Some hints are present, but an agent still has to guess many steps.