How to do A/B test?

1 Pre-Experiment & Preparition

1.1 Define Clear Objective & Metrics

You must move beyond a vague "affects the final results." What part of the algorithm are you changing? (e.g., scoring weights, match distance, ETA prediction model, dispatching logic)

1.2 Unit of Diversion & Randomization Unit

1.3 Hypothesis Formulation

  • Null Hypothesis (H0): The new matching algorithm does not change the mean of our primary metric (e.g., Total Completed Trips per day per city) compared to the old algorithm.

  • Alternative Hypothesis (H1): The new matching algorithm does change the mean of our primary metric. This can be two-tailed ("is different") or one-tailed ("increases" if you have strong directional belief).

  • Use power analysis (1 - β, typically 80%) and significance level (α, typically 5%).

  • Duration: Run long enough to capture full weekly cycles 

 

2 Experiment Execution & Monitoring

  • Start with a small smoke test (e.g., 1% traffic) to check for critical bugs/crashes.

  • Ramp up gradually (5% → 10% → 50%) while monitoring core system health metrics (latency, error rates).

  • Use holdbacks if possible: keep a small portion of users (e.g., 1%) permanently in the control group to measure long-term effects and novelty biases.

  • Real-Time Monitoring

 

3 Analysis & Hypothesis Testing

Improtant phase to evaluate variance. Because if a metric like CTR is increased, but the variance is high, then this experiment is not effective.

  • Delta Method 
  • Bootstrap (small samples)

 

posted @ 2026-01-18 18:45  ylxn  阅读(4)  评论(0)    收藏  举报