Most real-world evidence studies do not fail in the analysis. They fail in the design — quietly, weeks before anyone opens a statistics package. The treatment groups were defined at different moments in time. The comparator includes people who could never have received the treatment. Follow-up starts before the decision it is supposed to inform. By the time the forest plot exists, the question has already been lost.

Target trial emulation is the discipline that prevents this. It is not a statistical method. It is a protocol habit: before touching the data, specify the randomized trial you would run if you could — then emulate each element of that protocol, explicitly, in observational data.

The core move

The framework, developed by Miguel Hernán and James Robins, asks one organizing question: what is the target trial? If you cannot describe the hypothetical randomized trial that would answer your question, no amount of claims data will answer it either.

Writing the target trial protocol forces seven decisions that observational studies routinely leave implicit:

  1. Eligibility criteria. Who would qualify for this trial, assessed using only information available at enrollment?
  2. Treatment strategies. What exactly are the strategies being compared — initiation, dose, duration, discontinuation rules? "Exposed vs. unexposed" is not a strategy.
  3. Assignment procedures. In the trial, assignment is randomized. In the emulation, you must identify and adjust for the factors that determined treatment in the real world.
  4. Time zero. The moment eligibility is met and a strategy is assigned — the single most consequential decision in the design.
  5. Follow-up period. When it starts, when it ends, and what censors it.
  6. Outcomes. Defined and measurable identically across strategies.
  7. Causal contrast. Intention-to-treat effect or per-protocol effect — named, not implied.

None of this requires advanced methods. It requires the willingness to write the protocol before the analysis, and to defend each element the way a trialist would.

Time zero is where studies die

Most of the catastrophic biases in observational research are time-zero failures wearing different costumes.

Immortal time bias appears when the treated group's clock starts at eligibility, but treatment is defined by something that happens later — so the treated group is guaranteed to survive the gap. Prevalent-user bias appears when "current users" are compared with non-users; everyone harmed early by the drug has already left the user pool, leaving survivors who make the treatment look protective.

The target trial framing dissolves both, because a real trial cannot enroll someone yesterday based on what they will do next month. Align three things at the same moment — eligibility met, strategy assigned, follow-up started — and immortal time has nowhere to hide.

A real trial cannot enroll a patient based on what happens after enrollment. Neither can your emulation.

What it buys you in the room

A target trial protocol changes the conversation with every audience that matters.

With reviewers, it converts a methods argument into a checklist: here is the trial, here is how each element was emulated, here is where the emulation falls short. Disagreement becomes specific and resolvable. With regulators, it speaks the native language — FDA's real-world evidence framework and subsequent guidance consistently reward designs that look like trials: pre-specified protocols, explicit time zero, named estimands. With internal decision-makers, the protocol forces the question that matters commercially: what decision will this evidence support, and against what standard will it be judged?

The framework also has an honest answer to its own limits. Emulation handles the design biases — selection, immortal time, misaligned eligibility. It does not manufacture exchangeability. If treatment assignment depended on something your data cannot see, no protocol fixes that; it can only make the assumption visible and testable through sensitivity analysis and quantitative bias analysis. That visibility is the point. Evidence that names its assumptions survives cross-examination. Evidence that hides them does not.

The operator checklist

Before your next observational protocol leaves the building, it should answer:

  • Can the target trial be written on one page — population, strategies, assignment, time zero, follow-up, outcomes, contrast?
  • Is eligibility assessed using only pre-time-zero information?
  • Do eligibility, assignment, and the start of follow-up coincide?
  • Are the treatment strategies ones a clinician could actually follow?
  • Is the causal contrast named — and does the analysis plan match it?
  • Which confounders does assignment plausibly depend on, and what happens to the estimate when the unmeasured ones are stress-tested?

If any answer is "it depends" or "we'll handle it in the analysis," the design is not done.

Trials remain the benchmark for causal evidence. But most questions that matter — coverage, label expansion, safety in populations trials never enrolled — will be answered, if they are answered at all, with real-world data. The choice is not between a trial and an observational study. It is between an observational study designed like a trial and one designed like a data pull. Only one of them holds up when the decision is on the line.


References

  1. Hernán MA, Robins JM. Using big data to emulate a target trial when a randomized trial is not available. Am J Epidemiol. 2016;183(8):758–764. doi.org/10.1093/aje/kwv254
  2. Hernán MA, Wang W, Leaf DE. Target trial emulation: a framework for causal inference from observational data. JAMA. 2022;328(24):2446–2447. doi.org/10.1001/jama.2022.21383
  3. Suissa S. Immortal time bias in pharmacoepidemiology. Am J Epidemiol. 2008;167(4):492–499. doi.org/10.1093/aje/kwm324
  4. US Food and Drug Administration. Framework for FDA's Real-World Evidence Program. December 2018. fda.gov
  5. Hoffman SR, Gangan N, Chen X, Smith JL, et al. A step-by-step guide to causal study design using real-world data. Health Serv Outcomes Res Method. 2024. doi.org/10.1007/s10742-024-00333-6