ADaM Derivations You Can Defend: Versioning, Unit Tests, Rationale

Published on 21/12/2025

ADaM Derivations You Can Defend: Versioning Discipline, Unit Tests That Catch Drift, and Rationale You Can Read in Court

Table of Contents

Outcome-first ADaM: derivations that survive questions, re-cuts, and inspection sprints

What “defensible” means in practice

Defensible ADaM derivations are those that a new reviewer can trace, reproduce, and explain without calling the programmer. That requires three things: (1) explicit lineage from SDTM to analysis variables; (2) clear and versioned business rules tied to a SAP/estimand reference; and (3) automated unit tests that fail loudly when inputs, algorithms, or thresholds change. If any of these are missing, re-cuts become fragile and inspection time turns into archaeology.

State one compliance backbone—once

Anchor your analysis environment in a single, portable paragraph and reuse it across shells, SAP, standards, and CSR appendices: inspection expectations reference FDA BIMO; electronic records and signatures follow 21 CFR Part 11 and map to Annex 11; GCP oversight and roles align to ICH E6(R3); safety data exchange and narratives acknowledge ICH E2B(R3); public transparency aligns to ClinicalTrials.gov and EU postings under EU-CTR via CTIS; privacy follows HIPAA. Every change leaves a searchable audit trail; systemic issues route through CAPA; risk is tracked with QTLs and

managed via RBM. Patient-reported and remote elements feed validated eCOA pipelines, including decentralized workflows (DCT). All artifacts are filed to the TMF/eTMF. Standards use CDISC conventions with lineage from SDTM to ADaM, and statistical claims avoid ambiguity in non-inferiority or superiority contexts. Anchor this stance one time with compact authority links—FDA, EMA, MHRA, ICH, WHO, PMDA, and TGA—and then get back to derivations.

Define the outcomes before you write a single line of code

Set three measurable outcomes for your derivation work: (1) Traceability—every analysis variable includes a one-line provenance token (domains, keys, and algorithms) and a link to a test; (2) Reproducibility—a saved parameter file and environment hash can recreate results byte-identically for the same cut; (3) Retrievability—a reviewer can open the derivation spec, program, and associated unit tests in under two clicks from a portfolio tile. If you can demonstrate all three on a stopwatch drill, you are inspection-ready.

Regulatory mapping: US-first clarity that ports cleanly to EU/UK review styles

US (FDA) angle—event → evidence in minutes

US assessors frequently select an analysis number and drill: where is the rule, what data feed it, what are the intercurrent-event assumptions, and how would the number change if a sensitivity rule applied? Your derivations must surface that story without a scavenger hunt. Titles, footnotes, and derivation notes should name the estimand, identify analysis sets, and point to Define.xml, ADRG, and the unit tests that guard the variable. When a reviewer asks “why is this value here?” you should be able to open the program, show the spec, run the test, and move on in minutes.

EU/UK (EMA/MHRA) angle—identical truths, different wrappers

EMA/MHRA reviewers ask the same questions but often emphasize estimand clarity, protocol deviation handling, and consistency with registry narratives. If US-first derivation notes use literal labels and your lineage is explicit, the same package translates with minimal edits. Keep a label cheat (“IRB → REC/HRA; IND safety alignment → regional CTA safety language”) in your programming standards so everyone speaks the same truth with local words.

Dimension	US (FDA)	EU/UK (EMA/MHRA)
Electronic records	Part 11 validation & role attribution	Annex 11 controls; supplier qualification
Transparency	Consistency with registry wording	EU-CTR status via CTIS; UK registry alignment
Privacy	Minimum necessary & de-identification	GDPR/UK GDPR minimization/residency
Traceability set	Define.xml + ADRG/SDRG drill-through	Same, with emphasis on estimands clarity
Inspection lens	Event→evidence speed; unit test presence	Completeness & portability of rationale

Process & evidence: a derivation spec that actually prevents rework

The eight-line derivation template that scales

Use a compact, mandatory block for each analysis variable: (1) Name/Label; (2) Purpose (link to SAP/estimand); (3) Source lineage (SDTM domains, keys); (4) Algorithm (pseudo-code with thresholds and tie-breakers); (5) Missingness (imputation, censoring); (6) Time windows (visits, allowable drift); (7) Sensitivity (alternative rules); (8) Unit tests (inputs/expected outputs). This short form makes rules readable and testable and keeps writers, statisticians, and programmers synchronized.

Make lineage explicit and mechanical

List SDTM domains and keys explicitly—e.g., AE (USUBJID, AESTDTC/AETERM) → ADAE (ADY, AESER, AESDTH). If derived across domains, depict the join logic (join keys, timing rules). Ambiguity here is the #1 cause of late-stage rework because different programmers resolve gaps differently. A one-line lineage token in the program header prevents drift.

Enforce the eight-line derivation template in specs and program headers.
Require lineage tokens for every analysis variable (domains, keys, algorithm ID).
Map each rule to a SAP clause and estimand label (E9(R1) language).
Declare windowing/visit rules and how partial dates are handled.
Predefine sensitivity variants; don’t bolt them on later.
Create unit tests per variable with named edge cases and expected values.
Save parameters and environment hashes for reproducible reruns.
Drill from portfolio tiles → shell/spec → code/tests → artifacts in two clicks.
Version everything; tie changes to governance minutes and change summaries.
File derivation specs, tests, and run logs to the TMF with cross-references.

Decision Matrix: choose derivation strategies that won’t unravel during review

Scenario	Option	When to choose	Proof required	Risk if wrong
Baseline value missing or out-of-window	Pre-specified hunt rule (last non-missing pre-dose)	SAP allows single pre-dose window	Window spec; unit test with edge cases	Hidden imputation; inconsistent baselines
Multiple records per visit (duplicates/partials)	Tie-breaker chain (chronology → quality flag → mean)	When duplicates are common	Algorithm note; reproducible selection	Reviewer suspicion of cherry-picking
Time-to-event with heavy censoring	Explicit censoring rules + sensitivity	Dropout/administrative censoring high	Traceable lineage; ADTTE rules; tests	Bias claims; rerun churn late
Intercurrent events common (rescue, switch)	Treatment-policy primary + hypothetical sensitivity	E9(R1) estimand strategy declared	SAP excerpt; parallel shells	Estimand drift; mixed interpretations
Non-inferiority endpoint	Margin & scale stated in variable metadata	Primary or key secondary NI	Margin source; CI computation unit tests	Ambiguous claims; queries

Document the “why” where reviewers will actually look

Maintain a Derivation Decision Log: question → option → rationale → artifacts (SAP clause, spec snippet, unit test ID) → owner → date → effectiveness (e.g., query reduction). File in Sponsor Quality and cross-link from the spec and code so the path from a number to a decision is obvious.

QC / Evidence Pack: the minimum, complete set that proves your derivations are under control

Derivation specs (versioned) with lineage, rules, sensitivity, and unit tests referenced.
Define.xml pointers and reviewer guides (ADRG/SDRG) aligned to variable metadata.
Program headers with lineage tokens, change summaries, and run parameters.
Automated unit test suite with coverage report and named edge cases.
Environment lock files/hashes; rerun instructions that reproduce byte-identical results.
Change-control minutes linking rule edits to SAP amendments and shells.
Visual diffs of outputs pre/post change; threshold rules for acceptable drift.
Portfolio drill-through maps (tiles → spec → code/tests → artifact locations).
Governance minutes tying recurring defects to CAPA with effectiveness checks.
TMF cross-references so inspectors can open everything without helpdesk tickets.

Vendor oversight & privacy

Qualify external programming teams against your standards; enforce least-privilege access; store interface logs and incident reports near the codebase. Where subject-level listings are tested, apply data minimization and de-identification consistent with privacy and jurisdictional rules.

Versioning discipline: prevent drift with simple, humane rules

Semantic versions plus change summaries

Use semantic versioning for specs and code (MAJOR.MINOR.PATCH). Every change must carry a top-of-file summary that states what changed, why (SAP clause/governance), and how to retest. Small cost now, huge savings later when a reviewer asks why Week 24 changed on a re-cut.

Freeze tokens and naming

Freeze dataset and variable names early. Late renames create invisible fractures across shells, CSR text, and validation macros. If you must rename, deprecate with an alias period and unit tests that fail if both appear simultaneously to avoid shadow variables.

Parameterize time and windows

Put time windows, censoring rules, and reference dates in a parameters file checked into version control. It prevents “magic numbers” in code and lets re-cuts use the right windows without manual edits. Unit tests should load parameters so a changed window forces test updates, not silent drift.

Unit tests that matter: what to test and how to keep tests ahead of change

Test the rules you argue about

Focus tests on edge cases that trigger debate: partial dates, overlapping visits, duplicate ids, ties in “first” events, and censoring at lock. Encode one or two examples per edge and assert exact expected values. When an algorithm changes, tests should fail where your conversation would have started anyway.

Golden records and minimal fixtures

Create tiny, named fixtures that cover each derivation pattern. Avoid giant “real” datasets that hide signal; use synthetic rows with clear intent. Keep golden outputs in version control; diffs show exactly what changed and why, and reviewers can read them like a storyboard.

Coverage that means something

Report code coverage but don’t chase 100%—chase rule coverage. Every business rule in your spec should have at least one test. Include failure-path tests that assert correct error messages when assumptions break (e.g., missing keys, illegal window values).

Templates reviewers appreciate: paste-ready tokens, footnotes, and rationale language

Spec tokens for fast comprehension

Purpose: “Supports estimand E1 (treatment policy) for primary endpoint.”
Lineage: “SDTM LB (USUBJID, LBDTC, LBTESTCD) → ADLB (ADT, AVISIT, AVAL).”
Algorithm: “Baseline = last non-missing pre-dose AVAL within [−7,0]; change = AVAL – baseline; if missing baseline, impute per SAP §[ref].”
Sensitivity: “Per-protocol window [−3,0]; tipping point ±[X] sensitivity.”

CSR-ready footnotes

“Baseline defined as the last non-missing, pre-dose value within the pre-specified window; if multiple candidate records exist, the earliest value within the window is used. Censoring rules are applied per SAP §[ref], with administrative censoring at database lock. Intercurrent events follow the treatment-policy strategy; a hypothetical sensitivity is provided in Table S[ref].”

Rationale sentences that quell queries

“The tie-breaker chain (chronology → quality flag → mean of remaining) minimizes bias when multiple records exist and reflects clinical practice where earlier, higher-quality measurements dominate. Sensitivity analyses demonstrate effect stability across window definitions.”

FAQs

How detailed should an ADaM derivation spec be?

Short and specific. Use an eight-line template covering purpose, lineage, algorithm, missingness, windows, sensitivity, and unit tests. The goal is that a reviewer can forecast the output’s behavior without reading code, and a programmer can implement without guessing.

Where should we store derivation rationale so inspectors can find it?

In three places: the spec (short form), the program header (summary and links), and the decision log (why this rule). Cross-link all three and file to the TMF. During inspection, open the decision log first to show intent, then the spec and code to show execution.

What makes a good unit test for ADaM variables?

Named edge cases with minimal fixtures and explicit expected values. Tests should assert both numeric results and the presence of required flags (e.g., imputation indicators). Include failure-path tests that prove the program rejects illegal inputs with clear messages.

How do we handle multiple registry or public narrative wordings?

Keep derivation text literal and map public wording via a label cheat sheet in your standards. If you change a public narrative, open a change control ticket and verify no estimand or analysis definitions drifted as a side effect.

How do we prevent variable name drift across deliverables?

Freeze names early, use aliases temporarily when renaming, and add tests that fail on simultaneous presence of old/new names. Update shells, CSR templates, and macros from a single dictionary to keep words and numbers synchronized.

What evidence convinces reviewers that our derivations are stable across re-cuts?

Byte-identical rebuilds for the same data cut, environment hashes, parameter files, and visual diffs of outputs pre/post change with thresholds. File stopwatch drills showing you can open spec, code, and tests in under two clicks and reproduce results on demand.