文章目录
1. Observations and Variables
The SDTM is built around the concept of observations collected about subjects who participated in a clinical study. Each observation can be described by a series of variables, corresponding to a row in a dataset.
Variables can be classified into 5 major roles:
Variables | Roles |
---|---|
Identifier | identify the study, subject, domain, and sequence number of the record |
Topic | specify the focus of the observation (e.g., the name of a lab test) |
Timing | describe the timing of the observation (e.g., start date and end date) |
Qualifier | include additional illustrative text or numeric values that describe the results or additional traits of the observation (e.g., units, descriptive adjectives) |
Rule | describe the condition to start, end, branch, or loop in the Trial Design Model |
Qualifier variables can be further categorized into 5 subclasses:
Qualifiers | Purpose | Examples |
---|---|---|
Grouping Qualifiers | group together a collection of observations within the same domain | –CAT --SCAT |
Result Qualifiers | describe the specific results associated with the topic variable in a Findings dataset. They answer the question raised by the topic variable | –ORRES --STRESC --STRESN |
Synonym Qualifiers | specify an alternative name for a particular variable in an observation. | ‑‑MODIFY --DECOD |
Record Qualifiers | define additional attributes of the observation record as a whole (rather than describing a particular variable within a record) | –REASND, AESLIFE in AE;AGE, SEX, and RACE in DM |
Variable Qualifiers | further modify or describe a specific variable within an observation and are only meaningful in the context of the variable they qualify | –ORRESU, --ORNRHI, ‑‑ORNRLO are Variable Qualifiers of --ORRES |
E.g.: Subject 101 had mild nausea starting on study day 6.
- Identifier: 101
- Topic: “Nausea”
- Timing: study day 6
- Record Qualifier: mild
2. Datasets and Domains
Observations about study subjects are normally collected for all subjects in a series of domains. A domain is defined as a collection of logically related observations with a common topic. Each domain is represented by a single dataset.
All datasets are structured as flat files with rows representing observations and columns representing variables. Each dataset is described by metadata definitions that provide information about the variables used in the dataset. The metadata are described in a data definition document (i.e., a Define-XML document) that is submitted with the data to regulatory authorities.
Define-XML specifies seven distinct metadata attributes to describe SDTM data:
-
The Variable Name (limited to 8 characters for compatibility with the SAS Transport format)
-
A descriptive Variable Label, using up to 40 characters, which should be unique for each variable in the dataset
-
The data Type (e.g., whether the variable value is a character or numeric)
-
The set of controlled terminology for the value or the presentation format of the variable (Controlled Terms, Codelist, or Format)
-
Controlled terminology (CT) is now represented one of four ways:
• A single asterisk when there is no specific CT available at the current time, but the SDS Team expects that sponsors may have their own CT and/or the CDISC Controlled Terminology Team may be developing CT.
• A list of controlled terms for the variable when values are not yet maintained externally
• The name of an external codelist whose values can be found via the hyperlinks in either the domain or by accessing the CDISC Controlled Terminology as outlined in Appendix C – Controlled Terminology.
• A common format such as ISO 8601 -
The Origin of each variable
-
The Role of the variable, which determines how the variable is used in the dataset. e.g., Identifier, Topic, Timing, or the five types of Qualifiers.
-
Comments or other relevant information about the variable or its data included by the sponsor as necessary to communicate information about the variable or its contents to a regulatory agency.
Each domain dataset is distinguished by a unique, 2-character code that should be used consistently throughout the submission. This code is used in 4 ways:
- as the dataset name
- as the value of the DOMAIN variable in that dataset
- as a prefix for most variable names in that dataset
- as a value in the RDOMAIN variable in relationship tables.
2.1 The General Observation Classes
Most subject-level observations collected during the study should be represented according to 1 of the 3 SDTM general observation classes:
Class | Purpose | Examples |
---|---|---|
Interventions | investigational, therapeutic, and other treatments that are administered to the subject | exposure to study drug; concomitant medications; use of alcohol, tobacco, or caffeine |
Events | planned protocol milestones; occurrences, conditions, or incidents independent of planned study evaluations | randomization, study completion, adverse events, medical history |
Findings | the observations resulting from planned evaluations to address specific tests or questions | laboratory tests, ECG testing, questions listed on questionnaires |
2.1.1 Interventions
Procedure Agents (AG)
Concomitant/Prior Medications (CM)
Exposure (EX)
Exposure as Collected (EC)
Meal Data (ML)
Procedures (PR)
Substance Use (SU)
2.1.2 Events
Adverse Events (AE)
Biospecimen Events (BE)
Clinical Events (CE)
Disposition (DS)
Healthcare Encounters (HO)
Medical History (MH)
Protocol Deviations (DV)
2.1.3 Findings
Product Accountability (DA)
Death Details (DD)
ECG Test Results (EG)
Inclusion/Exclusion Criteria Not Met (IE)
Specimen-based Findings Domain
Biospecimen Findings (BS)
Cell Phenotype Findings (CP)
Genomics Findings (GF)
Immunogenicity Specimen Assessments (IS)
Laboratory Test Results (LB)
Microbiology Domains
Microscopic Findings (MI)
Pharmacokinetics Domains
Morphology (MO)
Morphology/Physiology Domains
Generic Morphology/Physiology Specification Cardiovascular System Findings (CV)
Musculoskeletal System Findings (MK)
Nervous System Findings (NV)
Ophthalmic Examinations (OE)
Reproductive System Findings (RP)
Respiratory System Findings (RE)
Urinary System Findings (UR)
Physical Examination (PE)
Questionnaires, Ratings, and Scales (QRS) Domains (FT, QS, RS)
Functional Tests (FT)
Questionnaires (QS)
Disease Response and Clin Classification (RS)
Subject Characteristics (SC)
Subject Status (SS)
Tumor/Lesion Domains
Tumor/Lesion Identification (TU)
Tumor/Lesion Results (TR)
Vital Signs (VS)
FINDINGS ABOUT EVENTS OR INTERVENTIONS
Findings About Events or Interventions (FA)
Skin Response (SR)
2.2 Datasets Other than General Observation Class Domains
The SDTM includes 4 types of datasets other than those based on the general observation classes:
Types | Purpose | Examples |
---|---|---|
Special purpose domains | subject-level data that do not conform to 1 of the 3 general observation classes | Demographics (DM), Comments (CO), Subject Elements (SE), Subject Visits (SV) |
Trial Design Model (TDM) datasets | represent information about the study design but do not contain subject data | Trial Arms (TA), Trial Elements (TE) |
Relationship datasets | represent relationships among datasets or records | the RELREC and SUPP-- datasets |
Study Reference datasets | provide structures for representing study-specific terminology used in subject data | Device Identifiers (DI), Non-host Organism Identifiers (OI) |