Project Summary
This project involved the end-to-end analysis of a Phase III, randomized, double-blind clinical trial for CardioX, a novel antihypertensive drug. The dataset contained records for 997 subjects across 4 international sites (USA, France, Canada, Australia).
The primary objective was to validate the drug’s efficacy in lowering Systolic Blood Pressure (SBP) and assess its safety profile (liver toxicity and adverse events) to support a hypothetical regulatory submission (FDA/EMA).
Disclaimer:
All datasets used in this project are fully simulated and do not contain any real patient, site, or sponsor information. The dataset was intentionally designed to mimic real-world clinical trial operations, data structures, and performance trends while ensuring full compliance with GCP, HIPAA, GDPR, and clinical research confidentiality standards.
Problem Statement
Clinical trial data is rarely “analysis-ready” upon extraction. The raw dataset for the CardioX trial contained several critical issues that risked compromising the study’s integrity:
- Data Quality Issues: Missing Subject IDs, missing lab values (ALT/Creatinine), and inconsistent naming conventions (“Drug X” vs “CardioX”).
- Compliance Variability: A significant portion of participants had adherence rates below 80% or above 120%, requiring rigorous population segmentation.
- Safety Signals: Early anecdotal reports suggested potential liver toxicity, requiring a focused analysis of lab outliers which was not immediately visible in the raw averages.
The stakeholder (Clinical Operations & Medical Monitor) required a clean, GCP-compliant dataset and a statistical report to determine if the trial met its primary endpoints.
Tools Used
- Microsoft Excel (Power Query, Pivot Tables, Interactive Slicers, Custom KPIs)
- Excel Formulas
- Conditional Formatting
- Data Modeling within Excel
Methodology (Step-by-Step)
I employed a rigorous Clinical Data Management (CDM) workflow:
- Data Ingestion & Profiling: Loaded raw CSV extracts into Excel to identify nulls, outliers, and formatting inconsistencies.
- Data Cleaning & Transformation:
- Standardized categorical variables (e.g., converting “Yes/No” to Binary “1/0” for calculation).
- Imputed missing safety lab data using group medians to preserve sample size.
- Renamed “Drug X” to the trade name “CardioX” for final reporting.
- Population Flagging (Cohort Definitions):
- Safety Population (ITT): All randomized subjects (N=997).
- Per-Protocol Population (PP): Filtered for subjects with 80-120% compliance and no withdrawal (N=481).
- Statistical Analysis: Used Pivot Tables to calculate Mean Change from Baseline, Responder Rates, and Adverse Event frequencies.
Dataset Description
CardioX Trial Dataset (subject_id, country, site_id, age, sex, BMI, treatment_group, dose_mg, ALT, creatinine, compliance_%, baseline_SBP, baseline_DBP, month12_SBP, month12_DBP, adverse_event, serious_AE, BP_reduction, responder, withdrawn).
Executive Summary
The Phase III clinical trial for CardioX (an Investigational Antihypertensive) evaluated 997 randomised subjects. The study successfully met its primary efficacy endpoint.
- Efficacy: CardioX demonstrated a superior reduction in Systolic Blood Pressure (SBP) compared to Placebo (15.1 mmHg vs. 6.3 mmHg).
- Safety: The overall Adverse Event (AE) rate was lower in the CardioX group (6.7%) compared to Placebo (8.2%), though Serious Adverse Events (SAEs) were slightly higher in the active arm (8 vs 5).
Recommendation: The drug demonstrates strong clinical efficacy with a manageable safety profile, supporting progression to regulatory submission, provided continued monitoring for liver enzyme elevations is maintained.
Analytical Findings
A. Efficacy Analysis (Per-Protocol Population)
Analysis restricted to compliant subjects (n=481) to assess the drug’s true physiological effect.
- Primary Endpoint Met: CardioX demonstrated a robust reduction in blood pressure.
- CardioX Mean Reduction: 15.12 mmHg
- Placebo Mean Reduction: 6.29 mmHg
- Net Benefit: +8.83 mmHg improvement over Placebo.
- Responder Rate: 48.2% of participants achieved the target clinical reduction.
- Subgroup Robustness: Efficacy remained stable across all BMI categories, including Obese (BMI 31-35) subjects, indicating the drug is effective regardless of patient weight.
B. Safety Analysis (Intent-to-Treat Population)
Analysis performed on all 997 subjects to capture all potential risks.
- Adverse Events (AE):
- Total AEs were actually higher in the Placebo group (82 events) vs. CardioX (67 events), suggesting the drug is well-tolerated.
- Serious Adverse Events (SAE):
- A slight imbalance was observed: 8 SAEs in CardioX vs. 5 in Placebo. While low, this requires medical review.
- Liver Toxicity (ALT Levels):
- Average ALT levels were normal (34.4 U/L).
- However, specific outlier analysis using the ALT>40 flag revealed that the SAE cases in the CardioX group correlated with transient enzyme elevation.
- Adverse Events (AE):
Conclusion and Recommendations
The analysis confirms that CardioX is a highly effective antihypertensive agent with a manageable safety profile.
- Approval Recommendation: The data supports moving forward with regulatory submission, as the primary efficacy endpoint was met with a significant margin.
- Labeling Note: Due to the slight increase in Serious Adverse Events associated with liver enzymes, I recommend including a protocol for routine liver function monitoring in the commercial label.
- Operational Success: The imputation and cleaning strategies successfully recovered 99% of the dataset for safety analysis, ensuring a robust statistical power.