001 — Healthcare Analytics

NY Medicaid
Provider
Coverage Gap

End-to-end analytics pipeline processing 18.6M rows of federal healthcare data across three government datasets — identifying where New York's 5.8 million Medicaid members face the greatest provider shortages by specialty.

ETL Pipeline Gap Analysis Python SQL DuckDB Pandas Power BI CMS Data HRSA
View on GitHub

2026

Data Analyst

Solo Project

Healthcare

NPPES · CMS · HRSA HPSA

18.6M
Rows Processed
3
Federal Datasets Joined
70
Shortage Areas Flagged
5.8M
Medicaid Members
Process
Workflow
01
Data Ingestion
Downloaded NPPES, CMS Medicaid Enrollment, and HRSA HPSA datasets — 18.6M rows of raw federal government data.
02
Cleaning
Standardized provider taxonomy codes, removed duplicates, filtered active Medicaid-enrolled providers using Python and Pandas.
03
ETL Pipeline
Joined all three datasets on NPI and county codes using DuckDB — engineered an Access Gap Index scoring each specialty.
04
Validate
Cross-validated gap scores against HRSA HPSA shortage designations to confirm findings were real — not model artifacts.
05
Reporting
Delivered an HTML & Power BI dashboard surfacing gap tiers across 12 specialties for non-technical stakeholders.
Visualization
Provider Gap
Dashboard
NY Medicaid — Provider Coverage Gap Analysis
Source: NPPES · CMS Medicaid Enrollment · HRSA HPSA · Data as of 2024
Medicaid Members
5.8M
Total Providers
145,190
Shortage Areas
70
Avg HPSA Score
15.5
Members per provider by specialty — click a bar to filter
Critical — HPSA confirmed
Moderate — HPSA flagged
3,000 critical threshold
Gap tier breakdown
12
specialties
Critical
2
Moderate
10
↑ click a bar to filter
Key findings
Clinical Nurse Specialists16,341 members per provider — 5× above the critical threshold
General Practice5,217 members per provider — critically undersupplied
70 shortage areas confirmedHPSA score 15.5 / 25 — federally validated severe shortage
What I Found
Key Findings
Critical Gap
Clinical Nurse Specialists
5× Over Threshold
16,341 Medicaid members per provider — the single most undersupplied specialty, more than five times the 3,000-member critical threshold across the entire dataset.
16,341Members per provider
Federal Validation
80% of Gaps Confirmed
by HRSA Designations
Cross-validated the Access Gap Index against HRSA HPSA data. 80% of critical gaps independently confirmed by federal shortage designations — average HPSA score 15.5 out of 25.
15.5Avg HPSA score out of 25
Scale
5.8M Members Across
70 Shortage Areas
All 5.8M Medicaid members mapped against 70 federally designated shortage areas — surfacing which counties and specialties need the most urgent intervention.
70Federally designated shortage areas
Stack
Tools & Data
Data Sources
  • CMS — Centers for Medicare & Medicaid Services
  • HRSA — Health Resources & Services Administration
  • NPPES — National Plan & Provider Enumeration System
  • 18.6M rows of raw federal government data
Software Used
  • Python (Pandas)
  • SQL + DuckDB
  • Power BI
  • Excel
  • Chart.js
  • HTML / CSS
Project Lifecycle
From Problem
to Delivery
01
Problem
Define the Question
Where are New York's 5.8 million Medicaid members most underserved — and which specialties are most critically short of providers?
02
Collect
Federal Data Acquisition
Downloaded and unified three federal datasets — NPPES, CMS Medicaid Enrollment, and HRSA HPSA — totaling 18.6M rows of raw government data.
03
Build
ETL Pipeline & Gap Index
Cleaned, filtered, and joined all three datasets using Python and DuckDB. Engineered an Access Gap Index scoring every specialty by members-per-provider ratio across all NY counties.
04
Validate
HRSA Cross-Validation
Cross-checked computed gap scores against federal HRSA shortage designations to confirm findings were real — not artifacts of the model. 80% of critical gaps independently confirmed.
05
Ship
Interactive Dashboard
Delivered an interactive Power BI dashboard surfacing gap tiers across 12 specialties.
Power BI Dashboard NY Medicaid Provider Coverage Gap
Power BI Dashboard
← Previous Project
Next Project →
Amazon Products
Catalog Analytics
View project →