

Datasets Available to IHPI Members
The IHPI Data & Methods Hub (DMH) maintains a repository of high-value datasets that IHPI members can utilize to advance their research. Many of these datasets are available for free for IHPI members. If you have any questions about any of the datasets listed below, please connect with our team at [email protected]. We will be happy to chat with you and help determine which data resource is right for you.
Electronic Health Record Data
Truveta's data portal includes EHR data collected from clinician notes, images, and structured data from more than 30 health systems, linked with claims, social determinants of health, and mortality data. The daily-updated data represent more than 100 million de-identified patients with all types of insurance, from more than 800 hospitals and 20,000 clinics across the U.S.
To view data documentation, log in to https://studio.truveta.com with your @umich.edu email address. Once logged in, click the question mark icon in the upper right corner to find links to various documents.
Administrative Claims Databases
View HCUP data availability here
HCUP State Inpatient Databases
The State Inpatient Databases includes inpatient discharge records from community hospitals in a particular state. The SID files encompass all patients, regardless of payer, providing a unique view of inpatient care in a defined market or state over time.
HCUP State Ambulatory Surgery and Services Databases
The SASD are state-specific files that include data for ambulatory surgery and other outpatient services from hospital-owned facilities. In addition, some States provide ambulatory surgery and outpatient services from nonhospital-owned facilities. The uniform format of the SASD helps facilitate cross-state comparisons. The SASD are well suited for research that requires a complete enumeration of hospital-based ambulatory surgeries within geographic areas or States.
HCUP State Emergency Department Databases
The State Emergency Department Databases are a set of longitudinal State-specific emergency department (ED) databases that capture discharge information on all emergency department visits that do not result in an admission. Information on patients seen in the emergency room and then admitted to the hospital is included in the State Inpatient Databases (SID).
HCUP Nationwide Inpatient Sample
The Nationawide Inpatient Sample is the largest publicly available all-payer inpatient healthcare database designed to produce U.S. regional and national estimates of inpatient utilization, access, cost, quality, and outcomes. Unweighted, it contains data from around 7 million hospital stays each year. Weighted, it estimates around 35 million hospitalizations nationally.
HCUP Nationwide Emergency Department Sample
The Nationawide EEmergency Department Sample is the largest all-payer emergency department (ED) database in the United States, yielding national estimates of hospital-owned ED visits. Unweighted, it contains data from over 28 million ED visits each year. Weighted, it estimates roughly 123 million ED visits.
HCUP Nationwide Readmissions Database
The Nationawide Readmissions Database is a unique and powerful database designed to support various types of analyses of national readmissions for all patients, regardless of the expected payer for the hospital stay. This database addresses a large gap in healthcare data - the lack of nationally representative information on hospital readmissions for all ages.
CMS Research Identifiable Files (RIFs)
View CMS RIF data availability here
RIFs contain beneficiary level protected health information (PHI). Requests for RIF data require a Data Use Agreement (DUA) and are reviewed by CMS’s Privacy Board to ensure that the beneficiary’s privacy is protected and only the minimum data necessary are requested and justified. Details on the various programs and data files can be found on the ResDAC site here.
CMS RIF: Medicaid MAX/TAF
The CMS Medicaid Analytic eXtract (MAX) and T-MSIS Analytic File (TAF) data include enrollment information for all Medicaid enrollees within a state, regardless of whether the beneficiary receives all services under fee for service (FFS) or is enrolled in managed care organizations (MCOs). The MAX and TAF files also include all FFS utilization and any reported MCO utilization. There are some known data quality concerns with the TAF data which varies by state, details available here.
CMS RIF: Medicare FFS
Data from approximately 35 million enrollees, including enrollment, demographic, inpatient, outpatient, skilled nursing facility, home health, durable medical equipment, and professional claims. Includes dates of service, diagnosis and procedure codes, provider, and cost information for services billed to Medicare Parts A and B.
CMS RIF: Medicare Encounter
Data from approximately 35 million enrollees, including enrollment, demographic, inpatient, outpatient, skilled nursing facility, home health, durable medical equipment, and professional claims. Includes dates of service, diagnosis and procedure codes, provider and cost information for services billed to Medicare Part C. The Encounter data does not include cost information.
CMS RIF: Part D Event
Data from prescription fills for approximately 50 million Medicare FFS and Medicare Advantage enrollees. The PDE file includes all transactions covered by the Medicare prescription drug plan for both Prescription Drug Plans (PDPs) and Medicare Advantage Prescription Drug Plans (MA-PDs). Of note, prescriptions written but not filled will not generate a record in the PDE file.
Merative MarketScan
Commercial and Medicare Advantage data from ~140 million employees and dependents covered by the health benefit programs of large employers. These claims data are collected from several hundred insurance providers, Blue Cross Blue Shield plans, and third-party administrators.
• Commercial and MA Data Dictionary
• Lab Results Data Dictionary
• Commercial and MA User Guide
Merative Multi-State Medicaid
Contains the pooled healthcare experience of Medicaid enrollees from multiple states. It includes inpatient services and prescription drug claims, as well as information on enrollment, long-term care, and other medical care.
• Multi-State Medicaid Data Dictionary
• Multi-State Medicaid User Manual
Supplemental Databases
American Hospital Association Annual Survey
A collection of key metrics from a cross-sectional study of 6,500 U.S. hospitals including bed size, physician arrangements, IT indicators, community health partnerships, etc. The data represents the most credible, consistent and comprehensive data provided by nearly 6,300 hospitals and more than 400 health care systems.
IHPI would like to thank the following center and program partners who help make our data available to the U-M research community through financial contributions:
• Ann Arbor VA Center for Clinical Management Research
• Center for Healthcare Outcomes & Policy
• Center for Evaluating Health Reform
• Center for Eye Policy & Innovation
• Dow Division for Urologic Health Services Research
• Michigan Opioid Prescribing Engagement Network
• School of Public Health Department of Health Management and Policy
• Susan B. Meister Child Health Evaluation and Research Center