Datasets Available to IHPI Members
The IHPI Data & Methods Hub (DMH) maintains a repository of high-value datasets that IHPI members can utilize to advance their research. Many of these datasets are available for free for IHPI members. If you have any questions about any of the datasets listed below, please connect with our team at ihpi-data@umich.edu. We will be happy to chat with you and help determine which data resource is right for you.
IHPI would like to thank the following center and program partners who help make our data available to the U-M research community through financial contributions:
- Ann Arbor VA Center for Clinical Management Research
- Center for Healthcare Outcomes & Policy
- Center for Evaluating Health Reform
- Center for Eye Policy & Innovation
- Dow Division for Urologic Health Services Research
- Kidney Epidemiology and Cost Center
- Michigan Medicine Department of Neurology
- Michigan Opioid Prescribing Engagement Network
- School of Public Health Department of Health Management and Policy
- Susan B. Meister Child Health Evaluation and Research Center
IHPI Data Assets
Administrative Claims Databases
View all HCUP data availability here
HCUP State Inpatient Databases
The SID includes inpatient discharge records from community hospitals in a particular state. The SID files encompass all patients, regardless of payer, providing a unique view of inpatient care in a defined market or state over time.
HCUP State Ambulatory Surgery and Services Databases
The SASD are state-specific files that include data for ambulatory surgery and other outpatient services from hospital-owned facilities. In addition, some States provide ambulatory surgery and outpatient services from nonhospital-owned facilities. The uniform format of the SASD helps facilitate cross-state comparisons. The SASD are well suited for research that requires a complete enumeration of hospital-based ambulatory surgeries within geographic areas or States.
HCUP State Emergency Department Databases
The SEDD are a set of longitudinal State-specific emergency department (ED) databases that capture discharge information on all emergency department visits that do not result in an admission. Information on patients seen in the emergency room and then admitted to the hospital is included in the State Inpatient Databases (SID).
HCUP Nationwide Inpatient Sample
The NIS is the largest publicly available all-payer inpatient healthcare database designed to produce U.S. regional and national estimates of inpatient utilization, access, cost, quality, and outcomes. Unweighted, it contains data from around 7 million hospital stays each year. Weighted, it estimates around 35 million hospitalizations nationally.
HCUP Nationwide Emergency Department Databases
The NEDS is the largest all-payer emergency department (ED) database in the United States, yielding national estimates of hospital-owned ED visits. Unweighted, it contains data from over 28 million ED visits each year. Weighted, it estimates roughly 123 million ED visits.
HCUP Nationwide Readmissions Database
The NRD is a unique and powerful database designed to support various types of analyses of national readmissions for all patients, regardless of the expected payer for the hospital stay. This database addresses a large gap in healthcare data - the lack of nationally representative information on hospital readmissions for all ages. Unweighted, the 2020 NRD contains data from approximately 17 million discharges. Weighted, it estimates roughly 32 million discharges.
Medicaid MAX/TAF (various years)
The CMS Medicaid Analytic eXtract (MAX) and T-MSIS Analytic File (TAF) data include enrollment information for all Medicaid enrollees within a state, regardless of whether the beneficiary receives all services under fee for service (FFS) or is enrolled in managed care organizations (MCOs). All FFS utilization and any reported MCO utilization is also included in both the MAX and TAF files.
Medicare (various years)
Data from 55 million enrollees, including enrollment data for all members and inpatient/nursing facility data, as well as Part D prescription-related data for a large sample across multiple years. Data include:
- 100% sample—Medicare Provider Analysis and Review (MedPAR—Inpatient and Skilled Nursing Facility claims)
- 100% sample—Master Beneficiary Summary File (MBSF—Base enrollment)
- 20% sample—Carrier File (Professional claims)
- 20% sample—Outpatient File
- 20% sample—Home Health Agency
- 20% sample—Hospice
- 20% sample—Part D Event and Drug Characteristics files
- Cohorts for various conditions and procedures
Merative MarketScan (2009 through 2022)
Commercial and Medicare Advantage data from ~140 million employees and dependents covered by the health benefit programs of large employers. These claims data are collected from several hundred insurance providers, Blue Cross Blue Shield plans, and third-party administrators.
• Commercial and MA Data Dictionary
• Lab Results Data Dictionary
• Commercial and MA User Guide
Merative Multi-State Medicaid (2009 through 2022)
Contains the pooled healthcare experience of Medicaid enrollees from multiple states. It includes inpatient services and prescription drug claims, as well as information on enrollment, long-term care, and other medical care.
• Multi-State Medicaid Data Dictionary
• Multi-State Medicaid User Manual
Supplemental Databases
American Hospital Association Annual Survey (2000 through 2019)
A collection of key metrics from a cross-sectional study of 6,500 U.S. hospitals including bed size, physician arrangements, IT indicators, community health partnerships, etc. The data represents the most credible, consistent and comprehensive data provided by nearly 6,300 hospitals and more than 400 health care systems.
EHR Data
The Data & Methods Hub staff have also worked with internal Michigan Medicine EHR data and can help you determine if medical record data is the correct choice for your project.
How do the administrative claims databases differ?
Choosing the right data source can be quite difficult, especially if there are multiple options for similar types of data. This is especially true for administrative claims where the selection of variables offered often overlap between the comparative databases. Here is some info to help you get a better understanding of two of our most popular datasets: Medicare , and Merative MarketScan.
Differences between the views will be noted in the table. Please remember that we are always available to meet with you to discuss any of our available datasets.
Data Characteristic |
Public Insurance |
Private Insurance |
Medicare FFS |
Merative MarketScan |
|
Basic Info |
|
|
Number of covered lives |
55M |
140M |
Longitudinal linkage |
✔ |
✔ |
Dental claims |
- |
✔ |
Demographics |
|
|
Date of Birth |
✔ |
Birth Year |
Ages Covered |
65+ or disabled |
0-100+ |
Race |
✔ |
- |
Employment Status |
- |
✔ |
Additional Socio Economic Info |
- |
- |
Death Date |
✔ |
- |
Cause of Death |
✔ (through 2016 only) |
- |
Geographic Information |
|
|
Census Region |
✔ |
✔ |
State |
✔ |
✔ |
County |
✔ |
- |
ZIP Code |
✔ |
✔ (only 3 digit) |
MSA |
✔ |
✔ |