Data Sources

CoDES houses a range of data resources that can be accessed by CoDES members within certain restrictions laid out by licensing and data user agreements. For detail on data access procedures and fees, please contact the CoDES director, Dr. Almut Winterstein, at almut@ufl.edu. An overview of data sources is provided below.

IBM® Marketscan® Research Databases

The IBM® Marketscan® Research Databases contain individual-level, de-identified healthcare claims information from employers, healthplans, and Medicaid programs. CoDES has active licenses for the IBM® Marketscan® Commercial Database, IBM®Marketscan® Medicare Supplemental Database, and IBM® Marketscan® Health Risk Assessment Database. 

The IBM® Marketscan® Commerical database includes 2005-2018 health insurance claims for inpatient, outpatient, and outpatient pharmacy encounters, as well as enrollment data from large employers and health plans across the United States who provide healthcare coverage for their employees, their spouses, and dependents. The current dataset includes >180 million lives.

The IBM® Marketscan® Medicare Supplemental Database includes 2005-2018 enrollment records along with inpatient, outpatient, ancillary, and drug claims for 12.5 million retirees in the United States with Medicare supplemental coverage through privately-insured fee-for-service, point-of-service, or capitated health plans.

The IBM® Marketscan® Health Risk Assessment (HRA) Database includes 2012-2018 self-reported biometric and health-related behavioral data obtained through surveys of employees of large US corporations and health plans. HRA is linked to medical, pharmacy, and enrollment data for these employees in the IBM® Marketscan® Commercial Database and used to examine the relationships between health behaviors/risk and health outcomes or medical expenditures. Linked data is available for about 5% of beneficiaries.

Medicaid Analytic eXtract (MAX) and T-MSIS Analytic Files (TAF)

MAX and TAF data contain claims for medical care and drug benefits received by beneficiaries with Medicaid insurance coverage, the state-run programs for low-income and categorically eligible individuals and families. CoDES has in-house MAX data for over >120 million beneficiaries residing in the 29 most populous states from 1999-2010 (AL, AR, CA, FL, GA, IA, ID, IL, IN, KS, KY, LA, MA, MN, MO, MS, NC, NE, NJ, NM,  NY, OH, SC, TN, TX, VA, WA, WI, WV) and national data (all 50 states plus the district of Columbia) from 2011-2014. The 29 states included in the 1999-2010 MAX data represent 85% of all Medicaid beneficiaries. National data for 2015-2016 are currently curated.

Birth Certificate Records

Medicaid data has been linked to birth certificates from the Florida Department of Health (1999-2014), Texas Department of State Health Services (1999-2012) and New Jersey Department of Health (1999-2010). The entire national Medicaid data set includes validated mother-infant linkages.

Medicare fee-for-service claims data

Medicare is a federal health insurance program that provides coverage to people aged 65 years or older and those with disabilities or end-stage renal disease. Annual Medicare enrollment has exceeded 50 million since 2012. Data include claims for inpatient, skilled care nursing facility, and hospice care (Part A) as well as outpatient care (Part B) and prescription drugs (Part D). CoDES center has in-house 5% national Medicare data for the years 2011 through 2015 plus 1 million beneficiaries in FL who were oversampled from individuals who reside in the UF Health catchment area, and 15% national Medicare beneficiaries  plus the entire state of Florida for 2016-2018, totaling >8 million lives.

Enrollment tables for the claims data sets listed above by year can be found here.