Does Amazon health and wellness spending follow the demographic lines you'd expect? My assumption going in: wealthier, more educated, older buyers would dominate supplement and medication purchases. The data had other ideas.
This analysis uses the Harvard Dataverse Amazon Purchases Dataset — 1,850,717 purchases from 5,027 households collected between 2018 and 2023, linked to a detailed demographic survey covering income, education, age, household size, and self-reported health conditions.
I expected a steep income gradient in health spending. Instead, less than 2 percentage points separate the lowest and highest earners. Households making under $25,000 per year dedicate 7.9% of their Amazon orders to health products. Households making $150,000 or more: 7.7%. Amazon's health aisle is, against all expectation, remarkably democratic.
People with some high school or less dedicate 9.5% of their Amazon orders to health and wellness products. Graduate degree holders — doctors, lawyers, MBAs — come in last at 7.7%. The more education a household has, the smaller the share of spending that goes to health products. This is the opposite of what most consumer behavior research would predict.
The explanation likely isn't that educated buyers are less health-conscious. It's that graduate-degree households buy vastly more of everything on Amazon, so health products become a smaller slice of a much larger pie.
Households that self-reported having diabetes buy health products at 8.048% of total orders. Households without diabetes: 8.049%. The difference is essentially zero. On the surface, a chronic health condition predicts nothing about health purchasing behavior.
But break down which health categories diabetic buyers choose, and a sharper
picture emerges. Diabetic households over-index heavily on
PROFESSIONAL_HEALTHCARE (20.9% of purchases in that category come from
diabetic buyers), OTC_MEDICATION (19.4%), and VITAMIN (14.4%). They are
not buying more health products overall — they are buying more medically serious
ones. The signal is hidden because the health category is dominated by
supplements and skincare bought by people without chronic conditions.
Young buyers (18–24) and older buyers (55–64) shop the Amazon health aisle at
similar overall rates — 8.0% vs 10.1%. But they are buying entirely different
things. Older buyers lead in MEDICATION, VITAMIN, and HERBAL_SUPPLEMENT.
Young buyers actually beat older buyers in SKIN_MOISTURIZER (1,571 vs
1,322 orders) and dominate BEAUTY. Amazon's health aisle serves two
completely different customers who happen to shop in the same place.
| Metric | Value |
|---|---|
| Total purchases analyzed | 1,850,717 |
| Unique households | 5,027 |
| Date range | Jan 2018 – Mar 2023 |
| Total spend in dataset | $44,053,633 |
| Avg spend per household | ~$8,765 over 5 years |
| Health & wellness orders | ~160,000 (~8.6% of total) |
Requirements: VS Code + the Malloy extension (bundles DuckDB, no setup needed) git clone https://github.com/YOUR_USERNAME/amazon-malloy cd amazon-malloy
- Download the dataset from
Harvard Dataverse
and place
amazon-purchases.csvandsurvey.csvin thedata/folder - Open the project folder in VS Code
- Open
notebooks/amazon-health-story.malloynbto follow the story - Open
models/amazon.malloyto explore or write your own queries
amazon-malloy/
data/
survey.csv
fields.csv
models/
amazon.malloy ← source definitions, joins, measures
hello.malloy ← initial sanity check query
notebooks/
amazon-health-story.malloynb ← the narrative notebook
README.md
Note: amazon-purchases.csv (299MB) exceeds GitHub's file limit and is not
included. Download it directly from Harvard Dataverse using the link above.
These findings matter most to three audiences.
Consumer brands and health product marketers should rethink demographic targeting. If income and education don't predict health spending on Amazon, then campaigns built on those proxies are misfiring. The better signal appears to be age — but only if you're willing to accept that "health" means skincare to one age group and medications to another, requiring entirely different product strategies and creative approaches.
Healthcare researchers and insurers should note the diabetes finding. The fact that diabetic households buy health products at the same rate as healthy households — but shift toward medications and professional healthcare equipment — suggests Amazon purchase data could be a meaningful supplementary signal for identifying unmet health needs, without requiring any clinical data at all.
Amazon itself sits on a segmentation goldmine. The platform currently presents a unified "Health & Wellness" category to all buyers. This data suggests that category is actually four or five distinct markets layered on top of each other, each driven by different demographics and different needs. A more granular category structure — or personalization that accounts for these splits — could meaningfully improve both discovery and conversion.
Dataset: Berinsky, A., et al. (2023). Amazon Purchase Histories. Harvard Dataverse. https://doi.org/10.7910/DVN/YGLYDY



