Apple

Myths vs. Data: Does an Apple a Day Keep the Doctor Away? – Towards Data Science


Introduction

“Money can’t buy happiness.” “You can’t judge a book by its cover.” “An apple a day keeps the doctor away.”

You’ve probably heard these sayings several times, but do they actually hold up when we look at the data? In this article series, I want to take popular myths/sayings and put them to the test using real-world data. 

We might confirm some unexpected truths, or debunk some popular beliefs. Hopefully, in either case we will gain new insights into the world around us.

The hypothesis

“An apple a day keeps the doctor away”: is there any real evidence to support this?

If the myth is true, we should expect a negative correlation between apple consumption per capita and doctor visits per capita . So, the more apples a country consumes, the fewer doctor visits people should need.

Let’s look into the data and see what the numbers really say.

Testing the relationship between apple consumption and doctor visits

Let’s start with a simple correlation check between apple consumption per capita and doctor visits per capita.

Data sources

The data comes from:

Since data availability varies by year, 2017 was selected as it provided the most complete in terms of number of countries. However, the results are consistent across other years.

The United States had the highest apple consumption per capita, exceeding 55 kg per year, while Lithuania had the lowest, consuming just under 1 kg per year.
South Korea had the highest number of doctor visits per capita, at more than 18 visits per year, while Colombia had the lowest, with just above 2 visits per year.

Visualizing the relationship

To visualize whether higher apple consumption is associated with fewer doctor visits, we start by looking at a scatter plot with a regression line.

The regression plot shows a very slim negative correlation, meaning that in countries where people eat more apples, there is a barely noticeable tendency to have lower doctor visits. 
Unfortunately, the trend is so weak that it cannot be considered meaningful.

OLS regression

To test this relationship statistically, we run a linear regression (OLS), where doctor visits per capita is the dependent variable and apple consumption per capita is the independent variable.

The results confirm what the scatterplot suggested:

  • The coefficient for apple consumption is -0.0107, meaning that even if there is an effect, it is very small.
  • The p-value is 0.860 (86%), far more than the standard significance threshold of 5%.
  • The R² value is almost zero, meaning apple consumption explains virtually none of the variation in doctor visits.

This doesn’t strictly mean that there is no relationship, but rather that we cannot prove one with the available data. It’s possible that any real effect is too small to detect, that other factors we didn’t include play a larger role, or that the data simply doesn’t reflect the relationship well.

Controlling for confounders

Are we done? Not quite. So far, we’ve only checked for a direct relationship between apple consumption and doctor visits. 

As already mentioned, many other factors could be influencing both variables, potentially hiding a true relationship or creating an artificial one.

If we consider this causal graph:

We are assuming that apple consumption directly affects doctor visits. However, other hidden factors might be at play. If we don’t account for them, we risk failing to detect a real relationship if one exists.

A well-known example where confounder variables are on display comes from a study by Messerli (2012), which found an interesting correlation between chocolate consumption per capita and the number of Nobel laureates. 

So, would starting to eat a lot of chocolate help us win a Nobel Prize? Probably not. The likely explanation was that GDP per capita was a confounder. That means that richer countries tend to have both higher chocolate consumption and more Nobel Prize winners. The observed relationship wasn’t causal but rather due to a hidden (confounding) factor.

The same thing could be happening in our case. There might be confounding variables that influence both apple consumption and doctor visits, making it difficult to see a real relationship if one exists. 

Two key confounders to consider are GDP per capita and median age. Wealthier countries have better healthcare systems and different dietary patterns, and older populations tend to visit doctors more often and may have different eating habits.

To control for this, we change our model by introducing these confounders:

Data sources

The data comes from:

Luxembourg had the highest GDP per capita, exceeding 115K USD, while Colombia had the lowest, at 14.3K USD.
Japan had the highest median age, at over 46 years, while Mexico had the lowest, at under 27 years.

OLS regression (with confounders)

After controlling for GDP per capita and median age, we run a multiple regression to test whether apple consumption has any meaningful effect on doctor visits.

The results confirm what we observed earlier:

  • The coefficient for apple consumption remains very small(-0.0100), meaning any potential effect is negligible.
  • The p-value (85.5%) is still extremely high, far from statistical significance.
  • We still cannot reject the null hypothesis, meaning we have no strong evidence to support the idea that eating more apples leads to fewer doctor visits.

Same as before, this does not necessarily mean that no relationship exists, but rather that we cannot prove one using the available data. It could still be possible that the real effect is too small to detect or that there are yet other factors we didn’t include.

One interesting observation, however, is that GDP per capita also shows no significant relationship with doctor visits, as its p-value is 0.668 (66.8%), indicating that we couldn’t find in the data that wealth explains variations in healthcare usage.

On the other hand, median age appears to be strongly associated with doctor visits, with a p-value of 0.001 (0.1%) and a positive coefficient (0.4952). This suggests that older populations tend to visit doctors more frequently, which is actually not really surprising if we think about it!

So while we find no support for the apple myth, the data does reveal an interesting relationship between aging and healthcare usage.

Median age → Doctor visits

The results from the OLS regression showed a strong relationship between median age and doctor visits, and the visualization below confirms this trend.

There is a clear upward trend, indicating that countries with older populations tend to have more doctor visits per capita

Since we are only looking at median age and doctor visits here, one could argue that GDP per capita might be a confounder, influencing both. However, the previous OLS regression demonstrated that even when GDP was included in the model, this relationship remained strong and statistically significant.

This suggests that median age is a key factor in explaining differences in doctor visits across countries, independent of GDP.

GDP Apple consumption

While not directly related to doctor visits, an interesting secondary finding emerges when looking at the relationship between GDP per capita and apple consumption

One possible explanation is that wealthier countries have better access to fresh products. Another possibility is that climate and geography play a role, so it could be that many high-GDP countries are located in regions with strong apple production, making apples more available and affordable. 

Of course, other factors could be influencing this relationship, but we won’t dig deeper here.

The scatterplot shows a positive correlation: as GDP per capita increases, apple consumption also tends to rise. However, compared to median age and doctor visits, this trend is weaker, with more variation in the data.

The OLS confirms the relationship: with a 0.2257 coefficient for GDP per capita, we can estimate an increase of around 0.23 kg in apple consumption per capita for each increase of $1,000 in GDP per capita.

The 3.8% p-value allows us to reject the null hypothesis. So the relationship is statistically significant. However, the R² value (0.145) is relatively low, so while GDP explains some variation in apple consumption, many other factors likely contribute. 

Conclusion

The saying goes:

“An apple a day keeps the doctor away,”

But after putting this myth to the test with real-world data, the results seem not in line with this saying. Across multiple years, the results were consistent: no meaningful relationship between apple consumption and doctor visits emerged, even after controlling for confounders. It seems that apples alone aren’t enough to keep the doctor away.

However, this doesn’t completely disprove the idea that eating more apples could reduce doctor visits. Observational data, no matter how well we control for confounders, can never fully prove or disprove causality. 

To get a more statistically accurate answer, and to rule out all possible confounders at a level of granularity that could be actionable for an individual, we would need to conduct an A/B test
In such an experiment, participants would be randomly assigned to two groups, for example one eating a fixed amount of apples daily and the other avoiding apples. By comparing doctor visits over time among these two groups, we could determine if any difference between them arise, providing stronger evidence of a causal effect.

For obvious reasons, I chose not to go that route. Hiring a bunch of participants would be expensive, and ethically forcing people to avoid apples for science is definitely questionable.

However, we did find some interesting patterns. The strongest predictor of doctor visits wasn’t apple consumption, but median age: the older a country’s population, the more often people see a doctor

Meanwhile, GDP showed a mild connection to apple consumption, possibly because wealthier countries have better access to fresh produce, or because apple-growing regions tend to be more developed.

So, while we can’t confirm the original myth, we can offer a less poetic, but data-backed version:

“A young age keeps the doctor away.”

If you enjoyed this analysis and want to connect, you can find me on LinkedIn

The full analysis is available in this notebook on GitHub.


Data Sources

Fruit Consumption: Food and Agriculture Organization of the United Nations (2023) — with major processing by Our World in Data. “Per capita consumption of apples — FAO” [dataset]. Food and Agriculture Organization of the United Nations, “Food Balances: Food Balances (-2013, old methodology and population)”; Food and Agriculture Organization of the United Nations, “Food Balances: Food Balances (2010-)” [original data]. Licensed under CC BY 4.0.

Doctor Visits: OECD (2024), Consultations, URL (accessed on January 22, 2025). Licensed under CC BY 4.0.

GDP per Capita: World Bank (2025) — with minor processing by Our World in Data. “GDP per capita — World Bank — In constant 2021 international $” [dataset]. World Bank, “World Bank World Development Indicators” [original data]. Retrieved January 31, 2025 from https://ourworldindata.org/grapher/gdp-per-capita-worldbank. Licensed under CC BY 4.0.

Median Age: UN, World Population Prospects (2024) — processed by Our World in Data. “Median age, medium projection — UN WPP” [dataset]. United Nations, “World Population Prospects” [original data]. Licensed under CC BY 4.0.


All images, unless otherwise noted, are by the author.



READ SOURCE

This website uses cookies. By continuing to use this site, you accept our use of cookies.