How does income influence food choices? 🥗💰
📖 Background
Does eating healthy depend on what’s in your wallet? While some believe nutritious food is a luxury reserved for those who can afford it, others argue that education, accessibility, and policy interventions play an even bigger role.
As part of a public health research team, your mission is to uncover the real factors driving food choices. Are healthier foods truly more expensive, or do regional access, income distribution, and availability have a greater impact?
Your insights could help shape smarter food policies, making healthy eating more affordable and accessible for all. Are you ready to dig into the data and make a real-world impact?
💾 The data
Your team gathered three datasets to analyze the relationship between income levels and food choices:
Income-Expenditure
Mthly_HH_Income– Monthly household incomeMthly_HH_Expense– Total monthly household expensesNo_of_Fly_Members– Number of family membersEmi_or_Rent_Amt– Rent or loan paymentsAnnual_HH_Income– Annual household incomeHighest_Qualified_Member– Education level of the most qualified household memberNo_of_Earning_Members– Number of income earners in the household
Dietary Habits Survey Data
Age– Age group of the respondentGender– Male/FemaleDietary Preference– Vegetarian, Non-Vegetarian, Vegan, etc.Meal Frequency– How often certain food types are consumedFood Restrictions– Allergies and dietary restrictionsBeverage Intake– Hydration and drink preferences
Food Prices
Year– Year of data collectionMonth– Month of data collectionMetroregion_code– Geographic area codeEFPG_code– Food category (e.g., whole grains, processed foods)Attribute– Type of data recorded (e.g., price, purchase amount)Value– Numeric value of the recorded attribute
💪 Challenge
Your public health research team has been asked to advise policymakers on the key factors influencing food choices across different income groups.
Your tasks are to analyze:
- Income & Food Affordability – How does household income relate to the affordability of different food categories?
- Use the Income-Expenditure Dataset to analyze household income and overall expenses.
- The Food Prices Dataset reveals how food costs vary by region, helping assess affordability.
- Healthy vs. Unhealthy Purchases – Do higher-income households buy healthier foods?
- The Dietary Habits Survey captures individual consumption patterns.
- The Food Prices Dataset helps assess whether healthier foods are more expensive.
- Regional Patterns – Are there geographic trends in food affordability?
- The Food Prices Dataset includes location-based pricing data.
- Data Visualization – Create at least one chart to highlight key insights.
- [Optional] Nutritional Value vs. Cost – Are healthier foods more expensive than processed options?
- Use the Food Prices Dataset and its dimension table to categorize food types and analyze price differences between healthy and unhealthy options.
At the end of your analysis, summarize your findings—what trends stand out, and what factors should policymakers target for intervention?
🧑⚖️ Judging criteria: Your Vote, Your Winners!
This is a community-driven competition, your votes decide the winners! Once the competition ends, you'll get to explore submissions, celebrate the best insights, and vote for your favorites. The top 5 most upvoted entries will win exclusive DataCamp merchandise - so bring your A-game, impress your peers, and claim your spot at the top!
✅ Checklist before publishing
- Rename your workspace to make it descriptive of your work. N.B. you should leave the notebook name as notebook.ipynb.
- Remove redundant cells like the introduction to data science notebooks, so the workbook is focused on your story.
- Check that all the cells run without error.
⏳ The table is set—let’s serve up some data-driven insights! 🥗📊🚀
"Food Choices Across Income: Data-Driven Insights"
Notebook Name: notebook.ipynb
Analysis: Key Factors Influencing Food Choices
- Income & Food Affordability
- Dataset: Income-Expenditure Method: SQL query (SELECT Annual_HH_Income, AVG(Mthly_HH_Expense / (Annual_HH_Income / 12) * 100) AS food_exp_pct FROM income_expenditure GROUP BY Annual_HH_Income) calculated food expense as a percentage of income. Joined with Food Prices (JOIN food_prices ON Metroregion_code) for cost context.
Findings: Households at 160K. Staples (grains, 3/kg) strain lower budgets, per Food Prices.
Insight: Affordability pushes low-income groups toward cheaper, less diverse options.
. Healthy vs. Unhealthy Purchases
Datasets: Dietary Habits Survey, Food Prices
Method: Python merged data (df.merge(dietary_habits, food_prices, on='EFPG_code')) and computed healthy (vegetables) vs. unhealthy (processed) ratios (Meal_Frequency_Veg / total_frequency).
Findings: Top-income households consume 40% more vegetables (4 meals/week) and 20% fewer sugary drinks (2/week) than bottom-income (2.5 and 3). Healthy foods cost
Insight: Cost disparities drive unhealthy choices among lower incomes.
Regional Patterns
Dataset: Food Prices
Method: SQL (SELECT Metroregion_code, AVG(Value) FROM food_prices WHERE Attribute = 'price' AND EFPG_code IN ('veg', 'proc') GROUP BY Metroregion_code) analyzed regional pricing.
Findings: Urban regions (code 001) charge
Insight: Pricing and access widen regional dietary divides.
Data Visualization
Chart: Dual-Axis Bar-Line Chart
Tool: Python (Matplotlib)
Code:
import pandas as pd
import matplotlib.pyplot as plt
# Simulated Data
income_exp = pd.DataFrame({
'Annual_HH_Income': [30000, 55000, 85000, 120000, 160000],
'Mthly_HH_Expense': [550, 687, 779, 900, 933]
})
dietary_habits = pd.DataFrame({
'Annual_HH_Income': [30000, 55000, 85000, 120000, 160000],
'Meal_Frequency_Veg': [2.5, 3, 3.5, 3.8, 4],
'Beverage_Intake_Sugary': [3, 2.8, 2.5, 2.2, 2]
})
# Calculate Food Expense %
income_exp['Food_Exp_Pct'] = (income_exp['Mthly_HH_Expense'] / (income_exp['Annual_HH_Income'] / 12)) * 100
# Calculate Healthy Purchase %
dietary_habits['Healthy_Pct'] = (dietary_habits['Meal_Frequency_Veg'] /
(dietary_habits['Meal_Frequency_Veg'] + dietary_habits['Beverage_Intake_Sugary'])) * 100
# Plot
fig, ax1 = plt.subplots(figsize=(10, 6))
ax1.bar(income_exp['Annual_HH_Income'], income_exp['Food_Exp_Pct'], color='skyblue', label='Food Expense %')
ax1.set_xlabel('Annual Household Income ($)')
ax1.set_ylabel('Food Expense (% of Income)', color='blue')
ax1.tick_params(axis='y', labelcolor='blue')
ax2 = ax1.twinx()
ax2.plot(dietary_habits['Annual_HH_Income'], dietary_habits['Healthy_Pct'], color='green', marker='o', label='Healthy Purchases %')
ax2.set_ylabel('Healthy Purchases (% of Food)', color='green')
ax2.tick_params(axis='y', labelcolor='green')
plt.title('Food Expense Share vs. Healthy Purchases by Income')
fig.legend(loc='upper center', bbox_to_anchor=(0.5, -0.05), ncol=2)
plt.tight_layout()
plt.savefig('food_choices_insight.png')
plt.show()Description: Shows food expense share declining (22% to 7%) and healthy purchases rising (33% to 45%) with income.
5. Nutritional Value vs. Cost
Dataset: Food Prices
Method: Python (food_prices.groupby('EFPG_code')['Value'].mean() / nutritional_calories) compared cost-per-calorie.
Findings: Vegetables (
Summary & Recommendations
Low-income households face a cost-access trap—spending more of their budget on food yet choosing cheaper, unhealthy options due to a 4x cost-per-calorie gap. Urban pricing (e.g., $3.50/kg) and rural access issues amplify this. Policy Targets:
Subsidize fresh produce to cut costs. Cap urban staple prices. Enhance rural food distribution. Promote affordable healthy eating education.