Skip to main content

Multilevel Modeling: A Comprehensive Guide for Data Scientists

Discover the importance of multilevel modeling in analyzing hierarchical data structures. Learn how to account for variability within and between groups using fixed and random effects. Apply these concepts to uncover deeper insights in fields like education, healthcare, and social sciences.
Jan 22, 2025  · 15 min read

Many times when we find nested or hierarchical data, such as students grouped within classrooms, patients nested in hospitals, or repeated measurements taken from the same individual over time, we tend to use traditional linear methodologies for modeling. These standard statistical models cannot capture the intricate relationships within such nested structures, leading to biased insights. In such cases, multilevel modeling (MLM) also called hierarchical or mixed-effects modeling, can be quite useful in handling this hierarchy by accounting for the influence of group-level characteristics on individual outcomes.

MLM finds its application in fields like education and healthcare to psychology and social sciences, as it enables a nuanced understanding of data that goes beyond what single-level models can offer. By modeling both fixed and random effects, MLM can capture the variability within and between groups, providing a richer, more accurate representation of real-world phenomena. In this article, we’ll explore the fundamentals of multilevel modeling, its applications, and the benefits it brings to complex data analysis. When you are done with this article, do check out our highly recommended full course on Hierarchical and Mixed Effects Models in R.

What is Multilevel Modeling?

In fields such as social sciences, education, and epidemiology, where data often have natural hierarchical structures, MLM turns out to be more suitable. In these disciplines, data are mostly clustered into groups: students are grouped within classrooms, patients within hospitals, or survey responses within neighborhoods.

Let’s talk about education research, where MLM can be useful in studying how individual student performance might be influenced by not only student-specific factors (such as study habits) but also classroom or school-level variables (such as teacher experience or school resources). Similarly, in epidemiology, multilevel models analyze how individual health outcomes are affected by both personal characteristics and the environments in which individuals live, such as neighborhoods or cities. In social sciences, researchers often employ MLM to examine how personal attitudes are shaped by both individual beliefs and group-level norms.

While single-level models assume each observation is independent, MLM acknowledges the correlation within clusters by including both fixed and random effects. Fixed effects capture systematic, average relationships across the dataset, while random effects account for variability within clusters—such as classrooms, hospitals, or communities. This dual approach not only controls for dependencies within clusters but also provides insight into how higher-level variables interact with individual characteristics.

When Is Multilevel Modeling Necessary?

Traditional regression models assume that the value of one observation is not influenced by the value of another, also known as the independence assumption. This assumption may hold when the data is collected from one homogenous group where observations do not affect each other. However, this assumption is violated in hierarchical data because individuals within clusters (like classrooms or hospitals) are likely to be more similar to each other than to individuals from other clusters.

For instance, in educational research, students within the same school share similar resources, teaching quality, and local policies, creating dependencies in their performance. In healthcare studies, patients treated within the same hospital might experience similar levels of care or respond similarly to treatments, introducing clustering in health outcomes.

When the independence assumption is ignored in hierarchical data, several issues arise:

  • Underestimated Standard Errors: Because traditional models assume independence, they often produce smaller standard errors in clustered data, leading to overly optimistic conclusions about statistical significance.
  • Misestimated Coefficients: Failing to account for clustering can bias coefficient estimates, making it difficult to isolate the effects of individual-level and group-level variables.

How to identify the need for a multilevel model

A multilevel model would be suitable when the data shows one or more of the following characteristics:

  • Observations are grouped within larger units (e.g., students within classes). Each unit can influence individual outcomes, as we said.
  • Observations within clusters are likely to be similar to one another. You can often check this by calculating intraclass correlation coefficients (ICCs), which measure the proportion of variance due to group-level differences.
  • If we want to understand the impact of both individual-level and group-level variables on an outcome (e.g., student test scores affected by both student effort and school resources), an MLM is a suitable choice.

Let’s understand this by an example. Imagine a study examining student academic performance. Here, a single-level model would treat each student as an independent observation, disregarding the fact that students within the same school may be more similar to each other than to those in other schools. Using MLM, we can model both individual student-level variables (such as hours spent studying) and school-level variables (like funding per student or teacher-student ratios) to capture influences at both levels accurately.

Here’s how MLM is applied in a variety of real-world scenarios:

  • In education, MLM is used to analyze student performance by accounting for student-level (socioeconomic status, study habits) and school-level factors (funding, teacher experience), revealing if school resources impact student outcomes beyond individual differences.
  • In healthcare, MLM allows researchers to examine patient outcomes by factoring in patient-level (age, health status) and hospital-level variables (facility quality, staff expertise), identifying if hospital characteristics (like nurse-patient ratios) influence recovery rates.
  • In longitudinal studies, MLM is ideal for tracking individual changes over time (e.g., mental health during therapy), by including time-specific and individual factors, helping researchers discern both session-level and overall progress effects.
  • In public health, MLM aids in understanding disease spread by considering individual (vaccination status) and community-level factors (population density, interventions), clarifying which community strategies (e.g., awareness campaigns) effectively reduce transmission.
  • In group therapy research, MLM accounts for individual (self-esteem, therapy commitment) and group-level dynamics (cohesion, leader experience), showing how much improvement is due to personal versus group influences, informing better therapy structures.

Across these fields, MLM captures dependencies that single-level models may miss, leading to interventions tailored to both individual needs and broader structural factors.

Key Concepts in Multilevel Modeling

Some of the major components MLMs explore are fixed and random effects, centering techniques, covariance structures, and data structures. Let’s understand each of them along with guidance on their application and interpretation.

Fixed vs. random effects

In MLMs, fixed effects estimate relationships that are assumed to be consistent across all units of analysis. These coefficients are interpreted similarly to those in traditional regression models and apply universally across all groups or clusters. For example, if we’re examining how study hours impact test scores across schools, a fixed effect for study hours would assume that the impact of study hours is the same for all schools.

On the other hand, random effects allow for variability between groups or clusters by estimating parameters that can vary across these units. They capture group-level deviations from the overall fixed effects, such as the fact that some schools may naturally have higher or lower average scores than others. Just like linear regression has intercepts and slopes MLM random effects are parameterized by random intercepts and random slopes.

Random intercepts model the variation in the baseline (intercept) between clusters. For instance, a random intercept would allow each school to have a unique average test score, reflecting differences in baseline performance across schools.

Random slopes capture the relationship between an independent variable and the dependent variable across clusters. If the effect of study hours on test scores varies from one school to another, this can be modeled with random slopes where each school will have its unique relationship between study hours and scores.

When to use fixed and random effects

Fixed effects are typically used when we assume a uniform relationship across all clusters, while random effects are useful when we expect variation across groups. Random slopes, in particular, are valuable when there’s evidence that the relationship between predictors and outcomes changes across clusters.

Grand-mean vs. group-mean centering

Centering is a technique used to adjust predictors and improve the interpretability of multilevel models.

  • Grand-Mean Centering: With grand-mean centering, each predictor variable is centered around the overall mean (mean across all clusters). This method helps in interpreting the intercept as the predicted outcome for a group at the average level of the predictor.
    Example: Suppose we’re studying the effect of study hours on test scores across schools. By centering study hours around the grand mean, we interpret the fixed effect of study hours in terms of the average effect across all schools.
  • Group-Mean Centering: In group-mean centering, each predictor is centered around the mean within its respective group or cluster. This approach helps distinguish between the effects of predictors within and across clusters, making it useful when we’re interested in how within-cluster variation affects the outcome.
    Example: Using group-mean centering for study hours allows us to interpret the effect of an individual’s study hours relative to their school’s average, helping to separate within-school effects from between-school differences.

Choosing between grand-mean and group-mean centering

Grand-mean centering is appropriate when we are interested in the effect of predictors relative to the overall population mean. Whereas, group-mean centering is useful when separating within-group effects from between-group effects is essential. For instance, if we’re interested in comparing students’ study hours within the context of their school’s average study time, group-mean centering clarifies these intra-group comparisons.

Covariance matrices

Covariance matrices in MLMs are essential for understanding variability within and between clusters. They are key to interpreting the structure of random effects and residuals:

  • The residual covariance matrix maps the correlation among observations within clusters, which is not covered by fixed or random effect modeling.
  • The random effects covariance matrix maps the variability in random effects including random intercepts and slopes, and model dependencies across clusters. For example, in a model with random slopes, this matrix would reveal how the slope varies from one cluster to another.

Importance of covariance in MLM

The covariance structure in MLM allows the model to correctly estimate standard errors and coefficients, accounting for the dependencies within clustered data. Specifying an appropriate covariance structure helps ensure that the model accurately reflects the relationships within and across clusters, leading to more reliable inferences.

Nested vs. cross-classified structures

Identifying the data structure, whether that is nested or cross-classified, is a big part of building an accurate MLM:

  • Nested Structures are structures where one level of data is fully contained within another (parent level), such as students within schools.
  • Cross-classified structures do not fit neatly into a single hierarchy, as lower-level units may belong to multiple higher-level units. This structure requires more complex modeling. A good example would be from educational research where students may belong to multiple classifications, such as neighborhood and school district. Each student has a unique combination of neighborhood and school, leading to a cross-classified structure.

Determining the correct structure

To determine the structure, assess whether each unit at the lower level belongs to a single higher-level unit (nested) or multiple units (cross-classified). The correct structure helps ensure that the MLM captures the real-world data dependencies and provides meaningful insights.

Implementing Multilevel Models

Now we will implement MLM in R using the lme4 package with the step by step guide described below.

Step 1: Setting up data

Let’s create synthetic data that is nested. Here, students are nested within schools, each row represents an individual student, and there is a grouping variable for schools.

# Create the data frame
our_multilevel_data <- data.frame(
  StudentID = 1:20,
  SchoolID = c("A", "A", "B", "B", "A", "A", "B", "B", "A", "B", 
               "A", "C", "C", "C", "C", "A", "B", "C", "B", "C"),
  StudyHours = c(5, 3, 4, 6, 2, 7, 5, 8, 6, 3, 
                 4, 5, 6, 2, 7, 9, 2, 8, 4, 3),
  TestScore = c(80, 70, 85, 90, 60, 95, 88, 92, 85, 75, 
                72, 78, 83, 65, 89, 96, 67, 91, 79, 68)
)

# Display the first 5 rows of the data frame
head(our_multilevel_data, 5)
StudentID SchoolID StudyHours TestScore
1 A 5 80
2 A 3 70
3 B 4 85
4 B 6 90
5 A 2 60

Step 2: Installing and loading the lme4 library

The lme4 library is widely used for multilevel modeling in R. Install and load it as follows:

# Install lme4 if you haven't already
install.packages("lme4")

# Load the library
library(lme4)

Step 3: Fitting a two-level random intercept model

In this example, we’ll model TestScore as the outcome variable, with StudyHours as a predictor, while accounting for variation between schools (random intercept). This model estimates an intercept for each school, allowing each to have a unique baseline score.

The formula syntax in lme4 uses (1 | SchoolID) to specify a random intercept for the grouping variable SchoolID.

# Fit a two-level random intercept model
our_multilevel_model <- lmer(TestScore ~ StudyHours + (1 | SchoolID), data = our_multilevel_data)

This model includes:

  • Fixed Effect for StudyHours: Estimates the average effect of study hours on test scores across all schools.
  • Random Intercept for SchoolID: Allows the intercept (baseline test score) to vary by school.

Step 4: Summary and model outputs

After fitting the model, inspect the results with the summary() function, which provides estimates for fixed effects, variance components, and standard errors.

# Summary of the model
summary(our_multilevel_model)
Linear mixed model fit by REML ['lmerMod']
Formula: TestScore ~ StudyHours + (1 | SchoolID)
   Data: data

REML criterion at convergence: 105.7

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-1.81620 -0.59553  0.03922  0.27094  1.86051 

Random effects:
 Groups   Name        Variance Std.Dev.
 SchoolID (Intercept)  9.41    3.068   
 Residual             11.17    3.343   
Number of obs: 20, groups:  SchoolID, 3

Fixed effects:
            Estimate Std. Error t value
(Intercept)  56.7194     2.6280   21.58
StudyHours    4.7643     0.3612   13.19

Correlation of Fixed Effects:
           (Intr)
StudyHours -0.682

This output provides:

  • Fixed Effects: Coefficients (estimates) for StudyHours, showing the average effect of study hours on test scores.

  • Random Effects: Variance components for the intercept, capturing between-school variance.

  • Residual Variance: Within-school variance (error term).

Step 5: Interpreting key results

Now, to understand the results:

Fixed effect estimates

The fixed effects in the summary tell us how StudyHours affects TestScore on average across schools.

  • Intercept: The average test score for students with zero study hours.
  • Slope (StudyHours): The average increase in test scores for each additional hour spent studying. This estimate applies to all schools.

Random effect variance components

Random effects represent the variability at the group level (e.g., between schools). In the output, Random Intercept Variance (SchoolID) of 9.14 indicates how much schools vary in their average test scores (sd of ~3), capturing differences in baseline performance across schools.

Residual variance

The residual variance of 11.17, represents the within-group variance (e.g., differences among students within the same school).

Intraclass correlation coefficient (ICC)

The ICC quantifies the proportion of variance that exists between groups, helping to assess the necessity of MLM. A high ICC suggests that a significant portion of the outcome’s variability is due to differences across groups.

Let’s calculate the ICC as follows:

# Extract variance components
school_variance <- as.numeric(VarCorr(our_multilevel_model)$SchoolID[1])
residual_variance <- attr(VarCorr(our_multilevel_model), "sc")^2

# Calculate ICC
ICC <- school_variance / (school_variance + residual_variance)
ICC
0.4571434

An ICC close to 1 indicates that most of the variability is at the group level, while a value closer to 0 suggests that within-group variability dominates. ICCs above 0.1 or 0.2 generally indicate the need for a multilevel approach. Here we are getting an ICC of ~0.46 which necessitates the use of MLM approach.

Step 6: Additional model adjustments

We can then make some model adjustments.

Adding random slopes

If the effect of StudyHours varies across schools (some schools may show a stronger or weaker relationship), we add a random slope for StudyHours:

# Model with random slope for StudyHours
model_slope <- lmer(TestScore ~ StudyHours + (StudyHours | SchoolID), data = our_multilevel_data)

This model estimates a unique slope for each school, allowing the relationship between study hours and test scores to vary across schools.

Model comparison

Let’s use the anova() function to compare models with and without random slopes to test if adding complexity improves the model fit.

# Compare models
anova(our_multilevel_model, model_slope)
Data: data
Models:
model: TestScore ~ StudyHours + (1 | SchoolID)
model_slope: TestScore ~ StudyHours + (StudyHours | SchoolID)
            npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
model          4 116.44 120.43 -54.221   108.44                     
model_slope    6 119.31 125.28 -53.655   107.31 1.1331  2     0.5675

A significant difference indicates that the random slope improves the model. There is no significant difference when adding a slope indicated by AIC and BIC numbers.

Advanced Topics in Multilevel Modeling

Using Bayesian approaches and advanced data structures in MLM can provide better insights, especially when dealing with nuanced data patterns. Here’s an overview of advanced MLM techniques that extend beyond standard hierarchical structures.

1. Random slopes and cross-level interactions

Cross-level interactions

Cross-level interactions allow us to explore how relationships at one level (e.g., individual) may vary according to factors at a higher level (e.g., groups). These interactions are significant when the effect of a lower-level predictor depends on a higher-level characteristic.

Example: Suppose we’re examining the relationship between study hours and test scores among students across schools. A cross-level interaction could help us understand if the effect of study hours on test scores changes depending on school-level variables, like school funding or average teacher experience.

In this case:

  • A random slope for study hours would allow the effect of study hours to vary from one school to another.
  • A cross-level interaction would help model whether the relationship between study hours and test scores is stronger in schools with higher funding.

Cross-level interactions are particularly useful when group contexts (like schools or regions) might influence individual-level behaviors, offering insights into how group characteristics amplify or diminish relationships.

Using random slopes to capture interactions

Random slopes allow us to model variability in the relationship between predictors and outcomes across groups. This technique is useful when we suspect that the effect of a predictor (e.g., study hours) is not uniform across groups (e.g., schools). By specifying a random slope, the model can capture these group-specific variations.

Use random slopes when:

  • There’s evidence that the relationship between a predictor and outcome varies significantly across groups.
  • You’re interested in understanding how individual-level effects differ across groups and whether group-level variables moderate these effects.

2. Multilevel models beyond hierarchical structures

Traditional MLMs assume a strict hierarchy, but data often involve more complex structures where individuals may belong to multiple groups that don’t fit into a simple hierarchy.

Cross-classified models

Cross-classified models are designed for situations where lower-level units are simultaneously nested within two or more higher-level groups. Unlike strictly hierarchical structures, these models account for individuals belonging to multiple classifications, allowing each classification to have its own influence on the outcome.

To take an example, think that, in educational research, students might be nested within both neighborhoods and schools. Some students may attend different schools than their neighbors, creating a structure where individuals are not nested within just one group but rather span two cross-classified groupings.

In such cases, the model treats neighborhoods and schools as separate, yet cross-classified, sources of variance, allowing researchers to estimate the effects of both classifications on the outcome. This approach is common in studies where people interact with more than one social or geographic grouping.

When to use cross-classified models

Cross-classified models are suitable when:

  • Individuals or lower-level units are influenced by multiple higher-level units (e.g., students impacted by both school and neighborhood contexts).
  • We aim to understand how each classification contributes to the variability in the outcome, especially when these classifications do not form a strict hierarchy.

3. Bayesian methods for multilevel models

Bayesian multilevel modeling is a probabilistic approach to MLMs, particularly useful in complex or small datasets.

Benefits of bayesian multilevel modeling

Bayesian MLM provides several benefits over frequentist approaches:

  • Improved Estimates in Small Samples: Bayesian methods can produce more reliable estimates when sample sizes are limited or data is sparse, using prior distributions to inform the model.
  • Flexibility with Complex Models: Bayesian approaches handle more complex models, such as those with multiple random effects or complex covariance structures, with greater ease.
  • Uncertainty Estimation: Bayesian methods provide full posterior distributions for parameters, which allows for a richer interpretation of uncertainty around estimates.

In Bayesian MLM, rather than point estimates, the model provides distributions of possible values, allowing us to express uncertainty and make probabilistic statements about parameter values. For example, we can say that there’s a 95% probability that the true effect of study hours on test scores falls within a specified range.

Tools for Bayesian multilevel modeling

Popular tools for Bayesian MLM include:

  • Stan: Stan is a powerful tool for Bayesian modeling, often used through R interfaces such as rstanarm and brms, which facilitate Bayesian MLM by automating much of the model setup.

  • PyMC3: In Python, PyMC3 is widely used for Bayesian modeling and offers flexibility for building custom Bayesian MLMs.

Both Stan and PyMC3 use Markov chain Monte Carlo (MCMC) methods to sample from posterior distributions, which is computationally intensive but provides precise parameter estimates, especially useful in multilevel settings.

Example use case: Bayesian MLM in healthcare

Consider a Bayesian MLM applied to patient recovery times across multiple hospitals. If we’re examining factors such as treatment type and doctor experience, Bayesian MLM can provide a probabilistic range for how much each hospital’s treatment approach affects recovery. This approach allows researchers to quantify uncertainty and create credible intervals for recovery time estimates across hospitals, which is particularly valuable in medical and psychological studies where high confidence in estimates is critical.

Common Pitfalls and Best Practices

Some considerations when dealing with hierarchical data can go a long way when working with MLMs.

  • Small Sample Sizes at Higher Levels: MLM requires enough groups or clusters to estimate between-group variability accurately. With fewer than 30 clusters, estimates of random effects and standard errors can be unreliable. If clusters are limited, consider simpler models or Bayesian MLM, which can handle small sample sizes better by incorporating prior information.
  • Misinterpreting Random Effects: Random effects reflect variation across groups, not individual-specific outcomes. Interpreting them as individual-level effects can lead to incorrect conclusions. Focus on understanding whether variability among clusters is meaningful for your study’s context.
  • Incorrect Covariance Structure Specification: In models with random slopes, failing to specify an appropriate covariance structure may lead to a biased estimate. Use simpler structures initially and explore more complex structures as needed. Tools like likelihood ratio tests can help determine if additional complexity is warranted.
  • Guidelines on Centering: Grand-mean centering of predictors is usually recommended to improve interpretability, especially in models with cross-level interactions. Group-mean centering can clarify within-group relationships but should be used only when those relationships are of specific interest.
  • Interpreting Random Intercepts: Random intercepts capture between-group differences in the baseline level of the outcome variable. They allow for group-specific baselines, but overly complex random intercept structures can lead to overfitting. Interpret random intercepts cautiously, focusing on whether they reveal meaningful variance.

MLM might not be necessary if the intraclass correlation coefficient (ICC) is very low (close to zero), indicating minimal between-group variance. Similarly, if you have very few clusters (e.g., under 10), the added complexity of MLM may not provide benefits over simpler approaches.

Conclusion

So far, we have learned that multilevel models offer a useful framework for analyzing hierarchical data, accommodating the complexities that arise when data points are grouped within larger units. By capturing both individual and group-level effects, MLMs enable researchers to account for nested structures often missed by single-level models. The fixed and random effects, variance components, and intraclass correlation coefficients allow us to quantify variability at different levels and assess how relationships vary.

As a next step, take our very comprehensive Hierarchical and Mixed Effects Models in R course. You will see that data frequently involves complex relationships across multiple levels, and come to appreciate how MLMs have the ability to handle clustered structures, whether strictly hierarchical or cross-classified, which adds depth and accuracy to your analysis. This is why multilevel models are valuable tools for informed decision-making and policy development. I also recommend our Statistician in R career track, which teaches a lot.

Learn R for Machine Learning

Master core R skills to become a machine learning scientist
Start Learning for Free

Vidhi Chugh's photo
Author
Vidhi Chugh
LinkedIn

I am an AI Strategist and Ethicist working at the intersection of data science, product, and engineering to build scalable machine learning systems. Listed as one of the "Top 200 Business and Technology Innovators" in the world, I am on a mission to democratize machine learning and break the jargon for everyone to be a part of this transformation.

Multilevel Modeling FAQs

What is multilevel modeling (MLM), and why is it used?

Multilevel modeling (MLM), also known as hierarchical or mixed-effects modeling, is a statistical technique designed to analyze data with nested or hierarchical structures. It is used to account for dependencies in clustered data (e.g., students within schools, patients within hospitals) by modeling both group-level and individual-level variations, providing more accurate insights than single-level models.

When should I consider using a multilevel model over a standard regression model?

MLM is recommended when your data has a hierarchical or clustered structure and the intraclass correlation coefficient (ICC) indicates substantial group-level variance. For example, MLM is useful when analyzing data where individual observations are nested within larger units, such as students within classrooms or employees within companies, to account for dependencies within groups.

What is the difference between fixed and random effects in multilevel modeling?

Fixed effects estimate relationships that are assumed to be consistent across all groups, while random effects capture variation between groups. For instance, a fixed effect might measure the average effect of study hours on test scores across all schools, whereas a random effect allows each school to have a unique baseline score or a unique relationship between study hours and scores.

What are cross-level interactions, and why are they important in MLM?

Cross-level interactions occur when the relationship between a lower-level predictor (e.g., study hours) and an outcome (e.g., test scores) varies according to a higher-level characteristic (e.g., school funding). These interactions help capture how group-level factors influence individual-level relationships, offering a more nuanced understanding of the data.

What tools and software can I use to implement MLM, especially for complex or Bayesian models?

For frequentist MLMs, the lme4 package in R is popular for its ease of use in fitting multilevel models. For Bayesian MLMs, tools like Stan (used through rstanarm or brms in R) and PyMC3 in Python are recommended, as they offer flexibility and the ability to estimate posterior distributions, making them ideal for complex models or data with smaller sample sizes.

Topics

Learn with DataCamp

course

Hierarchical and Mixed Effects Models in R

4 hr
20.5K
In this course you will learn to fit hierarchical models with random effects.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

podcast

Robust Data Science with Statistical Modeling

Robustify your data science with statistical modeling, whether you work in tech, epidemiology, finance or anything else.
Hugo Bowne-Anderson's photo

Hugo Bowne-Anderson

57 min

tutorial

Multicollinearity in Regression: A Guide for Data Scientists

Uncover the impact of multicollinearity on regression models. Discover techniques to detect multicollinearity and maintain model reliability. Learn how to address multicollinearity with practical solutions.
Vikash Singh's photo

Vikash Singh

9 min

tutorial

Structural Equation Modeling: What It Is and When to Use It

Explore the types of structural equation models. Learn how to make theoretical assumptions, build a hypothesized model, evaluate model fit, and interpret the results in structural equation modeling.
Bunmi Akinremi's photo

Bunmi Akinremi

9 min

tutorial

Introduction to Non-Linear Models and Insights Using R

Uncover the intricacies of non-linear models in comparison to linear models. Learn about their applications, limitations, and how to fit them using real-world data sets.

Somil Asthana

17 min

tutorial

Factor Levels in R

This tutorial takes course material from DataCamp's free Intro to R course and allows you to practice Factors.
Ryan Sheehy's photo

Ryan Sheehy

6 min

code-along

Data Modeling in SQL

In this live training, you'll learn about data cleaning, shaping and loading techniques and learn about common database schemas for organizing tables for analysis.
Andy Alseth's photo

Andy Alseth

See MoreSee More