Tidymodels and machine learning

Modified

February 19, 2024

The {tidymodels} concept (Kuhn and Silge 2022) is a group of packages in support of modeling and machine learning. In the last section we learned how to manipulate a basic linear model though a combination of the base-R lm() function and the tidyverse {broom} package along with the nest() function. However, modeling can be much more involved. This basic overview introduces tidymodels, a conceptual approach to integrating tidyverse principles with modeling, machine learning, feature selection and tuning.

Beyond the core of integrating machine learning and modeling with the tidyverse, tidymodels supports a variety of useful analytical and computational approaches. A short list of examples includes statistical analysis (e.g. bootstrapping, hypothesis testing, k-means clustering, logistic regression, etc.), robust modeling (e.g. classification, least squares, resampling), creating performance metrics, tuning, clustering, classification, text analysis, neural networks, and more.

Get started

Modelers and ML coders can approach tidymodels by

  1. Engaging with the five-step tutorial (build a model, use recipes to pre-process data, evaluate with resampling, tune, and predict.

  2. Dive deeper to find articles that help apply the tidymodels approach to your needs.

References

Kuhn, Max, and Julia Silge. 2022. Tidy Modeling with r : A Framework for Modeling in the Tidyverse. Sebastopol, CA : O’Reilly Media, 2022.

Reuse