Cleaning Data in R
Learn to clean data as quickly and accurately as possible to help your business move from raw data to awesome insights.
Comience El Curso Gratis4 Horas13 Videos44 Ejercicios
Crea Tu Cuenta Gratuita
o
Al continuar, acepta nuestros Términos de uso, nuestra Política de privacidad y que sus datos se almacenan en los EE. UU.¿Entrenar a 2 o más personas?Pruebe DataCamp para empresas
Preferido por estudiantes en miles de empresas
Descripción del curso
Overcome Common Data Problems Like Removing Duplicates in R
It's commonly said that data scientists spend 80% of their time cleaning and manipulating data and only 20% of their time analyzing it. The time spent cleaning is vital since analyzing dirty data can lead you to draw inaccurate conclusions.In this course, you’ll learn a variety of techniques to help you clean dirty data using R. You’ll start by converting data types, applying range constraints, and dealing with full and partial duplicates to avoid double-counting.
Delve into Advanced Data Challenges
Once you’ve practiced working on common data issues, you’ll move on to more advanced challenges such as ensuring consistency in measurements and dealing with missing data. After every new concept, you’ll have the chance to complete a hands-on exercise to cement your knowledge and build your experience.Learn to Use Record Linkage During Data Cleaning
Record Linkage is used to merge datasets together when the values have issues such as typos or different spellings. You’ll explore this useful technique in the final chapter and practice the application by using it to join two restaurant review datasets together into a single dataset.Empresas
¿Entrenar a 2 o más personas?
Obtenga acceso de su equipo a la biblioteca completa de DataCamp, con informes centralizados, tareas, proyectos y másEn las siguientes pistas
Importar y limpiar datos con R
Ir a la pista- 1
Common Data Problems
GratuitoIn this chapter, you'll learn how to overcome some of the most common dirty data problems. You'll convert data types, apply range constraints to remove future data points, and remove duplicated data points to avoid double-counting.
- 2
Categorical and Text Data
Categorical and text data can often be some of the messiest parts of a dataset due to their unstructured nature. In this chapter, you’ll learn how to fix whitespace and capitalization inconsistencies in category labels, collapse multiple categories into one, and reformat strings for consistency.
- 3
Advanced Data Problems
In this chapter, you’ll dive into more advanced data cleaning problems, such as ensuring that weights are all written in kilograms instead of pounds. You’ll also gain invaluable skills that will help you verify that values have been added correctly and that missing values don’t negatively impact your analyses.
- 4
Record Linkage
Record linkage is a powerful technique used to merge multiple datasets together, used when values have typos or different spellings. In this chapter, you'll learn how to link records by calculating the similarity between strings—you’ll then use your new skills to join two restaurant review datasets into one clean master dataset.
Comparing strings50 xpCalculating distance50 xpSmall distance, small difference100 xpFixing typos with string distance100 xpGenerating and comparing pairs50 xpLink or join?100 xpPair blocking100 xpComparing pairs100 xpScoring and linking50 xpScore then select or select then score?100 xpPutting it together100 xpCongratulations!50 xp
Empresas
¿Entrenar a 2 o más personas?
Obtenga acceso de su equipo a la biblioteca completa de DataCamp, con informes centralizados, tareas, proyectos y másEn las siguientes pistas
Importar y limpiar datos con R
Ir a la pistaColaboradores
Requisitos Previos
Joining Data with dplyrMaggie Matsui
Ver MasCurriculum Manager at DataCamp
¿Qué tienen que decir otros alumnos?
Únete a 13 millones de estudiantes y empeza Cleaning Data in R hoy!
Crea Tu Cuenta Gratuita
o
Al continuar, acepta nuestros Términos de uso, nuestra Política de privacidad y que sus datos se almacenan en los EE. UU.