Pular para o conteúdo principal

Curso

Análise de cluster em Python

IntermediárioNível de habilidade

Atualizado 04/2026

Conheça o aprendizado não supervisionado por meio de técnicas como agrupamento hierárquico e k-means usando a biblioteca SciPy.

Iniciar curso gratuitamente

PythonMachine Learning

4 h

14 vídeos

46 Exercícios

3,650 XP

65,105

Declaração de realização

Preferido por alunos de milhares de empresas

Treinando uma equipe?

Experimente para Empresas

Descrição do curso

Você provavelmente já se deparou com o Google News, que agrupa automaticamente artigos de notícias semelhantes em um tópico. Você já se perguntou qual processo é executado em segundo plano para chegar a esses grupos? Neste curso, você será apresentado ao aprendizado não supervisionado por meio de agrupamento usando a biblioteca SciPy em Python. Este curso abrange o pré-processamento de dados e a aplicação de agrupamento hierárquico e k-means. No curso, você explorará as estatísticas dos jogadores de um jogo de futebol popular, o FIFA 18. Após concluir o curso, você será capaz de aplicar rapidamente vários algoritmos de agrupamento nos dados, visualizar os agrupamentos formados e analisar os resultados.

Pré-requisitos

Intermediate Python

1

Introduction to Clustering

Before you are ready to classify news articles, you need to be introduced to the basics of clustering. This chapter familiarizes you with a class of machine learning algorithms called unsupervised learning and then introduces you to clustering, one of the popular unsupervised learning algorithms. You will know about two popular clustering techniques - hierarchical clustering and k-means clustering. The chapter concludes with basic pre-processing steps before you start clustering data.

Unsupervised learning: basics

Unsupervised learning in real world

Pokémon sightings

Basics of cluster analysis

Pokémon sightings: hierarchical clustering

Pokémon sightings: k-means clustering

Data preparation for cluster analysis

Normalize basic list data

Visualize normalized data

Normalization of small numbers

FIFA 18: Normalize data

Iniciar capítulo

2

Hierarchical Clustering

This chapter focuses on a popular clustering algorithm - hierarchical clustering - and its implementation in SciPy. In addition to the procedure to perform hierarchical clustering, it attempts to help you answer an important question - how many clusters are present in your data? The chapter concludes with a discussion on the limitations of hierarchical clustering and discusses considerations while using hierarchical clustering.

Basics of hierarchical clustering

Hierarchical clustering: ward method

Hierarchical clustering: single method

Hierarchical clustering: complete method

Visualize clusters

Visualize clusters with matplotlib

Visualize clusters with seaborn

How many clusters?

Create a dendrogram

How many clusters in comic con data?

Limitations of hierarchical clustering

Timing run of hierarchical clustering

FIFA 18: exploring defenders

Iniciar capítulo

3

K-Means Clustering

This chapter introduces a different clustering algorithm - k-means clustering - and its implementation in SciPy. K-means clustering overcomes the biggest drawback of hierarchical clustering that was discussed in the last chapter. As dendrograms are specific to hierarchical clustering, this chapter discusses one method to find the number of clusters before running k-means clustering. The chapter concludes with a discussion on the limitations of k-means clustering and discusses considerations while using this algorithm.

Basics of k-means clustering

K-means clustering: first exercise

Runtime of k-means clustering

How many clusters?

Elbow method on distinct clusters

Elbow method on uniform data

Limitations of k-means clustering

Impact of seeds on distinct clusters

Uniform clustering patterns

FIFA 18: defenders revisited

Iniciar capítulo

4

Clustering in Real World

Now that you are familiar with two of the most popular clustering techniques, this chapter helps you apply this knowledge to real-world problems. The chapter first discusses the process of finding dominant colors in an image, before moving on to the problem discussed in the introduction - clustering of news articles. The chapter concludes with a discussion on clustering with multiple variables, which makes it difficult to visualize all the data.

Dominant colors in images

Extract RGB values from image

How many dominant colors?

Display dominant colors

Document clustering

TF-IDF of movie plots

Top terms in movie clusters

Clustering with multiple features

Clustering with many features

Basic checks on clusters

FIFA 18: what makes a complete player?

Iniciar capítulo

Análise de cluster em Python

Curso
concluído

Obtenha um certificado de conclusão

Adicione esta credencial ao seu perfil do LinkedIn, currículo ou CV
Compartilhe nas redes sociais e em sua avaliação de desempenhoInscreva-se agora

Faça como mais de 19 milhões de alunos e comece Análise de cluster em Python hoje mesmo!

Desenvolva suas habilidades em dados com o app do DataCamp

Continue progredindo em qualquer lugar com nossos cursos para celular e desafios diários de programação de 5 minutos.