Recent Posts

More Posts

In this entry we are going to perform a statistical text analysis on the Star Wars scripts from The Original Trilogy Episodes (IV, V and VI), using wordclouds to show the most frequent words.

CONTINUE READING

In this entry we are going to analyze a data set containing the location and circumstances of every field goal attempted by Kobe Bryant took during his 20-year career.

CONTINUE READING

k-means is an unsupervised machine learning algorithm used to find groups of observations (clusters) that share similar characteristics. We are going to use the algorithm to cluster wines according to their similarity.

CONTINUE READING

This data set includes 721 Pokémon (until sixth generation), including their number, name, first and second type, and basic stats: HP, Attack, Defense, Special Attack, Special Defense and Speed.

CONTINUE READING

How do we have to deal with messy data? In this entry we are going to learn how to clean and manipulate data to analyze it correctly, using the principles of tidy data.

CONTINUE READING