Exploratory Data Analysis and Visualization

This is an expository subject that aims to present the fundamental concepts of Exploratory Analysis – thus allowing the understanding of the main mathematical and statistical tools used in understanding and analyzing data sets – and the main concepts of Data Visualization – in order to develop the training for the construction of good graphics, both exploratory and explanatory/communicative. In this sense, the subject will cover the following topics: Types of variables. Main measures of centrality and dispersion. Data cleaning, missing data, outliers. Descriptive statistics. Hypothesis testing. Clustering: k-means and hierarchical clustering. Types of graphs. Interactive graphics. Design principles and presentation of results. Tools: R: ggplot2, ggthemes, skisse. Python: matplotlib, plotly, seaborn, and bokeh Tableau and Power BI.

Basic Information

Workload
60 hours
Requirements
Does not exist

Mandatory: 

  • Mandatory: Tukey, J. Exploratory Data Analysis. Pearson. 1977 
  • Wes McKinney. Python Data Analysis, O'Reilly. 2017  
  • Cole Nussbaumer Knaflic. Storytelling with Data: A Data Visualization Guide for Business Professionals. Wiley, 2015 
     

Complementary: 

  • Philipp K. Janert. Data Analysis with Open Source Tools: A Hands-On Guide for Programmers and Data Scientists. O'Reilly, 2011. 
  • Osvaldo Martin. Bayesian Analysis with Python. Packt. 2016
  • Shai Vaingast. Beginning Python Visualization: Crafting Visual Transformation Scripts. Apress. 2014. 
  • Petrou, Theodore. Pandas Cookbook: Recipes for Scientific Computing, Time Series Analysis and Data Visualization 
  • Rossant, Cyrille. IPython Interactive Computing and Visualization Cookbook 
Conteúdo acessível em Libras usando o VLibras Widget com opções dos Avatares Ícaro, Hosana ou Guga. Conteúdo acessível em Libras usando o VLibras Widget com opções dos Avatares Ícaro, Hosana ou Guga.
A A A
High contrast