Ultimate guide for Python Data Visualisation Libraries

Data visualisation is the idea of showing some information in the form of visual. It helps to make sense out of data which otherwise seems un-meaningful. There are many Python libraries which can be used for visualising data, some of these are Matplotlib, Pandas, Seaborn, ggplot, Plotly.

The first step in the process of data visualisation is install library which we will be using for data visualisation and then importing that library into over workflow.

Installing Library

All of these libraries can be install using pip which is a package management system. For example – Matplotlib can be installed by using command ‘pip install matplotlib’ similarly other libraries can also be installed.

Importing Library into workflow

After installing the library the second step is to import that library into workflow, which allows the programmer to access functions in the library for making visuals.

This is the general overview of how Python’s Data Visualisation libraries can be installed and imported into workflow. For thorough walkthrough this process check Installing Python Modules.

Importing Data sets

The beauty of Python programming language is that it’s quite compatible meaning that only one library can be used for importing dataset into workflow. And after that we have access to many different libraries which effectively can be used for Data Visualisation. Typically for importing datasets Pandas can be used, which provides an API to do Exploratory Data Analysis on a dataset. Helping in understanding what’s inside dataset like what are the rows, columns in dataset.

Importing famous iris dataset into workflow.

import pandas as pd
iris = pd.read_csv('iris.csv', names=['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'class'])
print(iris.head())

iris dataset

Note – As this dataset is built into the Pandas library itself so there is not need to download it from somewhere on the web.

Using Libraries for Visualisation

MatplotLib

Matplotlib is the most popular python plotting library. It is a low-level library with a Matlab like interface which offers lots of freedom at the cost of having to write more code.

Scatter Plot
import pandas as pd
iris = pd.read_csv('iris.csv', names=['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'class'])
print(iris.head())

# create a figure and axis
fig, ax = plt.subplots()

# scatter the sepal_length against the sepal_width
ax.scatter(iris['sepal_length'], iris['sepal_width'])
# set a title and labels
ax.set_title('Iris Dataset')
ax.set_xlabel('sepal_length')
ax.set_ylabel('sepal_width')
Line Chart
import pandas as pd
iris = pd.read_csv('iris.csv', names=['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'class'])
print(iris.head())

# get columns to plot
columns = iris.columns.drop(['class'])
# create x data
x_data = range(0, iris.shape[0])
# create figure and axis
fig, ax = plt.subplots()
# plot each column
for column in columns:
    ax.plot(x_data, iris[column], label=column)
# set title and legend
ax.set_title('Iris Dataset')
ax.legend()
Histogram
# create figure and axis
fig, ax = plt.subplots()
# plot histogram
ax.hist(wine_reviews['points'])
# set title and labels
ax.set_title('Wine Review Scores')
ax.set_xlabel('Points')
ax.set_ylabel('Frequency')
Wine Review Scores
Bar Chart
# create a figure and axis 
fig, ax = plt.subplots() 
# count the occurrence of each class 
data = wine_reviews['points'].value_counts() 
# get x and y data 
points = data.index 
frequency = data.values 
# create bar chart 
ax.bar(points, frequency) 
# set title and labels 
ax.set_title('Wine Review Scores') 
ax.set_xlabel('Points') 
ax.set_ylabel('Frequency')

Check out this video for more through explanation about how Matplotlib can be used for Data Visualisation: –

Pandas

Scatter Plot
import pandas as pd
iris = pd.read_csv('iris.csv', names=['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'class'])
print(iris.head())

iris.plot.scatter(x='sepal_length', y='sepal_width', title='Iris Dataset')
Line Chart
import pandas as pd
iris = pd.read_csv('iris.csv', names=['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'class'])
print(iris.head())

iris.drop(['class'], axis=1).plot.line(title='Iris Dataset')
Bar Chart
wine_reviews['points'].value_counts().sort_index().plot.bar()

Check out this video for learning more about how Pandas can be used for making beautiful data visualisations.

https://www.youtube.com/watch?v=h-qM82lKb3U

Seaborn

import pandas as pd
iris = pd.read_csv('iris.csv', names=['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'class'])
print(iris.head())

sns.scatterplot(x='sepal_length', y='sepal_width', data=iris)

Seaborn Scatter plot
Line Chart
import pandas as pd
iris = pd.read_csv('iris.csv', names=['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'class'])
print(iris.head())

sns.lineplot(data=iris.drop(['class'], axis=1))
Seaborn line chart

This is how data visualisation can be done easily by using Python’s libraries. If you are keen enough to go one step further in becoming An Expert Data Engineer/Data Scientist definitely read Data Visualisation With Python: Create An Impact With Meaningful Data Insights Using Interactive And Engaging Visuals by Mario Dobler.

Talking about my personal journey of becoming an Expert Data Scientist CS Dojo have helped me a lot, his video about Data Analysis have helped me a lot.


Josh

Hi, I'm Josh a Computer Science graduate from California State University, Sacramento since coming out with my Master's from university. I've worked with multiple startups across US and in UK as well primarily as a Python Developer. Here on this website, I'm sharing my knowledge of Python. If you want to ask me anything about Python feel free to reach out, I would be happy to help you out.

Leave a Reply

Your email address will not be published. Required fields are marked *

Recent Posts