Friday, August 14, 2020

Covid-19 Data Visualization across the World using Choropleth map

Introduction

This project visualizes the Covid-19 data (i.e. Total cases, Deaths and Recoveries) across various Provinces and Districts of Nepal as of 12th August, 2020. Geojson file of Nepal's states and districts have been used. Also python library i.e. Folium has been used to generate Choropleth map whose geo_data value is the geojson of Nepal.

The libraries imported are:

Data description:

Covid-19 data of  Countries across the world were scrapped from wikipedia.

Click here to go to the wikipedia page.

Simple one line code can be used to scrap the table of wikipedia. We will store the scrapped data into a dataframe called 'df'.

df = pd.read_html('https://en.wikipedia.org/wiki/COVID-19_pandemic_by_country_and_territory')[1]

View of original data:

Data Wrangling/ Cleaning:

Step 1: 

Selecting only required columns into new dataframe df1 from the data above.



Step 2:

 Converting the multi-index column into single-index colums.

Step 3:

Removing the index attached with the name of each countries in the dataframe above.

df1['Countries'] = df1['Countries'].str.replace(r'\[.*?\]$', '') 

Step 4:

Changing the country name 'United States' to 'United States of America' to match with the name in Geojson file.

df1['Countries'].replace('United States', 'United States of America', inplace=True)

Step 5:

We can see the last 3 rows of dataframe are not required so dropping them out.

df1=df1[:-3]

Step 6:

Replacing the value 'No data' to 0 (Zero) in each column.

df1['Recovered'].replace('No data', '0', inplace=True)


Step 7:

Changing the data type of columns Cases, Recovered and Deaths to integer.

After changing the datatypes:

Visualizing the data across world:

For cases:


Similarly, it can be done for Recovered and Deaths.

For recovered:

For deaths:


Get the Github link here.

0 comments:

Post a Comment