Tutorial: how to easily create a bubble map with Plotly

3 minute read

Creating engaging plots is one useful skill to present data from your work. In this entry we will work with Plotly library and information regarding Coronavirus spread, trying to emulate some of the plots we are seeing these days.

Data import and libraries

We start by importing the usual libraries for data wrangling and the Plotly libraries we will use.

import numpy as np
import pandas as pd
# plotly libraries
#pip install plotly <-- in case you need to intall plotly
import plotly.offline as py 
import plotly.graph_objs as go # it's like "plt" of matplot
import plotly.tools as tl

Next we load a DataFrame that contains tidy information about cases, tests and deaths for several countries around the world.

df = pd.read_csv('https://raw.githubusercontent.com/nikkisharma536/streamlit_app/master/covid.csv')
df.head()
Country Latitude Longitude date total_cases new_cases total_deaths new_deaths total_cases_per_million new_cases_per_million total_deaths_per_million new_deaths_per_million total_tests new_tests total_tests_per_thousand new_tests_per_thousand
0 Argentina -34.0 -64.0 9/4/2020 1795 80 65 5 39.716 1.770 1.438 0.111 14850 1520 0.329 0.034
1 Argentina -34.0 -64.0 10/4/2020 1894 99 79 14 41.907 2.190 1.748 0.310 16379 1529 0.362 0.034
2 Argentina -34.0 -64.0 11/4/2020 1975 81 82 3 43.699 1.792 1.814 0.066 18027 1648 0.399 0.036
3 Argentina -34.0 -64.0 14/4/2020 2272 69 98 3 50.270 1.527 2.168 0.066 22805 3047 0.505 0.067
4 Argentina -34.0 -64.0 15/4/2020 2432 160 105 7 53.810 3.540 2.323 0.155 24374 1569 0.539 0.035

Grouping info and plot

Before jumping into the plot details, let’s group the data by country and keep only the relevant features (recall from the table before that we have data by date).

df_plot = df[['total_cases', 'Country', 'Latitude', 'Longitude','total_deaths_per_million' ]]\
          .groupby(df['Country']).mean().reset_index()

Now we have all set up to start the map plot. To make it easier, we will divide it into 4 parts

1.Marker text and color

In this section we define the marker text to show and variables. As you can see below we will select Total Cases and Total deaths per million. The scale variable will help us adjust the size of the bubble.

df_plot['text'] = df_plot['Country'] + '<br>Cases: ' + (df_plot['total_cases'].astype('int')).astype(str)\
+ '<br>Total Deaths per million: ' + (df_plot['total_deaths_per_million'].astype('int')).astype(str) 
colors = "#981E32"
scale = 200

2.Figure instantiation and definition

In this section we define a ScatterGeo() figure, that will allow us to render a full map.

fig = go.Figure()

fig.add_trace(go.Scattergeo(
        locationmode = 'country names',
        lon = df_plot['Longitude'],
        lat = df_plot['Latitude'],
        text = df_plot['text'],  #<--- the text for the market that we already defined
        mode = 'markers',
        marker = dict(
            size = df_plot['total_cases']/scale, #<-- size of the bubble (normalized by scale)
            color = colors, <--- color of the bubble
            line_color='rgb(200,200,200)', #<--- color for the bubble line
            line_width=0.5,
            sizemode = 'area' )))

3.Layout configuration

Last, we define the base map.

fig.update_layout(
        showlegend = False,
        mapbox_style="open-street-map",
        geo = dict(
            scope = 'world', #<--- define base map (as far as I know you can choose between world and US
            landcolor = 'rgb(220, 220, 220)', #<--- you can change the color of the land
        )
    )

fig.show()

And voilà!

Plotly allows a full range of different modes and configuration, and you can experiment with the full list of parameters to get better results for your project. The idea behind this post was to get you working on a simple map and build up from there. Having said that, I will present you with the same map as before, but this time as a 3D globe.:earth_americas:

fig = go.Figure()
fig.add_trace(go.Scattergeo(
        locationmode = 'country names',
        lon = df_plot['Longitude'],
        lat = df_plot['Latitude'],
        text = df_plot['text'],
        mode = 'markers',
        marker = dict(
            size = df_plot['total_cases']/scale,
            color = "#981E32",
            line_color='rgb(200,200,200)',
            line_width=0.5,
            sizemode = 'area' )))
       
fig.update_layout(
        showlegend = False,
        mapbox_style="open-street-map",
          width=800,
    height=500,
    margin=dict(
        l=20,
        r=20,
        b=20,
        t=0,
        pad=4),
        geo = dict(
            scope = 'world',
            landcolor = '#fffdd0',
            showland= True,
            showocean=True,
            oceancolor="LightBlue",
            showcountries=True
        )
    )

fig.update_geos(projection_type="orthographic")

fig.show()

Tip: if you want to embed the map, follow these steps:

  • Export the html to your local drive
    with open('plotly_graph.html', 'w') as f:
      f.write(fig.to_html(include_plotlyjs='cdn'))
    
  • Open the file as a .txt and copy the <div> section
  • Paste it in the code

Concluding remarks

I hope this tutorial come in handy to start improving your plots and presentations of your projects. And also to keep your mind busy while we wait for a vaccine to arrive :syringe:

Extras

  • Repo of the project with more plots and analysis of data.
  • 2 great videos by 3blue1brown on epidemics and exponential growth.