Photo by Luke Chesser on Unsplash

Part 2: Beginner Friendly Data Visualization using Python, Pandas and Plotly

Almas Myrzatay
8 min readSep 14, 2022

--

This is Part 2 of an article focused on data visualization using open-sourced tools. In this part, we will show how to dynamically visualize data (i.e. change a graph as we update source code). We will also show how to make an API call to update a graph with real-life data.

Feel free to skip around depending on your comfort level or go straight to GitHub for copy/paste of source code. You can also reference Part 1 for set up and simple example.

Table of Content

  1. Part 1: Introduction & Motivation
  2. Part 1: Set up Python environment & Python Library Installation
  3. Part 1: Simple Example — Creating Simple Visualization
  4. Part 2: Simple Example — Creating Dynamic Visualization
  5. Part 2: More Advanced Example — Fetching data via API and visualizing dynamically
  6. Part 2: Conclusion & References

4. Simple Example — Creating Dynamic Visualization

In first part, we installed framework called Dash , but never used it or addressed it. In this part, we will use Dash — a web application framework built on top of Plotly.js and React.js

  1. Create new empty python file
touch dynamic_simple_ex.py

Add the following code. For copy/paste, go to GitHub for source code.

Full source code of the example

Let’s walk-thru code in more detail.

Lines 1–3: import statements

Line 5: we are creating new variable app that stores our application (graph and all the relevant info for the webpage)

Line 7: Setting up Pandas dataframe with some dummy data [1]

Line 13: Creating bar graph specifying graph parameters such as title, axis names, etc

Lines 15–26: This part of code requires familiarity of HTML. Overall, the idea is to create a list of elements, which serves as a skeleton of the webpage (i.e. only the content and structure of content without styling or any actions).

The list of elements is an order at which various elements such as header, paragraph, or graph appear on the web screen to the end user. For example, when you enter a website, you might see navigation bar on top, followed by main title of the page, and followed by a picture. In this case, list will contain in order: navigation bar element (e.g. <nav> tag in HTML), header element (e.g. <h1> tag in HTML) and image element (e.g. <img> tag in HTML). Each website has different “skeleton”, but the idea of stacking order of elements stays the same. Learn more about HTML here.

Creating a parentdiv element that contains 3 children. Below is provided brief overview of tags we use, but for more info refer to HTML reference list here.

a) Line 16: Child html.H1 is a section heading, which is equivalent to <h1> tag in HTML5.

b) Line 18–20: Child html.P is a paragraph element, which is equivalent to <p> tag in HTML5

c) Lines 22–25: Child dcc.Graph is a graph component, which is part of dcc module from theDash library.

Lines 28–30: Run application (server) in the debug modeapp.run_server(debug=True), which was set by parameterdebug=True . Setting up debug mode will ease development process to refresh automatically with latest changes.

Web page rendering example

Run the following command by typing in the terminal of the source-code editor (e.g. VSCode) :

python3 dynamic_simple_ex.py

Navigate to your web browser and in the URL type:

http://127.0.0.1:8050/

Now, you should be able to make edit to your graph (or any part of the page) and see output rendered (nearly) instantaneously.

5. More Advanced Example — Fetching data via API and visualizing dynamically

Now, let’s add API data fetching element and build web page with graph dynamically.

Fetch data using API call

We will use one of the public APIs calls to simplify the example. You can play around with this collective list of public and free APIs in GitHub here. I picked vehicle data API provided by NHTSA (to keep it in theme of cars), which doesn’t require authentication key and is open for anyone to call [2].

Here is my source code from GitHub if you want to copy/paste.

You can play with API call by inputting into URL the following link:

https://vpic.nhtsa.dot.gov/api/vehicles/GetAllManufacturers?format=json&page=2

On your webpage, you will see .json formatted API result that lists all the manufacturers registered with NHTSA. Let’s do this API call within Python using requests module.

Create new empty python file inside src folder:

touch dynamic_api_call_ex.py

Open dynamic_api_call_ex.py file and insert following code:

Let’s break down the following code snippet:

Lines 1–4: import statements

Lines 5–6: defining constant values. In this case, file name — FILE_NAME— to save API results and the API URL command. In more complex (and more realistic) cases, you will have API key along with various parameters specified for your needs to fetch the relevant data. This public API can be called with a single URL.

Lines 9–12: Helper functionread_data(): We open the content of the file where we stored API results.

Lines 14–18: Helper functioncheck_if_data_is_downloaded()that verifies if file exists at specific location.

Lines 20–22: Helper function store_data() that saves API data into a json file.

Note: auxiliary methods such as check_if_data_is_downloaded() and read_data() store_data() help us to store data after first API call. In our case, I didn’t want to make API call every time since manufacturer list doesn’t get updated often. In general, optimization of API requests is an important subject in the realm of DevOps, and not addressed in this article.

Data Pre-Processing

We are still in the file of dynamic_api_call_ex.py where we will create main method as entry point to utilize above mentioned auxiliary functions. We will also process API data to remove unnecessary information and format for our use case.

Let’s break down the above code snippet:

Lines 24–41: Once we read the API data, we are filtering only to take country and manufacturer name (Mfr_Name) columns from the dataset as we don’t need anything else.

Lines 41–43: We are reshaping our dataframe (df) to count how many manufacturers there are per each country and return the result.

Lines 48–58: This is main entry point into this python file. We will call fetch_data() method which makes an API call if file doesn’t exist locally and performs pre-processing, and subsequently returns the results.

At this point we are done with API fetching and data pre-processing. We will now focus on modeling it to visualize as needed.

Visualize and filter data

In this step we will display pre-processed data and have a user input that will allow us to filter graph based on selected country.

Create new empty python file inside src folder:

touch dynamic_advanced_ex.py

In this file, we will import method fetch_data() from dynamic_api_call_ex.py file that performs data fetching whether stored locally or make an API call if we don’t have a local copy. Additionally, fetch_data() performs data pre-processing needed for consumption into graph.

Run the command below and navigate to http://127.0.0.1:8050/

python3 dynamic_advanced_ex.py

Add the below code into the file. Here is the GitHub source code for copy/paste. Let’s break it down line by line:

Lines 1–4: import statements. Notably on line 4 we are importing fetch_data() method which has downloaded and pre-processed data

Lines 6 & 8: we are setting up app variable that contains Dash object and contains the webpage layout (including the graph and all the relevant content) to render on the website.

Lines 10–18: we are adding elements into the web page including header of the page, the graph and some description. Play close attention to the id parameters within Graph and Dropdown objects as they will be entry points to display and subsequently update the data on page dynamically.

Lines 20–32: This is where the dynamic update is happening. This method has decorator app.callback which acts as a listener that binds itself to an app variable we set up on Line 6. As the input into the method changes, method will update the data and return an output. In this case input and output are the parameters id of the Dropdown and Graph (respectively). In other words, when id of the Dropdown changes, it will use the data we imported, update it and report it back to the id of the Graph element.

The final result will be the following web page, which outputs count of manufacturers per country depending on which country is selected from the dropdown menu.

Example of the dashboard before and after filtering

6. Conclusion & References

In Part 2 of the article, we learned:

  1. To create dynamic graph that will update as we change source code. We also created a simple filtering feature to update as input into the graph changes.
  2. Make an API call to fetch data and display the data. In present day, expectation that you will make an API call is almost a given if you are working in a software related field. Hence why I thought it was important to demonstrate.

This two part intro article is just a beginning of what could be done. As next steps you could focus on deployment on Azure, AWS or Google Cloud platform, for example. You can also create much more advanced graphs to make it full-fledged analytics tools suitable for your needs. Hope you liked it & good luck!

References:

[1] E. Edwards. The Biggest Car Manufacturers in the USA (2020), URL: https://www.thomasnet.com/articles/top-suppliers/car-manufacturers-in-usa/

[2] National Highway Traffic Safety Administration. Product Information Catalog Vehicle Listing (vPIC) (2022), URL: https://vpic.nhtsa.dot.gov/api/

Thanks for reading my article!

Check out my other stories and make sure to follow me for more beginner friendly tech content!

If you liked it, or have any comments/questions, let me know! Feel free to connect on social media: Instagram, LinkedIn

--

--