Close Menu
    Trending
    • What If Your Portfolio Could Speak for You? | by Lusha Wang | Jun, 2025
    • High Paying, Six Figure Jobs For Recent Graduates: Report
    • What If I had AI in 2018: Rent the Runway Fulfillment Center Optimization
    • YouBot: Understanding YouTube Comments and Chatting Intelligently — An Engineer’s Perspective | by Sercan Teyhani | Jun, 2025
    • Inspiring Quotes From Brian Wilson of The Beach Boys
    • AI Is Not a Black Box (Relatively Speaking)
    • From Accidents to Actuarial Accuracy: The Role of Assumption Validation in Insurance Claim Amount Prediction Using Linear Regression | by Ved Prakash | Jun, 2025
    • I Wish Every Entrepreneur Had a Dad Like Mine — Here’s Why
    Finance StarGate
    • Home
    • Artificial Intelligence
    • AI Technology
    • Data Science
    • Machine Learning
    • Finance
    • Passive Income
    Finance StarGate
    Home»Artificial Intelligence»Heatmaps for Time Series  | Towards Data Science
    Artificial Intelligence

    Heatmaps for Time Series  | Towards Data Science

    FinanceStarGateBy FinanceStarGateMarch 12, 2025No Comments8 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    In 2015, the Wall Avenue Journal (WSJ) revealed a extremely efficient collection of heatmaps illustrating the impression of vaccines on infectious ailments in the US. These visualizations showcased the ability of blanket insurance policies to drive widespread change. You may view the heatmaps here.

    Heatmaps are a flexible software for knowledge evaluation. Their skill to facilitate comparative evaluation, spotlight temporal traits, and allow sample recognition makes them invaluable for speaking complicated data. 

    On this Fast Success Information Science challenge, we’ll use Python’s Matplotlib graphing library to recreate the WSJ’s measles chart, demonstrating methods to leverage heatmaps and thoroughly designed colorbars to affect knowledge storytelling.

    The info 

    The illness knowledge comes from the College of Pittsburgh’s Project Tycho. This group works with nationwide and international well being institutes and researchers to make knowledge simpler to make use of to enhance international well being. The measles knowledge is accessible underneath a Artistic Commons Attribution 4.0 Worldwide Public License. 

    For comfort, I’ve downloaded the information from Mission Tycho’s data portal to a CSV file and saved it on this Gist. Later, we’ll entry it programmatically by the code.

    The measles heatmap

    We’ll use the Matplotlib pcolormesh() perform to assemble an in depth facsimile of the WSJ measles heatmap. Whereas different libraries, corresponding to Seaborn, Plotly Express, and hvplot, embody devoted heatmap capabilities, these are constructed for ease of use, with many of the design choices abstracted away. This makes it troublesome to drive their outcomes to match the WSJ heatmap. 

    In addition to pcolormesh(), Matplotlib’s imshow() perform (for “picture present”) may also produce heatmaps. The pcolormesh perform, nevertheless, higher aligns gridlines with cell edges. 

    Right here’s an instance of a heatmap made with imshow() that you just evaluate to the pcolormesh() outcomes later. The primary distinction is the shortage of gridlines.

    Measles incidence heatmap constructed with Matplotlib’s imshow()perform (by the writer)

    In 1963, the measles vaccine was licensed and launched throughout America with widespread uptake. Inside 5 years, the incidence of the illness was enormously diminished. By 2000, measles had been thought of eradicated in the US, with any new circumstances arriving from exterior the nation. Discover how properly the visualization conveys this “huge image” whereas preserving the state-level particulars. That is due in no small half to the selection of colorbar.

    The colours used within the visualization are biased. Greater than 80% of the colorbar consists of heat colours, and (gentle) blue is reserved for the smallest values. This makes it simple to demarcate the pre- and post-vaccination durations. White cells denote lacking knowledge, represented by NaN (Not a Quantity) values. 

    Examine the earlier heatmap to at least one constructed with a extra balanced colorbar:

    Heatmap
    The heatmap utilizing a extra balanced colorbar (by the writer)

    The darker blue shade not solely overpowers the plot, it’s onerous on the eyes. And whereas it’s nonetheless potential to see the impact of the vaccine, the visible impression is much extra delicate than within the plot with the biased colorbar. Alternately, it’s simpler to parse larger values however on the expense of the general theme.

    The code

    The next code was written in JupyterLab and is introduced by cell.

    Importing libraries

    The primary cell imports the libraries we’ll want to finish the challenge. A web based seek for the library names will lead you to the set up directions.

    import numpy as np
    import matplotlib.pyplot as plt
    from matplotlib.colours import LinearSegmentedColormap, Normalize
    from matplotlib.cm import ScalarMappable
    import pandas as pd

    Creating the customized colormap

    The next code carefully reproduces the colormap utilized by the WSJ. I used the web Image Color Picker software to determine the important thing colours from a screenshot of their measles heatmap and adjusted these based mostly on colours chosen for a similar tutorial constructed for R.

    # Normalize RGB colours:
    colours = ['#e7f0fa',  # lightest blue
              '#c9e2f6',  # light blue
              '#95cbee',  # blue
              '#0099dc',  # dark blue
              '#4ab04a',  # green
              '#ffd73e',  # yellow
              '#eec73a',  # yellow brown
              '#e29421',  # dark tan
              '#f05336',  # orange
              '#ce472e']  # crimson
    
    # Create a listing of positions for every shade within the colormap:
    positions = [0, 0.02, 0.03, 0.09, 0.1, 0.15, 0.25, 0.4, 0.5, 1]
    
    # Create a LinearSegmentedColormap (steady colours):
    custom_cmap = LinearSegmentedColormap.from_list('custom_colormap', 
                                                    record(zip(positions, 
                                                             colours)))
    
    # Show a colorbar with the customized colormap:
    fig, ax = plt.subplots(figsize=(6, 1))
    
    plt.imshow([list(range(256))],
               cmap=custom_cmap, 
               facet='auto', 
               vmin=0, vmax=255)
    
    plt.xticks([]), plt.yticks([])
    plt.present()

    Right here’s the generic colorbar produced by the code:

    The colorbar based mostly on the WSJ measles heatmap (by writer)

    This code makes a steady colormap utilizing Matplotlib’s LinearSegmentedColormap() class. This class specifies colormaps utilizing anchor factors between which RGB(A) values are interpolated. That’s, it generates colormap objects based mostly on lookup tables utilizing linear segments. It creates the lookup desk utilizing linear interpolation for every main shade, with the 0–1 area divided into any variety of segments. For extra particulars, see this short tutorial on making customized colormaps with Matplotlib.

    Loading and prepping the illness knowledge

    Subsequent, we load the CSV file into pandas and prep it for plotting. This file comprises the incidence of measles (because the variety of circumstances per 100,000 individuals) for every state (and the District of Columbia) by week from 1928 to 2003. We’ll have to convert the values to a numeric knowledge kind, mixture the information by yr, and reshape the DataFrame for plotting.

    # Learn the csv file right into a DataFrame:
    url = 'https://bit.ly/3F47ejX'
    df_raw = pd.read_csv(url)
    
    # Convert to numeric and mixture by yr:
    df_raw.iloc[:, 2:] = (df_raw.iloc[:, 2:]
                          .apply(pd.to_numeric, 
                                 errors='coerce'))
    
    df = (df_raw.groupby('YEAR', as_index=False)
            .sum(min_count=1, numeric_only=True)
            .drop(columns=['WEEK']))
    
    # Reshape the information for plotting:
    df_melted = df.soften(id_vars='YEAR',
                        var_name='State',
                        value_name='Incidence')
    
    df_pivot = df_melted.pivot_table(index='State',
                                     columns='YEAR',
                                     values='Incidence')
    
    # Reverse the state order for plotting:
    df_pivot = df_pivot[::-1]

    Right here’s how the preliminary (uncooked) DataFrame seems, exhibiting the primary 5 rows and ten columns:

    A part of the top of the df_raw DataFrame (by writer)

    NaN values are represented by a touch (-). 

    The ultimate df_pivot DataFrame is in broad format, the place every column represents a variable, and rows symbolize distinctive entities:

    A part of the top of the dv_pivot DataFrame (by writer)

    Whereas plotting is usually carried out utilizing long format knowledge, as within the df_raw DataFrame, pcolormesh() prefers broad format when making heatmaps. It’s because heatmaps are inherently designed to show a 2D matrix-like construction, the place rows and columns symbolize distinct classes. On this case, the ultimate plot will look very similar to the DataFrame, with states alongside the y-axis and years alongside the x-axis. Every cell of the heatmap might be coloured based mostly on the numerical values.

    Dealing with lacking values 

    The dataset comprises a number of lacking values. We’ll wish to distinguish these from 0 values within the heatmap by making a masks to determine and retailer these NaN values. Earlier than making use of this masks with NumPy, we’ll use Matplotlib’s Normalize() class to normalize the information. This manner, we are able to immediately evaluate the heatmap colours throughout states.

    # Create a masks for NaN values:
    nan_mask = df_pivot.isna()
    
    # Normalize the information for a shared colormap:
    norm = Normalize(df_pivot.min().min(), df_pivot.max().max())
    
    # Apply normalization earlier than masking:
    normalized_data = norm(df_pivot)
    
    # Create masked array from normalized knowledge:
    masked_data = np.ma.masked_array(normalized_data, masks=nan_mask)

    Plotting the heatmap

    The next code creates the heatmap. The center of it consists of the one line calling the pcolormesh() perform. A lot of the relaxation ornaments the plot in order that it seems just like the WSJ heatmap (excluding the x, y, and colorbar labels, that are enormously improved in our model).

    # Plot the information utilizing pcolormesh with a masked array:
    multiplier = 0.22  # Modifications determine facet ratio
    fig, ax = plt.subplots(figsize=(11, len(df_pivot.index) * multiplier))
    
    states = df_pivot.index
    years = df_pivot.columns
    
    im = plt.pcolormesh(masked_data, cmap=custom_cmap, 
                        edgecolors='w', linewidth=0.5)
    
    ax.set_title('Measles Incidence by State (1928-2002)', fontsize=16)
    
    # Regulate x-axis ticks and labels to be centered:
    every_other_year_indices = np.arange(0, len(years), 2) + 0.5
    ax.set_xticks(every_other_year_indices)
    ax.set_xticklabels(years[::2], rotation='vertical', fontsize=10)
    
    # Regulate labels on y-axis:
    ax.set_yticks(np.arange(len(states)) + 0.5)  # Heart ticks in cells
    ax.set_yticklabels(states, fontsize=9)
    
    # Add vertical line and label for vaccine date:
    vaccine_year_index = record(years).index(1963)
    ax.axvline(x=vaccine_year_index, linestyle='--', 
               linewidth=1, shade='ok')
    alaska_index = states.get_loc('ALASKA')
    ax.textual content(vaccine_year_index, alaska_index, ' Vaccine', 
            ha='left', va='middle', fontweight='daring')
    
    # Add a colorbar:
    cbar = fig.colorbar(ScalarMappable(norm=norm, cmap=custom_cmap), 
                        ax=ax, orientation='horizontal', pad=0.1, 
                        label='Circumstances per 100,000')
    cbar.ax.xaxis.set_ticks_position('backside')
    
    plt.savefig('measles_pcolormesh_nan.png', dpi=600, bbox_inches='tight')
    plt.present()

    Right here’s the end result:

    Measles incidence heatmap constructed with Matplotlib’s pcolormesh() perform (by the writer)

    This can be a shut approximation of the WSJ heatmap, with what I contemplate extra legible labels and higher separation of 0 and NaN (lacking knowledge) values. 

    Makes use of for heatmaps

    Heatmaps are extremely efficient at demonstrating how a blanket coverage or motion impacts a number of geographic areas over time. Because of their versatility, they are often tailored for different functions, corresponding to monitoring:

    • Air high quality index ranges in several cities earlier than and after the Clean Air Act
    • Change in check scores for faculties or districts after insurance policies like No Child Left Behind
    • Unemployment charges for various areas after financial stimulus packages
    • Product gross sales efficiency by area after native or nationwide advert campaigns

    Among the many benefits of heatmaps is that they promote a number of evaluation strategies. These embody:

    Comparative Evaluation: simply evaluate traits throughout totally different classes ( states, faculties, areas, and many others.).

    Temporal Traits: elegantly present how values change over time.

    Sample Recognition: determine patterns and anomalies within the knowledge at a look.

    Communication: Present a transparent and concise technique to talk complicated knowledge.

    Heatmaps are a good way to current a big-picture overview whereas preserving the information’s fine-scale granularity.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleXaier Initialization 神經網路參數初始化 – Jacky Chou
    Next Article How to Make Your Business Look Bigger Than It Is
    FinanceStarGate

    Related Posts

    Artificial Intelligence

    What If I had AI in 2018: Rent the Runway Fulfillment Center Optimization

    June 14, 2025
    Artificial Intelligence

    AI Is Not a Black Box (Relatively Speaking)

    June 13, 2025
    Artificial Intelligence

    Boost Your LLM Output and Design Smarter Prompts: Real Tricks from an AI Engineer’s Toolbox

    June 13, 2025
    Add A Comment

    Comments are closed.

    Top Posts

    History shows Liberals’ housing plan failed the last time

    April 8, 2025

    Mattel, OpenAI Sign Deal to Bring ChatGPT to ‘Iconic’ Toys

    June 12, 2025

    Network-aware job scheduling in Machine Learning clusters | by Alex Nguyen | Mar, 2025

    March 7, 2025

    Is Python Set to Surpass Its Competitors?

    February 26, 2025

    How to Optimise your RAG — Enhancing LLM Accuracy with a Dictionary-Based Approach (Part 2/3) | by MD. SHARIF ALAM | Mar, 2025

    March 10, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Data Science
    • Finance
    • Machine Learning
    • Passive Income
    Most Popular

    This CEO Says the Secret to Growth Is Knowing Who You’re Not For

    May 25, 2025

    How do I detect skewness and deal with it? | by DataMantra | Analyst’s corner | Mar, 2025

    March 23, 2025

    How to automate data extraction in healthcare: A quick guide

    April 8, 2025
    Our Picks

    CRA can collect tax debt from spouses

    March 20, 2025

    Training AI Is ‘Perfect’ Work for Introverts. Here’s Why.

    February 20, 2025

    Secure Your Family’s Digital Life for $16

    April 6, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Data Science
    • Finance
    • Machine Learning
    • Passive Income
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2025 Financestargate.com All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.