python – Creating a for loop or function to create multiple heatmaps from a dataframe

Here is one way to create all the charts in a loop. It isn’t the cleanest, as there will be repeated charts (but with the axes swapped).

My data:

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

snow_data = pd.DataFrame(data={"temp": np.random.randint(20, 40, 50), "dewpoint": np.random.randint(15, 40, 50),
                               "wind": np.random.randint(0, 20, 50), "precip_rate_hr": np.random.random(50),
                               "total_snow": np.random.random(50)*10})

Creating the categories (this is not in a loop as the bins are all different):

snow_data['temp_bin'] = pd.cut(snow_data['temp'], [0, 10.4, 23.5, 28.75, 37], labels=['<10.4', '10.4-23.5', '23.5-28.75', '>28.75'])
snow_data['dewpt_bin'] = pd.cut(snow_data['dewpoint'], [0, 4.1, 15, 19.75, 33], labels=['<4.1', '4.1-15', '15-19.75', '>19.75'])
snow_data['wind_bin'] = pd.cut(snow_data['wind'], [0, 5, 10, 15, 20], labels=['<5', '5-10', '10-15', '15-20'])
snow_data['precip_rate_hr_bin'] = pd.cut(snow_data['precip_rate_hr'], [0, 0.25, 0.5, 0.75, 1], labels=['<0.25', '0.25-0.5', '0.5-0.75', '>0.75'])

The loop:

# List of all _bin columns to loop through
bin_cols = ['temp_bin', 'dewpt_bin', 'wind_bin', 'precip_rate_hr_bin']

# First factor
for i in bin_cols:
    # Second factor
    for j in bin_cols:
        # Need to ensure you aren't grouping the data by the same column twice!
        if j != i:
            # Average now mean for bin groups
            avg_snow = snow_data.groupby([i, j], as_index=False)['total_snow'].mean()
            # Title for plot
            title="Average Snow Total on Days that Met Specific " + i[: -4] + ' and ' + j[: -4] + ' Criteria'
            # Pivot table
            data_fp = avg_snow.pivot_table(index=i, columns=j, values="total_snow")
            # Plot
            sns.set(font_scale=1.2)
            f, ax = plt.subplots(figsize=(25, 25))
            sns.set(font_scale=2.0)
            sns.heatmap(data_fp, annot=True, fmt="g", linewidth=0.5)
            ax.set_title(title, fontsize=20)
            plt.show()

I agree with Parfait’s comment about not needing the np.percentile, unless you are using these to find appropriate categories for the _bin columns.

Read more here: Source link