2

So, the script below requests data from postgres database and draws a diagram. The requested data is a table with 4 columns (ID, Object, Percentage, Color).

The data:

result = [
    (1, 'Apple', 10, 'Red'),
    (2, 'Blueberry', 40, 'Blue'),
    (3, 'Cherry', 94, 'Red'), 
    (4, 'Orange', 68, 'Orange')
]
import pandas as pd
from matplotlib import pyplot as plt
import psycopg2
conn = psycopg2.connect(
    host="localhost",
    port="5432",
    database="db",
    user="user",
    password="123")
cur = conn.cursor()
cur.callproc("test_stored_procedure")
result = cur.fetchall()
cur.close()
conn.close()
print(result)

result = pd.DataFrame(result, columns=['ID', 'Object', 'Percentage', 'Color'])
fruits = result.Object
counts = result.Percentage
labels = result.Color
s = 'tab:'
bar_colors = [s + x for x in result.Color]

fig, ax = plt.subplots()

for x, y, c, lb in zip(fruits, counts, bar_colors, labels):
    ax.bar(x, y, color=c, label=lb)

ax.set_ylabel('fruit supply')
ax.set_title('Fruit supply by kind and color')
ax.legend(title='Fruit color', loc='upper left')

plt.show()

Result:

enter image description here

As you can see in the legend "Red" label is shown twice.

I tried several different examples of how to fix this, but unfortunately no one worked out. F.e.:

handles, labels = ax.get_legend_handles_labels()
ax.legend(handles, labels)
0

2 Answers 2

1

Either make a dictionnary of the handles/labels to make unique combinations:

hls = {l:h for h,l in zip(*ax.get_legend_handles_labels())}
ax.legend(hls.values(), hls.keys(), loc='upper left', title='Fruit color')

Output :

enter image description here

Or avoid the duplicates upfront by using plot and manually draw the legend :

fig, ax = plt.subplots()

result.plot(x="Object", y="Percentage", kind="bar", rot=0,
                 title="Fruit supply by kind and color",
                 color=result["Color"].radd("tab:"), ax=ax)

labels = result["Color"].unique()

handles = [plt.Rectangle((0, 0), 0, 0, color=c) for c in labels]

ax.legend(
    handles, labels, ncol=1,
    handleheight=2, handlelength=3,
    loc="upper left", title="Fruit color"
)

ax.set_xlabel(None)
0
1
  • There are two easy ways to accomplish this:
    1. Use seaborn.barplot to plot the long-form data
    2. pivot the data to wide-form, and plot with pandas.DataFrame.plot and kind='bar'.
    • Both options will use a dict to define the color associated with the category.
    • The primary goal is the use the plotting API, given a specific shape for the DataFrame, to manage the legend, which results in fewer lines of code.
  • Use pandas.DataFrame.rename to update column names, and pandas.Series.map to map categories to new names.
    • It's always better to clean the data first, because the plot API uses this information for labels.
  • seaborn is a high-level API for matplotlib, and pandas.DataFrame.plot uses matplotlib as the default backend.

Imports and Data

import pandas as pd
import seaborn as sns

data = [(1, 'Apple', 10, 'Red'),
        (2, 'Blueberry', 40, 'Blue'),
        (3, 'Cherry', 94, 'Red'),
        (4, 'Orange', 68, 'Orange')]

# long-form for seaborn.barplot
df = pd.DataFrame(data, columns=['ID', 'Object', 'Percentage', 'Fruit Color'])

# convert df to wide-form for pandas.DataFrame.plot
dfp = df.pivot(index='Object', columns='Fruit Color', values='Percentage')

# create the color dict for either option
colors = df['Fruit Color'].unique()
palette = dict(zip(colors, map(str.lower, colors)))

sns.barplot

ax = sns.barplot(data=df, x='Object', y='Percentage', hue='Fruit Color', palette=palette, dodge=False)

pandas.DataFrame.plot

  • stacked=True can be used because there is only a single value for each 'Object', per 'Color'
ax = dfp.plot(kind='bar', stacked=True, width=0.8, color=palette, ylabel='Percentage', rot=0)

Plot Result for Both

enter image description here


df

   ID     Object  Percentage Fruit Color
0   1      Apple          10         Red
1   2  Blueberry          40        Blue
2   3     Cherry          94         Red
3   4     Orange          68      Orange

dfp

Fruit Color  Blue  Orange   Red
Object                         
Apple         NaN     NaN  10.0
Blueberry    40.0     NaN   NaN
Cherry        NaN     NaN  94.0
Orange        NaN    68.0   NaN

palette

{'Red': 'red', 'Blue': 'blue', 'Orange': 'orange'}
0

Not the answer you're looking for? Browse other questions tagged or ask your own question.