Data Science

Create A Meaningful Count Plot Using Plotly!

The only extensive guide you’ll ever need

Alifia Ghantiwala

--

Photo by Edward Howell on Unsplash
Index Of Contents
· Introduction
· Drawing a count plot in plotly
· Creating a subplot
· Plotting a mixed subplot
· Conclusion
· References

Introduction:

More often than not you visualize your data, to share it with the public. Plotting a meaningful graph that is also visually appealing is an art, in my opinion, today we would work with the plotly library in python to draw some visualizations and their corresponding analysis.

Drawing a count plot in plotly:

While starting with any analysis, the first thing we often want to check is the distribution of a particular feature. A helpful plot for the same would be a count plot, it helps to visualize the number of values present in the dataset for a specific category.

You can directly draw a count plot of a feature in seaborn using sns.countplot(), however, as per my knowledge, there is no inbuilt function to draw a count plot in plotly.

We would take the help of the group by function available in pandas to draw a count plot.

(The data we are analyzing as part of this article is available on Kaggle, it details information regarding noble prize winners over the period of 1901 to 2019.)

nobel_cat = nobel_award.groupby(by=["Category"]).size().reset_index(name="counts")
px.bar(data_frame=nobel_cat,x="Category",y="counts")
Image: Author

The above count plot helps us to understand that the category which has the most number of noble prize awardees is Physiology or Medicine, and also a relative understanding of the distribution of awards across categories.

Now, let’s check the distribution of awards across categories for female winners.

award_cat = female_awardees.groupby(by=["Category"]).size().reset_index(name="counts")
px.bar(award_cat,x="Category",y="counts")
Image: Author

The above plot tells us that women receive the highest number of Nobel prize awards in the Peace category.

So, as you can see we were able to identify this meaningful insight that the categories where women received the most awards were different from the category where the most number of awards were announced using count plots.

But what would be even better would be if we could plot the above plots next to each other. Let’s do that.

Creating a subplot:

We would use the make_subplot method available in plotly to create the subplots.

fig = make_subplots(
rows=1, cols=2,
subplot_titles=("Categories in which women have won","Categories in which nobel prize have been received")
)
fig.add_trace(go.Bar(x=award_cat["Category"],y=award_cat["counts"]),row=1, col=1)
fig.add_trace(go.Bar(x=nobel_cat["Category"],y=nobel_cat["counts"]),row=1, col=2)
Image: Author

The above plot does look much better, doesn't it? Next, we go a bit further and draw a mixed subplot.

Plotting a mixed subplot:

Consider a scenario wherein you need to prepare a concise dashboard that provides meaningful insights into the data, you can do that by plotting a mixed subplot as below.

Image: Author

The bar plot helps us to analyze which countries have the most female Nobel prize winners, often times the environment a country provides is conducive for research and can contribute to its citizens being able to excel in their fields.

The pie chart explains the categories with the most number of winners and the line chart in the second row shows us how the number of female winners has been on a rise over the years!

The code for generating the above plot is below:

#Mixed Subplot
plt.figure(figsize=(25,25))
fig = make_subplots(
rows=2, cols=2,
subplot_titles=("Countries where women have won","Categories where women have won","Number of female awardees over the years"),
specs=[[{"type": "bar"}, {"type": "pie"}],
[None,{"type":"scatter"}]],
)
fig.add_trace(go.Bar(x=award_country["Country"],y=award_country["counts"],marker=dict(color="#DC0AA9"), showlegend=False),row=1, col=1)
fig.add_trace(go.Pie(labels=award_cat["Category"],values=award_cat["counts"],text=award_cat["Category"], showlegend=False),row=1, col=2)
fig.add_trace(go.Scatter(x=award_year["Year"],y=award_year["counts"],mode="lines",showlegend=False),row=2, col=2)
fig.show()

Conclusion:

As part of this article, we have seen how we can use the group by method to draw count plots in plotly, use the make_subplot method to plot different count plots together, and lastly how we can merge various plots to determine a meaningful insight from the data.

I hope this article has been of use to you! Have a good day!

The link to my public notebook which I have used in this article:

References:

https://www.kaggle.com/code/residentmario/styling-your-plots/notebook

--

--