👋 Hello! I'm Alphonsio the robot. Ask me a question, I'll try to answer.

How to do groupby in pandas

The pandas.DataFrame.groupby() splits a DataFrame, apply a function and return the combined result. In other word, it groups some parts of the DataFrame to apply a function. Let's consider the following DataFrame representing the height of a population composed of men and women:


>>> import pandas as pd
>>> df = pd.DataFrame([[175 , 'male' ], [181 , 'male' ], [165 , 'female' ], [179 , 'male' ], [156 , 'female' ]], columns=['height', 'gender'])

	height	gender
0	175		male
1	181		male
2	165		female
3	179		male
4	156		female

The following example groups the rows according to the gender for computing the average height of each category:

>>> df.groupby('gender')['height'].mean()
gender
female    160.500000
male      178.333333
Name: height, dtype: float64

More