[1]:

import pandas as pd
pd.set_option("display.max_rows", 5)

Count¶

This function counts the number of rows that exist when grouping by one or more columns. It is equivalent to a group by followed by a summarize counting the rows of each group.

[2]:

from siuba import _, group_by, summarize, count
from siuba.data import mtcars

Specifying column to count¶

[3]:

# longer approach
mtcars >> group_by(_.cyl) >> summarize(n = _.cyl.size)

# shorter approach
mtcars >> count(_.cyl)

[3]:

	cyl	n
0	4	11
1	6	7
2	8	14

Counting multiple columns and sorting¶

[4]:

mtcars >> count(_.cyl, _.gear, sort = True)

[4]:

	cyl	gear	n
0	8	3	12
1	4	4	8
...	...	...	...
6	4	3	1
7	6	5	1

8 rows × 3 columns

Note that since it’s common to want to see the groups with the highest counts, passing sort = True returns counts in descending order.

Counting expressions¶

As is the case with group_by, the count function accepts complex expressions, as long are they are passed as keyword arguments.

[5]:

mtcars >> count(_.cyl, many_gears = _.gear > 3)

[5]:

	cyl	many_gears	n
0	4	False	1
1	4	True	10
...	...	...	...
4	8	False	12
5	8	True	2

6 rows × 3 columns

Mutating and counting with `add_count`¶

While count is equivalent to a group by and summarize, add_count is equivalent to group by and mutate. This means that it keeps the original data, but adds on a new column of counts.

[6]:

from siuba import add_count

mtcars >> add_count(_.cyl)

[6]:

	mpg	cyl	disp	hp	drat	wt	qsec	vs	am	gear	carb	n
0	21.0	6	160.0	110	3.90	2.620	16.46	0	1	4	4	7
1	21.0	6	160.0	110	3.90	2.875	17.02	0	1	4	4	7
...	...	...	...	...	...	...	...	...	...	...	...	...
30	15.8	8	351.0	264	4.22	3.170	14.50	0	1	5	4	14
31	15.0	8	301.0	335	3.54	3.570	14.60	0	1	5	8	14

32 rows × 12 columns

Edit page on github here. Interactive version:

siuba

Navigation

Related Topics

Count¶

Specifying column to count¶

Counting multiple columns and sorting¶

Counting expressions¶

Mutating and counting with `add_count`¶

Count¶

Specifying column to count¶

Counting multiple columns and sorting¶

Counting expressions¶

Mutating and counting with add_count¶

Mutating and counting with `add_count`¶