11. Aggregation and grouping

All code in this section will be based on Tipping dataset

df := DataFrame loadTips.

The simplest example of applying a groupBy: operator is grouping the values of a series by the values of another one of the same size.

bill := tips column: #total_bill.
sex := tips column: #sex.

bill groupBy: sex.

The result of this query will be an object of DataSeriesGrouped, which splits the bill into two series, mapped to the unique 'Male' and 'Female' values of sex series.

Since most of the time the series that are grouped are both columns of a same data frame, there is a handy shortcut

tips group: #total_bill by: #sex.

The result of groupBy: operator is rather useless unless combined with

df select: #(sepal_length species)
   where: [ :petal_length :petal_width |
      (petal_length < 4.9 and: petal_length > 1.6) and:
      (petal_width < 0.4 or: petal_width > 1.5) ]
   groupBy: #species
   aggregate: #sum.

The result of this query will be a data frame with a single column

            |  sepal_length  
------------+--------------
    setosa  |          15.9  
versicolor  |          18.2  
 virginica  |          17.1

Tutorial

Installation
Creating DataSeries

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

11. Aggregation and grouping

Tutorial

Clone this wiki locally