How to use the aggregate function in R to perform computation on measures that are categorized by some variables in a data frame
In today's fast-paced world, there are tremendous amount of data being recorded periodically. These data may come from sensors which record some measurement along with some categorization such as time and sensor type.
To make sense of such data, most data analysts use the R programming language as a tool. Apart from being free, there are many nice features of R which can help make my data analysis work easier.
This post records the use of the aggregate
function in R which I often use to create meaning out of the humongous data which I lay my hands on.
To remember how to use the aggregate function, I will recite the following sentence in my brain before constructing the codes:
Aggregate a_measurement_column, by a_type_column on the data with the function a_function.
With that I could easily construct the R codes similar to the following examples. These examples aggregate the miles per gallon values in the mtcars dataframe that is provided as part of the R Datasets Package.
Example to get the average of measurements that are categorized by one column
> averageMpgByCyl <- aggregate(mpg ~ cyl, data = mtcars, FUN = 'mean') > head(averageMpgByCyl) cyl mpg 1 4 26.66364 2 6 19.74286 3 8 15.10000
Example to get the average of measurements that are categorized by more than one column
> averageMpgByGearAndCyl <- aggregate(mpg ~ gear * cyl, data = mtcars, FUN = 'mean') > head(averageMpgByGearAndCyl) gear cyl mpg 1 3 4 21.500 2 4 4 26.925 3 5 4 28.200 4 3 6 19.750 5 4 6 19.750 6 5 6 19.700
Get the sum of measures that are categorized by one column
> sumMpgByCyl <- aggregate(mpg ~ cyl, data = mtcars, FUN='sum') > head(sumMpgByGearAndCyl) gear cyl mpg 1 3 4 21.500 2 4 4 26.925 3 5 4 28.200 4 3 6 19.750 5 4 6 19.750 6 5 6 19.700
Get the sum of measures that are categorized by more than one column
> sumMpgByGearAndCyl <- aggregate(mpg ~ gear * cyl, data = mtcars, FUN='sum') > head(sumMpgByGearAndCyl) gear cyl mpg 1 3 4 21.5 2 4 4 215.4 3 5 4 56.4 4 3 6 39.5 5 4 6 79.0 6 5 6 19.7