Using .GRP symbol with R data.table package

David Ross

2021/05/12

I am a big fan of the data.table package in R for data manipulation. I use the csv reader function fread all the time and enjoy the concise syntax for many basic queries of the data. There are some special symbols associated with the package that I’ve used before, especially .N and .SD. However, I had never used the .GRP symbol until I faced a particular query for a class assignment recently. I am documenting how I used it in this post.

The data come from the pnwflights14 package which is modeled upon the nycflights13 package.

library(data.table)
data("flights", package = "pnwflights14")
setDT(flights)
# just analyze flights originating from Portland
flights <- flights[origin == "PDX"]

The analysis question is: by month, what new routes were added or removed? Only the dest and month columns are necessary to answer this query. The special symbol .GRP is used with the data.table syntax as an index for the current group. So, I grouped the data by month and the .GRP symbol now aligns with the month of the current group which allows for comparison with the previous month.

# flights dropped from previous month
dropped <- flights[order(month), .(dropped = setdiff(flights[month == .GRP - 1, dest], dest)), by = month]
# flights added from previous month
added <- flights[order(month), .(added = setdiff(dest, flights[month == .GRP - 1, dest])), by = month][month != 1]

I used the excellent gt package for creating a formatted table.

Routes added and dropped
Changes in destinations from previous month1
Dropped Added
Feb ABQ
Mar BOI, ABQ
Apr PHL
May KOA, LIH
Jun FAI, HOU, AUS, BWI, STL
Jul BOI, RNO, PSP, LMT
Aug
Sep HOU, AUS
Oct RDM, BWI, EUG, FAI, STL PSP
Nov PHL KOA, LIH
Dec
Source: pnwflights R package

1 For flights departing PDX in the year 2014.