Select and deselect columns in data.table, and what does "with=" mean?
load package
library(data.table)
demo data
Click for details
input <- if (file.exists("flights14.csv")) {
"flights14.csv"
} else {
"https://raw.githubusercontent.com/Rdatatable/data.table/master/vignettes/flights14.csv"
}
flights <- fread(input)
flights
select/subset columns
-
flights[ , 1:3]
-
flights[, year:day]
, in this case, the argumentwith=TRUE
is the default indata.table
. -
flights[ , c('year', 'month', 'day')]
-
flights[ , list(year, month, day)]
-
flights[ , .(year, month, day)]
-
select_cols <- c('year', 'month', 'day')
→flights[ , ..select_cols]
deselect columns
flights[ , -c('year', 'month', 'day')]
, orflights[ , !c('year', 'month', 'day')]
flights[ , -(year:day)]
, orflights[ , !(year:day)]
Again,with=TRUE
is the default here.
say something more about with()
in base R
and data.table
Suppose we have a data.frame
: DF
, and we want to subset all rows where x > 1
. In base R
, we can do this:
Click for details
DF <- data.frame(x = c(1,1,1,2,2,3,3,3), y = 1:8)
## (1) normal way
DF[DF$x > 1, ] # data.frame needs that ',' as well
# x y
# 4 2 4
# 5 2 5
# 6 3 6
# 7 3 7
# 8 3 8
## (2) using with()
DF[with(DF, x > 1), ]
# x y
# 4 2 4
# 5 2 5
# 6 3 6
# 7 3 7
# 8 3 8
In the example(2) above, the x
is regarded as a variable or column name when the codes/elements are inside the function with()
. In data.table
, the argument with=
works in the similar way. It is TRUE
in default, in which the codes/elements are regarded as variables/column names, hence the argument with=FALSE
disables the ability to refer to columns, thereby restoring the “data.frame mode”.