After installing the data.table function using install.packages('data.table') load the package -
> library(data.table)
fread is short for fast read -
> statecsv <- fread("Data/statedata.csv") # Reading CSV file
> head(statecsv)
State state_code region governor psychRegions
1: Alabama AL South Republican Friendly and Conventional
2: Arizona AZ West Republican Relaxed and Creative
3: Arkansas AR South Republican Friendly and Conventional
4: California CA West Democrat Relaxed and Creative
5: Colorado CO West Democrat Friendly and Conventional
6: Connecticut CT Northeast Democrat Temperamental and Uninhibited
extraversion agreeableness conscientiousness neuroticism openness instagram
1: 55.5 52.7 55.5 48.7 42.7 0.64
2: 50.6 46.6 58.4 38.1 54.7 0.18
3: 49.9 52.7 41.0 56.2 40.3 0.46
4: 51.4 49.0 43.2 39.1 65.0 1.47
5: 45.3 47.5 58.8 34.3 57.9 -1.04
6: 57.6 38.6 34.2 53.4 53.9 0.37
facebook retweet entrepreneur gdpr privacy university mortgage volunteering
1: 1.65 0.35 0.26 -0.77 0.58 1.74 1.41 -1.49
2: -0.26 -0.57 0.56 -0.31 -0.45 -0.77 1.01 0.96
3: 1.10 -0.60 0.25 -0.60 0.69 0.02 -0.66 -1.19
4: -0.42 0.48 0.50 1.12 0.23 -1.92 -0.88 0.57
5: -1.06 -0.90 0.02 0.59 -0.22 -0.44 1.49 1.01
6: -0.98 1.15 0.07 0.71 0.36 0.37 0.48 1.28
museum scrapbook modernDance
1: -1.11 0.06 -1.27
2: -0.13 0.34 0.41
3: -0.95 0.85 -1.44
4: 0.05 -0.69 0.38
5: 0.72 -0.34 -0.29
6: 1.18 -0.81 0.55
> class(statecsv)
[1] "data.table" "data.frame"
So, functions that accept a data.frame will work just fine on data.table as well.
A simple demonstration of how fast fread is:
> # Create a big data frame
> big_df <- data.frame(x = rnorm(10E5),
+ y = runif(10E5))
> # create a temporary file
> file <- tempfile()
> # write the data frame to the file
> write.table(big_df, file=file, col.names = T, sep="\t")
Now using read.table(), read the file and find the time it took using system.time():
> system.time(read.table(file, header=T, sep="\t"))
user system elapsed
7.29 0.14 7.74
> system.time(fread(file, header=T, sep="\t"))
user system elapsed
0.28 0.05 0.22
SO using fread(), data can be read in faster speed than the regular read.table() function.
Using data.table(df) or as.data.table(df) -
> class(iris)
[1] "data.frame"
> iris2 <- data.table(iris)
> class(iris2)
[1] "data.table" "data.frame"
> iris3 <- as.data.table(iris)
> class(iris3)
[1] "data.table" "data.frame"
Get R Programming assignment help at www.homeworkhelponline.net.
Using setDT(df) -
> iris4 <- copy(iris)
> class(iris4)
[1] "data.frame"
Converting inplace -
> setDT(iris4)
> class(iris4)
[1] "data.table" "data.frame"
Using as.data.frame(dt) or data.frame(dt)-
> class(iris4)
[1] "data.table" "data.frame"
> iris5 <- data.frame(iris4)
> class(iris5)
[1] "data.frame"
Using setDF(dt)-
> class(iris4)
[1] "data.table" "data.frame"
> setDF(iris4)
> class(iris4)
[1] "data.frame"
The built in mtcars data frame has rownames but data table rows do not contain any rownames.
> head(mtcars)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
> class(mtcars)
[1] "data.frame"
> mtcars2 <- copy(mtcars)
> setDT(mtcars2)
> head(mtcars2) # rownames are gone
mpg cyl disp hp drat wt qsec vs am gear carb
1: 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
2: 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
3: 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
4: 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
5: 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
6: 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
So remember to put the rownames in a column first -
> mtcars2$carname <- rownames(mtcars)
> head(mtcars2)
mpg cyl disp hp drat wt qsec vs am gear carb carname
1: 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 Mazda RX4
2: 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 Mazda RX4 Wag
3: 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 Datsun 710
4: 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 Hornet 4 Drive
5: 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 Hornet Sportabout
6: 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 Valiant
Gives all the columns where cyl is 4 and gear is 4 -
> mtcars2[cyl==4 & gear==4, ]
mpg cyl disp hp drat wt qsec vs am gear carb carname
1: 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 Datsun 710
2: 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 Merc 240D
3: 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 Merc 230
4: 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 Fiat 128
5: 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 Honda Civic
6: 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 Toyota Corolla
7: 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 Fiat X1-9
8: 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2 Volvo 142E
To specify columns -
> mtcars2[cyl==4 & gear==4,c(12,4,1,8)]
carname hp mpg vs
1: Datsun 710 93 22.8 1
2: Merc 240D 62 24.4 1
3: Merc 230 95 22.8 1
4: Fiat 128 66 32.4 1
5: Honda Civic 52 30.4 1
6: Toyota Corolla 65 33.9 1
7: Fiat X1-9 66 27.3 1
8: Volvo 142E 109 21.4 1
The above code is equivalent to this one shown below. By supplying a list of column names we can get the specified columns -
> mtcars2[cyl==4 & gear==4, list(carname, hp, mpg, vs)]
carname hp mpg vs
1: Datsun 710 93 22.8 1
2: Merc 240D 62 24.4 1
3: Merc 230 95 22.8 1
4: Fiat 128 66 32.4 1
5: Honda Civic 52 30.4 1
6: Toyota Corolla 65 33.9 1
7: Fiat X1-9 66 27.3 1
8: Volvo 142E 109 21.4 1
Instead of using list, a simple . also does the work!
> mtcars2[cyl==4 & gear==4, .(carname, hp, mpg, vs)]
carname hp mpg vs
1: Datsun 710 93 22.8 1
2: Merc 240D 62 24.4 1
3: Merc 230 95 22.8 1
4: Fiat 128 66 32.4 1
5: Honda Civic 52 30.4 1
6: Toyota Corolla 65 33.9 1
7: Fiat X1-9 66 27.3 1
8: Volvo 142E 109 21.4 1
By specifying the column names as string we can pass a vector of strings to the columns place -
> mtcars2[cyl>=4 & gear>4, c('carname', 'hp', 'mpg', 'drat')]
carname hp mpg drat
1: Porsche 914-2 91 26.0 4.43
2: Lotus Europa 113 30.4 3.77
3: Ford Pantera L 264 15.8 4.22
4: Ferrari Dino 175 19.7 3.62
5: Maserati Bora 335 15.0 3.54
If a column name isn’t specified directly in the function but in a vector outside the function then we have to set with=F:
> cols <- c('carname', 'hp', 'mpg', 'drat')
> mtcars2[, cols, with=F]
carname hp mpg drat
1: Mazda RX4 110 21.0 3.90
2: Mazda RX4 Wag 110 21.0 3.90
3: Datsun 710 93 22.8 3.85
4: Hornet 4 Drive 110 21.4 3.08
5: Hornet Sportabout 175 18.7 3.15
6: Valiant 105 18.1 2.76
7: Duster 360 245 14.3 3.21
8: Merc 240D 62 24.4 3.69
9: Merc 230 95 22.8 3.92
10: Merc 280 123 19.2 3.92
11: Merc 280C 123 17.8 3.92
12: Merc 450SE 180 16.4 3.07
13: Merc 450SL 180 17.3 3.07
14: Merc 450SLC 180 15.2 3.07
15: Cadillac Fleetwood 205 10.4 2.93
16: Lincoln Continental 215 10.4 3.00
17: Chrysler Imperial 230 14.7 3.23
18: Fiat 128 66 32.4 4.08
19: Honda Civic 52 30.4 4.93
20: Toyota Corolla 65 33.9 4.22
21: Toyota Corona 97 21.5 3.70
22: Dodge Challenger 150 15.5 2.76
23: AMC Javelin 150 15.2 3.15
24: Camaro Z28 245 13.3 3.73
25: Pontiac Firebird 175 19.2 3.08
26: Fiat X1-9 66 27.3 4.08
27: Porsche 914-2 91 26.0 4.43
28: Lotus Europa 113 30.4 3.77
29: Ford Pantera L 264 15.8 4.22
30: Ferrari Dino 175 19.7 3.62
31: Maserati Bora 335 15.0 3.54
32: Volvo 142E 109 21.4 4.11
carname hp mpg drat
Let’s look at the column names of the mtcars2 data table -
> names(mtcars2)
[1] "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec"
[8] "vs" "am" "gear" "carb" "carname"
Using setnames() renaming can be done inplace.
Let’s rename the carname to CarName:
> setnames(mtcars2, 'carname', 'CarName')
> names(mtcars2)
[1] "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec"
[8] "vs" "am" "gear" "carb" "CarName"
Let’s take only the columns CarName, mpg, cyl, hp, gear to the data table named dt:
> dt <- mtcars2[,list(CarName, mpg, cyl, hp, gear)]
> head(dt)
CarName mpg cyl hp gear
1: Mazda RX4 21.0 6 110 4
2: Mazda RX4 Wag 21.0 6 110 4
3: Datsun 710 22.8 4 93 4
4: Hornet 4 Drive 21.4 6 110 3
5: Hornet Sportabout 18.7 8 175 3
6: Valiant 18.1 6 105 3
Let’s add a new column to the table which shows kmpg (km per gallon) calculated from the column mpg:
> dtcopy <- copy(dt)
> dtcopy[,kmpg := mpg * 1.6]
> head(dtcopy)
CarName mpg cyl hp gear kmpg
1: Mazda RX4 21.0 6 110 4 33.60
2: Mazda RX4 Wag 21.0 6 110 4 33.60
3: Datsun 710 22.8 4 93 4 36.48
4: Hornet 4 Drive 21.4 6 110 3 34.24
5: Hornet Sportabout 18.7 8 175 3 29.92
6: Valiant 18.1 6 105 3 28.96
To add multiple columns:
> dtcopy <- copy(dt)
> dtcopy[,`:=`(kmpg = mpg*1.6,
+ bhp = hp*0.986,
+ mileage_type = ifelse(mpg > 22, 'high', 'low'))]
> # bhp is distinct from hp because it takes into account power loss due to friction
> head(dtcopy)
CarName mpg cyl hp gear kmpg bhp mileage_type
1: Mazda RX4 21.0 6 110 4 33.60 108.460 low
2: Mazda RX4 Wag 21.0 6 110 4 33.60 108.460 low
3: Datsun 710 22.8 4 93 4 36.48 91.698 high
4: Hornet 4 Drive 21.4 6 110 3 34.24 108.460 low
5: Hornet Sportabout 18.7 8 175 3 29.92 172.550 low
6: Valiant 18.1 6 105 3 28.96 103.530 low
To see only the columns that is added new to the data table use list or . symbol:
> dt1 <- dt[, .(CarName,
+ kmpg = mpg*1.6,
+ bhp = hp*0.986)]
> head(dt1)
CarName kmpg bhp
1: Mazda RX4 33.60 108.460
2: Mazda RX4 Wag 33.60 108.460
3: Datsun 710 36.48 91.698
4: Hornet 4 Drive 34.24 108.460
5: Hornet Sportabout 29.92 172.550
6: Valiant 28.96 103.530
> pdt <- data.table(Date=seq(as.Date("2010/1/1"), as.Date("2019/1/1"), "years"),
+ Prod=round(runif(10,1000,20000)))
> pdt
Date Prod
1: 2010-01-01 8121
2: 2011-01-01 11675
3: 2012-01-01 12676
4: 2013-01-01 13028
5: 2014-01-01 3545
6: 2015-01-01 10370
7: 2016-01-01 1105
8: 2017-01-01 18542
9: 2018-01-01 9225
10: 2019-01-01 8349
> pdt[ , Lag := shift(Prod, 1, type="lag")] # lag by 1
> pdt
Date Prod Lag
1: 2010-01-01 8121 NA
2: 2011-01-01 11675 8121
3: 2012-01-01 12676 11675
4: 2013-01-01 13028 12676
5: 2014-01-01 3545 13028
6: 2015-01-01 10370 3545
7: 2016-01-01 1105 10370
8: 2017-01-01 18542 1105
9: 2018-01-01 9225 18542
10: 2019-01-01 8349 9225
> pdt[ , Lead := shift(Prod, 1, type="lead")] # lead by 1
> pdt
Date Prod Lag Lead
1: 2010-01-01 8121 NA 11675
2: 2011-01-01 11675 8121 12676
3: 2012-01-01 12676 11675 13028
4: 2013-01-01 13028 12676 3545
5: 2014-01-01 3545 13028 10370
6: 2015-01-01 10370 3545 1105
7: 2016-01-01 1105 10370 18542
8: 2017-01-01 18542 1105 9225
9: 2018-01-01 9225 18542 8349
10: 2019-01-01 8349 9225 NA
> pdt[ , `:=`(Lag3 = shift(Prod, 3, type="lag"),
+ Lead2 = shift(Prod, 2, type="lead"))]
> pdt
Date Prod Lag Lead Lag3 Lead2
1: 2010-01-01 8121 NA 11675 NA 12676
2: 2011-01-01 11675 8121 12676 NA 13028
3: 2012-01-01 12676 11675 13028 NA 3545
4: 2013-01-01 13028 12676 3545 8121 10370
5: 2014-01-01 3545 13028 10370 11675 1105
6: 2015-01-01 10370 3545 1105 12676 18542
7: 2016-01-01 1105 10370 18542 13028 9225
8: 2017-01-01 18542 1105 9225 3545 8349
9: 2018-01-01 9225 18542 8349 10370 NA
10: 2019-01-01 8349 9225 NA 1105 NA
Using ! to delete single or multiple columns:
> colnames(mtcars2)
[1] "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec"
[8] "vs" "am" "gear" "carb" "CarName"
> del <- mtcars2[,!'vs', with=F]
> colnames(del)
[1] "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec"
[8] "am" "gear" "carb" "CarName"
Using variables to assign the column names and passing it to the function: (use with=F)
> dropcols <- c('drat','wt','qsec','am','carb')
> colnames(mtcars2)
[1] "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec"
[8] "vs" "am" "gear" "carb" "CarName"
> del <- mtcars2[,!dropcols, with=F]
> colnames(del)
[1] "mpg" "cyl" "disp" "hp" "vs" "gear" "CarName"
Another way is to using := to assign and setting the value to NULL :
> mtcars3 <- copy(mtcars2)
> colnames(mtcars3)
[1] "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec"
[8] "vs" "am" "gear" "carb" "CarName"
> mtcars3[, c('vs','wt','am') := NULL]
> colnames(mtcars3)
[1] "mpg" "cyl" "disp" "hp" "drat" "qsec" "gear"
[8] "carb" "CarName"
Let’s group by gear and calculate the mean of mpg:
> dt[, .(mean_mpg = mean(mpg)), by=gear]
gear mean_mpg
1: 4 24.53333
2: 3 16.10667
3: 5 21.38000
> head(dt)
CarName mpg cyl hp gear
1: Mazda RX4 21.0 6 110 4
2: Mazda RX4 Wag 21.0 6 110 4
3: Datsun 710 22.8 4 93 4
4: Hornet 4 Drive 21.4 6 110 3
5: Hornet Sportabout 18.7 8 175 3
6: Valiant 18.1 6 105 3
Let’s add the new column to the data set:
> dt2 <- copy(dt)
> dt2[, mean_mpg := round(mean(mpg),3), by=gear]
> head(dt2)
CarName mpg cyl hp gear mean_mpg
1: Mazda RX4 21.0 6 110 4 24.533
2: Mazda RX4 Wag 21.0 6 110 4 24.533
3: Datsun 710 22.8 4 93 4 24.533
4: Hornet 4 Drive 21.4 6 110 3 16.107
5: Hornet Sportabout 18.7 8 175 3 16.107
6: Valiant 18.1 6 105 3 16.107
We can also group by multiple columns:
> dt2 <- copy(dt)
> dt2[, .(mean_mpg = mean(mpg)), by=.(gear, cyl)]
gear cyl mean_mpg
1: 4 6 19.750
2: 4 4 26.925
3: 3 6 19.750
4: 3 8 15.050
5: 3 4 21.500
6: 5 4 28.200
7: 5 8 15.400
8: 5 6 19.700
> dt[, .N, by=cyl]
cyl N
1: 6 7
2: 4 11
3: 8 14
> dt[, .N, by=gear]
gear N
1: 4 12
2: 3 15
3: 5 5
In descending order:
> dt[, .N, by=gear][order(N, decreasing = T),]
gear N
1: 3 15
2: 4 12
3: 5 5
In Ascending order:
> dt[, .N, by=gear][order(N, decreasing = F),]
gear N
1: 5 5
2: 4 12
3: 3 15
Using previous result to calculate the later:
> dtcopy[,`:=`(kmpg = mpg*1.6,
+ bhp = hp*0.986)][, mileage_type := ifelse(kmpg > 30,
+ 'high', 'low')]
> head(dtcopy)
CarName mpg cyl hp gear kmpg bhp mileage_type
1: Mazda RX4 21.0 6 110 4 33.60 108.460 high
2: Mazda RX4 Wag 21.0 6 110 4 33.60 108.460 high
3: Datsun 710 22.8 4 93 4 36.48 91.698 high
4: Hornet 4 Drive 21.4 6 110 3 34.24 108.460 high
5: Hornet Sportabout 18.7 8 175 3 29.92 172.550 low
6: Valiant 18.1 6 105 3 28.96 103.530 low
Setting one or more keys on a data.table enables it to perform binary search, which is many order of magnitudes faster than linear search, especially for large data.
Using setkey() key can be set:
> setkey(dt, CarName)
To check the keys for a data table:
> key(dt)
[1] "CarName"
To remove keys:
> setkey(dt, NULL)
Setting multiple keys:
> setkey(dt, CarName, cyl)
> # It first sorts in CarName's ascending order then in cyl's ascending order
> key(dt)
[1] "CarName" "cyl"
Using key we can filter/find rows that satisfies specific conditions too fast
> DT <- data.table(X = rep(c('a','b','c'), each=10),Y = rnorm(30))
> setkey(DT, X)
> key(DT)
[1] "X"
> DT['c']
X Y
1: c 1.4711518
2: c -0.1449743
3: c -0.3280360
4: c -1.4182936
5: c 0.9825383
6: c 1.0700048
7: c 0.4630494
8: c -0.7679049
9: c -1.6136755
10: c -0.9212752
> setkey(dt, CarName)
> key(dt)
[1] "CarName"
Let’s make two tables:
> dt1 <- dt[c(1:12,15,18), .(CarName, mpg, cyl)]
> dt2 <- dt[c(4:10,14:16), .(CarName, gear)]
> key(dt1);key(dt2)
[1] "CarName"
[1] "CarName"
Since we have already defined the key we can easily merge the data tables without defining key again:
> merge(dt1, dt2)
CarName mpg cyl gear
1: Chrysler Imperial 14.7 8 3
2: Datsun 710 22.8 4 4
3: Dodge Challenger 15.5 8 3
4: Duster 360 14.3 8 3
5: Ferrari Dino 19.7 6 5
6: Fiat 128 32.4 4 4
7: Fiat X1-9 27.3 4 4
8: Lincoln Continental 10.4 8 3
> merge(dt1, dt2, all=T)
CarName mpg cyl gear
1: AMC Javelin 15.2 8 NA
2: Cadillac Fleetwood 10.4 8 NA
3: Camaro Z28 13.3 8 NA
4: Chrysler Imperial 14.7 8 3
5: Datsun 710 22.8 4 4
6: Dodge Challenger 15.5 8 3
7: Duster 360 14.3 8 3
8: Ferrari Dino 19.7 6 5
9: Fiat 128 32.4 4 4
10: Fiat X1-9 27.3 4 4
11: Ford Pantera L 15.8 8 NA
12: Honda Civic 30.4 4 NA
13: Hornet Sportabout NA NA 3
14: Lincoln Continental 10.4 8 3
15: Lotus Europa NA NA 5
16: Mazda RX4 21.0 6 NA
> merge(dt1, dt2, all.x=T)
CarName mpg cyl gear
1: AMC Javelin 15.2 8 NA
2: Cadillac Fleetwood 10.4 8 NA
3: Camaro Z28 13.3 8 NA
4: Chrysler Imperial 14.7 8 3
5: Datsun 710 22.8 4 4
6: Dodge Challenger 15.5 8 3
7: Duster 360 14.3 8 3
8: Ferrari Dino 19.7 6 5
9: Fiat 128 32.4 4 4
10: Fiat X1-9 27.3 4 4
11: Ford Pantera L 15.8 8 NA
12: Honda Civic 30.4 4 NA
13: Lincoln Continental 10.4 8 3
14: Mazda RX4 21.0 6 NA
> merge(dt1, dt2, all.y=T)
CarName mpg cyl gear
1: Chrysler Imperial 14.7 8 3
2: Datsun 710 22.8 4 4
3: Dodge Challenger 15.5 8 3
4: Duster 360 14.3 8 3
5: Ferrari Dino 19.7 6 5
6: Fiat 128 32.4 4 4
7: Fiat X1-9 27.3 4 4
8: Hornet Sportabout NA NA 3
9: Lincoln Continental 10.4 8 3
10: Lotus Europa NA NA 5
dcast.data.table(dt, x ~ y, fun.aggregate = list(functions), value.var='column name of value'):
> dcast.data.table(mtcars2, gear ~ cyl, fun.aggregate = list(length), value.var = 'mpg')
gear 4 6 8
1: 3 1 2 12
2: 4 8 4 0
3: 5 2 1 2
> dcast.data.table(mtcars2, cyl ~ vs,
+ fun.aggregate = list(sum, mean),
+ value.var = 'mpg')
cyl mpg_sum_0 mpg_sum_1 mpg_mean_0 mpg_mean_1
1: 4 26.0 267.3 26.00000 26.730
2: 6 61.7 76.5 20.56667 19.125
3: 8 211.4 0.0 15.10000 NaN
set(dt, i, j, value)
It is usually used in for-loops and is literally thousands of times faster.
Here is a demonstration of how fast the set() function is compared to other functions:
> m = matrix(1,nrow=100000,ncol=100)
> DF = as.data.frame(m)
> DT = as.data.table(m)
>
> system.time(for (i in 1:10000) DF[i,1] <- i)
user system elapsed
2.73 1.72 4.47
> system.time(for (i in 1:10000) set(DT,i,1L,i))
user system elapsed
0.03 0.00 0.03
Here you can find more about the data.table package.