dplyr

tidyrはreadrとまとめるべきか？

tibble
- as_tibble
- rownames_to_columnなど
行操作：slice, filter
列操作：select、列選択のための関数
列作成：mutate rename
- mutateの途中にbrowser()でデバッグ
結合：join
グループ化：group_by
gather / spread
nest / unnest
separate
purrrとの組み合わせ
古いdoの使い方
distinct, sample_n
summarize, count
arrange

行番号を振る

連番で行番号を振るならdplyr::row_number()でいける。

iris %>%
  dplyr::as_tibble() %>% 
  dplyr::mutate(Row_Num = dplyr::row_number()) %>% 
  dplyr::select(Row_Num, everything())

## # A tibble: 150 x 6
##    Row_Num Sepal.Length Sepal.Width Petal.Length Petal.Width Species
##      <int>        <dbl>       <dbl>        <dbl>       <dbl> <fct>  
##  1       1          5.1         3.5          1.4         0.2 setosa 
##  2       2          4.9         3            1.4         0.2 setosa 
##  3       3          4.7         3.2          1.3         0.2 setosa 
##  4       4          4.6         3.1          1.5         0.2 setosa 
##  5       5          5           3.6          1.4         0.2 setosa 
##  6       6          5.4         3.9          1.7         0.4 setosa 
##  7       7          4.6         3.4          1.4         0.3 setosa 
##  8       8          5           3.4          1.5         0.2 setosa 
##  9       9          4.4         2.9          1.4         0.2 setosa 
## 10      10          4.9         3.1          1.5         0.1 setosa 
## 11      11          5.4         3.7          1.5         0.2 setosa 
## 12      12          4.8         3.4          1.6         0.2 setosa 
## 13      13          4.8         3            1.4         0.1 setosa 
## 14      14          4.3         3            1.1         0.1 setosa 
## 15      15          5.8         4            1.2         0.2 setosa 
## 16      16          5.7         4.4          1.5         0.4 setosa 
## 17      17          5.4         3.9          1.3         0.4 setosa 
## 18      18          5.1         3.5          1.4         0.3 setosa 
## 19      19          5.7         3.8          1.7         0.3 setosa 
## 20      20          5.1         3.8          1.5         0.3 setosa 
## 21      21          5.4         3.4          1.7         0.2 setosa 
## 22      22          5.1         3.7          1.5         0.4 setosa 
## 23      23          4.6         3.6          1           0.2 setosa 
## 24      24          5.1         3.3          1.7         0.5 setosa 
## 25      25          4.8         3.4          1.9         0.2 setosa 
## 26      26          5           3            1.6         0.2 setosa 
## 27      27          5           3.4          1.6         0.4 setosa 
## 28      28          5.2         3.5          1.5         0.2 setosa 
## 29      29          5.2         3.4          1.4         0.2 setosa 
## 30      30          4.7         3.2          1.6         0.2 setosa 
## # … with 120 more rows

dplyr 0.8での変更点

dplyrでの操作内容を表示

間違って変なことしていないかを確認できるのは良さそうだが、名前空間の問題などがあり常用するのは面倒かな。

elbersb/tidylog: Tidylog provides feedback about basic dplyr operations. It provides simple wrapper functions for the most common functions, such as filter, mutate, select, and group_by.