These functions are available in my personal package: # install. Of course the tradeoff is that you need to specify a regex, but for the purposes of working with column names I don’t imagine that’s likely to get too complicated. Regex = "(+)(\\d+)" ) # id year age scores twydyverse :: gather_multivalue ( scores_dumb, "year" , Both of these variables have a numeric part. In our dataset, there are 2 variables which are INC and UE. After that, we have to specify the names of the variables which are in the wide format. I like that regex gives you some flexibility if you get columns with slightly weird or varying patterns: id Since we need to convert the data from a wide format to a long format, this is why the command that we wrote was reshape long. The default regex assumes that the columns are of the form (word)(number).
#RESHAPE STATA HOW TO#
Gather_multivalue also asks you to specify a regular expression (regex) for how to extract the key and values. Gather_multivalue ( scores, "year", - id ) Gather_multivalue ( scores, "year", age2000 : scores2010 ) 3 So now you only need one line: # equivalent Gather_multivalue and spread_multivalue are basically wrappers around this sequence of steps. Plus it was a good way to practice working with quasiquotation.įor #rstats users - how do I reshape wide / long using gather and spread with multiple variables?- Paul G-P January 13, 2018 I’ve been thinking about writing a function to automate this process for a while and finally got some impetus to do so when Paul Goldsmith-Pinkham asked about the issue on Twitter. The Stata reshape command allows one to go from data in a long format with multiple. Tidyr :: spread ( "colname", "value" ) # id year age scores Can I increase the capacity of the Results window in Stata. Tidyr :: extract ( "key2", c ( "colname", "year" ), Then extract 2 and spread: scores_vlong %>% With the tidyr functions, you need to first gather: scores_vlong = scores %>% tidyr :: gather ( "key2", "value", c ( age2000 : scores2010 )) In Stata you would do this with something like 1 reshape long age scores, i(id) j(year). For example, to reshape the following (fake) wide dataset from this: id The difference is that gather and spread work on key- value pairs, emphasis on the singular “value”, while reshape is fine with having multiple values associated with a single key. The equivalent in the tidyverse would be the gather (wide to long) and spread (long to wide) functions from the tidyr package. reshape makes a wide dataset long and vice versa. If you’ve used Stata you might be familiar with its reshape command. Update: There is an R function called reshape from the stats package that does the same thing, just not within the tidyverse framework