Reshaping Deduping long data to wide in R

Question

I've been attempting unsuccessfully to restructure this lengthy data into a broad format by ID in Excel. I attempted to use dcast, but it did not produce the outcomes I was hoping for.

I've included a csv file with the data's current formatting in cells (a1:c10) and my preferred formatting in cells (f1:n4). I initially tried with Excel, but since I've never used a power query, I figured reshape2 or dcast should be able to accomplish the same thing.

In R I did:

olddata_wide$ID <- factor(olddata_wide$ID)

widedf <- dcast(df1, ID  ~ paydate, value.var="Type")

This just gave me an output of dates.

Kithuzzz · Answer 1 · Mar 17, 2023

Using pivot_wider and rename

library(dplyr)
library(tidyr)

repl <- c("1st_transaction" = "type_1", "2nd_transaction" = "type_2", 
  "3rd_transaction" = "type_3", "4th_transaction" = "type_4")

df %>% 
  mutate(n = row_number(), .by = ID) %>% 
  pivot_wider(names_from = n, values_from = c(type, paydate)) %>% 
  rename(all_of(repl))
# A tibble: 3 × 9
  ID     1st_transacti…¹ 2nd_t…² 3rd_t…³ 4th_t…⁴ payda…⁵ payda…⁶ payda…⁷ payda…⁸
  <chr>  <chr>           <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>  
1 AAL100 H               H       B       NA      8/28/2… 8/28/2… 8/28/2… NA     
2 AAC926 H               H       NA      NA      8/28/2… 8/28/2… NA      NA     
3 ABR765 V               H       H       B       8/17/2… 8/28/2… 8/28/2… 8/28/2…
# … with abbreviated variable names ¹`1st_transaction`, ²`2nd_transaction`,
#   ³`3rd_transaction`, ⁴`4th_transaction`, ⁵paydate_1, ⁶paydate_2, ⁷paydate_3,
#   ⁸paydate_4

Data

df <- structure(list(ID = c("AAL100", "AAL100", "AAL100", "AAC926", 
"AAC926", "ABR765", "ABR765", "ABR765", "ABR765"), paydate = c("8/28/2019", 
"8/28/2020", "8/28/2021", "8/28/2017", "8/28/2018", "8/17/2016", 
"8/28/2020", "8/28/2021", "8/28/2022"), type = c("H", "H", "B", 
"H", "H", "V", "H", "H", "B")), class = "data.frame", row.names = c(NA, 
-9L))