If you’re a fan of the tidyverse, check out purrr::reduce(). It’s a modern take on base R’s Reduce, offering a consistent syntax with other purrr functions (like .x and .y for arguments) and handy shortcuts like ~ .x + .y for inline functions. It also defaults to left-to-right reduction but can go right-to-left with reduce_right(). Worth a look if you want a more polished, tidyverse-friendly alternative!
Here’s an intermediate-level example of using the reduce()
function from the purrr
package for joining multiple dataframes:
library(purrr) library(dplyr) # Create three sample dataframes representing different aspects of customer data customers <- data.frame( customer_id = 1:5, name = c("Alice", "Bob", "Charlie", "Diana", "Edward"), age = c(32, 45, 28, 36, 52) ) orders <- data.frame( order_id = 101:108, customer_id = c(1, 2, 2, 3, 3, 3, 4, 5), order_date = as.Date(c("2023-01-15", "2023-01-20", "2023-02-10", "2023-01-05", "2023-02-15", "2023-03-20", "2023-02-25", "2023-03-10")), amount = c(120.50, 85.75, 200.00, 45.99, 75.25, 150.00, 95.50, 210.25) ) feedback <- data.frame( feedback_id = 201:206, customer_id = c(1, 2, 3, 3, 4, 5), rating = c(4, 5, 3, 4, 5, 4), feedback_date = as.Date(c("2023-01-20", "2023-01-25", "2023-01-10", "2023-02-20", "2023-03-01", "2023-03-15")) ) # List of dataframes to join with the joining column dataframes_to_join <- list( list(df = customers, by = "customer_id"), list(df = orders, by = "customer_id"), list(df = feedback, by = "customer_id") ) # Using reduce to join all dataframes # Start with customers dataframe and progressively join the others joined_data <- reduce( dataframes_to_join[-1], # Exclude first dataframe as it's our starting point function(acc, x) { left_join(acc, x$df, by = x$by) }, .init = dataframes_to_join[[1]]$df # Start with customers dataframe ) # View the result print(joined_data)
customer_id name age order_id order_date amount feedback_id rating 1 1 Alice 32 101 2023-01-15 120.50 201 4 2 2 Bob 45 102 2023-01-20 85.75 202 5 3 2 Bob 45 103 2023-02-10 200.00 202 5 4 3 Charlie 28 104 2023-01-05 45.99 203 3 5 3 Charlie 28 104 2023-01-05 45.99 204 4 6 3 Charlie 28 105 2023-02-15 75.25 203 3 7 3 Charlie 28 105 2023-02-15 75.25 204 4 8 3 Charlie 28 106 2023-03-20 150.00 203 3 9 3 Charlie 28 106 2023-03-20 150.00 204 4 10 4 Diana 36 107 2023-02-25 95.50 205 5 11 5 Edward 52 108 2023-03-10 210.25 206 4 feedback_date 1 2023-01-20 2 2023-01-25 3 2023-01-25 4 2023-01-10 5 2023-02-20 6 2023-01-10 7 2023-02-20 8 2023-01-10 9 2023-02-20 10 2023-03-01 11 2023-03-15
This example demonstrates how to use reduce()
to join multiple dataframes in a sequential, elegant way. This pattern is particularly useful when dealing with complex data integration tasks where you need to combine multiple data sources with a common identifier.