r - How to join (merge) data frames (inner, outer, left, right)? -


given 2 data frames:

df1 = data.frame(customerid = c(1:6), product = c(rep("toaster", 3), rep("radio", 3))) df2 = data.frame(customerid = c(2, 4, 6), state = c(rep("alabama", 2), rep("ohio", 1)))  df1 #  customerid product #           1 toaster #           2 toaster #           3 toaster #           4   radio #           5   radio #           6   radio  df2 #  customerid   state #           2 alabama #           4 alabama #           6    ohio 

how can database style, i.e., sql style, joins? is, how get:

  • an inner join of df1 , df2:
    return rows in left table have matching keys in right table.
  • an outer join of df1 , df2:
    returns rows both tables, join records left have matching keys in right table.
  • a left outer join (or left join) of df1 , df2
    return rows left table, , rows matching keys right table.
  • a right outer join of df1 , df2
    return rows right table, , rows matching keys left table.

extra credit:

how can sql style select statement?

by using merge function , optional parameters:

inner join: merge(df1, df2) work these examples because r automatically joins frames common variable names, want specify merge(df1, df2, = "customerid") make sure matching on fields desired. can use by.x , by.y parameters if matching variables have different names in different data frames.

outer join: merge(x = df1, y = df2, = "customerid", = true)

left outer: merge(x = df1, y = df2, = "customerid", all.x = true)

right outer: merge(x = df1, y = df2, = "customerid", all.y = true)

cross join: merge(x = df1, y = df2, = null)

just inner join, want explicitly pass "customerid" r matching variable. think it's best explicitly state identifiers on want merge; it's safer if input data.frames change unexpectedly , easier read later on.


Comments

Popular posts from this blog

aws api gateway - SerializationException in posting new Records via Dynamodb Proxy Service in API -

asp.net - Problems sending emails from forum -