I am trying to join two datasets. It should be a basic left_join
, but every time I try to do so, I end up getting the new column (the one I want to merge in question) completely filled with NA
. This is not a column class issue -- I have changed the classes to all match. There is no white space, I have trimmed all the white space. All the column names match. I cannot figure out for the life of me what the error is here. As an example, the two datasets look vaguely like this:
date | hour | site | PNC_stat |
---|---|---|---|
2021-03-03 | 0 | Chelsea | 19203.2 |
2021-03-03 | 1 | Chelsea | 72837.2 |
2021-03-03 | 2 | Chelsea | 23683.1 |
2021-03-03 | 0 | Winthrop | 27728.2 |
2021-03-03 | 1 | Winthrop | 8374728 |
and the dataset to merge:
date | hour | site | PNC_mob |
---|---|---|---|
2021-03-03 | 0 | Chelsea | 1837238.5 |
2021-03-03 | 1 | Chelsea | 2314.2 |
2021-03-03 | 2 | Chelsea | 283147.2 |
2021-03-03 | 0 | Winthrop | 9385.3 |
2021-03-03 | 1 | Winthrop | 83934.2 |
This basic code should do the trick:
all1 <- right_join(all_stat, mob, by=c("NEAR_SITE", "date", "hour"))
And yet either the entire PNC_mob
column for example will append as NA
, OR sometimes I will get the PNC_mob
column to have values ONLY for one site group (i.e. all the Chelsea values will have this column filled in, but the others will day NA
).
Please tell me what I am doing wrong here, I have used this function in the past with no issue.
For context, there are not the same number of rows in each df, so I will need the first df to repeat for all of the matches in the second df, but again this has always worked in a basic left_join
.
site
but your code tries to join by"NEAR_SITE"