I am aiming to plot a choropleth map (plotted using the polygons from a geodatabase) that will count the number of instances per region (wildlife management units) from a separate data frame (currently saved as xlsx but can be changed). Ideally it would then be able to then filter the data based on timeframe, species, method, etc. Both the data and geodatabase have the same column title for the wmu's, WMUNIT_CODE.
I can plot the geodatabase using this code:
library(sf)
library(ggplot2)
library(dplyr)
library(readxl)
alberta <- st_read("data/bf_geoadmin_21-06-2024.gdb")
alberta_layers <- st_layers(dsn = "data/bf_geoadmin_21-06-2024.gdb")
wmu <- st_read("data/bf_geoadmin_21-06-2024.gdb", layer = "WildlifeManagementUnit")
plot(wmu)
However, all the tutorials I've been able to find suggest merging or joining the data. Whenever I try to do so it doesn't work as the is more than one entry per WMUNIT_CODE in the xlsx. Is there a way to get r to count the instances for each WMU code?
I have tried the following code along with many other failed attempts.
cwd_data <- read_excel("data/cwd_data.xlsx")
View(cwd_data)
alberta_merged <- wmu %>%
left_join(cwd, by = c("WMUNIT_CODE" = "WMUNIT_CODE")) %>%
mutate(nb_equip = ifelse(is.na(nb_equip), 0.01, nb_equip))
xlsx head
cwd_data %>% head(n = 30) %>% dput()
structure(list(
`Case Number` = c("95", "96", "97", "98", "99",
"100", "101", "102", "103", "104", "105", "106", "107", "108",
"109", "110", "111", "112", "113", "114", "115", "116", "117",
"118", "119", "120", "121", "122", "123", "124"),
`Method of Detection` = c("Found dead (road kill)",
"Hunter surveillance", "Hunter surveillance", "Hunter surveillance",
"Hunter surveillance", "Hunter surveillance", "Hunter surveillance",
"Hunter surveillance", "Hunter surveillance", "Hunter surveillance",
"Hunter surveillance", "Hunter surveillance", "Hunter surveillance",
"Hunter surveillance", "Hunter surveillance", "Hunter surveillance",
"Hunter surveillance", "Hunter surveillance", "Hunter surveillance",
"Hunter surveillance", "Hunter surveillance", "Hunter surveillance",
"Hunter surveillance", "Hunter surveillance", "Hunter surveillance",
"Hunter surveillance", "Hunter surveillance", "Hunter surveillance",
"Hunter surveillance", "Hunter surveillance"),
`Date Killed
(dd,mm,yyyy)` = c("4/11/2011",
"5/11/2011", "2/11/2011", "1/11/2011", "3/11/2011", "12/11/2011",
"12/11/2011", "11/11/2011", "10/11/2011", "5/11/2011", "19/11/2011",
"9/11/2011", "3/11/2011", "24/11/2011", "21/11/2011", "12/11/2011",
"5/12/2011", "30/11/2011", "12/11/2011", "11/11/2011", "2/12/2011",
"16/11/2011", "24/11/2011", "6/12/2011", "3/12/2011", "21/11/2011",
"17/12/2011", "2/12/2011", "1/12/2011", "21/10/2011"),
Year = c("2011",
"2011", "2011", "2011", "2011", "2011", "2011", "2011", "2011",
"2011", "2011", "2011", "2011", "2011", "2011", "2011", "2011",
"2011", "2011", "2011", "2011", "2011", "2011", "2011", "2011",
"2011", "2011", "2011", "2011", "2011"),
WMUNIT_CODE = c("00234",
"00203", "00203", "00202", "00202", "00200", "00150", "00202",
"00163", "00150", "00119", "00200", "00163", "00151", "00203",
"00202", "00730", "00150", "00150", "00150", "00151", "00150",
"00234", "00728", "00150", "00202", "00202", "00150", "00728",
"00150"),
Species = c("Mule Deer", "Mule Deer", "Mule Deer",
"Mule Deer", "Mule Deer", "Mule Deer", "Mule Deer", "Mule Deer",
"Mule Deer", "Mule deer", "White-tail deer", "Mule deer", "Mule deer",
"Mule deer", "Mule deer", "Mule Deer", "Mule Deer", "White-tail Deer",
"Mule Deer", "Mule Deer", "White-tail Deer", "Mule Deer", "Mule Deer",
"Mule Deer", "Mule Deer", "Mule Deer", "Mule Deer", "Mule Deer",
"Mule Deer", "Mule Deer"),
Sex = c("Female", "Male", "Female",
"Female", "Female", "Female", "Male", "Female", "Male", "Male",
"Male", "Female", "Male", "Male", "Male", "Female", "Male", "Male",
"Female", "Male", "Male", "Male", "Male", "Female", "Male", "Male",
"Female", "Male", "Male", "Male"), Age = c("Adult", "Yearling",
"Adult", "Adult", "Adult", "Adult", "Adult", "Yearling", "Adult",
"Adult", "Yearling", "Adult", "Adult", "Adult", "Adult", "Adult",
"Yearling", "Adult", "Adult", "Adult", "Adult", "Adult", "Adult",
"Adult", "Adult", "Adult", "Adult", "Adult", "Adult", "Adult"
)),
row.names = c(NA, -30L), class = c("tbl_df", "tbl", "data.frame"
))
I tried all of your suggestions and kept getting error responses or being unable to plot after joining the data. This lead me to discover that for a few of the primary key WMUNIT_CODE in the wmu geodatabase there are several multipolygons per WMUNIT_CODE. Annoyingly, each of them have different shape details. Would I be correct in assuming this is what is causing issues? Should I try to edit the geodatabase to remove the extras? How would you do that? Would it be possible to join the extras together?
wmu |> count(WMUNIT_CODE) |> filter(n > 1)
Simple feature collection with 3 features and 2 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -115.2175 ymin: 49.03546 xmax: -110.8206 ymax: 52.85801
Geodetic CRS: NAD83
WMUNIT_CODE n SHAPE
1 00718 6 MULTIPOLYGON (((-111.6199 4...
2 00728 3 MULTIPOLYGON (((-110.8684 5...
3 00794 3 MULTIPOLYGON (((-115.1348 5...
Here's a few ways I've tried to group them:
#Method 1
wmu_grouped<-wmu[,.("WMUNIT_CODE$00718"=st_union(WMUNIT_CODE$00718)), by=Id]
Error: unexpected numeric constant in "wmu_grouped<-wmu[,.("WMUNIT_CODE$00718"=st_union(WMUNIT_CODE$00718"
#Method 2
wmu_grouped <- rbind(188,189,190,191,192,193)
plot(wmu) # This just created a separate table of the 6 rows with no data.
wmu_grouped <- rbind(wmu([188,],[189,],[190,],[191,],[192,],[193,]))
Error: unexpected '[' in "wmu_grouped <- rbind(wmu(["
wmu_grouped <- rbind((wmu[188,],(wmu[189,]),(wmu[190,]),(wmu[191,]),(wmu[192,]),(wmu[193,]))
Error: unexpected ',' in "wmu <- rbind((wmu[188,],"
#Method 3 - I think this is the closest to working but it didn't change the quantity of entries for that WMUNIT_CODE
st_union(wmu[188,],(wmu[189,]))
Simple feature collection with 1 feature and 8 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -111.8976 ymin: 49.09596 xmax: -111.7863 ymax: 49.11547
Geodetic CRS: NAD83
WMUNIT_NAME WMUNIT_CODE SHAPE_Length SHAPE_Area WMUNIT_NAME.1 WMUNIT_CODE.1
188 Writing-On-Stone Provincial Park 00718 0.07247036 0.0001988734 Writing-On-Stone Provincial Park 00718
SHAPE_Length.1 SHAPE_Area.1 SHAPE
188 0.162602 0.0005893379 MULTIPOLYGON (((-111.7975 4...
Warning message:
attribute variables are assumed to be spatially constant throughout all geometries
#Method 4 - trying to build upon Method 3
wmu_grouped<- wmu |> st_union(wmu[188,],(wmu[189,]),(wmu[190,]),(wmu[191,]),(wmu[192,]),(wmu[193,]))
Error in Ops.sfc(left, right) : operation > not supported
wmu_grouped<- wmu %>% st_union(wmu[188,],(wmu[189,]),(wmu[190,]),(wmu[191,]),(wmu[192,]),(wmu[193,]))
Error in Ops.sfc(left, right) : operation > not supported
st_union(wmu[188,],(wmu[189,]),(wmu[190,]),(wmu[191,]),(wmu[192,]),(wmu[193,]))
Error in Ops.sfc(left, right) : operation > not supported
wmu %>% left_join(cwd |> count(WMUNIT_CODE))
might do what you want? To the extent you want to do filtering by time, species, etc. it would be done before thecount
step. Alternatively, do the join as you've been doing and add the count afterward.wmu_grouped <- wmu |> group_by(WMUNIT_CODE) |> summarise()
. Or If you preferer base syntax, you can useaggregate()
instead. I'd also consider browsing thoughsf
Vignettes and perhaps e-books linked at r-spatial.github.io/sf/… , relevant geocompx chpt - r.geocompx.org/attr#vector-attribute-aggregationaggregate(rep(1,nrow(cwd_data)), list(cwd_data$WMUNIT_CODE), sum)
ORtable(cwd_data$WMUNIT_CODE)