How to create a 'count' choropleth map? Large dataframe with many different values overlain on separate geodatabase

Question

I am aiming to plot a choropleth map (plotted using the polygons from a geodatabase) that will count the number of instances per region (wildlife management units) from a separate data frame (currently saved as xlsx but can be changed). Ideally it would then be able to then filter the data based on timeframe, species, method, etc. Both the data and geodatabase have the same column title for the wmu's, WMUNIT_CODE.

section of the dataframe wmu geodatabase plotted

I can plot the geodatabase using this code:

library(sf)
library(ggplot2)
library(dplyr)
library(readxl)
 
alberta <- st_read("data/bf_geoadmin_21-06-2024.gdb")

 alberta_layers <- st_layers(dsn = "data/bf_geoadmin_21-06-2024.gdb")
 
wmu <- st_read("data/bf_geoadmin_21-06-2024.gdb", layer = "WildlifeManagementUnit")
 
plot(wmu)

However, all the tutorials I've been able to find suggest merging or joining the data. Whenever I try to do so it doesn't work as the is more than one entry per WMUNIT_CODE in the xlsx. Is there a way to get r to count the instances for each WMU code?

I have tried the following code along with many other failed attempts.

 cwd_data <- read_excel("data/cwd_data.xlsx")
 
View(cwd_data)

 alberta_merged <- wmu %>%
     left_join(cwd, by = c("WMUNIT_CODE" = "WMUNIT_CODE")) %>%
     mutate(nb_equip = ifelse(is.na(nb_equip), 0.01, nb_equip))

xlsx head

cwd_data %>% head(n = 30) %>% dput()
structure(list(
`Case Number` = c("95", "96", "97", "98", "99", 
"100", "101", "102", "103", "104", "105", "106", "107", "108", 
"109", "110", "111", "112", "113", "114", "115", "116", "117", 
"118", "119", "120", "121", "122", "123", "124"), 
`Method of Detection` = c("Found dead (road kill)", 
"Hunter surveillance", "Hunter surveillance", "Hunter surveillance", 
"Hunter surveillance", "Hunter surveillance", "Hunter surveillance", 
"Hunter surveillance", "Hunter surveillance", "Hunter surveillance", 
"Hunter surveillance", "Hunter surveillance", "Hunter surveillance", 
"Hunter surveillance", "Hunter surveillance", "Hunter surveillance", 
"Hunter surveillance", "Hunter surveillance", "Hunter surveillance", 
"Hunter surveillance", "Hunter surveillance", "Hunter surveillance", 
"Hunter surveillance", "Hunter surveillance", "Hunter surveillance", 
"Hunter surveillance", "Hunter surveillance", "Hunter surveillance", 
"Hunter surveillance", "Hunter surveillance"), 
`Date Killed
(dd,mm,yyyy)` = c("4/11/2011", 
"5/11/2011", "2/11/2011", "1/11/2011", "3/11/2011", "12/11/2011", 
"12/11/2011", "11/11/2011", "10/11/2011", "5/11/2011", "19/11/2011", 
"9/11/2011", "3/11/2011", "24/11/2011", "21/11/2011", "12/11/2011", 
"5/12/2011", "30/11/2011", "12/11/2011", "11/11/2011", "2/12/2011", 
"16/11/2011", "24/11/2011", "6/12/2011", "3/12/2011", "21/11/2011", 
"17/12/2011", "2/12/2011", "1/12/2011", "21/10/2011"), 
Year = c("2011", 
"2011", "2011", "2011", "2011", "2011", "2011", "2011", "2011", 
"2011", "2011", "2011", "2011", "2011", "2011", "2011", "2011", 
"2011", "2011", "2011", "2011", "2011", "2011", "2011", "2011", 
"2011", "2011", "2011", "2011", "2011"), 
WMUNIT_CODE = c("00234", 
"00203", "00203", "00202", "00202", "00200", "00150", "00202", 
"00163", "00150", "00119", "00200", "00163", "00151", "00203", 
"00202", "00730", "00150", "00150", "00150", "00151", "00150", 
"00234", "00728", "00150", "00202", "00202", "00150", "00728", 
"00150"), 
Species = c("Mule Deer", "Mule Deer", "Mule Deer", 
"Mule Deer", "Mule Deer", "Mule Deer", "Mule Deer", "Mule Deer", 
"Mule Deer", "Mule deer", "White-tail deer", "Mule deer", "Mule deer", 
"Mule deer", "Mule deer", "Mule Deer", "Mule Deer", "White-tail Deer", 
"Mule Deer", "Mule Deer", "White-tail Deer", "Mule Deer", "Mule Deer", 
"Mule Deer", "Mule Deer", "Mule Deer", "Mule Deer", "Mule Deer", 
"Mule Deer", "Mule Deer"), 
Sex = c("Female", "Male", "Female", 
"Female", "Female", "Female", "Male", "Female", "Male", "Male", 
"Male", "Female", "Male", "Male", "Male", "Female", "Male", "Male", 
"Female", "Male", "Male", "Male", "Male", "Female", "Male", "Male", 
"Female", "Male", "Male", "Male"), Age = c("Adult", "Yearling", 
"Adult", "Adult", "Adult", "Adult", "Adult", "Yearling", "Adult", 
"Adult", "Yearling", "Adult", "Adult", "Adult", "Adult", "Adult", 
"Yearling", "Adult", "Adult", "Adult", "Adult", "Adult", "Adult", 
"Adult", "Adult", "Adult", "Adult", "Adult", "Adult", "Adult"
)), 
row.names = c(NA, -30L), class = c("tbl_df", "tbl", "data.frame"
))

I tried all of your suggestions and kept getting error responses or being unable to plot after joining the data. This lead me to discover that for a few of the primary key WMUNIT_CODE in the wmu geodatabase there are several multipolygons per WMUNIT_CODE. Annoyingly, each of them have different shape details. Would I be correct in assuming this is what is causing issues? Should I try to edit the geodatabase to remove the extras? How would you do that? Would it be possible to join the extras together?

wmu |> count(WMUNIT_CODE) |> filter(n > 1)
Simple feature collection with 3 features and 2 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -115.2175 ymin: 49.03546 xmax: -110.8206 ymax: 52.85801
Geodetic CRS:  NAD83
  WMUNIT_CODE n                          SHAPE
1       00718 6 MULTIPOLYGON (((-111.6199 4...
2       00728 3 MULTIPOLYGON (((-110.8684 5...
3       00794 3 MULTIPOLYGON (((-115.1348 5...

Here's a few ways I've tried to group them:

#Method 1
wmu_grouped<-wmu[,.("WMUNIT_CODE$00718"=st_union(WMUNIT_CODE$00718)), by=Id]
Error: unexpected numeric constant in "wmu_grouped<-wmu[,.("WMUNIT_CODE$00718"=st_union(WMUNIT_CODE$00718"

#Method 2
wmu_grouped <- rbind(188,189,190,191,192,193)
plot(wmu) # This just created a separate table of the 6 rows with no data.
wmu_grouped <- rbind(wmu([188,],[189,],[190,],[191,],[192,],[193,]))
Error: unexpected '[' in "wmu_grouped <- rbind(wmu(["
wmu_grouped <- rbind((wmu[188,],(wmu[189,]),(wmu[190,]),(wmu[191,]),(wmu[192,]),(wmu[193,]))
Error: unexpected ',' in "wmu <- rbind((wmu[188,],"

#Method 3 - I think this is the closest to working but it didn't change the quantity of entries for that WMUNIT_CODE
st_union(wmu[188,],(wmu[189,]))
Simple feature collection with 1 feature and 8 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -111.8976 ymin: 49.09596 xmax: -111.7863 ymax: 49.11547
Geodetic CRS:  NAD83
                         WMUNIT_NAME WMUNIT_CODE SHAPE_Length   SHAPE_Area                    WMUNIT_NAME.1 WMUNIT_CODE.1
188 Writing-On-Stone Provincial Park       00718   0.07247036 0.0001988734 Writing-On-Stone Provincial Park         00718
    SHAPE_Length.1 SHAPE_Area.1                          SHAPE
188       0.162602 0.0005893379 MULTIPOLYGON (((-111.7975 4...
Warning message:
attribute variables are assumed to be spatially constant throughout all geometries 

#Method 4 - trying to build upon Method 3
wmu_grouped<-  wmu |> st_union(wmu[188,],(wmu[189,]),(wmu[190,]),(wmu[191,]),(wmu[192,]),(wmu[193,]))
Error in Ops.sfc(left, right) : operation > not supported
wmu_grouped<- wmu %>% st_union(wmu[188,],(wmu[189,]),(wmu[190,]),(wmu[191,]),(wmu[192,]),(wmu[193,]))
Error in Ops.sfc(left, right) : operation > not supported
st_union(wmu[188,],(wmu[189,]),(wmu[190,]),(wmu[191,]),(wmu[192,]),(wmu[193,]))
Error in Ops.sfc(left, right) : operation > not supported

wmu %>% left_join(cwd |> count(WMUNIT_CODE)) might do what you want? To the extent you want to do filtering by time, species, etc. it would be done before the count step. Alternatively, do the join as you've been doing and add the count afterward. — Jon Spring, Commented Jun 25 at 18:43
It's difficult to provide an answer without having access to your dataset, but you could try regular dplyr-style aggregation, sf will union those shapes for you: wmu_grouped <- wmu |> group_by(WMUNIT_CODE) |> summarise() . Or If you preferer base syntax, you can use aggregate() instead. I'd also consider browsing though sf Vignettes and perhaps e-books linked at r-spatial.github.io/sf/… , relevant geocompx chpt - r.geocompx.org/attr#vector-attribute-aggregation — margusl, Commented Jun 26 at 19:52
How about aggregate(rep(1,nrow(cwd_data)), list(cwd_data$WMUNIT_CODE), sum) OR table(cwd_data$WMUNIT_CODE) — G5W, Commented Jul 5 at 23:26

Collectives™ on Stack Overflow

How to create a 'count' choropleth map? Large dataframe with many different values overlain on separate geodatabase

0

Browse other questions tagged
r
dataframe
geospatial
r-sf
choropleth
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Browse other questions tagged rdataframegeospatialr-sfchoropleth or ask your own question.

Browse other questions tagged
r
dataframe
geospatial
r-sf
choropleth
or ask your own question.