1

I am trying to do a representation of some sets of data to show how many elements are common between the various groups.

I was thinking of doing something similar to a Venn diagram, but for this representation there is a catch.

Here I am doing a very simplified example of my problem. Let us say I have this list:

my_list=list(c("A", "A", "A", "B", "B"), c("A", "A", "A", "C"), c("A", "A", "A", "D") 

I can represent it with a Venn diagram as follows:

library(VennDiagram)
display_venn <- function(x, a_category){
  grid.newpage()
  venn_object <- venn.diagram(x, category.names = a_category, filename = NULL)
  grid.draw(venn_object)
}
 
 display_venn(my_list , a_category=c("set1", "set2", "set2")

The output of this is:

Venn diagram output

This is because the elements in the list are considered ONLY ONCE.
So, it is like to have ("A", "B"), ("A", "C"), and ("A", "D").

The problem is that this is not what I want.
I need a representation that shows the amount of elements.
So, for set1 would be: 3 in common, 2 alone.
So, for set2 would be: 3 in common, 1 alone.
So, for set3 would be: 3 in common, 1 alone.

Is there some kind of "Venn diagram" (at this point I am not sure even if it is a Venn representation) library that deals with duplicates?
Thanks for any help.

1 Answer 1

1

You need to change duplicate elements so that they're unique within each vector, but consistent across vectors. One approach is to number duplicates sequentially within each vector (e.g., your first vector would become c("A1", "A2", "A3", "B1", "B2")).

library(VennDiagram)

my_list |>
  lapply(\(x) paste0(sort(x), sequence(table(x)))) |>
  display_venn(a_category = c("set1", "set2", "set3"))

Created on 2024-03-01 with reprex v2.0.2

0

Not the answer you're looking for? Browse other questions tagged or ask your own question.