-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Reduce Boilerplate to Silence Many-to-Many Join Warnings in dplyr #6993
Comments
I'm a user of dplyr. IMHO you're right. It's discussed in this closed issue: |
We realise that this behaviour might be annoying for folks who really depend a lot on many to many joins, but our experience is that these warnings are useful for the overwhelming majority of dplyr users, and we don't have any intention to change the default behaviour at this time. If you find this really frustrating, I'd suggest making a couple of your own little helpers like this:
That will reduce the amount of typing you need to do while still keeping your code compact and easy to understand. |
Thanks @hadley this is an acceptable solution, but I still do not love it. I am just curious if you have any references to point to the overwhelming majority you're referencing here. |
Brief description of the problem:
Conducting exploratory (i.e. not production-environment) data analysis often requires multiple many-to-many joins.
dplyr
's current behavior emits warnings for each join when not explicitly specifyingrelationship = "many-to-many"
and significantly clutter the console output.This behavior introduces verbosity into the exploratory analysis process, where "many-to-many" relationships are expected, anticipated, and managed. The repeated need to specify the relationship parameter for each join operation to avoid these warnings is cumbersome and detracts from the efficiency to use
dplyr
as an exploration tool.An option to globally silence these warnings would streamline exploratory data analyses, allowing for a focus on substantive inquiry and result interpretation.
Desired output:
A global option in
dplyr
to silence warnings for many-to-many joins, enhancing experience by reducing repetitive boilerplate and focusing on relevant outputs.Reprex:
In the above reprex, joining the objects have duplicate keys results in a many-to-many join, triggering a warning, which is sometimes longer than the output. The proposed feature would allow for operations without the need to explicitly suppress warnings for each operation, assuming a global setting had been enabled.
Hypothetical implementation 1
Introduce a much faster to type alias, i.e. "rel" for "relationship" and "mm" for "many-to-many", with other options for other relationships.
Hypothetical implementation 2
Introduce an option to silence these warnings for a session
The text was updated successfully, but these errors were encountered: