Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should desc() use vec_rank() internally on character vectors? #7045

Open
DavisVaughan opened this issue Jun 26, 2024 · 0 comments
Open

Should desc() use vec_rank() internally on character vectors? #7045

DavisVaughan opened this issue Jun 26, 2024 · 0 comments

Comments

@DavisVaughan
Copy link
Member

See #7044

In particular, note that arrange(df, x) will sort x using the C locale if it is a character vector. But arrange(df, -desc(x)) (i.e. invert the desc() call, giving you the original order in theory) will sort x using the user's locale.

Normally a call like desc(x) is recognized and we don't even actually call desc() under the hood, we translate it to a "desc" value for the directions argument of vec_order_radix(), but in this case the - interferes and we actually evaluate the call.

That ends up calling desc() which does -xtfrm(x), and xtfrm() ends up using base::order(x), utilizing the user's locale.

I don't think we should remove usage of xtfrm() in desc(), since that is a generic that people have probably written S3 methods for, but maybe we can have special behavior for unclassed character vectors where it utilized vec_rank() instead (which uses the C locale)? It would not be a perfect fix, but it may be good enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
1 participant