Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes for Ray-based shuffle #29

Merged
merged 6 commits into from
Mar 16, 2023
Merged

Conversation

franklsf95
Copy link
Contributor

@franklsf95 franklsf95 commented Mar 13, 2023

Things fixed:

  • PyResultSet can now take a list of bytes from worker execution result.
  • Passing the correct partitions to the correct reducers.

All queries passed except:

  • q15 view not supported (known issue)

Tested on:

  • sf=1, 2 workers, single node
  • sf=1, 4 workers, single node
@franklsf95 franklsf95 changed the title [WIP] Bug Fixes for Ray-based shuffle Mar 13, 2023
@franklsf95 franklsf95 changed the title [WIP] Fixes for Ray-based shuffle Mar 14, 2023
if isinstance(lst, list):
for j, f in enumerate(lst):
if concurrency == 1 or j == part:
# If concurrency is 1, pass in all shuffle partitions. Otherwise,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andygrove Does this make sense?

@andygrove andygrove merged commit 177790a into datafusion-contrib:main Mar 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants