Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shuffle Service #5976

Closed
wants to merge 101 commits into from
Closed

Shuffle Service #5976

wants to merge 101 commits into from

Commits on Mar 13, 2022

  1. Configuration menu
    Copy the full SHA
    9deb8c6 View commit details
    Browse the repository at this point in the history
  2. Add MultiFile prototype

    mrocklin committed Mar 13, 2022
    Configuration menu
    Copy the full SHA
    3f62911 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    626eda0 View commit details
    Browse the repository at this point in the history

Commits on Mar 14, 2022

  1. Add buffered comms

    mrocklin committed Mar 14, 2022
    Configuration menu
    Copy the full SHA
    7f48f77 View commit details
    Browse the repository at this point in the history
  2. Move multi files to shuffle/

    mrocklin committed Mar 14, 2022
    Configuration menu
    Copy the full SHA
    99bb283 View commit details
    Browse the repository at this point in the history

Commits on Mar 15, 2022

  1. add arrow

    Performance is good, still need to track down memory
    mrocklin committed Mar 15, 2022
    Configuration menu
    Copy the full SHA
    0d49ab9 View commit details
    Browse the repository at this point in the history
  2. Handle buffers manually in multi_file

    This manages memory more smoothly.
    
    We still have issues though in that we're still passing around slices
    of arrow tables, which hold onto large references
    mrocklin committed Mar 15, 2022
    Configuration menu
    Copy the full SHA
    1e1311d View commit details
    Browse the repository at this point in the history
  3. Pass around only bytes

    This helps to reduce lots of extra unmanaged memory
    
    This flow pretty well right now.  I'm finding that it's useful to blend
    between the disk and comm buffer sizes.
    
    The abstraction in multi_file and multi_comm are getting a little bit
    worn down (it would be awkward to shift back to pandas), but maybe that's ok.
    mrocklin committed Mar 15, 2022
    Configuration menu
    Copy the full SHA
    b99329a View commit details
    Browse the repository at this point in the history

Commits on Mar 16, 2022

  1. Clean up a few extra copies

    mrocklin committed Mar 16, 2022
    Configuration menu
    Copy the full SHA
    5781b7e View commit details
    Browse the repository at this point in the history
  2. Let comms continue without blocking on disk

    Isn't solid yet
    mrocklin committed Mar 16, 2022
    Configuration menu
    Copy the full SHA
    0ce6e01 View commit details
    Browse the repository at this point in the history
  3. Move flush into multi_file.read

    This avoids a race
    mrocklin committed Mar 16, 2022
    Configuration menu
    Copy the full SHA
    3204997 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    8b11d6d View commit details
    Browse the repository at this point in the history
  5. Change configuration for smoother single-machine use

    We don't need a lot of comm buffer, we also don't want more connecitons
    than machines (too much sitting in buffers).
    
    We also improve some printing
    mrocklin committed Mar 16, 2022
    Configuration menu
    Copy the full SHA
    62cc43d View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    5a64248 View commit details
    Browse the repository at this point in the history
  7. Fix shard size accountiing

    mrocklin committed Mar 16, 2022
    Configuration menu
    Copy the full SHA
    b901613 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    24092bb View commit details
    Browse the repository at this point in the history

Commits on Mar 17, 2022

  1. Allow worker extensions to piggy-back on heartbeat

    To enable better diagnostics, it would be useful to allow worker
    extensions to piggy-back on the standard heartbeat.  This adds an
    optional "heartbeat" method to extensions, and, if present, calls a
    custom method that gets sent to the scheduler and processed by an
    extension of the same name.
    
    This also starts to store the extensions on the worker in a named
    dictionary.  Previously this was a list, but I'm not sure that it was
    actually used anywhere.  This is a breaking change without deprecation,
    but in a space that I suspect no one will care about.  I'm happy to
    provide a fallback if desired.
    mrocklin committed Mar 17, 2022
    Configuration menu
    Copy the full SHA
    7fbe4aa View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    89f6347 View commit details
    Browse the repository at this point in the history
  3. Remove file cache

    mrocklin committed Mar 17, 2022
    Configuration menu
    Copy the full SHA
    bc81db8 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    27d2ab3 View commit details
    Browse the repository at this point in the history
  5. Name scheduler extensions

    mrocklin committed Mar 17, 2022
    Configuration menu
    Copy the full SHA
    9fd6da0 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    8641abb View commit details
    Browse the repository at this point in the history
  7. fixup test

    mrocklin committed Mar 17, 2022
    Configuration menu
    Copy the full SHA
    34617db View commit details
    Browse the repository at this point in the history
  8. Add timing and diagnostics

    mrocklin committed Mar 17, 2022
    Configuration menu
    Copy the full SHA
    e1c0a4d View commit details
    Browse the repository at this point in the history
  9. fixup tests

    mrocklin committed Mar 17, 2022
    Configuration menu
    Copy the full SHA
    8c28b83 View commit details
    Browse the repository at this point in the history

Commits on Mar 18, 2022

  1. Configuration menu
    Copy the full SHA
    29253b4 View commit details
    Browse the repository at this point in the history
  2. Add back in manual addition of stealing extension

    Tests are failing.  I can't reproduce locally.  This is just blind
    praying that it fixes the problem.  It should be innocuous.
    mrocklin committed Mar 18, 2022
    Configuration menu
    Copy the full SHA
    1f79575 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    cf9a939 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    6f2286e View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    efedc04 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    9b7b03b View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    456de23 View commit details
    Browse the repository at this point in the history
  8. make larger dashboard page

    mrocklin committed Mar 18, 2022
    Configuration menu
    Copy the full SHA
    aef2f61 View commit details
    Browse the repository at this point in the history
  9. extend shuffling dashboard

    mrocklin committed Mar 18, 2022
    Configuration menu
    Copy the full SHA
    f79e923 View commit details
    Browse the repository at this point in the history
  10. Don't offload file writes

    also remove printing
    mrocklin committed Mar 18, 2022
    Configuration menu
    Copy the full SHA
    1e0256f View commit details
    Browse the repository at this point in the history
  11. reduce comm memory limit

    mrocklin committed Mar 18, 2022
    Configuration menu
    Copy the full SHA
    97fb09c View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    76baf4b View commit details
    Browse the repository at this point in the history

Commits on Mar 19, 2022

  1. Configuration menu
    Copy the full SHA
    f58b2e9 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    9bc6ce6 View commit details
    Browse the repository at this point in the history
  3. Merge branch 'heartbeat-extensions' of github.com:mrocklin/distribute…

    …d into heartbeat-extensions
    mrocklin committed Mar 19, 2022
    Configuration menu
    Copy the full SHA
    57b4a42 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    1309c22 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    c894f40 View commit details
    Browse the repository at this point in the history
  6. Grey out unseen workers

    mrocklin committed Mar 19, 2022
    Configuration menu
    Copy the full SHA
    486320d View commit details
    Browse the repository at this point in the history
  7. flake8

    mrocklin committed Mar 19, 2022
    Configuration menu
    Copy the full SHA
    6419328 View commit details
    Browse the repository at this point in the history
  8. remove old test

    This was the result of a bad merge conflict
    mrocklin committed Mar 19, 2022
    Configuration menu
    Copy the full SHA
    9e6aadc View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    c202385 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    cf51784 View commit details
    Browse the repository at this point in the history
  11. bump y-axis, add kwargs

    mrocklin committed Mar 19, 2022
    Configuration menu
    Copy the full SHA
    c272458 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    1d114c2 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    4ba9923 View commit details
    Browse the repository at this point in the history

Commits on Mar 21, 2022

  1. Remove errant print

    mrocklin committed Mar 21, 2022
    Configuration menu
    Copy the full SHA
    e7a5143 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    b0cd7ae View commit details
    Browse the repository at this point in the history

Commits on Mar 22, 2022

  1. Configuration menu
    Copy the full SHA
    bd37f49 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    6e1af62 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    7776ecb View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    107b5a0 View commit details
    Browse the repository at this point in the history
  5. clean up old methods

    mrocklin committed Mar 22, 2022
    Configuration menu
    Copy the full SHA
    dc8a7a4 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    6ef62a0 View commit details
    Browse the repository at this point in the history
  7. Speed up tests

    mrocklin committed Mar 22, 2022
    Configuration menu
    Copy the full SHA
    8550a13 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    2e01aa8 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    4486584 View commit details
    Browse the repository at this point in the history
  10. Update distributed/stealing.py

    Co-authored-by: Florian Jetter <fjetter@users.noreply.github.com>
    mrocklin and fjetter authored Mar 22, 2022
    Configuration menu
    Copy the full SHA
    01403b9 View commit details
    Browse the repository at this point in the history
  11. use nonlocal

    mrocklin committed Mar 22, 2022
    Configuration menu
    Copy the full SHA
    97fdf2a View commit details
    Browse the repository at this point in the history
  12. Merge branch 'heartbeat-extensions' of github.com:mrocklin/distribute…

    …d into heartbeat-extensions
    mrocklin committed Mar 22, 2022
    Configuration menu
    Copy the full SHA
    0bd1f89 View commit details
    Browse the repository at this point in the history

Commits on Mar 23, 2022

  1. Configuration menu
    Copy the full SHA
    182dc83 View commit details
    Browse the repository at this point in the history
  2. Update distributed/shuffle/multi_file.py

    Co-authored-by: Ashwin Srinath <3190405+shwina@users.noreply.github.com>
    mrocklin and shwina authored Mar 23, 2022
    Configuration menu
    Copy the full SHA
    007ea90 View commit details
    Browse the repository at this point in the history

Commits on Mar 25, 2022

  1. Configuration menu
    Copy the full SHA
    d27d0f3 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    b50c61a View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    dfc31fd View commit details
    Browse the repository at this point in the history
  4. cleanup hover

    mrocklin committed Mar 25, 2022
    Configuration menu
    Copy the full SHA
    ea000a3 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    4598577 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    2993643 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    fe61116 View commit details
    Browse the repository at this point in the history
  8. tests pass

    mrocklin committed Mar 25, 2022
    Configuration menu
    Copy the full SHA
    adccb02 View commit details
    Browse the repository at this point in the history
  9. depend on pyarrow in CI

    mrocklin committed Mar 25, 2022
    Configuration menu
    Copy the full SHA
    3aabef0 View commit details
    Browse the repository at this point in the history
  10. install dask@p2p-shuffle

    mrocklin committed Mar 25, 2022
    Configuration menu
    Copy the full SHA
    2af4974 View commit details
    Browse the repository at this point in the history
  11. simplify dashboard charts

    mrocklin committed Mar 25, 2022
    Configuration menu
    Copy the full SHA
    1756fb2 View commit details
    Browse the repository at this point in the history

Commits on Mar 28, 2022

  1. Move arrow utilities over to a separate file

    Also add pyarrow to precommit / mypy settings
    mrocklin committed Mar 28, 2022
    Configuration menu
    Copy the full SHA
    6694c84 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    154b21f View commit details
    Browse the repository at this point in the history

Commits on Mar 29, 2022

  1. Configuration menu
    Copy the full SHA
    36956ef View commit details
    Browse the repository at this point in the history
  2. make multi_file tests pass

    mrocklin committed Mar 29, 2022
    Configuration menu
    Copy the full SHA
    2ab401a View commit details
    Browse the repository at this point in the history
  3. Add test for MultiComm

    mrocklin committed Mar 29, 2022
    Configuration menu
    Copy the full SHA
    90673d1 View commit details
    Browse the repository at this point in the history

Commits on Mar 31, 2022

  1. Respond to feedback

    mrocklin committed Mar 31, 2022
    Configuration menu
    Copy the full SHA
    7d8954a View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    1796804 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    8de2793 View commit details
    Browse the repository at this point in the history
  4. Python 3.10 (dask#5952)

    graingert authored and mrocklin committed Mar 31, 2022
    Configuration menu
    Copy the full SHA
    caa852f View commit details
    Browse the repository at this point in the history
  5. Cluster Dump SchedulerPlugin (dask#5983)

    Add SchedulerPlugin to dump state on cluster close
    
    This also adds a new method to SchedulerPlugins that runs directly before closing time
    sjperkins authored and mrocklin committed Mar 31, 2022
    Configuration menu
    Copy the full SHA
    f3fb682 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    0a1761d View commit details
    Browse the repository at this point in the history
  7. Update gpuCI RAPIDS_VER to 22.06 (dask#5962)

    Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    github-actions[bot] authored and mrocklin committed Mar 31, 2022
    Configuration menu
    Copy the full SHA
    7cdb56f View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    d0afbb1 View commit details
    Browse the repository at this point in the history
  9. Remove support for PyPy (dask#6029)

    jrbourbeau authored and mrocklin committed Mar 31, 2022
    Configuration menu
    Copy the full SHA
    a74fd38 View commit details
    Browse the repository at this point in the history
  10. Make test_reconnect async (dask#6000)

    This was flakey due to cleaning up resources.
    
    My experience is that making things async helps with this in general.
    I don't have strong confidence that this will fix the issue, but I do
    have mild confidence, and strong confidence that it won't hurt.
    mrocklin committed Mar 31, 2022
    Configuration menu
    Copy the full SHA
    bde718f View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    dd857b8 View commit details
    Browse the repository at this point in the history
  12. Add test for bad disk

    mrocklin committed Mar 31, 2022
    Configuration menu
    Copy the full SHA
    9efb27c View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    69bed31 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    ec71091 View commit details
    Browse the repository at this point in the history
  15. cleanup files properly

    mrocklin committed Mar 31, 2022
    Configuration menu
    Copy the full SHA
    016ed25 View commit details
    Browse the repository at this point in the history
  16. cleanup extra futures

    mrocklin committed Mar 31, 2022
    Configuration menu
    Copy the full SHA
    8ee5605 View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    1c3bfb9 View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    fa235ee View commit details
    Browse the repository at this point in the history

Commits on Apr 8, 2022

  1. Configuration menu
    Copy the full SHA
    df4348f View commit details
    Browse the repository at this point in the history