Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hydroflow 1.0 Roadmap #1074

Closed
MingweiSamuel opened this issue Feb 26, 2024 · 1 comment
Closed

Hydroflow 1.0 Roadmap #1074

MingweiSamuel opened this issue Feb 26, 2024 · 1 comment
Labels

Comments

@MingweiSamuel
Copy link
Member

MingweiSamuel commented Feb 26, 2024

P0

  1. Singleton syntax/usability

P1

  1. State externalization (dataflow) (State externalization meta issue #1059)
  2. Hydroflow+ polish (documentation)
    • Prune Hydroflow operator set
  3. Standard library for Hydroflow+
    • distributed async/await (futures/promises/etc.)
    • actors
    • distributed protocols, CRDTs/[semi]rings/groups, KVS, transaction manager, BFT, etc
  4. Deployment (hydro-deploy)
  5. Performant KVS
  6. Debugging/diagnostics/telemetry
  7. Decent Ops support
    • cleaner k8s integration?
    • Version management
  8. Lattices/properties (semantics)
    • Singletons, flows,
    • ticks/deltas
    • David Chu optimization preconditions
    • Groups, rings?
  9. Dynamic/Auto-Ops
    • Live reconfiguration
    • Auto-elasticity

P2

  1. Choose benchmarks (maybe a subset of the below)
  2. Networking?
    • Backpressure?
    • Reconnection?
    • Shared memory performance
      • No serialization overhead
  3. Integrations

P3

  1. Fault tolerance specs (on Hydroflow functions?)
  2. State externalization (for replication)
    1. checkpointing Checkpoint/migration handling #1049
  3. Sequence operators (windowing)
    • (check out stream-it)
    • Caleb Stanford ordered streams
  4. Extended Performance
    • Cache locality
    • Vectorization / Columnar join
    • Instruction-level benchmarking (Vtune, etc)
  5. Dataflow Algebra Optimizations
@MingweiSamuel MingweiSamuel pinned this issue Feb 26, 2024
@MingweiSamuel MingweiSamuel changed the title 1.0 Roadmap Hydroflow 1.0 Roadmap Feb 26, 2024
@MingweiSamuel
Copy link
Member Author

MingweiSamuel commented Mar 26, 2024

Moved here from #930

Performance

  • How low is our latency for a simple flow? Identity? Anna? Can we get to "line rate"? How does the KVS bench compare?
  • Can a fast box with many Hydroflow transducers handle a high-speed (40Gb?) NIC?
  • Similar questions for bandwidth -- seems easy, but have we done something wrong?
  • What else should we benchmark?
  • We get lots of goodness from "shared nothing". What's the downside of explicit comm vs shared memory? Let's do some adversarial analysis of our weaknesses.
  • Revisit early timely comparison?

Expressivity

  • Is map and fold enough generality for non-dataflow user code?
  • If you're justifying that you're "a general language", what constructs do you need to demonstrate?
  • What about linear algebra/ML?

@hydro-project hydro-project locked and limited conversation to collaborators Aug 12, 2024
@MingweiSamuel MingweiSamuel converted this issue into discussion #1387 Aug 12, 2024
@MingweiSamuel MingweiSamuel unpinned this issue Nov 21, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
Projects
None yet
Development

No branches or pull requests

1 participant