You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Lineapy is really impressive which covers the flow end2end. I notice there is to_pipeline feature which is very convenient to productionize the DF.
Ray is a distributed computation framework that can be used as a backend of python to do the scale-up (https://docs.ray.io/en/master/)
Recently, Ray is introducing Ray DAG, which is a ray style to define the computation graph: the data communicated between functions are the object refs instead of the actual python data. With this, we can either execute it in normal ray mode or in a workflow mode. The former one is just executing the pipeline in a distributed way and the latter one will do checkpointing for each step which offers durability and fault tolerance.
I'm thinking about the possibility to add it as a backend of lineapy. One benefit of this integration is that we'll be able to scale up the workloads easily by utilizing Ray's functionality.
I haven't dug deep into the implementation, but it seems to_pipeline is implemented as a plugin and the actual work is to convert a Graph defined internally to the code of the target backend. I feel the implementation should be straightforward.
The text was updated successfully, but these errors were encountered:
Thanks @iycheng - This is a very interesting feature request! We are familiar with Ray and we'd love to talk to you more about your use case. Can you please DM me on slack for a follow up?
Lineapy is really impressive which covers the flow end2end. I notice there is
to_pipeline
feature which is very convenient to productionize the DF.Ray is a distributed computation framework that can be used as a backend of python to do the scale-up (https://docs.ray.io/en/master/)
Recently, Ray is introducing Ray DAG, which is a ray style to define the computation graph: the data communicated between functions are the object refs instead of the actual python data. With this, we can either execute it in normal ray mode or in a workflow mode. The former one is just executing the pipeline in a distributed way and the latter one will do checkpointing for each step which offers durability and fault tolerance.
I'm thinking about the possibility to add it as a backend of lineapy. One benefit of this integration is that we'll be able to scale up the workloads easily by utilizing Ray's functionality.
I haven't dug deep into the implementation, but it seems
to_pipeline
is implemented as a plugin and the actual work is to convert a Graph defined internally to the code of the target backend. I feel the implementation should be straightforward.The text was updated successfully, but these errors were encountered: