Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define abstractions for framework integration #279

Open
sebhrusen opened this issue Apr 6, 2021 · 2 comments
Open

Define abstractions for framework integration #279

sebhrusen opened this issue Apr 6, 2021 · 2 comments
Labels
enhancement New feature or request framework For issues with frameworks in the current benchmark
Milestone

Comments

@sebhrusen
Copy link
Collaborator

The goal is to provide an abstraction and default implementation(s) for most common scenarios.
This would also allow frameworks to support several versions easily.
Finally, and more structured framework runner will simplify the integration effort and standardize support for extra features like the _save_artifacts param.

1st suggestion (incomplete, and will change):

class FrameworkRunner:
    def __init__(self, config, dataset): pass
    def prepare_data(self): pass
    def fit(self, …): pass
    def predict(self, …): pass
    def get_result(self): pass
    def save_artifacts(self): pass
@sebhrusen sebhrusen changed the title define abstractions for framework runner Define abstractions for framework integration Apr 6, 2021
@sebhrusen sebhrusen added enhancement New feature or request framework For issues with frameworks in the current benchmark labels Apr 6, 2021
@PGijsbers
Copy link
Collaborator

Based on our discussion, we should include some type of recovery mode.

@PGijsbers
Copy link
Collaborator

With this refactor, it will also be easier to use the various "checkpoints" to store partial results and/or have more dynamic time cut-offs. For example, the one hour time limit could be more strictly enforced for just the fit call, while being (much) more lenient in phases after fit, as compared to having a single large budget for all phases combined. This will avoid both scenarios where EC2 instances live needlessly long because they get hung in a fit call and those were results are incompletely merely because the predict part took longer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request framework For issues with frameworks in the current benchmark
Projects
None yet
Development

No branches or pull requests

2 participants