Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare MAHF for crates.io #173

Closed
wants to merge 30 commits into from
Closed

Prepare MAHF for crates.io #173

wants to merge 30 commits into from

Conversation

Saethox
Copy link
Collaborator

@Saethox Saethox commented Jun 4, 2023

I'm making this a draft for now, so we can discuss the design decisions before anything is merged. This is a PR that emerged from some experimental changes I wanted to try out and have integrated before we publish on crates.io, and it subsequently got somewhat out of control (the line diff scares me).

In general, I'm pretty happy with how it turned out, but there is still much work to do apart from all of this.

Documentation

This PR should include

  • extensive documentation of the public API (structs, methods)
  • goal is that every struct and method has a code example, excluding trait implementations
  • extensive documentation of the concepts represented by the respective modules, i.e. the general goal and motivation of MAHF in lib, components, conditions, state, logging, heuristics, again with examples

I just didn't want to spend time documenting stuff we might change, so I put it off until now. It should definitely be part of this PR though.

Rustfmt

I added a rustfmt.toml with the options mentioned in #164, and adapted the ci.yml to use nighly rustfmt, because these options are nightly. clippy, check, and test are still performed on stable just like before. I just can't go back to the old imports after I have seen the beauty of what nightly rustfmt is capable of.

Changes

Problems

  • Old problems were removed, in accordance with Move problems::coco and problems::coco_bound into separate crates. #172, Load tsplib using the tspf crate. #169, Extract problems into own repos #155
  • The Evaluator component (now PopulationEvaluator) was changed from accessing the EvaluatorInstance in the State to having a T: Evaluator type parameter
    • This makes it possible to have multiple Evaluators in the state at the same time, and decouples Evaluator and Problem somewhat
  • Problem::default_evaluator was removed in favor of a TryDefault bound on Evaluator
  • The ObjectiveFunction trait is a simpler version of Evaluator for problems where the objective function can be easily specified, and Sequential<P> and Parallel<P> offer a default implementation of a sequential and parallel Evaluator, respectively, when P implements ObjectiveFunction
  • Simplification and refactoring of the additional problem traits, now with a proper hierarchy (e.g. LimitedVectorProblem depends on VectorProblem, etc.) and a separate TSP trait to access distance information

State

  • Finally, MutState is no more: State now allows for proper (mutable) borrowing of custom state at the same time (making Ref and RefMut part of the public API is kind of annoying, but it's still so much better than before IMO)
  • There is now full support of the entry API similar to HashMap::entry
  • Mutable borrow of a tuple of distinct types is possible like before
  • State<P> is now only a thin wrapper around the StateRegistry, to allow accessing custom state without having to add P as a type parameter everywhere where it's really not necessary (especially relevant for the new Extract mechanism)
  • Extract, ExtractRef, and ExtractRefMut now make it possible to access custom state without caring what the exact wrapper type is
    • Makes it possible to e.g. only specify in a component that you want to retrieve a f64 and do something with it, but the instantiator of the component can decide where this f64 comes from exactly
    • extractors themselves are static (don't take &self) for several reasons: dealing with boxed extractors was a major hassle, they are meant to only map values, not really modify them, and adding them through type parameters allows a pretty readable syntax, imo
    • IdFn<T> makes it possible to retrieve any custom state
    • ValueOf<T> makes it possible to retrieve the Deref::Target of a wrapper type, which is really handy
    • This makes it possible to write conditions like LessThan<ValueOf<Iterations>>::new(10) instead of implementing separate FixedIterations, FixedEvaluations, etc. conditions
    • Mapping components are possible, e.g. the Linear<I, O> component taking an input value from the extractor I and mapping it linearly onto the custom state extracted by O
    • State now defines try_ methods which return a Result<_, StateError>

Components and Conditions

  • initialize was renamed to init
  • There is now a proper require phase between init and execute, where only requirements can be defined
  • In general, state is now stored alongside the components managing it
  • The swarm and generative modules are now meant to hold PSO and ACO-like components, respectively, if we add more
  • Complex functionality (e.g. recombination, mutation operators) was refactored out of components into functions, which are then unit tested
  • The very repetitive tests on components that only tested an obvious thing (e.g. if the correct number of children are returned) were removed (if we decide to re-add them we should definitely use macros)
  • All Component and Condition methods now return a Result
    • eyre was used as error library because it has better backtrace information than anyhow, and a custom error type would be much more hassle
    • This should make it much easier to debug invalid configs
  • All components and conditions now have a from_params() method, which returns the unboxed component/condition
  • Much more components store their parameters publicly in the state to allow adaption by other components
  • Identifiers can be used to distinguish the same state from different components
    • E.g. both UniformMutation and NormalMutation store their MutationRate in the state, but if they use the same type they overwrite each other
    • Solution is to make MutationRate generic over the type of the component that stores it --> MutationRate<UniformMutation>
    • But the same component might be used multiple times in the same configuration, so they themselves are generic over some Identifier type --> UniformMutation<I: Identifier = Global>
    • The default Identifier is Global, but can be replaced with A, B, etc., or own types
    • Because Rust is weird, the type sometimes has to be enclosed in <> if the default identifier is used, but it's not really a problem (should be documented though because it took me a while to figure it out)
  • The concept of termination was removed, there are now only conditions
  • The LessThan, EveryN, and ChangeOf conditions are now very flexible in combination with extractors
  • The wrapper components Mutator, Initializer, etc. were replaced with functions doing the same thing
    • This is because the wrapper components had three flaws
      • implementing init or require was very awkward for components only implementing the simpler trait, and would require adding these methods to each simpler trait
      • Serializing wrapped components resulted in something like Selector(ComponentImplementingSelection), which made an implementation detail part of the serialization
      • Errors in the Selector were not easily retraceable to the component that was wrapped
    • having the function do the work requires the component to actually implement Component, which is a little bit more code, but has none of these flaws
    • for components that don't overwrite init or require, we could add a derive macro that accepts the translator function and simply calls it in the execute implementation

Logging

  • tracking was renamed into logging
  • Trigger was removed in favor of just using Conditions
    • Conditions have a Serialize bound, but having a separate Trigger either leads to much code duplication or very weird conversion code
    • If it's important enough to be implemented as a Trigger, it might as well be reused as a Condition, so it can be one in the first place
  • The old Extractor (now EntryExtractor) trait is now built on top of the Extract API, which again is very comfortable because everything can be reused
  • LogSet is now LogConfig, because this makes more sense IMO

Other

  • Several traits for dealing with populations were added with very general implementations (e.g. converting individuals into their encoding all at once or the other way around), which really simplifies some component definitions
  • The par_experiment helper was added for dealing with experiments, this could (and probably should) be extended into a whole experiment framework, maybe some sort of Experiment builder with a bunch of options

Saethox added 16 commits June 4, 2023 23:52
- Remove built-in problems in favor of separate repos
- Split Evaluator further into ObjectiveFunction
- Fix the encoding of VectorProblem
- Add a separate trait for TSP
- MutState was removed in favor of Ref and RefMut
- StateRegistry handles borrowing and state management
- State allows access to problem-specific common state
- Added entry API
- Added separate handle for state requirements
- Added error handling to state access
- Add a separate require phase additionally to init
- Remove generation module in favor of having
  separate modules for mutation, recombination, swarm, and
  generative components
- Keep state-modifying components here instead of the state
  module
- Split complicated components functionality into separate
  functions with own unit tests
- Add an add_params() method to all components, which
  returns Self rather than the boxed trait
- Update PopulationEvaluator to use a generic Evaluator
- Store more parameters in the state to allow for adaption
- Introduce identifiers to distinguish between similar
  state and components
- Remove the concept of termination in favor of simple
  conditions
-
- Rename LogSet into LogConfig
- Allow multiple extractors for the same trigger
- Remove Trigger in favor of Condition
- Integrate saving to multiple file formats into Log
- Integrate the Extract API with log entries
- Always insert the Logger
- Allow specifying the Evaluator separately
@Saethox Saethox marked this pull request as draft June 4, 2023 23:03
@HeleNoir
Copy link
Collaborator

HeleNoir commented Jun 5, 2023

While this sounds really good in general, would it be possible to split this PR in smaller ones, each focussed on a specific aspect? Reviewing everything as a whole is a bit overwhelming and particularly prone to errors. Generally, we need a better PR workflow.

If its not (easily) possible, we can leave the PR as is, but next time, smaller ones would be better.

@luleyleo
Copy link
Collaborator

luleyleo commented Jun 5, 2023

@Saethox how much work would it be to split this up? Or at least move everything, where it is convenient, into separate PRs. For example, removing the old problem modules could easily be separated, right?

@Saethox
Copy link
Collaborator Author

Saethox commented Jun 5, 2023

While this sounds really good in general, would it be possible to split this PR in smaller ones, each focussed on a specific aspect? Reviewing everything as a whole is a bit overwhelming and particularly prone to errors. Generally, we need a better PR workflow.

I know, right 🙈. I was a little much too focused on getting it all to work together.

how much work would it be to split this up? Or at least move everything, where it is convenient, into separate PRs. For example, removing the old problem modules could easily be separated, right?

Splitting the problems is easily done, the other changes are more cumbersome. I'm in no real hurry to merge this (maybe before GECCO), so I could probably split this PR into three/four incremental PRs:

  • removing old problems and adding new problem trait hierarchy (the other changes depend on this)
  • State management, component refactors, and error handling (everything is very connected here, so I want to avoid splitting this up)
  • Logging rework (this is pretty independent from the rest)
  • Helpers for specifying and performing experiments (extend par_experiment)

Depending on how we want to handle the problem dependencies (mahf-tsplib -> mahf, mahf -> mahf-tsplib, mahf-problems -> mahf-tsplib -> mahf), the problem changes could be moved into the mahf-problems repo at this point.

We can then just use this draft as a proof that all of the concepts do work together, so I guess that's nice to know 🤷‍♂️.

@HeleNoir
Copy link
Collaborator

HeleNoir commented Jun 5, 2023

I think 3 or 4 PRs would be okay.

I'm in no real hurry to merge this (maybe before GECCO)

Well, I would prefer as soon as possible, but before GECCO would be okay. I have to start implementing #75 and I also need to play around with the logging a bit to get data for the evaluation (so I can finally implement it completely) and I fear if I start before this PR is done, I will cause additional problems.

@Saethox
Copy link
Collaborator Author

Saethox commented Jun 5, 2023

I have to start implementing #75

That's not really a problem, porting components to the new state management and returning a result instead of panicking is done in a few minutes, It's not like the functionality of the components gets any different through this, especially for specialized operators.

I also need to play around with the logging a bit to get data for the evaluation

Again, the general concept of logging is not that different, it's just more flexible than before. The log format is also exactly the same.

Saethox added a commit that referenced this pull request Jun 18, 2023
Integrate problem changes from PR draft #173
@Saethox
Copy link
Collaborator Author

Saethox commented Jul 5, 2023

Obsolete with #176 and #178

@Saethox Saethox closed this Jul 5, 2023
Saethox added a commit that referenced this pull request Jul 5, 2023
Integrate most of the changes from PR #173
@Saethox Saethox deleted the cleanup-and-restructuring branch July 5, 2023 11:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants