Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discuss code refactoring #22

Open
Necried opened this issue Jan 2, 2019 · 9 comments
Open

Discuss code refactoring #22

Necried opened this issue Jan 2, 2019 · 9 comments
Assignees

Comments

@Necried
Copy link
Collaborator

Necried commented Jan 2, 2019

The Main.elm file contains the entire logic of the application and is over 2000 lines long (after elm-format). Adding more features will only grow the size of the code. Discuss here a scalable way to refactor our current code into logical chunks!

@CSchank
Copy link
Owner

CSchank commented Jan 2, 2019

Here are some thoughts, feel free to comment. I'll need to add to this more as I think of things.

Modules

Creating Types.elm

Firstly, we should break off all types and type aliases into their own module that everything else can import. This will be required when we start breaking off stuff into other modules, and it will have the advantages of making things shorter in Main.elm as well as giving people a quick way to see all data types in the application.

Modules for each "page"

We should probably have modules which contain the logic for each main state of the application, ie Build and Simulate right now. Some things will be very easy to move, but others are trickier. For example, both Build and Simulate share machine rendering but with different features of each (Build, in particular, exposes a lot of features that Simulate doesn't have in terms of the machine render). So we have to figure out the best way to share that. Perhaps we should have a module just for machine rendering, but unfortunately a lot of the "page"-specific logic would be left (e.g. when to allow selection of states, dragging, etc). It would be nice if each module handled those specific features, but we don't want to duplicate code.

One potentially good way of doing it would be to have the "common denominator" of machine rendering in its own module, with a variable for modules to pass in their own extra shapes (and by extension, logic).

Update functions and their models

Right now, there are separate update functions for each "page" but they still return the full model type. This is likely not what we want. Instead, they should probably only be allowed to edit the data specific to themselves. For example, the Build module should only have access something like a BuildModel.

Helper functions

Some things are shared but they are not data types, e.g. the rendering of LaTeX equations which is used throughout the app. These should likely be placed into a helper module or modules. Should it be one module with all such things, or is there enough to warrant splitting it up into separate modules? For now we only have a handful of such things, including the LaTeX logic and text box logic (which will likely eventually be made part of GraphicSVG anyway).

Data Types

Refactoring Machine Data Type

Right now, the things for a single machine are scattered around all over the place unfortunately. In order to make things more sane, and to allow for future improvements, including handling multiple machines, it makes sense to refactor this so that the type alias Machine contains all of the information for a given machine. Therefore we should move the following fields into Machine and out of the main Model:

  • states : Set StateID
  • statePositions : StatePositions
  • stateTransitions : StateTransitions
  • stateNames : StateNames
  • transitionNames : TransitionNames
    We should also discuss whether the fields themselves could be simplified for better performance or less bookkeeping.

Splitting user (undo-able) data from UI data

In order to pave the way for undo and redo, we need to separate "undo-able" data from the other data.

@Necried
Copy link
Collaborator Author

Necried commented Jan 2, 2019

Comments

Most of the points mentioned above are very good. While I agree with most of them, here are some additional thoughts I have:

Is Types.elm necessary?

Probably having most, if not all, type declarations in a single module will make adding types or modifying types much simpler. However, this makes everything depend on this single module. Instead, we can decouple type declarations into their own modules where they are needed. E.g: I have Build.elm, so this is where we can put all our Build types in, instead of having it import Types.elm and introducing the danger of using datatypes that we don't actually want.

Make a Machine.elm?

The comments on refactoring the Machine data type are all relevant, and in addition to those, I propose that Machine itself deserves its own module that Build and Simulate imports. This dependency I have no problem with, since the state machine is the cornerstone of this entire application. What we need to be careful with is introducing app-state dependent data into Machine, therefore the Machine datatype should be as minimal as possible (lowest common denominator).

Module Layout

Having outlined some of the dependencies above, a nice ASCII directed-acyclic graph shows how we can restructure our application into:

                       /-->   Build.elm   --\
       Machine.elm  --<                        >--> Main.elm
                       \--> Simulate.elm  --/

(I'd like a markdown graph generator now! D:)

On Build,Simulate,Main.elm

The current Main.elm actually does a great job separating out what happens when the app is in Build versus Simulate, which will make refactoring alot easier. As already mentioned, Build and Simulate had their separated update functions (and possibly view too, but I haven't looked too closely). These can go into their own separate modules, and probably is just a matter of cut-pasting (yanking) them from the original main module.

By virtue of having type declarations living locally in Build or Simulate, we already separated the model part of TEA as well (hopefully?).
Main.elm will then contain the driving force of the application. This may include:

  • Environment variables, e.g. window size
  • Elm Subscriptors, e.g. holdingShift (which probably are also env-vars)
  • Initialization data

Since we've had everything in one place previously, some of these information may have leaked into our previous datatypes. This is a good opportunity to separate those out!

Once that's done, our model in Main.elm will probably just be environment variables, together with the Sum Type appState = BuildingState | SimulatingState.

Helpers are (Maybe) helpful

Anything that might be used in multiple files (and are not related to our data types), or will be future library functions themselves, we can create a Helper.elm (or something more elegantly named), and then just import them on a use-case basis. This file can theoretically be arbitrarily long as well, since we can rely on qualified imports to protect namespacing issues.

UndoList data strcture probably belongs in Build.elm?

This I need to consider further, but currently it makes sense to record our Build state as undo-able data. By separating "environment variables" out into Main, the idea is that all undo-able data will already be part of Build.elm and nothing more. Thoughts?

@CSchank
Copy link
Owner

CSchank commented Jan 6, 2019

Agreed with the Types.elm -- we can probably put things in their respective modules

Machine.elm sounds good, except..

I'm still not entirely sure how to have a Machine.elm module but still allow other modules to place their own features inside of it. Maybe the right thing to do is to have Machine.elm provide its own TEA instance, with messages for common things like mousing over states, clicking labels, etc etc.
We could also need an API for the view function of the Machine module, that can be controlled by the caller (e.g. showing a text box for editing states). I can see two ways of doing this. Either we put all possible view features into the machine module and let the caller control it, or we put only the basic view features into the module and just have Build, Simulate, etc draw other features on top of it. I'm not sure what's the best approach.

On Build,Simulate,Main.elm

This sounds good and shouldn't be too difficult. If you look at the model as it is now, you'll see that in addition to appState, Simulate stores some persistent data in another field. Perhaps each module should have access to both a persistent state and the one in appState. But then does each one also get access to the current Machine? Then maybe it should be a triple? And what about environment variables like holdingShift? Those should probably be read only. So, we could have an Environment type and each module must provide an update function as follows:

update : Environment -> Msg -> (Machine, Model, PersistentModel) -> ((Machine, Model, PersistentModel), Cmd Msg)

I think this almost provides the right level of abstraction. Some modules' PersistentModel could be blank. For example, Build doesn't have any persistent state at the moment that I can think of, but likely in the future things like the zoom level would be included there. However, can modules share (writeable) information (like zoom level)? Should we make Environment writeable, or maybe have one more SharedModel that is writeable? In this case, one question is whether Machine should just be wrapped up into the SharedModel, which is probably worthwhile. The update could then become:

update : Environment -> Msg -> (Model, PersistentModel, SharedModel) -> ((Model, PersistentModel, SharedModel), Cmd Msg)

Seems a bit complicated (especially thanks to the 3-tuple limit!), but this very clearly draws the line of what is writeable, readable, shareable, etc amongst modules.

On UndoList

One problem with the approach you stated is that it only applies to the Build module and we probably want undoing to be global. Without it being global, we could easily get inconsistencies across modules. For example, let's say you undo adding a transition in Build, it could then still be included in a tape in Simulate. Probably, we should just wrap the entire triple above into an undoList instance, and have the module update functions tell us whether to checkpoint or not:

update : Environment -> Msg -> (Model, PersistentModel, SharedModel) -> ((Model, PersistentModel, SharedModel), Bool, Cmd Msg)

If the Bool is True, then we call new, otherwise we replace the state without creating a checkpoint. I kind of like this! Main.elm can then handle all of the undoing, the modules barely need to know about it, and as the app grows, undos stay consistent across modules! What do you think?

@CSchank
Copy link
Owner

CSchank commented Jan 6, 2019

I just thought of something else. We might run into situations where a module wants to "clean up" or "prepare" things when the user if entering or exiting that module. So we should probably also have each module provide the following:

onEnter : Environment -> (Model, PersistentModel, SharedModel) -> ((Model, PersistentModel, SharedModel), Bool, Cmd Msg)
onExit : Environment -> (Model, PersistentModel, SharedModel) -> ((Model, PersistentModel, SharedModel), Bool, Cmd Msg)

Our Main.elm runtime will call the correct function at the correct time in order to give modules the chance to do things as they're entered or exited.

Come to think of it, each module also need to provide inits for

  • Model (which we'll call onEnter with)
  • PersistentModel (which will be used to initialize at the start of the app)

And we'll also need to provide global inits for Environment and SharedModel.

Overall, I think this will give us a nice API to keep adding things modularly. What do you think?

@CSchank
Copy link
Owner

CSchank commented Jan 10, 2019

Refactoring work started in b8b557c and 14438cd

@Necried
Copy link
Collaborator Author

Necried commented Feb 20, 2020

Minor gripe with the Bool in the return type - it doesn't document what exactly the boolean is being used for.

We could change it to a custom two-constructor datatype to better reflect what it is being used for

@Necried
Copy link
Collaborator Author

Necried commented Feb 23, 2021

Just my yearly check-in here to look up what the Bool was supposed to be doing - at least I'm getting the answer by reading what's above 😃

@CSchank
Copy link
Owner

CSchank commented Feb 23, 2021

Lol yes, I am noticing myself a lot of opportunities for refactoring, including re-use of code and making things more explicit. We should definitely clean things up, yet again! Though we luckily are in a much better place than the beginning of this thread suggests: one module with 2000 loc!

@CSchank
Copy link
Owner

CSchank commented Feb 23, 2021

Another thing to refactor: have all the "real actions" have their own message types. And then separate out more "meta messages" (like keyboard presses) from those, but have those actually create a newMsg which causes those actions. I think this would by itself reduce code duplication. I'm thinking along the lines of how you've implemented ToggleStart in Building.elm, in fact that's what made me think of this again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants