Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate docs to Documenter.jl #199

Merged
merged 10 commits into from
Sep 6, 2022
Merged

Migrate docs to Documenter.jl #199

merged 10 commits into from
Sep 6, 2022

Conversation

Saransh-cpp
Copy link
Member

Xref #187

Note
This PR does not improve the existing textual documentation in any way; instead, it migrates the current documentation to Documenter.jl.

See the new documentation in action here - https://saransh-cpp.github.io/Metalhead.jl!

cc: @theabhirath

@Saransh-cpp Saransh-cpp changed the title Migrate to Documenter.jl Migrate docs to Documenter.jl Aug 12, 2022
Copy link
Member Author

@Saransh-cpp Saransh-cpp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of comments -

- run: julia --project=docs docs/make.jl
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
DOCUMENTER_KEY: ${{ secrets.DOCUMENTER_KEY }}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The maintainers will need to add this to the repository secrets.

],
format = Documenter.HTML(
canonical = "https://fluxml.ai/Metalhead.jl/stable/",
# analytics = "UA-36890222-9",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this is supposed to be the same across all FluxML repositories.

@theabhirath
Copy link
Member

theabhirath commented Aug 12, 2022

This is amazing, thanks @Saransh-cpp! The current docs look good (and will be filled up considerably after this migration). Just one question, is there a way to perhaps migrate the docs for the 0.y.x releases (y<8)? Or alternatively, leave the current Publish docs untouched in a separate legacy URL as we migrate the current docs (this might involve less work)? Also cc @darsnack for comments

docs/src/api/models.md Outdated Show resolved Hide resolved
@theabhirath theabhirath added the documentation Documentation label Aug 12, 2022
@theabhirath theabhirath added this to the 0.8 milestone Aug 12, 2022
@ToucheSir
Copy link
Member

Are we sure this is a good idea? Among other things, the entirety of https://fluxml.ai/Metalhead.jl/dev/docstrings.html has disappeared. Having a good API reference was part of the reason we ditched Documenter in the first place, so I'd prefer not to regress there ;)

@theabhirath
Copy link
Member

theabhirath commented Aug 12, 2022

Are we sure this is a good idea? Among other things, the entirety of https://fluxml.ai/Metalhead.jl/dev/docstrings.html has disappeared. Having a good API reference was part of the reason we ditched Documenter in the first place, so I'd prefer not to regress there ;)

As soon as this PR is in (if it moves forward in the current shape), I'm going to try and refactor the documentation so that Layers, utility functions are visible as they should. But docstrings for private functions like building blocks for ResNets will stay hidden - I will write devdocs so that there's a clear explanation for what goes where and how, though.

The other major thing I will try and introduce is documenting what the difference is between the camelcase model constructors and the lowercase one (think ResNet vs resnet). Right now it seems as though it's just pretrained weights but that's actually not the only point of difference

@ToucheSir
Copy link
Member

To be clear, I don't just mean the content is missing, but that the reference page itself is as well. Documenter has extraordinarily poor support for listing and searching the API reference (or rather, it has no concept of one). Moving to Publish/eventually Pollen was a way out of this mess, so I'd prefer to try fixing whatever makes the current system unsatisfying to use than to subject users to the poor UX of the old solution.

@theabhirath
Copy link
Member

theabhirath commented Aug 12, 2022

To be clear, I don't just mean the content is missing, but that the reference page itself is as well. Documenter has extraordinarily poor support for listing and searching the API reference (or rather, it has no concept of one). Moving to Publish/eventually Pollen was a way out of this mess, so I'd prefer to try fixing whatever makes the current system unsatisfying to use than to subject users to the poor UX of the old solution.

Oh, I see. If there's a way to keep the docs for the versions separate then I think Pollen should be fine, but IIRC Lorenz mentioned that isn't possible at the moment. The Publish CI times are way too high, though, not to mention that it shows literally every function (even private ones) in the docs, meaning that I'm always afraid that people will end up using what is documented internal API but isn't really stable. I'm not sure if there's a way to work past this.

@Saransh-cpp
Copy link
Member Author

Saransh-cpp commented Aug 13, 2022

I don't have very strong opinions about either documentation style as I haven't used Metalhead as a user or developer, but some comments -

  • I agree that the current documentation is great for displaying the API reference, and it does seem to look better than Documenter.jl. One can search the API reference in a Documenter.jl based documentation too, but https://fluxml.ai/Metalhead.jl/dev/docstrings.html will always outperform this in terms of searching capability.
  • On the other hand, the current docs feel very alien-like in Flux's ecosystem, plus it does not look good from the point of view of a new Metalhead user.
  • Additionally, the API reference can also be placed in between pages using Documenter.jl, given that there is some discussion of clarifying the usage of the existing API reference. So I think Documenter.jl would give us more control over the documentation (for example - It will allow us to split the API reference into small blocks or pages to increase readability)?

And then yes, @theabhirath's concerns about exposing the internal API are valid, which should be taken care of even if this PR is closed.

@ToucheSir
Copy link
Member

ToucheSir commented Aug 13, 2022

It's not just that page. I can search for "LSTMCell" in the Flux docs and it will completely miss the docstring at https://fluxml.ai/Flux.jl/stable/models/layers/#Flux.LSTM. Publish can search inside docstrings, which is a huge boon when you know a couple of keywords but not the exact API.

  • On the other hand, the current docs feel very alien-like in Flux's ecosystem, plus it does not look good from the point of view of a new Metalhead user.

The alternative solution there is to migrate other repos off Documenter 😉. I do get it, but given how much the structure of Documenter sites varies between packages I'm not sure it's a deal-breaker.

  • So I think Documenter.jl would give us more control over the documentation (for example - It will allow us to split the API reference into small blocks or pages to increase readability)?

I agree this should be done, but it's not something only Documenter can do. The main problem we've had with Documenter is that it's way too free-form. Easy to add a docstring and forget to add to the docs or vice versa because there is no enforcement of structure. Unless one is doing regular cleanups, docs coverage tends to degrade over time in this scheme.

So ultimately, I would put the following criteria on a reversion to Documenter:

  1. Some way to make CI complain on dangling docstrings/undocumented functions (and a mechanism to opt out of that for functions we know are fine)
  2. Searching within docstrings (doesn't have to be perfect, but better than what we have with Flux right now)
  3. A high-level draft of how the API docs will be structured

1) and 2) are the make or break criteria. 3) I have confidence you'll be able to pull off if those two are addressed :)

@Saransh-cpp
Copy link
Member Author

It's not just that page. I can search for "LSTMCell" in the Flux docs and it will completely miss the docstring at https://fluxml.ai/Flux.jl/stable/models/layers/#Flux.LSTM.

Ah, I had experienced this but had no idea about this being a real issue. Definitely a big letdown :(

@theabhirath
Copy link
Member

theabhirath commented Aug 13, 2022

  1. Some way to make CI complain on dangling docstrings/undocumented functions

This already happens silently like so. I might have to look at the Documenter docs more closely to see if there's a way to A. opt out for certain docstrings (for example, this one errors for all modules in the project, which includes Flux) and B. to get it to error for some important docstrings or new docstrings until explicitly opted out of.

  1. Searching within docstrings

I believe the problem with the example you posted is that Documenter's current search does not search inside of code blocks. I will have to verify this with more searches, but I think that might be an issue. Apart from opening an issue upstream, what we could do to sort of replicate the current setup with Publish is to possibly have an @autodocs block with all the docstrings from all the modules, and warn the user that this is not necessarily the best place to look for usage examples. We can then handle proper, detailed docs for important functions ourselves (I'm always in awe of how well SciML is documented, especially with the copy-paste tutorials, and I'd like to replicate something like that for Metalhead at least).

@theabhirath
Copy link
Member

theabhirath commented Aug 13, 2022

Side note: the Lux docs look beautiful IMO, and the theme really adds to the polish of the documentation - not to mention that its search is more powerful, so it does manage to find LSTM inside of code as well. If only there's a way to make it work with multiple versions of packages, it might be the answer to all our solutions in combination with Documenter.jl.

Edit: there's a TODO here but not sure if there's already an extension that handles this or it's something that'll have to be cobbled together by hand

@Saransh-cpp
Copy link
Member Author

For the versioning part -

I have tried mike - https://github.com/jimporter/mike - on a personal project before, but it was too much work (for a personal project). It might work for Metalhead.

There is an open issue (since 2014) in mkdocs regarding this - mkdocs/mkdocs#193.

@Saransh-cpp
Copy link
Member Author

Saransh-cpp commented Aug 13, 2022

This already happens silently like so. I might have to look at the Documenter docs more closely to see if there's a way to A. opt out for certain docstrings (for example, this one errors for all modules in the project, which includes Flux) and B. to get it to error for some important docstrings or new docstrings until explicitly opted out of.

Unfortunately, there is no way to opt-out of the warnings generated from doc dependencies and no way to opt out of the errors (if set to strict - see below). See https://juliadocs.github.io/Documenter.jl/stable/lib/public/#Documenter.makedocs -

modules specifies a vector of modules that should be documented in source. If any inline docstrings from those modules are seen to be missing from the generated content then a warning will be printed during execution of makedocs. By default no modules are passed to modules and so no warnings will appear. This setting can be used as an indicator of the "coverage" of the generated documentation.

For erroring out on specific things, the strict keyword can be used (used by Flux.jl and Functors.jl at the moment) -

strict – if set to true, makedocs fails the build right before rendering if it encountered any errors with the document in the previous build phases. The keyword strict can also be set to a Symbol or Vector{Symbol} to specify which kind of error (or errors) should be fatal. Options are: :autodocs_block, :cross_references, :docs_block, :doctest, :eval_block, :example_block, :footnote, :linkcheck, :meta_block, :missing_docs, :parse_error, and :setup_block. Use Documenter.except to explicitly set non-fatal errors.


especially with the copy-paste tutorials

New Flux tutorials will most probably have this as a part of them 😉

@ToucheSir
Copy link
Member

I have tried mike - https://github.com/jimporter/mike - on a personal project before, but it was too much work (for a personal project). It might work for Metalhead.

Feel free to try this on a smaller repo first. If it works well, we can consider applying it across the org.

@darsnack
Copy link
Member

Lux's docs are a great example of well-written and organized docs. But the build system has many moving pieces, and I'm not sure it really provides much in the way of features. Documenter is still the core documentation engine, and since it doesn't support Literate.jl natively, it requires a separate conversion step to Documenter-friendly markdown first. Then you need to use Documenter.jl to generate plain markdown output (replacing docstrings etc.). Then finally you build and deploy the docs using a static site generator that is managed by pip (yuck!).

I also think Lux's theme, while pretty, is not really good for presentation of information. There's a lack of contrast in color and font weight. The titles are too lightweight, making it hard to visually separate content. The content width isn't constrained resulting in long horizontal paragraphs which are known to be difficult to read. And my personal peeve, the sidebar isn't sticky so I can't tell where I am, or I have to go through a series of clicks to get to the page I want.

If we are going to go through the hassle of using a static site generator for the docs, then why not use Franklin.jl? Having recently struggled with FluxML's website and based on my past experience, Franklin.jl is just so much simpler than all the other static site generators. And we already have a great example of good docs built with Franklin: Makie.jl. As a comparison, look at Makie vs Lux below:

Screen Shot 2022-08-13 at 4 54 56 PM Screen Shot 2022-08-13 at 4 58 20 PM

Screen Shot 2022-08-13 at 4 59 45 PM Screen Shot 2022-08-13 at 5 00 03 PM

Even at the small resolution above, you can clear tell each block of content from the other in Makie's docs. Every docstring is clearly separated in its own box. Franklin has better built-in support for running Julia code and Literate notebooks. And the search in Makie appears to be very comprehensive. Of course, all the theme stuff above is just CSS, but my point is that if we're going to put in the effort to build this system, then we should start for something that's closer to what we want (all Julia + better theme).


Using Publish for Metalhead was meant to be an experiment in a new documentation system for all of FluxML. I think while we gained some good features, like a really good, up-to-date API page, Publish isn't going to be sustainable without significant development. Personally, I think the best documentation system that presents a clear value-add is Pollen by orders of magnitude. It looks polished and has a reasonably good out-of-the-box theme for presentation. The stacked + connected pages system is a great feature for complex packages like in FluxML where you are following a tutorial and needing to quickly reference docstrings.

The point of switching to Documenter was to do something to alleviate the docs build time that was quick and dirty. If we are going to expend effort, then we should try to PR Pollen to add versioning (the only piece missing). Pollen also supports Jupyter notebooks which will let us write tutorials with fine-tuning pipelines without needing to re-run the training on CI (Jupyter is the only format that won't require manual copy-pasting of the cell output). If we don't want to help push Pollen forward, then we can either:

  • use @autodocs to generate the API pages and take the hit on search
  • statically paste the output of cells in the docs so that Publish doesn't need to run any live cells (the main source of slow down)

@theabhirath
Copy link
Member

theabhirath commented Aug 14, 2022

If we are going to go through the hassle of using a static site generator for the docs, then why not use Franklin.jl?

I was very interested in using Franklin.jl but I noticed that it was undergoing a big overhaul at Xranklin which is why I've actually not been advocating for it in the short term. I feel like Franklin and Pollen are in the same place right now - they're in that mid-level beta, where general feature-wise both are ready to be used but I'm afraid that if we want to use more specific features we might hit roadblocks (not to mention that current Franklin is notoriously slow to build, and Xranklin to my knowledge improves performance significantly).

Pollen has a couple of UI stuff I would like to smooth over (an easy close button on the tabs, long term maybe even a command palette for searching them because I stack up tabs very easily), not to mention dark mode and versioning. I'm also interested in seeing how much the CSS can be tweaked without losing the overall polished look and feel. Maybe @lorenzoh can explain what aspects can be worked out without in depth knowledge of Pollen and in possible drive-by PRs. There's also some stuff about fonts and code highlighting I'm not particularly satisfied with in the current Pollen version v/s Documenter, but I'm sure Lorenz has plans on improving those as soon as he gets time (I'd be happy to help wherever I can if I know enough, though).

Personally, I find that using Publish for Metalhead sticks out in the current FluxML ecosystem docs, so I'd rather go with using @autodocs for Documenter.jl if that's the two choices we have

@theabhirath
Copy link
Member

In case we are looking at other options, though, I also remembered that Quarto has Julia support. A potential left field option could be tying it up with Documenter (it has a very polished feel, and it doesn't have the same UI issues as the Lux theme)

@Saransh-cpp
Copy link
Member Author

There have been some discussions regarding using Pluto notebooks for writing tutorials and how-to guides in Flux. We can try using Pluto notebooks for the tutorials if we don't want to take a hit on the search and keep using the current documentation style (I personally don't like the overall user interface, but yes, the API search is superior).

A relevant comment - FluxML/Flux.jl#2016 (comment)

@darsnack
Copy link
Member

We discussed this off GH and decided to take the hit and switch to Documenter for now. I think this PR is nearly ready, there is just @theabhirath's comments that need addressing.

@theabhirath
Copy link
Member

theabhirath commented Aug 23, 2022

Wait this is weird but I just noticed https://saransh-cpp.github.io/Metalhead.jl/dev/api/models/ doesn't render the paper references accurately. Is this another Documenter thing?

Edit: nevermind, brainfade - turns out we're using the wrong syntax for some reference links 🤦🏽 At least we know what to correct 😁

@theabhirath
Copy link
Member

Also once this is in I think the Publish docs will stop deploying, so there probably needs to be a note somewhere to figure out how to navigate to older versions of the docs?

@Saransh-cpp
Copy link
Member Author

Also once this is in I think the Publish docs will stop deploying, so there probably needs to be a note somewhere to figure out how to navigate to older versions of the docs?

Yes! Maybe the maintainers can shift the Publish docs to a new URL ? I can then add a note in the README and in index.md.

@Saransh-cpp
Copy link
Member Author

Saransh-cpp commented Aug 23, 2022

Note: Metalhead.resnet has no docstring; hence the references pointing to it are broken. If a docstring is added to Metalhead.resnet, I can rebase this branch and make cross-references strict, which would be a nice thing to have at the beginning of the migration.

@theabhirath
Copy link
Member

Metalhead.resnet is the biggest docstring of the lot 😅 Writing it might take a day or two. I'll get it to it though

@theabhirath
Copy link
Member

theabhirath commented Sep 4, 2022

Alright now that #200 is in, this is the next PR we need to land for the docs transition to go through. A rebase should do the trick. I think most important docstrings are now in so we can make cross references strict and restructure later.

Also not sure how we're gonna keep the Publish docs around...

@darsnack
Copy link
Member

darsnack commented Sep 4, 2022

I can manually build the previous tagged doc versions and push them. @Saransh-cpp you had posted a snippet for doing this on some repo, but I can't find it now. Never mind, found it: FluxML/OneHotArrays.jl#19

@Saransh-cpp
Copy link
Member Author

I still cannot make the cross references strict because now mbconv_stage_builder's reference ends up in a 404 page 😅 I guess this would keep happening as Metalhead is under a huge refactor, so it should be okay to not keep the cross-references strict right now!

Comment on lines 3 to 5
```julia
using Flux, Metalhead
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably delete this now?

Comment on lines -18 to -19

See also [`reluadd`](@ref).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe swap these for references to Metalhead.actadd (and vice-versa below)?

@darsnack darsnack merged commit cc486bf into FluxML:master Sep 6, 2022
@Saransh-cpp Saransh-cpp deleted the docs branch September 6, 2022 15:41
@Saransh-cpp
Copy link
Member Author

Worked!! https://fluxml.ai/Metalhead.jl/dev/index.html - the previous versions are intact too!

@theabhirath
Copy link
Member

Thanks a lot for this @Saransh-cpp! This is going to make my life a lot easier with the new docs structure 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

4 participants