A new linear regression tutorial #2016

Saransh-cpp · 2022-07-05T20:22:58Z

This PR -

Adds a new "Getting Started" section in the docs which I will be working on for the next couple of weeks
Moves basics.md and overview.md to the "Getting Started" section. Currently, there are no changes in the files, but they will be changed soon.
Adds a Linear Regression example to the "Getting Started" section

For the linear regression guide -

Should I add the backpropagation step in the first half, or should I leave it to the Flux.Optimise.update! step?
This guide overlaps with a lot of other textual information present but scattered in Flux's docs. These other texts will also be updated or moved to a better place soon. Basics.md, for example, does something very similar with dummy data, and the current "Getting Started" guide does something similar but with pre-defined weights.

PR Checklist

Tests are added
Entry in NEWS.md
Documentation, if applicable

DhairyaLGandhi · 2022-07-05T21:08:09Z

Should we consider having a "getting started" on the website in addition to the docs?

…

On Wed, Jul 6, 2022, 01:53 Saransh ***@***.***> wrote: This PR - - Adds a new "Getting Started" section in the docs which I will be working on for the next couple of weeks - Moves basics.md and overview.md to the "Getting Started" section. Currently, there are no changes in the files, but they will be changed soon. - Adds a Linear Regression example to the "Getting Started" section For the linear regression guide - - Should I add the backpropagation step in the first half, or should I leave it to the Flux.Optimise.update! step? - This guide overlaps with a lot of other textual information present but scattered in Flux's docs. These other texts will also be updated or moved to a better place soon. Basics.md, for example, does something very similar with dummy data, and the current "Getting Started" guide does something similar but with pre-defined weights. PR Checklist - Tests are added - Entry in NEWS.md - Documentation, if applicable ------------------------------ You can view, comment on, or merge this pull request online at: #2016 Commit Summary - c52dc2c <c52dc2c> Create a getting started section and add a new linear regression example File Changes (9 files <https://github.com/FluxML/Flux.jl/pull/2016/files>) - *M* docs/Project.toml <https://github.com/FluxML/Flux.jl/pull/2016/files#diff-6e0adb5e01dce01db395f52201532ad8fec022922489f34827886053953ce93f> (3) - *M* docs/make.jl <https://github.com/FluxML/Flux.jl/pull/2016/files#diff-4aae2d1c783cade58bd2cb13748da956e568b7f2aed5fafd9e2a46fb97daf613> (11) - *R* docs/src/getting_started/basics.md <https://github.com/FluxML/Flux.jl/pull/2016/files#diff-4224422c0fa149eeb1d03330e17a6a2029d129db92bed998c99986f034dfc4fb> (2) - *A* docs/src/getting_started/linear_regression.md <https://github.com/FluxML/Flux.jl/pull/2016/files#diff-fba702b6c13d725a8429905593e78fdf3d263cbde6f5c769af63ee1a20fd8424> (496) - *R* docs/src/getting_started/overview.md <https://github.com/FluxML/Flux.jl/pull/2016/files#diff-f33b3f58a594c426aed1cddeaabc6569c071cdee3799b15fdf52a278142eeb66> (0) - *M* docs/src/gpu.md <https://github.com/FluxML/Flux.jl/pull/2016/files#diff-813fc68aeb136c3af32fae13632adf27c24229ce9b56b151d13b836db8a7aa2d> (2) - *M* docs/src/models/advanced.md <https://github.com/FluxML/Flux.jl/pull/2016/files#diff-01d4481e127f0cef79438231dbde6fca65d618f29070e0f2835ed6a7f8b13df7> (2) - *M* docs/src/training/optimisers.md <https://github.com/FluxML/Flux.jl/pull/2016/files#diff-e9c792a4e565cbb6cd7f78a85ab25ad5bb66a6fc0f1e4d49ab39ac821dd69b4c> (2) - *M* docs/src/training/training.md <https://github.com/FluxML/Flux.jl/pull/2016/files#diff-791e8b024a9ce7e7f89b45b7582d628d3d8d55f0bb5e17c39f8a50bd6aa21aea> (6) Patch Links: - https://github.com/FluxML/Flux.jl/pull/2016.patch - https://github.com/FluxML/Flux.jl/pull/2016.diff — Reply to this email directly, view it on GitHub <#2016>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AJOZVVPWCTKUTEJDXS5J7ILVSSKTHANCNFSM52XLUU3A> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

Saransh-cpp · 2022-07-06T10:56:48Z

The first thought I had was to add a hyperlink on the website's Getting Started page, which would redirect the user to the docs' Getting Started page (similar to the Ecosystem page).

Should I also add the same tutorial (once approved) to the website, or should we link the section?

avik-pal · 2022-07-07T04:07:19Z

I think there is a consensus to not use Flux.params API any longer. We should not introduce it into new tutorials being written.

Saransh-cpp · 2022-07-08T16:51:44Z

Thank you for the suggestion! I took some time to go through Optimisers.jl, and I am assuming that the traditional Flux training method should be replaced with Optimisers.jl?

mcabbott · 2022-07-13T16:39:38Z

Agree we should move away from the whole weird Params story.

But perhaps this linear regression example should delay introducing Optimisers.jl as long as possible. For simple gradient descent, just writing out something like this:

dLdW, _, _ = gradient(loss, W, x, y)
W .= W .- 0.1 .* dLdW

might be a better level than immediately introducing optimiser state etc. It's unfortunate that this has quite a few moving parts, unlike train!'s apparent simplicity (although really train! hides a lot & this is also confusing). Maybe Optimisers.jl ought to be introduced along with an explanation that adding momentum helps, so that you know why there is a state?

Even something like this seems OK to me, pretty explicit, and makes you understand why you are about to see a tool for walking over the arrays:

m = Dense(1 => 1)
for step in 1:10
  dLdm, _, _ = gradient(loss, m, x, y)
  
  m.weight .= m.weight .- 0.1 .* dLdm.weight
  m.bias .= m.bias .- 0.1 .* dLdm.bias
end

One more comment. In such tutorials, things like params = Flux.params(W, b) seem super-confusing. It would be nice to choose variable names which are very clearly things you've chosen, not features of Flux. flux_model is good.

(Link to rendered version: https://github.com/Saransh-cpp/Flux.jl/blob/linear-regression/docs/src/getting_started/linear_regression.md )

Saransh-cpp · 2022-07-14T11:52:43Z

Agreed, we should keep the linear regression example simple. I will update the tutorial to show the gradient descent algorithm in action.

I have been working on the logistic regression example locally and will update that with the same.

Saransh-cpp · 2022-07-31T18:34:50Z

I'll update this with the new train! definition once #2029 is merged. Right now this PR does not use train! in any way.

Saransh-cpp · 2022-08-07T13:26:17Z

Test: @ModelZookeeper commands

mcabbott

I had some comments on an earlier version... perhaps useful but not a full review.

docs/src/getting_started/linear_regression.md

mcabbott · 2022-08-15T03:31:50Z

docs/src/getting_started/linear_regression.md

+## Copy-pastable code
+### Dummy dataset
+```julia
+using Flux
+using Plots
+
+# data
+x = hcat(collect(Float32, -3:0.1:3)...)


It's nice to encourage people to try this out. I fear that nobody will find this section until they have done plenty of copying pieces above.

And that they will tend to drift out of sync. (Maybe this section should at least run as a smoke-test.)

Should we consider making these pages with Pluto, so that they are just a static version of something you can download and run?

Saransh-cpp · 2022-08-15T20:00:50Z

Thanks for the review, @mcabbott! Pluto sounds good! We can get rid of the copy-paste section if this is converted to a Pluto notebook (or file, not sure, will go through it in detail).

(SciML uses this copy-paste section at the top, but this code was too lengthy to be placed at the top)

Edit: A discussion about the documentation of Metalhead is going on at FluxML/Metalhead.jl#199, which could result in a uniform template for these getting started/quickstart guides.

Saransh-cpp · 2022-08-21T10:59:30Z

@mcabbott, JuliaManifolds/Manopt.jl renders Pluto notebooks in the Documenter documentation using the following make.jl contents - https://github.com/JuliaManifolds/Manopt.jl/blob/master/docs/make.jl#L1-L106.

This does look a bit hacky, and it also distorts the documentation when a user opens one of the rendered Pluto notebooks -

Sidebar

Normal page

Rendered Pluto notebook

Settings

Normal page

Rendered Pluto notebook

Is there a legitimate way to render Pluto notebooks in Documenter's documentation? This hack does not look good to me. Alternatively, we could keep the section as it is and at the top link a Pluto notebook that can be downloaded for following along.

Here is how the converted Pluto notebook looks like - https://saransh-cpp.github.io/assets/pluto/linear_regression.jl.html

Note: I am just redirecting users to the html page generated by Pluto from here - https://saransh-cpp.github.io/blog/ (this will be removed once this PR is merged to avoid duplicate pages on the web).

Saransh-cpp · 2022-08-22T17:24:40Z

Currently, https://fluxml.ai/Flux.jl/stable/models/overview/ does something similar but is not as extensive as this guide. Should it be removed, or should it also be put under the "Getting Started" guide with a better title?

ToucheSir · 2022-08-23T01:05:42Z

What if anything do we lose if it's removed? I think it would be nice to do so, but any material in it which isn't covered elsewhere would need a new home.

Saransh-cpp · 2022-08-23T14:58:40Z

I think the following text can be used at the top of this guide. The rest of this page is definitely a subset of this guide.

Flux is a pure Julia ML stack that allows you to build predictive models. Here are the steps for a typical Flux program:

Provide training and test data
Build a model with configurable parameters to make predictions
Iteratively train the model by tweaking the parameters to improve predictions
Verify your model

Under the hood, Flux uses a technique called automatic differentiation to take gradients that help improve predictions. Flux is also fully written in Julia so you can easily replace any layer of Flux with your own code to improve your understanding or satisfy special requirements.

Here's how you'd use Flux to build and train the most basic of models, step by step.

mcabbott · 2022-08-27T18:11:34Z

Currently, https://fluxml.ai/Flux.jl/stable/models/overview/ does something similar but is not as extensive as this guide. Should it be removed, or

One virtue of that page is that it's much shorter.

I like this PR's story, it's a nice ground-up explanation. But if you have met some of this before, and want a faster path to seeing how to write it in Julia's syntax and using Flux's pieces, then perhaps you prefer the older one.

Not sure what the ideal way to organise this material is...

Saransh-cpp · 2022-08-29T17:57:39Z

But if you have met some of this before, and want a faster path to seeing how to write it in Julia's syntax and using Flux's pieces, then perhaps you prefer the older one.

Yes, this makes sense. Maybe converting it into a "Quickstart" page under the "Getting Started" section?

mcabbott · 2022-08-29T19:14:30Z

In addition to the main manual

https://fluxml.ai/Flux.jl/stable/models/overview/

we also have some quite nice tutorials here:

https://fluxml.ai/tutorials.html

How do we make these all findable, and where does this new page go?

The main docs also seem a slightly awkward combination of introduction and reference. It's possible that the "Building models" heading should be split in two? Not sure.

Saransh-cpp · 2022-09-01T16:51:10Z

In addition to the main manual

https://fluxml.ai/Flux.jl/stable/models/overview/

we also have some quite nice tutorials here:

https://fluxml.ai/tutorials.html

How do we make these all findable, and where does this new page go?

While drafting the "Getting Started" section, I wanted to include only the guides that will get a user started with flux. The website tutorials should be the ones introducing something that a user doesn't find themselves engaged with when they begin with ML/DL or an ML/DL package, for example, GANs and Transfer Learning. I think the DataLoader tutorial should be moved to the MLUtils page and the Deep Learning with Flux - A 60 Minute Blitz (September 2020) can be added to the "Getting Started" section.

Ideally, there should be a "Tutorials" heading on the docs' sidebar which should redirect users to the website's tutorials page. Similarly, the website's getting started page should redirect users to the docs' getting started section.

A nice infographic -

IMO, the DataLoader example is a "How-To Guide", the "Getting Started" guides are more inclined towards "Explanations", and the website tutorials are "Tutorials".

Co-authored-by: Michael Abbott <32575566+mcabbott@users.noreply.github.com>

mcabbott · 2022-10-27T11:16:52Z

docs/make.jl

-            "Gradients and Layers" => "models/basics.md",
+            "Quick Start" => "getting_started/quickstart.md",
+            "Fitting a Line" => "getting_started/overview.md",
+            "Gradients and Layers" => "getting_started/basics.md",


Can I suggest leaving files where they are, until sure? I think moving them may break some links elsewhere. And we may re-organise this into Guide / Reference.

(Adding id & linking by that, not heading name nor file name, seems like the right solution.)

All the references associated with these pages now use ids! I have also reverted back the structural changes.

mcabbott

Let's do it. Tutorials is the right section, IMO.

Saransh-cpp marked this pull request as draft July 5, 2022 20:23

Saransh-cpp marked this pull request as ready for review July 6, 2022 11:37

Saransh-cpp force-pushed the linear-regression branch from f3f6933 to fdc67c0 Compare July 15, 2022 08:04

Saransh-cpp force-pushed the linear-regression branch from d66b311 to a8f4788 Compare July 28, 2022 13:40

mcabbott added the documentation label Jul 28, 2022

Saransh-cpp force-pushed the linear-regression branch from 418a778 to 1884275 Compare July 31, 2022 18:31

Saransh-cpp force-pushed the linear-regression branch from 1884275 to 06cb07c Compare August 6, 2022 13:57

mcabbott reviewed Aug 15, 2022

View reviewed changes

Saransh-cpp force-pushed the linear-regression branch from f5cfd16 to adb49b2 Compare August 15, 2022 19:51

Saransh-cpp mentioned this pull request Aug 22, 2022

Migrate docs to Documenter.jl FluxML/Metalhead.jl#199

Merged

Saransh-cpp force-pushed the linear-regression branch 2 times, most recently from da20b62 to 2c9cdd1 Compare August 23, 2022 15:02

Saransh-cpp mentioned this pull request Sep 12, 2022

[Discussion]: Update and periodically test posts and model-zoo tutorials FluxML/fluxml.github.io#141

Open

Saransh-cpp and others added 22 commits October 27, 2022 16:02

Create a getting started section and add a new linear regression example

b7c4ae9

Minor improvements

2f74f37

Enable doctests

d3526e9

Update code blocks to get rid of Flux.params

a1e49ad

Update the text to manually run gradient descent

2605f92

Fix doctests

bca37be

Minor language fixes

7670145

Better variable names and cleaner print statements

288f4ad

@epcohs is deprecated

8cab77b

Update docs/src/getting_started/linear_regression.md

0a03ab5

Co-authored-by: Michael Abbott <32575566+mcabbott@users.noreply.github.com>

Update docs/src/getting_started/linear_regression.md

f55603f

Co-authored-by: Michael Abbott <32575566+mcabbott@users.noreply.github.com>

Update docs/src/getting_started/linear_regression.md

91b1260

Co-authored-by: Michael Abbott <32575566+mcabbott@users.noreply.github.com>

Show data

055f6a4

More general regex

8f89bd7

Minor bug in the guide

51f8a38

Better introduction to a ML pipeline

36d7578

Move to the new Getting Started section?

df06a6d

Create a new 'tutorials' section

b67a3a9

Fix doctests

768543c

Try fixing spaces

13cb623

More doctest fixing

17d167e

Move to the existing tutorials section

0350e03

Saransh-cpp force-pushed the linear-regression branch from e1b88d1 to 0350e03 Compare October 27, 2022 10:33

mcabbott reviewed Oct 27, 2022

View reviewed changes

Revert structure + use ids

6b64b58

Saransh-cpp changed the title ~~A new linear regression tutorial + restructure docs directory~~ A new linear regression tutorial Oct 27, 2022

mcabbott approved these changes Nov 19, 2022

View reviewed changes

Merge branch 'master' into linear-regression

25eea17

ToucheSir merged commit da8ce81 into FluxML:master Nov 22, 2022

Saransh-cpp deleted the linear-regression branch November 22, 2022 06:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A new linear regression tutorial #2016

A new linear regression tutorial #2016

Saransh-cpp commented Jul 5, 2022

DhairyaLGandhi commented Jul 5, 2022 via email

Saransh-cpp commented Jul 6, 2022

avik-pal commented Jul 7, 2022

Saransh-cpp commented Jul 8, 2022

mcabbott commented Jul 13, 2022

Saransh-cpp commented Jul 14, 2022

Saransh-cpp commented Jul 31, 2022

Saransh-cpp commented Aug 7, 2022

mcabbott left a comment

mcabbott Aug 15, 2022

Saransh-cpp commented Aug 15, 2022 •

edited

Loading

Saransh-cpp commented Aug 21, 2022

Saransh-cpp commented Aug 22, 2022

ToucheSir commented Aug 23, 2022

Saransh-cpp commented Aug 23, 2022

mcabbott commented Aug 27, 2022

Saransh-cpp commented Aug 29, 2022

mcabbott commented Aug 29, 2022

Saransh-cpp commented Sep 1, 2022

mcabbott Oct 27, 2022

Saransh-cpp Oct 27, 2022

mcabbott left a comment

A new linear regression tutorial #2016

A new linear regression tutorial #2016

Conversation

Saransh-cpp commented Jul 5, 2022

PR Checklist

DhairyaLGandhi commented Jul 5, 2022 via email

Saransh-cpp commented Jul 6, 2022

avik-pal commented Jul 7, 2022

Saransh-cpp commented Jul 8, 2022

mcabbott commented Jul 13, 2022

Saransh-cpp commented Jul 14, 2022

Saransh-cpp commented Jul 31, 2022

Saransh-cpp commented Aug 7, 2022

mcabbott left a comment

Choose a reason for hiding this comment

mcabbott Aug 15, 2022

Choose a reason for hiding this comment

Saransh-cpp commented Aug 15, 2022 • edited Loading

Saransh-cpp commented Aug 21, 2022

Sidebar

Normal page

Rendered Pluto notebook

Settings

Normal page

Rendered Pluto notebook

Saransh-cpp commented Aug 22, 2022

ToucheSir commented Aug 23, 2022

Saransh-cpp commented Aug 23, 2022

mcabbott commented Aug 27, 2022

Saransh-cpp commented Aug 29, 2022

mcabbott commented Aug 29, 2022

Saransh-cpp commented Sep 1, 2022

mcabbott Oct 27, 2022

Choose a reason for hiding this comment

Saransh-cpp Oct 27, 2022

Choose a reason for hiding this comment

mcabbott left a comment

Choose a reason for hiding this comment

Saransh-cpp commented Aug 15, 2022 •

edited

Loading