diff --git a/docs/make.jl b/docs/make.jl index feec4e056..599f21b66 100644 --- a/docs/make.jl +++ b/docs/make.jl @@ -90,7 +90,6 @@ makedocs(; sitename="Lux.jl Docs", repo="github.com/LuxDL/Lux.jl", devbranch="main", devurl="dev", deploy_url="https://lux.csail.mit.edu", deploy_decision), draft=false, - warnonly=:linkcheck, # Lately it has been failing quite a lot but those links are actually fine pages) deploydocs(; repo="github.com/LuxDL/Lux.jl.git", diff --git a/docs/src/manual/autodiff.md b/docs/src/manual/autodiff.md index a8e16ed53..8413fb070 100644 --- a/docs/src/manual/autodiff.md +++ b/docs/src/manual/autodiff.md @@ -14,7 +14,7 @@ Lux. Additionally, we provide some convenience functions for working with AD. | [`ForwardDiff.jl`](https://github.com/JuliaDiff/ForwardDiff.jl) | Forward | ✔️ | ✔️ | ✔️ | Tier I | | [`ReverseDiff.jl`](https://github.com/JuliaDiff/ReverseDiff.jl) | Reverse | ✔️ | ❌ | ❌ | Tier II | | [`Tracker.jl`](https://github.com/FluxML/Tracker.jl) | Reverse | ✔️ | ✔️ | ❌ | Tier II | -| [`Tapir.jl`](https://github.com/withbayes/Tapir.jl) | Reverse | ❓[^q] | ❌ | ❌ | Tier III | +| [`Tapir.jl`](https://github.com/compintell/Tapir.jl) | Reverse | ❓[^q] | ❌ | ❌ | Tier III | | [`Diffractor.jl`](https://github.com/JuliaDiff/Diffractor.jl) | Forward | ❓[^q] | ❓[^q] | ❓[^q] | Tier III | [^e]: Currently Enzyme outperforms other AD packages in terms of CPU performance. However, diff --git a/docs/src/manual/nested_autodiff.md b/docs/src/manual/nested_autodiff.md index 9ea529fdf..f373a8f95 100644 --- a/docs/src/manual/nested_autodiff.md +++ b/docs/src/manual/nested_autodiff.md @@ -192,9 +192,9 @@ nothing; # hide Hutchinson Trace Estimation often shows up in machine learning literature to provide a fast estimate of the trace of a Jacobian Matrix. This is based off of -[Hutchinson 1990](https://www.researchgate.net/publication/243668757_A_Stochastic_Estimator_of_the_Trace_of_the_Influence_Matrix_for_Laplacian_Smoothing_Splines) which -computes the estimated trace of a matrix ``A \in \mathbb{R}^{D \times D}`` using random -vectors ``v \in \mathbb{R}^{D}`` s.t. ``\mathbb{E}\left[v v^T\right] = I``. +[Hutchinson 1990](https://www.nowozin.net/sebastian/blog/thoughts-on-trace-estimation-in-deep-learning.html) +which computes the estimated trace of a matrix ``A \in \mathbb{R}^{D \times D}`` using +random vectors ``v \in \mathbb{R}^{D}`` s.t. ``\mathbb{E}\left[v v^T\right] = I``. ```math \text{Tr}(A) = \mathbb{E}\left[v^T A v\right] = \frac{1}{V} \sum_{i = 1}^V v_i^T A v_i diff --git a/docs/src/manual/performance_pitfalls.md b/docs/src/manual/performance_pitfalls.md index 24a17dc14..92be45e0b 100644 --- a/docs/src/manual/performance_pitfalls.md +++ b/docs/src/manual/performance_pitfalls.md @@ -67,4 +67,4 @@ GPUArraysCore.allowscalar(false) `Lux.jl` is integrated with `DispatchDoctor.jl` to catch type instabilities. You can easily enable it by setting the `instability_check` preference. This will help you catch type instabilities in your code. For more information on how to set preferences, check out -[`set_dispatch_doctor_preferences`](@ref). +[`Lux.set_dispatch_doctor_preferences!`](@ref). diff --git a/docs/src/manual/preferences.md b/docs/src/manual/preferences.md index 88117b2ad..eaea213ee 100644 --- a/docs/src/manual/preferences.md +++ b/docs/src/manual/preferences.md @@ -50,8 +50,8 @@ By default, both of these preferences are set to `false`. ## [Dispatch Doctor](@id dispatch-doctor-preference) 1. `instability_check` - Preference controlling the dispatch doctor. See the documentation - on [`set_dispatch_doctor_preferences!`](@ref) for more details. The preferences need to - be set for `LuxCore` and `LuxLib` packages. Both of them default to `disable`. + on [`Lux.set_dispatch_doctor_preferences!`](@ref) for more details. The preferences need + to be set for `LuxCore` and `LuxLib` packages. Both of them default to `disable`. - Setting the `LuxCore` preference sets the check at the level of `LuxCore.apply`. This essentially activates the dispatch doctor for all Lux layers. - Setting the `LuxLib` preference sets the check at the level of functional layer of diff --git a/examples/Basics/main.jl b/examples/Basics/main.jl index ec2696365..2ce2fd9df 100644 --- a/examples/Basics/main.jl +++ b/examples/Basics/main.jl @@ -3,7 +3,7 @@ # This is a quick intro to [Lux](https://github.com/LuxDL/Lux.jl) loosely based on: # # 1. [PyTorch's tutorial](https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html). -# 2. [Flux's tutorial](https://fluxml.ai/Flux.jl/stable/tutorials/2020-09-15-deep-learning-flux/). +# 2. Flux's tutorial (the link for which has now been lost to abyss). # 3. [Jax's tutorial](https://jax.readthedocs.io/en/latest/jax-101/index.html). # # It introduces basic Julia programming, as well `Zygote`, a source-to-source automatic diff --git a/examples/BayesianNN/main.jl b/examples/BayesianNN/main.jl index 4e38e2ee0..31e1635de 100644 --- a/examples/BayesianNN/main.jl +++ b/examples/BayesianNN/main.jl @@ -1,11 +1,13 @@ # # Bayesian Neural Network # We borrow this tutorial from the -# [official Turing Docs](https://turinglang.org/stable/tutorials/03-bayesian-neural-network/). We -# will show how the explicit parameterization of Lux enables first-class composability with -# packages which expect flattened out parameter vectors. +# [official Turing Docs](https://turinglang.org/docs/tutorials/03-bayesian-neural-network/index.html). +# We will show how the explicit parameterization of Lux enables first-class composability +# with packages which expect flattened out parameter vectors. -# We will use [Turing.jl](https://turinglang.org/stable/) with [Lux.jl](https://lux.csail.mit.edu/) +# Note: The tutorial in the official Turing docs is now using Lux instead of Flux. + +# We will use [Turing.jl](https://turinglang.org/) with [Lux.jl](https://lux.csail.mit.edu/) # to implement implementing a classification algorithm. Lets start by importing the relevant # libraries. diff --git a/examples/SymbolicOptimalControl/main.jl b/examples/SymbolicOptimalControl/main.jl index 4fd5dc07d..7a96eb1a0 100644 --- a/examples/SymbolicOptimalControl/main.jl +++ b/examples/SymbolicOptimalControl/main.jl @@ -2,8 +2,8 @@ # This tutorial is based on [SciMLSensitivity.jl tutorial](https://docs.sciml.ai/SciMLSensitivity/stable/examples/optimal_control/optimal_control/). # Instead of using a classical NN architecture, here we will combine the NN with a symbolic -# expression from [DynamicExpressions.jl](https://symbolicml.org/DynamicExpressions.jl) (the -# symbolic engine behind [SymbolicRegression.jl](https://astroautomata.com/SymbolicRegression.jl) +# expression from [DynamicExpressions.jl](https://symbolicml.org/DynamicExpressions.jl/) (the +# symbolic engine behind [SymbolicRegression.jl](https://astroautomata.com/SymbolicRegression.jl/) # and [PySR](https://github.com/MilesCranmer/PySR/)). # Here we will solve a classic optimal control problem with a universal differential diff --git a/src/helpers/losses.jl b/src/helpers/losses.jl index ba56cc97c..2f1df05eb 100644 --- a/src/helpers/losses.jl +++ b/src/helpers/losses.jl @@ -595,7 +595,7 @@ true ## Special Note This function takes any of the -[`LossFunctions.jl`](https://juliaml.github.io/LossFunctions.jl/stable) public functions +[`LossFunctions.jl`](https://juliaml.github.io/LossFunctions.jl/stable/) public functions into the Lux Losses API with efficient aggregation. """ @concrete struct GenericLossFunction <: AbstractLossFunction