From 772bf0803f871f2c3516af76315fb4ed3b0948e2 Mon Sep 17 00:00:00 2001
From: Pietro Monticone <38562595+pitmonticone@users.noreply.github.com>
Date: Tue, 31 Jan 2023 23:20:06 +0100
Subject: [PATCH] Fix a few typos

---
 _weave/lecture02/optimizing.jmd                | 2 +-
 _weave/lecture03/sciml.jmd                     | 2 +-
 _weave/lecture04/dynamical_systems.jmd         | 2 +-
 _weave/lecture05/parallelism_overview.jmd      | 6 +++---
 _weave/lecture06/styles_of_parallelism.jmd     | 2 +-
 _weave/lecture07/discretizing_odes.jmd         | 8 ++++----
 _weave/lecture08/automatic_differentiation.jmd | 2 +-
 _weave/lecture09/stiff_odes.jmd                | 2 +-
 _weave/lecture10/estimation_identification.jmd | 2 +-
 _weave/lecture11/adjoints.jmd                  | 2 +-
 _weave/lecture13/gpus.jmd                      | 4 ++--
 _weave/lecture14/pdes_and_convolutions.jmd     | 4 ++--
 _weave/lecture15/diffeq_machine_learning.jmd   | 2 +-
 course/index.md                                | 2 +-
 14 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/_weave/lecture02/optimizing.jmd b/_weave/lecture02/optimizing.jmd
index b5bf3b4b..1a792496 100644
--- a/_weave/lecture02/optimizing.jmd
+++ b/_weave/lecture02/optimizing.jmd
@@ -1118,7 +1118,7 @@ interrupting every single `+`. Fortunately these function calls disappear during
 the compilation process due to what's known as inlining. Essentially, if the
 function call is determined to be "cheap enough", the actual function call
 is removed and the code is basically pasted into the function caller. We can
-force a function call to occur by teling it to not inline:
+force a function call to occur by telling it to not inline:
 
 ```julia
 @noinline fnoinline(x,y) = x + y
diff --git a/_weave/lecture03/sciml.jmd b/_weave/lecture03/sciml.jmd
index dc590bc3..ae495a54 100644
--- a/_weave/lecture03/sciml.jmd
+++ b/_weave/lecture03/sciml.jmd
@@ -679,7 +679,7 @@ sol = solve(prob)
 plot(sol,label=["Velocity" "Position"])
 ```
 
-Don't worry if you don't understand this sytnax yet: we will go over differential
+Don't worry if you don't understand this syntax yet: we will go over differential
 equation solvers and DifferentialEquations.jl in a later lecture.
 
 Let's say we want to learn how to predict the force applied on the spring at
diff --git a/_weave/lecture04/dynamical_systems.jmd b/_weave/lecture04/dynamical_systems.jmd
index 7ac431a8..5eba0845 100644
--- a/_weave/lecture04/dynamical_systems.jmd
+++ b/_weave/lecture04/dynamical_systems.jmd
@@ -395,7 +395,7 @@ step through the function pointer.
 
 What will approximately be the value of this dynamical system after 1000 steps
 if you start at `1.0` with parameter `p=0.25`? Can you guess without solving the
-system? Think about steady states and stabiltiy.
+system? Think about steady states and stability.
 
 ```julia
 solve_system(f,1.0,0.25,1000)
diff --git a/_weave/lecture05/parallelism_overview.jmd b/_weave/lecture05/parallelism_overview.jmd
index aeac0c68..ed3337ac 100644
--- a/_weave/lecture05/parallelism_overview.jmd
+++ b/_weave/lecture05/parallelism_overview.jmd
@@ -47,7 +47,7 @@ Each process can have many compute threads. These threads are the unit of
 execution that needs to be done. On each task is its own stack and a virtual
 CPU (virtual CPU since it's not the true CPU, since that would require that the
 task is ON the CPU, which it might not be because the task can be temporarily
-haulted). The kernel of the operating systems then *schedules* tasks, which
+halted). The kernel of the operating systems then *schedules* tasks, which
 runs them. In order to keep the computer running smooth, *context switching*,
 i.e. changing the task that is actually running, happens all the time. This is
 independent of whether tasks are actually scheduled in parallel or not.
@@ -82,7 +82,7 @@ be polled, then it will execute the command, and give the result. There are two
 variants:
 
 - Non-Blocking vs Blocking: Whether the thread will periodically poll for whether that task is complete, or whether it should wait for the task to complete before doing anything else
-- Synchronous vs Asynchronus: Whether to execute the operation as initiated by the program or as a response to an event from the kernel.
+- Synchronous vs Asynchronous: Whether to execute the operation as initiated by the program or as a response to an event from the kernel.
 
 I/O operations cause a *privileged context switch*, allowing the task which is
 handling the I/O to directly be switched to in order to continue actions.
@@ -117,7 +117,7 @@ a higher level interrupt which Julia handles the moment the safety locks says
 it's okay (these locks occur during memory allocations to ensure that memory
 is not corrupted).
 
-#### Asynchronus Calling Example
+#### Asynchronous Calling Example
 
 This example will become more clear when we get to distributed computing, but
 for think of `remotecall_fetch` as a way to run a command on a different computer.
diff --git a/_weave/lecture06/styles_of_parallelism.jmd b/_weave/lecture06/styles_of_parallelism.jmd
index 8a2ac87e..c12d6d1f 100644
--- a/_weave/lecture06/styles_of_parallelism.jmd
+++ b/_weave/lecture06/styles_of_parallelism.jmd
@@ -351,7 +351,7 @@ using blocking structures, one needs to be careful about deadlock!
 ### Two Programming Models: Loop-Level Parallelism and Task-Based Parallelism
 
 As described in the previous lecture, one can also use `Threads.@spawn` to
-do multithreading in Julia v1.3+. The same factors all applay: how to do locks
+do multithreading in Julia v1.3+. The same factors all apply: how to do locks
 and Mutex etc. This is a case of a parallelism construct having two alternative
 **programming models**. `Threads.@spawn` represents task-based parallelism, while
 `Threads.@threads` represents Loop-Level Parallelism or a parallel iterator
diff --git a/_weave/lecture07/discretizing_odes.jmd b/_weave/lecture07/discretizing_odes.jmd
index 9f1a4581..fdeeecc5 100644
--- a/_weave/lecture07/discretizing_odes.jmd
+++ b/_weave/lecture07/discretizing_odes.jmd
@@ -151,7 +151,7 @@ system by describing the force between two particles as:
 
 $$F = G \frac{m_1m_2}{r^2}$$
 
-where $r^2$ is the Euclidian distance between the two particles. From here, we
+where $r^2$ is the Euclidean distance between the two particles. From here, we
 use the fact that
 
 $$F = ma$$
@@ -451,7 +451,7 @@ that
 $$u(t+\Delta t) = u(t) + \Delta t f(u,p,t) + \mathcal{O}(\Delta t^2)$$
 
 This is a first order approximation because the error in our step can be
-expresed as an error in the derivative, i.e.
+expressed as an error in the derivative, i.e.
 
 $$\frac{u(t + \Delta t) - u(t)}{\Delta t} = f(u,p,t) + \mathcal{O}(\Delta t)$$
 
@@ -542,7 +542,7 @@ be larger, even if it matches another one asymptotically.
 
 ## What Makes a Good Method?
 
-### Leading Truncation Coeffcients
+### Leading Truncation Coefficients
 
 For given orders of explicit Runge-Kutta methods, lower bounds for the number of
 `f` evaluations (stages) required to receive a given order are known:
@@ -743,7 +743,7 @@ Stiffness can thus be approximated in some sense by the condition number of
 the Jacobian. The condition number of a matrix is its maximal eigenvalue divided
 by its minimal eigenvalue and gives a rough measure of the local timescale
 separations. If this value is large and one wants to resolve the slow dynamics,
-then explict integrators, like the explicit Runge-Kutta methods described before,
+then explicit integrators, like the explicit Runge-Kutta methods described before,
 have issues with stability. In this case implicit integrators (or other forms
 of stabilized stepping) are required in order to efficiently reach the end
 time step.
diff --git a/_weave/lecture08/automatic_differentiation.jmd b/_weave/lecture08/automatic_differentiation.jmd
index 3b11fa74..5c0c9711 100644
--- a/_weave/lecture08/automatic_differentiation.jmd
+++ b/_weave/lecture08/automatic_differentiation.jmd
@@ -583,7 +583,7 @@ for $e_i$ the $i$th basis vector, then
 
 $f(d) = f(d_0) + Je_1 \epsilon_1 + \ldots + Je_n \epsilon_n$
 
-computes all columns of the Jacobian simultaniously.
+computes all columns of the Jacobian simultaneously.
 
 ### Array of Structs Representation
 
diff --git a/_weave/lecture09/stiff_odes.jmd b/_weave/lecture09/stiff_odes.jmd
index 8035d74a..d73df3e0 100644
--- a/_weave/lecture09/stiff_odes.jmd
+++ b/_weave/lecture09/stiff_odes.jmd
@@ -478,7 +478,7 @@ The most common method in the Krylov subspace family of methods is the GMRES
 method. Essentially, in step $i$ one computes $\mathcal{K}_i$, and finds the
 $x$ that is the closest to the Krylov subspace, i.e. finds the $x \in \mathcal{K}_i$
 such that $\Vert Jx-v \Vert$ is minimized. At each step, it adds the new vector
-to the Krylov subspace after orthgonalizing it against the other vectors via
+to the Krylov subspace after orthogonalizing it against the other vectors via
 Arnoldi iterations, leading to an orthogonal basis of $\mathcal{K}_i$ which
 makes it easy to express $x$.
 
diff --git a/_weave/lecture10/estimation_identification.jmd b/_weave/lecture10/estimation_identification.jmd
index c8f08829..df60e2fb 100644
--- a/_weave/lecture10/estimation_identification.jmd
+++ b/_weave/lecture10/estimation_identification.jmd
@@ -560,7 +560,7 @@ unscaled gradient.
 
 ### Multi-Seeding
 
-Similarly to forward-mode having a dual number with multiple simultanious
+Similarly to forward-mode having a dual number with multiple simultaneous
 derivatives through partials $d = x + v_1 \epsilon_1 + \ldots + v_m \epsilon_m$,
 one can see that multi-seeding is an option in reverse-mode AD by, instead of
 pulling back a matrix instead of a row vector, where each row is a direction.
diff --git a/_weave/lecture11/adjoints.jmd b/_weave/lecture11/adjoints.jmd
index aa6ce416..3145ced8 100644
--- a/_weave/lecture11/adjoints.jmd
+++ b/_weave/lecture11/adjoints.jmd
@@ -143,7 +143,7 @@ does not need to be re-calculated.
 
 Using this style, Tracker.jl moves forward, building up the value and closures
 for the backpass and then recursively pulls back the input `Δ` to receive the
-derivatve.
+derivative.
 
 ### Source-to-Source AD
 
diff --git a/_weave/lecture13/gpus.jmd b/_weave/lecture13/gpus.jmd
index 945b9367..8d5a6ed8 100644
--- a/_weave/lecture13/gpus.jmd
+++ b/_weave/lecture13/gpus.jmd
@@ -74,7 +74,7 @@ Loop: fld    f0, 0(x1)
     i > 0 && @goto Loop  # Cycle 8
 ```
 
-With our given latencies and issueing one operation per cycle,
+With our given latencies and issuing one operation per cycle,
 we can execute the loop in 8 cycles. By reordering we can
 execute it in 7 cycles. Can we do better?
 
@@ -91,7 +91,7 @@ execute it in 7 cycles. Can we do better?
 
 By reordering the decrement we can hide the load latency.
 
-- How many cylces are overhead: 2
+- How many cycles are overhead: 2
 - How many stall cycles: 2
 - How many cycles are actually work: 3
 
diff --git a/_weave/lecture14/pdes_and_convolutions.jmd b/_weave/lecture14/pdes_and_convolutions.jmd
index b78afa63..51df8fdf 100644
--- a/_weave/lecture14/pdes_and_convolutions.jmd
+++ b/_weave/lecture14/pdes_and_convolutions.jmd
@@ -13,7 +13,7 @@ weave_options:
 At this point we have identified how the worlds of machine learning and scientific
 computing collide by looking at the parameter estimation problem. Training
 neural networks is parameter estimation of a function `f` where `f` is a neural
-network. Backpropogation of a neural network is simply the adjoint problem
+network. Backpropagation of a neural network is simply the adjoint problem
 for `f`, and it falls under the class of methods used in reverse-mode automatic
 differentiation. But this story also extends to structure. Recurrent neural
 networks are the Euler discretization of a continuous recurrent neural network,
@@ -82,7 +82,7 @@ m = Chain(
 
 ## Discretizations of Partial Differential Equations
 
-Now let's investigate discertizations of partial differential equations. A
+Now let's investigate discretizations of partial differential equations. A
 canonical differential equation to start with is the Poisson equation. This is
 the equation:
 
diff --git a/_weave/lecture15/diffeq_machine_learning.jmd b/_weave/lecture15/diffeq_machine_learning.jmd
index 4b2d97fa..486b5949 100644
--- a/_weave/lecture15/diffeq_machine_learning.jmd
+++ b/_weave/lecture15/diffeq_machine_learning.jmd
@@ -77,7 +77,7 @@ example:
 - [Hybrid differential equations](http://diffeq.sciml.ai/latest/features/callback_functions/) (DEs with event handling)
 
 For each of these equations, one can come up with an adjoint definition in order
-to define a backpropogation, or perform direct automatic differentiation of the
+to define a backpropagation, or perform direct automatic differentiation of the
 solver code. One such paper in this area includes
 [neural stochastic differential equations](https://arxiv.org/abs/1905.09883)
 
diff --git a/course/index.md b/course/index.md
index 64d015b5..8ce37cea 100644
--- a/course/index.md
+++ b/course/index.md
@@ -61,7 +61,7 @@ Homework 2: Parameter estimation in dynamical systems and overhead of parallelis
   - Definition of inverse problems with applications to clinical pharmacology
     and smartgrid optimization
   - Adjoint methods for fast gradients
-  - Automated adjoints through reverse-mode automatic differentiation (backpropogation)
+  - Automated adjoints through reverse-mode automatic differentiation (backpropagation)
   - Adjoints of differential equations
   - Using neural ordinary differential equations as a memory-efficient RNN for
     deep learning