From 4ff3364d4388e942984b03a377392bd282cf9fc8 Mon Sep 17 00:00:00 2001
From: stkrueppel <82206550+stkrueppel@users.noreply.github.com>
Date: Wed, 16 Jun 2021 11:04:46 +0200
Subject: [PATCH] Improvements to spike train basics after course

Added explanation of lists of arrays and exercise about using np.histogram by specifying the bin edges.
---
 day1_2_spike_train_basics.ipynb | 159 ++++++++++++++++++++++++++++----
 1 file changed, 140 insertions(+), 19 deletions(-)

diff --git a/day1_2_spike_train_basics.ipynb b/day1_2_spike_train_basics.ipynb
index 388cf9f..447a0dc 100644
--- a/day1_2_spike_train_basics.ipynb
+++ b/day1_2_spike_train_basics.ipynb
@@ -99,7 +99,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "**Exercise**\n",
+    "**Exercise:**\n",
     "\n",
     "Print the shape of `example_spike_times`. What does the result mean? Print the time of the 10th spike in milliseconds."
    ]
@@ -120,7 +120,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "**Expected output**\n",
+    "**Expected output:**\n",
     "```\n",
     "(5391,)\n",
     "1634.5\n",
@@ -186,6 +186,76 @@
     "plt.title(\"Raster plot\");"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Intermezzo: Lists and arrays"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "If we want to work with multiple neurons, we need data structures that can hold all that data, e.g. all spike trains. We could create one variable for each spike train, but that will get very tedious if we have many neurons. Instead, we want one variable for all spike trains."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Let's first remind ourselves of the pros and cons of lists and arrays:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "|Lists|Arrays|\n",
+    "|--|--|\n",
+    "|can contain multiple datatypes|can only contain one datatype|\n",
+    "|indexing by slicing|indexing by slicing and boolean arrays|\n",
+    "|only few methods/functions|fast and powerful computations using numpy|\n",
+    "| |numpy functions work on/return arrays anyway|\n",
+    "|only 1-dimensional|multiple dimensions possible|"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "2-dimensional arrays are a great choice for e.g. parameters (mean firing rate etc.)! The first dimension (rows) can denote the cell, the second dimension (columns) can denote its parameter.\n",
+    "\n",
+    "To optimize memory allocation we often want to initialize an empty array before we begin our analysis. Let's say we have 20 cells and we want to calculate 3 different parameters for each cell."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "parameter_storage = np.empty((20, 3))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "During our analysis, we can store the results in this 2D array."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "For spike trains, however, 2D arrays are not suited. This is because 2D arrays must be rectangular, i.e. each row has the same number of columns. Not all cells have the same number of spikes in their spike train, though.\n",
+    "\n",
+    "Instead, we could use arrays of arrays, lists of lists, or lists of arrays. All are fine, but we recommend lists of arrays.\n",
+    "\n",
+    "This means: All spike train data is stored in a list. The elements of the list are numpy arrays, one for each cell. Each array contains the spike train of that cell. "
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -242,7 +312,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "**Exercise**\n",
+    "**Exercise:**\n",
     "\n",
     "Write a function called `load_spike_trains_to_list`. As an argument, this function should take a list of filepaths called `list_of_paths`. It should return a list containing each neuron's spike times called `list_of_spikes`.\n",
     "\n",
@@ -293,7 +363,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Expected output:\n",
+    "**Expected output:**\n",
     "```\n",
     "[  0.5766   2.8239   4.5523 ... 481.387  482.4371 482.4677]\n",
     "```"
@@ -365,9 +435,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "**Exercise**\n",
+    "**Exercise:**\n",
     "\n",
-    "Let's focus on some neurons that responded similarly. Specifically, plot the first 32 seconds of the spike trains of the 1st, 12th, 13th, 14th, and 19th neurons in `all_spike_trains`, but not the rest."
+    "Let's focus on some neurons that have similar spike trains. Specifically, plot the first 32 seconds of the spike trains of the 1st, 12th, 13th, 14th, and 19th neurons in `all_spike_trains`, but not the rest."
    ]
   },
   {
@@ -428,7 +498,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Numpy provides the function `histogram` to calculate histograms."
+    "Numpy provides the function `histogram` to calculate histograms. We'll start by analyzing the first of the five neurons."
    ]
   },
   {
@@ -510,7 +580,21 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "**Exercise**\n",
+    "Compared to the eventplot, we can now see the dynamics of the spike train much better."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In the next three exercises, your goal will be to create a plot that shows the firing rates of all five neurons, so that we can compare them better."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Exercise:**\n",
     "\n",
     "Write a function called `firing_rate_histogram` that calculates the firing rate histogram from a spike train. As an argument, this function should take a spike train called `spike_train`. It should return two variables: First, the firing rate in each bin of the histogram, called `hist_firing_rate`. Second, the centers of the corresponding bins, called `hist_bin_centers`. The histogram should have 50 bins in the range from 0 to 8.\n",
     "\n",
@@ -542,7 +626,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Expected output:\n",
+    "**Expected output:**\n",
     "\n",
     "```93.75 4.88```"
    ]
@@ -551,11 +635,11 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "**Exercise**\n",
+    "**Exercise:**\n",
     "\n",
     "Write another function called `multi_frate_histograms` that calculates a histogram for each of multiple spike trains. As an argument, this function should take a list of spike trains called `list_of_spike_trains`. It should return two variables: First, a list called `multi_hist_firing_rate` that contains for each given spike train the corresponding firing rates in a histogram. Second, the bin centers of these histograms, called `hist_bin_centers`.\n",
     "\n",
-    "_Hint:_ Make use of a for-loop and the function you wrote above. Remember that the bin centers of all the histograms are the same, so you don't have to create a list for them."
+    "_Hint:_ Make use of a for-loop and the function you wrote above. Remember that the bin centers of all the histograms are the same, so you don't have to create a list for them. A single 1D array will suffice."
    ]
   },
   {
@@ -583,7 +667,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Expected output:\n",
+    "**Expected output:**\n",
     "\n",
     "```56.25 4.88```"
    ]
@@ -592,7 +676,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "**Exercise**\n",
+    "**Exercise:**\n",
     "\n",
     "Finally, create one plot that shows the histograms of the five neurons from the end of the last chapter (use `short_spike_trains`).\n",
     "\n",
@@ -642,7 +726,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "**Exercise**\n",
+    "**Exercise:**\n",
     "\n",
     "Calculate the ISI histogram for the spike train `example_spike_times` that we loaded in the beginning. Choose a reasonable number of bins and focus on ISIs below 30ms. Then plot the histogram.\n",
     "\n",
@@ -672,7 +756,25 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "_Followup exercise:_ If you're done early, try plotting the histogram using matplotlib's `hist` function."
+    "What does this mean? Why are there so few intervals below ~3ms?"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "So far, when using 'np.histogram', we have specified the number of bins of the histogram. This is the easiest way of using it, but we might also want to directly control the size of the bins. "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Exercise:**\n",
+    "\n",
+    "Create the same plot as above, but now explicitly set the size of the bins to 0.1 ms. You can do so by creating the bin edges manually and passing them as the parameter `bins` to `np.histogram`. Take a look at the documentation!\n",
+    "\n",
+    "_Hint:_ To create the bin edges, you can use `np.arange` with the `step` parameter."
    ]
   },
   {
@@ -681,9 +783,12 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# Start your code here\n",
-    "# FIXME\n",
-    "plt.hist(isi, bins=50, range=(0, 0.03))\n",
+    "bin_size = 0.0001\n",
+    "bin_edges = np.arange(0, 0.03 + bin_size, bin_size)\n",
+    "bin_centers = bin_edges[:-1] + bin_size/2\n",
+    "isi_hist, bin_edges = np.histogram(isi, bins=bin_edges)\n",
+    "\n",
+    "plt.plot(bin_centers, isi_hist)\n",
     "plt.title(\"Interspike interval histogram\")\n",
     "plt.xlabel(\"Interspike interval [s]\")\n",
     "plt.ylabel(\"Number of occurences\");"
@@ -693,7 +798,23 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "What does this mean? Why are there so few intervals below ~3ms?"
+    "**Exercise:**\n",
+    "\n",
+    "Try plotting a 'real' histogram using matplotlib's `hist` function. Whether you specify the number of bins or the bin edges is up to you."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Start your code here\n",
+    "# FIXME\n",
+    "plt.hist(isi, bins=50, range=(0, 0.03))\n",
+    "plt.title(\"Interspike interval histogram\")\n",
+    "plt.xlabel(\"Interspike interval [s]\")\n",
+    "plt.ylabel(\"Number of occurences\");"
    ]
   }
  ],