-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathKey concepts on Deep Neural Networks
333 lines (222 loc) · 5.81 KB
/
Key concepts on Deep Neural Networks
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
Key concepts on Deep Neural Networks
LATEST SUBMISSION GRADE
98.57%
1.Question 1
What is the "cache" used for in our implementation of forward propagation and backward propagation?
It is used to keep track of the hyperparameters that we are searching over, to speed up computation.
We use it to pass variables computed during forward propagation to the corresponding backward propagation step. It contains useful values for backward propagation to compute derivatives.
We use it to pass variables computed during backward propagation to the corresponding forward propagation step. It contains useful values for forward propagation to compute activations.
It is used to cache the intermediate values of the cost function during training.
Correct
Correct, the "cache" records values from the forward propagation units and sends it to the backward propagation units because it is needed to compute the chain rule derivatives.
1 / 1 point
2.Question 2
Among the following, which ones are "hyperparameters"? (Check all that apply.)
activation values a^{[l]}a
[l]
learning rate \alphaα
Correct
number of iterations
Correct
weight matrices W^{[l]}W
[l]
bias vectors b^{[l]}b
[l]
size of the hidden layers n^{[l]}n
[l]
Correct
number of layers LL in the neural network
You didn’t select all the correct answers
0.857 / 1 point
3.Question 3
Which of the following statements is true?
The deeper layers of a neural network are typically computing more complex features of the input than the earlier layers.
The earlier layers of a neural network are typically computing more complex features of the input than the deeper layers.
Correct
1 / 1 point
4.Question 4
Vectorization allows you to compute forward propagation in an LL-layer neural network without an explicit for-loop (or any other explicit iterative loop) over the layers l=1, 2, …,L. True/False?
True
False
Correct
Forward propagation propagates the input through the layers, although for shallow networks we may just write all the lines (a^{[2]} = g^{[2]}(z^{[2]})a
[2]
=g
[2]
(z
[2]
), z^{[2]}= W^{[2]}a^{[1]}+b^{[2]}z
[2]
=W
[2]
a
[1]
+b
[2]
, ...) in a deeper network, we cannot avoid a for loop iterating over the layers: (a^{[l]} = g^{[l]}(z^{[l]})a
[l]
=g
[l]
(z
[l]
), z^{[l]} = W^{[l]}a^{[l-1]} + b^{[l]}z
[l]
=W
[l]
a
[l−1]
+b
[l]
, ...).
1 / 1 point
5.Question 5
Assume we store the values for n^{[l]}n
[l]
in an array called layers, as follows: layer_dims = [n_xn
x
, 4,3,2,1]. So layer 1 has four hidden units, layer 2 has 3 hidden units and so on. Which of the following for-loops will allow you to initialize the parameters for the model?
Correct
1 / 1 point
6.Question 6
Consider the following neural network.
How many layers does this network have?
The number of layers LL is 4. The number of hidden layers is 3.
The number of layers LL is 3. The number of hidden layers is 3.
The number of layers LL is 4. The number of hidden layers is 4.
The number of layers LL is 5. The number of hidden layers is 4.
Correct
Yes. As seen in lecture, the number of layers is counted as the number of hidden layers + 1. The input and output layers are not counted as hidden layers.
1 / 1 point
7.Question 7
During forward propagation, in the forward function for a layer ll you need to know what is the activation function in a layer (Sigmoid, tanh, ReLU, etc.). During backpropagation, the corresponding backward function also needs to know what is the activation function for layer ll, since the gradient depends on it. True/False?
True
False
Correct
Yes, as you've seen in the week 3 each activation has a different derivative. Thus, during backpropagation you need to know which activation was used in the forward propagation to be able to compute the correct derivative.
1 / 1 point
8.Question 8
There are certain functions with the following properties:
(i) To compute the function using a shallow network circuit, you will need a large network (where we measure size by the number of logic gates in the network), but (ii) To compute it using a deep network circuit, you need only an exponentially smaller network. True/False?
True
False
Correct
1 / 1 point
9.Question 9
Consider the following 2 hidden layer neural network:
Which of the following statements are True? (Check all that apply).
W^{[1]}W
[1]
will have shape (4, 4)
Correct
Yes. More generally, the shape of W^{[l]}W
[l]
is (n^{[l]}, n^{[l-1]})(n
[l]
,n
[l−1]
).
b^{[1]}b
[1]
will have shape (4, 1)
Correct
Yes. More generally, the shape of b^{[l]}b
[l]
is (n^{[l]}, 1)(n
[l]
,1).
W^{[1]}W
[1]
will have shape (3, 4)
b^{[1]}b
[1]
will have shape (3, 1)
W^{[2]}W
[2]
will have shape (3, 4)
Correct
Yes. More generally, the shape of W^{[l]}W
[l]
is (n^{[l]}, n^{[l-1]})(n
[l]
,n
[l−1]
).
b^{[2]}b
[2]
will have shape (1, 1)
W^{[2]}W
[2]
will have shape (3, 1)
b^{[2]}b
[2]
will have shape (3, 1)
Correct
Yes. More generally, the shape of b^{[l]}b
[l]
is (n^{[l]}, 1)(n
[l]
,1).
W^{[3]}W
[3]
will have shape (3, 1)
b^{[3]}b
[3]
will have shape (1, 1)
Correct
Yes. More generally, the shape of b^{[l]}b
[l]
is (n^{[l]}, 1)(n
[l]
,1).
W^{[3]}W
[3]
will have shape (1, 3)
Correct
Yes. More generally, the shape of W^{[l]}W
[l]
is (n^{[l]}, n^{[l-1]})(n
[l]
,n
[l−1]
).
b^{[3]}b
[3]
will have shape (3, 1)
1 / 1 point
10.Question 10
Whereas the previous question used a specific network, in the general case what is the dimension of W^{[l]}, the weight matrix associated with layer ll?
W^{[l]}W
[l]
has shape (n^{[l]}, n^{[l+1]})(n
[l]
,n
[l+1]
)
W^{[l]}W
[l]
has shape (n^{[l-1]}, n^{[l]})(n
[l−1]
,n
[l]
)
W^{[l]}W
[l]
has shape (n^{[l+1]}, n^{[l]})(n
[l+1]
,n
[l]
)
W^{[l]}W
[l]
has shape (n^{[l]}, n^{[l-1]})(n
[l]
,n
[l−1]
)
Correct
True