numpy.pad() 填充数组。1
2
3
4
5
6
7
8
9
10
11
12>>> a = [1, 2, 3, 4, 5]
>>> np.pad(a, (2,3), 'constant', constant_values=(4, 6))
array([4, 4, 1, 2, 3, 4, 5, 6, 6, 6])
>>> a = [[1, 2], [3, 4]]
>>> np.pad(a, ((2, 2), (2, 2)), 'constant', constant_values=(0, 0))
array([[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 1, 2, 0, 0],
[0, 0, 3, 4, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]])
卷积层(Convolutional)
Zero-Padding
Padding 一方面可以使卷积层传到下一层的高度和宽度得到保持,另一方面可以利用到图像的边缘信息。1
2
3
4
5
6
7
8
9
10
11
12
13
14
def zero_pad(X, pad):
"""
Argument:
X -- python numpy array of shape (m, n_H, n_W, n_C) representing a batch of m images
pad -- integer, amount of padding around each image on vertical and horizontal dimensions
Returns:
X_pad -- padded image of shape (m, n_H + 2*pad, n_W + 2*pad, n_C)
"""
X_pad = np.pad(X, ((0,0), (pad,pad), (pad,pad), (0,0)), 'constant', constant_values = (0,0))
return X_pad
单步卷积
将卷积核作用于与前一层输出的一块,产生一个值。1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
def conv_single_step(a_slice_prev, W, b):
"""
Arguments:
a_slice_prev -- slice of input data of shape (f, f, n_C_prev)
W -- Weight parameters contained in a window - matrix of shape (f, f, n_C_prev)
b -- Bias parameters contained in a window - matrix of shape (1, 1, 1)
Returns:
Z -- a scalar value, result of convolving the sliding window (W, b) on a slice x of the input data
"""
# Element-wise product between a_slice and W. Add bias.
c = np.multiply(a_slice_prev, W) + b
# Sum over all entries of the volume s
Z = np.sum(c)
return Z
前向传播
使用多个卷积核作用于输入,每个卷积核产生一个二维矩阵,将多个卷积核的输出堆叠在一起,得到一个三维输出。1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56def conv_forward(A_prev, W, b, hparameters):
"""
Arguments:
A_prev -- output activations of the previous layer, numpy array of shape (m, n_H_prev, n_W_prev, n_C_prev)
W -- Weights, numpy array of shape (f, f, n_C_prev, n_C)
b -- Biases, numpy array of shape (1, 1, 1, n_C)
hparameters -- python dictionary containing "stride" and "pad"
Returns:
Z -- conv output, numpy array of shape (m, n_H, n_W, n_C)
cache -- cache of values needed for the conv_backward() function
"""
# Retrieve dimensions from A_prev's shape
(m, n_H_prev, n_W_prev, n_C_prev) = A_prev.shape
# Retrieve dimensions from W's shape
(f, f, n_C_prev, n_C) = W.shape
# Retrieve information from "hparameters"
stride = hparameters['stride']
pad = hparameters['pad']
# Compute the dimensions of the CONV output volume using the formula given above. Hint: use int() to floor.
n_H = int((n_H_prev - f + 2 * pad) / stride) + 1
n_W = int((n_W_prev - f + 2 * pad) / stride) + 1
# Initialize the output volume Z with zeros.
Z = np.zeros((m, n_H, n_W, n_C))
# Create A_prev_pad by padding A_prev
A_prev_pad = zero_pad(A_prev, pad)
for i in range(0, m): # loop over the batch of training examples
a_prev = A_prev_pad[i, :, :, :] # Select ith training example's padded activation
for h in range(0, n_H): # loop over vertical axis of the output volume
for w in range(0, n_W): # loop over horizontal axis of the output volume
for c in range(0, n_C): # loop over channels (= #filters) of the output volume
# Find the corners of the current "slice"
vert_start = h * stride
vert_end = vert_start + f
horiz_start = w * stride
horiz_end = horiz_start + f
# Use the corners to define the (3D) slice of a_prev_pad (See Hint above the cell).
a_slice_prev = a_prev[vert_start:vert_end, horiz_start:horiz_end, :]
# Convolve the (3D) slice with the correct filter W and bias b, to get back one output neuron.
Z[i, h, w, c] = conv_single_step(a_slice_prev, W[:,:,:,c], b[:,:,:,c])
# Making sure your output shape is correct
assert(Z.shape == (m, n_H, n_W, n_C))
# Save information in "cache" for the backprop
cache = (A_prev, W, b, hparameters)
return Z, cache
反向传播
$$ dA += \sum _{h=0} ^{n_H} \sum_{w=0} ^{n_W} W_c \times dZ_{hw}$$
其中,$W_c$为卷积核,$dZ_{hw}$为卷积层$Z$的$h$行$w$列的输出对应的梯度。
$$ dW_c += \sum _{h=0} ^{n_H} \sum_{w=0} ^ {n_W} a_{slice} \times dZ_{hw} $$
其中,$a_{slice}$对应图像对应于输出为$z_{ij}$的切片。
$$ db = \sum_h \sum_w dZ_{hw}$$
1 | def conv_backward(dZ, cache): |
池化层(Pooling)
池化层用来规约输入的高度和宽度,从而减少计算。池化包括最大池化和平均池化。
前向传播
1 | def pool_forward(A_prev, hparameters, mode = "max"): |
反向传播
即使池化层没有参数在反向传播中更新,也要计算梯度,从而传给前面的层。
Max-Pooling
使用一个掩模来追踪最大值,例如:
$$ X = \begin{bmatrix}
1 && 3 \\
4 && 2
\end{bmatrix} \quad \rightarrow \quad M =\begin{bmatrix}
0 && 0 \\
1 && 0
\end{bmatrix}$$
1 | def create_mask_from_window(x): |
如果有一个矩阵 X 和一个标量 x: A = (X == x) 将返回一个矩阵,效果等同于:1
2A[i,j] = True if X[i,j] = x
A[i,j] = False if X[i,j] != x
Average-Pooling
计算掩模,每个位置代表对于$dZ$的贡献。例如:
$$ dZ = 1 \quad \rightarrow \quad dZ =\begin{bmatrix}
1/4 && 1/4 \\
1/4 && 1/4
\end{bmatrix}$$
1 | def distribute_value(dz, shape): |
Putting it together: Pooling backward
1 | def pool_backward(dA, cache, mode = "max"): |