Backends¶
A backend is an implementation of a consistent interface, which provides basic operations for filtering N-dimensional arrays. These include filtering operations that build selectivity, pooling operations that build invariance, and an operation providing local contrast enhancement of an image.
Filtering¶
Four filter operations are supported. The operation DotProduct compares the
input neigborhood and the weight vector (i.e., prototype) using a dot
product, where each output is given by
for input neighborhood \(X\) (given as a vector) and weight vector
\(W\), where \(X^T\) denotes the matrix transpose. The operation
NormDotProduct is similar,
but constrains each vector to have unit norm. Thus, the output is given by
where \(\left\Vert \cdot \right\Vert\) denotes the Euclidean norm.
Instead of a dot product, the operation Rbf compares the input and
weight vectors using a radial basis function (RBF). Here, the output is
given as
where \(\beta\) controls the sensitivity of the RBF. Constraining the
vector norm of the arguments gives the final operation NormRbf, where the output is
given as
Here, we have used the bilinearity of the inner product to write the distance as
for unit vectors \(V_a\) and \(V_b\).
Pooling¶
Currently, the only operation that is supported is a maximum-value pooling function. For a local neighborhood of the input \(X\), this computes an output value as
This has been argued to provide a good match to cortical response properties [1], and has been shown in practice to lead to better performance [2].
Contrast Enhancement¶
Given a local input neighborhood \(X\), the output is
where \(x_c\) is the center of the input neighborhood, \(\mu\) and \(\sigma\) are the mean and standard deviation of \(X\), and \(\epsilon\) is a bias term. This term is used to avoid the amplificiation of noise and to ensure a non-zero divisor.
References¶
| [1] | Serre, T., Oliva, A. & Poggio, T., 2007. A feedforward architecture accounts for rapid categorization. Proceedings of the National Academy of Sciences, 104(15), p.6424-6429. |
| [2] | Boureau, Y.-L. et al., 2010. Learning mid-level features for recognition. In Computer Vision and Pattern Recognition 2010. IEEE, pp. 2559-2566. |