Convolution

5 minute read

Convolution is pretty much just sliding 2 lists along each other and summing their element wise multiplication. The general equation for the $t^{th}$ result is: $$(f*g)(t) = \int^{\infty}_{-\infty}f(\tau)g(t-\tau)d\tau$$

Properties

Convolution has a couple of interesting properties, which boil down to it being commutative:

  • $f * g = g * f$ - I make use of this in the convolve2() function, as it makes things easier for me to understand
  • $\int (f * g) = \int g \cdot \int f$. This isn’t true in general for integrals, e.g. $\int (x \cdot x) \ne \int x \cdot \int x$
  • $f(t) * \delta(t) = f(t)$, where $\delta(t)$ is the Dirac delta function. In the case of lists, this is convolving $f(x)$ by $[1]$, which will simply return the given element of the list.

Code version

To play about with the idea, I wrote the following implementations of convolution. The first version is pretty much a straight forward implementation of the convolution equation, while the second one is more pythonic by playing around with the lists. These don’t implement proper convolution, as they only accept lists, rather generic numerical functions. That being said, a list can totally be viewed as a function that receives an integer (the index, i.e. $x$) and returns a value ($f(x)$). The at_index() function sort of does this, as it returns $0$ for any index that isn’t a valid index of the given list.

def at_index(l, i):
    if isinstance(i, int) and i >= 0 and len(l) > i:
        return l[i]
    return 0


def convolve(a, b):
    # Pretend to integrate from -inf to inf. Though there's really no point in
    # integrating over indexes that won't return anything, don't use indexes that
    # are `-max_index < i or i > max_index`.
    max_items = max(len(a), len(b))
    total_steps = len(a) + len(b) - 1
    sums = []
    for t in range(total_steps):
        sums.append(sum([
            at_index(a, tau) * at_index(b, t - tau)
            for tau in range(-max_items, max_items)
        ]))
    return sums


def convolve2(a, b):
    # Slide the shorter list along the longer one
    if len(a) > len(b):
        a, b = b, a

    # Pad the longer lists with zeros - the idea is to slide like (the smaller list
    # is reversed)
    #
    #  [d, c, b, a] ->
    #           [e, f, g, h, i, j, k, l]
    #
    # rather than
    #
    #  [d, c, b, a] ->
    #  [e, f, g, h, i, j, k, l]
    #
    # The `-1` is so that the sliding starts from the first element, rather than
    # outside the lists. The sums would be the same (as it would mulitply by 0),
    # but there would then be an extra [0] element.
    # The sliding results in the following zips:
    #
    #  [d, c, b, a]
    #  [0, 0, 0, e, f, g, h, i, j, k, l]
    #
    #  [d, c, b, a]
    #  [0, 0, e, f, g, h, i, j, k, l]
    #
    #  [d, c, b, a]
    #  [0, e, f, g, h, i, j, k, l]
    #
    #  ...
    #
    #  [d, c, b, a]
    #  [j, k, l]
    #
    #  [d, c, b, a]
    #  [k, l]
    #
    #  [d, c, b, a]
    #  [l]
    #
    longer = [0] * (len(a) - 1) + b

    return [
        sum([j*k for j,k in zip(reversed(a), longer[i:])])
        for i in range(len(longer))
    ]

Uses

This can come in handy in any algorithm that moves a list of items along another one.

Convolution networks

I first met with the concept of convolution in the context of convolution networks. Understanding them was proving hard, so I decided to first find out what the idea behind convolution is. In this case the list of items is a list of subsequent partitions of the input image (e.g. 5x5 pixels), which get convolved over a filter mask of the same size.

Moving average

A moving average is nothing more than convolving a list of values over the equivalent of [1/n for _ in range(n)].

Image blurring

For each pixel, get the subset of the image that is around it, and convolve them through a same sized filter to get the blurred image. This is pretty much just a moving average over the pixels (abusing the notation): $$p_{i,j} = \begin{bmatrix} p_{i-1, j-1} & p_{i-1, j} & p_{i-1, j+1} \\\ p_{i, j-1} & p_{i, j} & p_{i, j+1} \\\ p_{i+1, j-1} & p_{i+1, j} & p_{i+1, j+1} \end{bmatrix}*\begin{bmatrix} 1 & 1 & 1 \\\ 1 & 1 & 1 \\\ 1 & 1 & 1 \end{bmatrix}$$

Multiplication

Multiplication can be represented as a convolution of numbers transformed into lists of digits, i.e. $$a*b = (…, a_2 \cdot 10^2, a_1 \cdot 10, a_0 \cdot 1) * (…, b_2 \cdot 10^2, b_1 \cdot 10, b_0 \cdot 1)$$ An example of this could be $1234 * 5678$, which give:

1234 = 1000 + 200 + 30 + 4 = [1000 200 30 4]
5678 = 5000 + 600 + 70 + 8 = [5000 600 70 8]

Or

In [185]: a, b = [1000, 200, 30, 4], [5000, 600, 70, 8]

In [186]: sum(convolve(b, a))
Out[186]: 7006652

In [187]: 1234 * 5678
Out[187]: 7006652

Matrix convolution

Convolutions are often used in image filters. In that case the transformed pixels are given by:

$$g(x, y) = \omega * f(x, y) = \sum_{dx=-1}^a \sum_{dy=-b}^b \omega(dx, dy)f(x-dx, y - dy)$$

As an example, assuming a 3x3 image and a 3x3 filter, the middle pixel ([2, 2]) is given by:

$$\begin{pmatrix} \begin{bmatrix} a & b & c \\\ d & e & f \\\ g & h & i \end{bmatrix} * \begin{bmatrix} 1 & 2 & 3 \\\ 4 & 5 & 6 \\\ 7 & 8 & 9 \end{bmatrix} \end{pmatrix} [2, 2] = (i \cdot 1) + (h \cdot 2) + (g \cdot ) + (f \cdot 4) + (e \cdot 5) + (d \cdot 6) + (c \cdot 7) + (b \cdot 8) + (a \cdot 9)$$

This amounts to doing a basic dot product between a section of the image centered on the given pixel, with the transpose of the filter.