by Michael J. Aramini

This memo describes the design of an efficient software implementation of gray-scale digital image magnification by high-resolution bicubic spline interpolation. For simplicity, it will be assumed herein that the magnification factor is the same both horizontally and vertically.

In the following discussion, f is a two dimensional array of numbers representing the input digital image, F is a function of two variables representing the generated continuous image, and g is a two dimensional array of numbers representing the output digital image. H is used as an intermediate one dimensional array of functions of one variable.

The output image is represented by

gk, m = F(k/r, m/r)
where r is the magnification factor.

For high-resolution bicubic spline interpolation, the continuous image consists of a number of square regions, each region bordered on its corners by input pixels. The continuous image in each region is defined by high-resolution cubic spline interpolation in both the horizontal and vertical directions. This can be described mathematically as

F(x, y) = Hj-1(x) C3(y-j) + Hj(x) C2(y-j) + Hj+1(x) C1(y-j) + Hj+2(x) C0(y-j),
for j <= y < j+1
Hj(x) = fi-1, j C3(x-i) + fi, j C2(x-i) + fi+1, j C1(x-i) + fi+2, j C0(x-i),
for i <= x < i+1
C0(t) = -at3 + at2
C1(t) = -(a+2)t3 + (2a+3)t2 - at
C2(t) = (a+2)t3 - (a+3)t2 + 1
C3(t) = at3 - 2at2 + at
and a is a spline parameter such that -1 <= a <= 0.

Since we are magnifying from one discrete coordinate system to another, the magnification factor, r, must be a rational number, so we can represent it in a program with a pair of integer variables, n and d, such that r = n/d.

The program needs to access only one input scan line at a time, and generates only one output scan line at a time. Thus it is not necessary for the input or output images to exist entirely in main memory.

For high-resolution bicubic spline interpolation, the index of the lowest indexed input pixel on which an intermediate pixel depends is |_kd/n_|-1. Likewise, the index of the lowest indexed intermediate pixels on which an output pixel depends is |_md/n_|-1. In addition, the indices of the largest indexed pixels are |_kd/n_|+2 and |_md/n_|+2, respectively. This creates several difficulties, one is that the range of input pixels looked at would exceed the boundaries of the input image. An approximate solution for this is to pretend that the input image in three scan lines higher and three pixels per scan line wider than it really is and to presume that the values of these artificial pixels beyond the boundaries are the same as for the adjacent pixels within the input image. This solution introduces another difficulty, namely that there would be pixels with indices as low as -1 in this slightly larger input image, while C language arrays always start at index 0. One way around this is to shift the input image left one pixel position and down one scan line so that the actual input image data start at indices of 1 in each dimension, leaving room in row 0 and column 0 for artificial input pixels.

A look-up table, L, is initially generated for the mapping from the index of an output pixel and the index of the lowest indexed input pixel on which it depends, for a particular n, d, and range of output pixel indices. L can be generated by

L[k] = |_kd/n_|

In order to simplify the implementation, the definitions of F and H can be modified to

F(x, y) = Hj(x) C3(y-j) + Hj+1(x) C2(y-j) + Hj+2(x) C1(y-j) + Hj+3(x) C0(y-j),
for j <= y < j+1
Hj(x) = fi, j C3(x-i) + fi+1, j C2(x-i) + fi+2, j C1(x-i) + fi+3, j C0(x-i),
for i <= x < i+1

Performing interpolation in each dimension separately, rather than trying to compute the two dimensional interpolation directly, significantly saves computation time. To compute high-resolution bicubic spline interpolation directly requires 16 multiplications and 15 additions each per output pixel. To compute it in one dimension at a time requires only 4(r+1)/r multiplications and 3(r+1)/r additions each per output pixel, which, for all values of r greater than 1/3, results in a fewer number of operations. For example, for a typical magnification factor of 2.4, only 5.67 multiplications and 4.25 additions are required per output pixel, which is 68% fewer operations than for the direct computation!

To facilitate modeling high-resolution cubic spline interpolation in one dimension at a time, I will define h, an intermediate two dimensional array, each row of which is a digitally sampled version of the corresponding element of the one dimensional array of functions, H, by the following

hk, j = Hj( |_kd/n_| )
This array is a digital representation of an image magnified in only one dimension. As with the input and output images, in the actual implementation there is no reason for this entire intermediate image to reside in main memory. In fact, as I will later show, no more than four scan lines need to exist in main memory at any time.

Additional time can be saved by the use of look-up tables for the functions C0 , C1 , C2 , and C3 . These tables, c0 , c1 , c2 , and c3 , can efficiently be generated by

cl[k] = Cl((kd mod n) / n), for l = 0,1,2,3, and 0 <= k < n
cl[k] = cl[k mod n], for l = 0,1,2,3, and k >= n

The pixels of the intermediate image, h, can be computed using

hk, j = fL[k], j c3[k] + fL[k]+1, j c2[k] + fL[k]+2, j c1[k] + fL[k]+3, j c0[k]
Finally, the output image can be computed using
gk, m = hk, L[m] c3[m] + hk, L[m]+1 c2[m] + hk, L[m]+2 c1[m] + hk, L[m]+3 c0[m]
Note that this implies that four intermediate scan lines need to exist in main memory at a time.

To further speed up the computation, fixed point arithmetic, as simulated with integer operations and shifts, was used instead of floating-point.