This memo describes the design of an efficient software implementation of gray-scale digital image magnification by high-resolution bicubic spline interpolation. For simplicity, it will be assumed herein that the magnification factor is the same both horizontally and vertically.

In the following discussion, `f` is a two dimensional array of numbers
representing the input digital image, `F` is a function of two
variables representing the generated continuous image, and `g` is a
two dimensional array of numbers representing the output digital image.
`H` is used as an intermediate one dimensional array of functions of
one variable.

The output image is represented by

`g`_{k, m}=`F`(`k`/`r`,`m`/`r`)

For high-resolution bicubic spline interpolation, the continuous image consists of a number of square regions, each region bordered on its corners by input pixels. The continuous image in each region is defined by high-resolution cubic spline interpolation in both the horizontal and vertical directions. This can be described mathematically as

`F`(`x`,`y`) =`H`_{j-1}(`x`)`C`_{3}(`y`-`j`) +`H`_{j}(`x`)`C`_{2}(`y`-`j`) +`H`_{j+1}(`x`)`C`_{1}(`y`-`j`) +`H`_{j+2}(`x`)`C`_{0}(`y`-`j`),- for
`j`<=`y`<`j`+1

`H`_{j}(`x`) =`f`_{i-1, j}`C`_{3}(`x`-`i`) +`f`_{i, j}`C`_{2}(`x`-`i`) +`f`_{i+1, j}`C`_{1}(`x`-`i`) +`f`_{i+2, j}`C`_{0}(`x`-`i`),- for
`i`<=`x`<`i`+1 `C`_{0}(`t`) = -`a``t`^{3}+`a``t`^{2}`C`_{1}(`t`) = -(`a`+2)`t`^{3}+ (2`a`+3)`t`^{2}-`a``t``C`_{2}(`t`) = (`a`+2)`t`^{3}- (`a`+3)`t`^{2}+ 1`C`_{3}(`t`) =`a``t`^{3}- 2`a``t`^{2}+`a``t`

Since we are magnifying from one discrete coordinate system to another, the
magnification factor, `r`, must be a rational number, so we can
represent it in a program with a pair of integer variables, `n` and
`d`, such that `r` = `n`/`d`.

The program needs to access only one input scan line at a time, and generates only one output scan line at a time. Thus it is not necessary for the input or output images to exist entirely in main memory.

For
high-resolution
bicubic spline interpolation, the index of the lowest indexed input pixel
on which an intermediate pixel depends is
**|_**`k``d`/`n`**_|**-1.
Likewise, the index of the lowest indexed intermediate pixels on which an
output pixel depends is
**|_**`m``d`/`n`**_|**-1.
In addition, the indices of the largest indexed pixels are
**|_**`k``d`/`n`**_|**+2
and
**|_**`m``d`/`n`**_|**+2,
respectively.
This creates several difficulties, one is that the range of input pixels
looked at would exceed the boundaries of the input image.
An approximate solution for this is to pretend that the input image in three
scan lines higher and three pixels per scan line wider than it really is and
to presume that the values of these artificial pixels beyond the boundaries
are the same as for the adjacent pixels within the input image.
This solution introduces another difficulty, namely that there would be
pixels with indices as low as -1 in this slightly larger input image, while C
language arrays always start at index 0.
One way around this is to shift the input image left one pixel position and
down one scan line so that the actual input image data start at indices of 1
in each dimension, leaving room in row 0 and column 0 for artificial input
pixels.

A look-up table, `L`, is initially generated for the mapping from the
index of an output pixel and the index of the lowest indexed input pixel on
which it depends, for a particular `n`, `d`, and
range of output pixel indices.
`L` can be generated by

`L`[`k`] =**|_**`k``d`/`n`**_|**

In order to simplify the implementation, the definitions of `F` and
`H` can be modified to

`F`(`x`,`y`) =`H`_{j}(`x`)`C`_{3}(`y`-`j`) +`H`_{j+1}(`x`)`C`_{2}(`y`-`j`) +`H`_{j+2}(`x`)`C`_{1}(`y`-`j`) +`H`_{j+3}(`x`)`C`_{0}(`y`-`j`),- for
`j`<=`y`<`j`+1

`H`_{j}(`x`) =`f`_{i, j}`C`_{3}(`x`-`i`) +`f`_{i+1, j}`C`_{2}(`x`-`i`) +`f`_{i+2, j}`C`_{1}(`x`-`i`) +`f`_{i+3, j}`C`_{0}(`x`-`i`),- for
`i`<=`x`<`i`+1

Performing interpolation in each dimension separately, rather than trying to
compute the two dimensional interpolation directly, significantly saves
computation time.
To compute
high-resolution
bicubic spline interpolation directly requires 16 multiplications and 15
additions each per output pixel.
To compute it in one dimension at a time requires only
4(`r`+1)/`r` multiplications and
3(`r`+1)/`r` additions each per output pixel, which, for all
values of `r` greater than
^{1}/3, results in a fewer number of
operations.
For example, for a typical magnification factor of 2.4, only 5.67
multiplications and 4.25 additions are required per output pixel, which is
68% fewer operations than for the direct computation!

To facilitate modeling
high-resolution
cubic spline interpolation in one dimension at a time, I will define
`h`, an intermediate two dimensional array, each row of which is a
digitally sampled version of the corresponding element of the one dimensional
array of functions, `H`, by the following

`h`_{k, j}=`H`_{j}(**|_**`k``d`/`n`**_|**)

Additional time can be saved by the use of look-up tables for the functions
`C`_{0} , `C`_{1} ,
`C`_{2} , and `C`_{3} .
These tables, `c`_{0} , `c`_{1} ,
`c`_{2} , and `c`_{3} , can
efficiently be generated by

`c`_{l}[`k`] =`C`_{l}((`k``d`mod`n`) /`n`), for`l`= 0,1,2,3, and 0 <=`k`<`n``c`_{l}[`k`] =`c`_{l}[`k`mod`n`], for`l`= 0,1,2,3, and`k`>=`n`

The pixels of the intermediate image, `h`, can be computed using

`h`_{k, j}=`f`_{L[k], j}`c`_{3}[`k`] +`f`_{L[k]+1, j}`c`_{2}[`k`] +`f`_{L[k]+2, j}`c`_{1}[`k`] +`f`_{L[k]+3, j}`c`_{0}[`k`]

`g`_{k, m}=`h`_{k, L[m]}`c`_{3}[`m`] +`h`_{k, L[m]+1}`c`_{2}[`m`] +`h`_{k, L[m]+2}`c`_{1}[`m`] +`h`_{k, L[m]+3}`c`_{0}[`m`]

To further speed up the computation, fixed point arithmetic, as simulated with integer operations and shifts, was used instead of floating-point.