Previous: , Up: Arrays   [Contents][Index]


2.9 Array storage orders

Blitz++ is very flexible about the way arrays are stored in memory. Starting indices can be 0, 1, or arbitrary numbers; arrays can be stored in row major, column major or an order based on any permutation of the dimensions; each dimension can be stored in either ascending or descending order. An N dimensional array can be stored in N! 2^N possible ways.

Before getting into the messy details, a review of array storage formats is useful. If you’re already familiar with strides and bases, you might want to skip on to the next section.

2.9.1 Fortran and C-style arrays

Suppose we want to store this two-dimensional array in memory:

[ 1 2 3 ]
[ 4 5 6 ]
[ 7 8 9 ]

Row major vs. column major

To lay the array out in memory, it’s necessary to map the indices (i,j) into a one-dimensional block. Here are two ways the array might appear in memory:

[ 1 2 3 4 5 6 7 8 9 ]
[ 1 4 7 2 5 8 3 6 9 ]

The first order corresponds to a C or C++ style array, and is called row-major ordering: the data is stored first by row, and then by column. The second order corresponds to a Fortran style array, and is called column-major ordering: the data is stored first by column, and then by row.

The simplest way of mapping the indices (i,j) into one-dimensional memory is to take a linear combination.2 Here’s the appropriate linear combination for row major ordering:

memory offset = 3*i + 1*j

And for column major ordering:

memory offset = 1*i + 3*j

The coefficients of the (i,j) indices are called strides. For a row major storage of this array, the row stride is 3 – you have to skip three memory locations to move down a row. The column stride is 1 – you move one memory location to move to the next column. This is also known as unit stride. For column major ordering, the row and column strides are 1 and 3, respectively.

Bases

To throw another complication into this scheme, C-style arrays have indices which start at zero, and Fortran-style arrays have indices which start at one. The first valid index value is called the base. To account for a non-zero base, it’s necessary to include an offset term in addition to the linear combination. Here’s the mapping for a C-style array with i=0..3 and j=0..3:

memory offset =  0 + 3*i + 1*j

No offset is necessary since the indices start at zero for C-style arrays. For a Fortran-style array with i=1..4 and j=1..4, the mapping would be:

memory offset = -4 + 3*i + 1*j

By default, Blitz++ creates arrays in the C-style storage format (base zero, row major ordering). To create a Fortran-style array, you can use this syntax:

Array<int,2> A(3, 3, FortranArray<2>());

The third parameter, FortranArray<2>(), tells the Array constructor to use a storage format appropriate for two-dimensional Fortran arrays (base one, column major ordering).

A similar object, ColumnMajorArray<N>, tells the Array constructor to use column major ordering, with base zero:

Array<int,2> B(3, 3, ColumnMajorArray<2>());

This creates a 3x3 array with indices i=0..2 and j=0..2.

In addition to supporting the 0 and 1 conventions for C and Fortran-style arrays, Blitz++ allows you to choose arbitrary bases, possibly different for each dimension. For example, this declaration creates an array whose indices have ranges i=5..8 and j=2..5:

Array<int,2> A(Range(5,8), Range(2,5));

2.9.2 Creating custom storage orders

All Array constructors take an optional parameter of type GeneralArrayStorage<N_rank>. This parameter encapsulates a complete description of the storage format. If you want a storage format other than C or Fortran-style, you have two choices:

The next sections describe how to modify a GeneralArrayStorage<N_rank> object to suit your needs.

In higher dimensions

In more than two dimensions, the choice of storage order becomes more complicated. Suppose we had a 3x3x3 array. To map the indices (i,j,k) into memory, we might choose one of these mappings:

memory offset = 9*i + 3*j + 1*k
memory offset = 1*i + 3*j + 9*k

The first corresponds to a C-style array, and the second to a Fortran-style array. But there are other choices; we can permute the strides (1,3,9) any which way:

memory offset = 1*i + 9*j + 3*k
memory offset = 3*i + 1*j + 9*k
memory offset = 3*i + 9*j + 1*k
memory offset = 9*i + 1*j + 3*k

For an N dimensional array, there are N! such permutations. Blitz++ allows you to select any permutation of the dimensions as a storage order. First you need to create an object of type GeneralArrayStorage<N_rank>:

GeneralArrayStorage<3> storage;

GeneralArrayStorage<N_rank> contains a vector called ordering which controls the order in which dimensions are stored in memory. The ordering vector will contain a permutation of the numbers 0, 1, ..., N_rank-1. Since some people are used to the first dimension being 1 rather than 0, a set of symbols (firstDim, secondDim, ..., eleventhDim) are provided which make the code more legible.

The ordering vector lists the dimensions in increasing order of stride. You can access this vector using the member function ordering(). A C-style array, the default, would have:

storage.ordering() = thirdDim, secondDim, firstDim;

meaning that the third index (k) is associated with the smallest stride, and the first index (i) is associated with the largest stride. A Fortran-style array would have:

storage.ordering() = firstDim, secondDim, thirdDim;

Reversed dimensions

To add yet another wrinkle, there are some applications where the rows or columns need to be stored in reverse order.3

Blitz++ allows you to store each dimension in either ascending or descending order. By default, arrays are always stored in ascending order. The GeneralArrayStorage<N_rank> object contains a vector called ascendingFlag which indicates whether each dimension is stored ascending (true) or descending (false). To alter the contents of this vector, use the ascendingFlag() method:

// Store the third dimension in descending order
storage.ascendingFlag() = true, true, false;

// Store all the dimensions in descending order
storage.ascendingFlag() = false, false, false;

Setting the base vector

GeneralArrayStorage<N_rank> also has a base vector which contains the base index value for each dimension. By default, the base vector is set to zero. FortranArray<N_rank> sets the base vector to one.

To set your own set of bases, you have two choices:

Here are some examples of the first approach:

// Set all bases equal to 5
storage.base() = 5;    

// Set the bases to [ 1 0 1 ]
storage.base() = 1, 0, 1;

And of the second approach:

// Have bases of 5, but otherwise C-style storage
Array<int,3> A(Range(5,7), Range(5,7), Range(5,7));

// Have bases of [ 1 0 1 ] and use a custom storage
Array<int,3> B(Range(1,4), Range(0,3), Range(1,4), storage);

Working simultaneously with different storage orders

Once you have created an array object, you will probably never have to worry about its storage order. Blitz++ should handle arrays of different storage orders transparently. It’s possible to mix arrays of different storage orders in one expression, and still get the correct result.

Note however, that mixing different storage orders in an expression may incur a performance penalty, since Blitz++ will have to pay more attention to differences in indexing than it normally would.

You may not mix arrays with different domains in the same expression. For example, adding a base zero to a base one array is a no-no. The reason for this restriction is that certain expressions become ambiguous, for example:

Array<int,1> A(Range(0,5)), B(Range(1,6));
A=0;
B=0;
using namespace blitz::tensor;
int result = sum(A+B+i);

Should the index i take its domain from array A or array B? To avoid such ambiguities, users are forbidden from mixing arrays with different domains in an expression.

Debug dumps of storage order information

In debug mode (-DBZ_DEBUG), class Array provides a member function dumpStructureInformation() which displays information about the array storage:

Array<float,4> A(3,7,8,2,FortranArray<4>());
A.dumpStructureInformation(cerr);

The optional argument is an ostream to dump information to. It defaults to cout. Here’s the output:

Dump of Array<f, 4>:
ordering_      = (0,1,2,3)
ascendingFlag_ = (1,1,1,1)
base_          = (1,1,1,1)
length_        = (3,7,8,2)
stride_        = (1,3,21,168)
zeroOffset_    = -193
numElements()  = 336
isStorageContiguous() = 1

A note about storage orders and initialization

When initializing arrays with comma delimited lists, note that the array is filled in storage order: from the first memory location to the last memory location. This won’t cause any problems if you stick with C-style arrays, but it can be confusing for Fortran-style arrays:

Array<int,2> A(3, 3, FortranArray<2>());
A = 1, 2, 3,
    4, 5, 6,
    7, 8, 9;
cout << A << endl;

The output from this code excerpt will be:

A = 3 x 3
         1         4         7 
         2         5         8
         3         6         9

This is because Fortran-style arrays are stored in column major order.

2.9.3 Storage orders example

#include <blitz/array.h>

BZ_USING_NAMESPACE(blitz)

int main()
{
    // 3x3 C-style row major storage, base zero
    Array<int,2> A(3, 3);

    // 3x3 column major storage, base zero
    Array<int,2> B(3, 3, ColumnMajorArray<2>());

    // A custom storage format: 
    // Indices have range 0..3, 0..3
    // Column major ordering
    // Rows are stored ascending, columns stored descending
    GeneralArrayStorage<2> storage;
    storage.ordering() = firstRank, secondRank;
    storage.base() = 0, 0;
    storage.ascendingFlag() = true, false;

    Array<int,2> C(3, 3, storage);

    // Set each array equal to
    // [ 1 2 3 ]
    // [ 4 5 6 ]
    // [ 7 8 9 ]

    A = 1, 2, 3,
        4, 5, 6, 
        7, 8, 9;

    cout << "A = " << A << endl;

    // Comma-delimited lists initialize in memory-storage order only.
    // Hence we list the values in column-major order to initialize B:

    B = 1, 4, 7, 2, 5, 8, 3, 6, 9;

    cout << "B = " << B << endl;

    // Array C is stored in column major, plus the columns are stored
    // in descending order!

    C = 3, 6, 9, 2, 5, 8, 1, 4, 7;

    cout << "C = " << C << endl;

    Array<int,2> D(3,3);
    D = A + B + C;

#ifdef BZ_DEBUG
    A.dumpStructureInformation();
    B.dumpStructureInformation();
    C.dumpStructureInformation();
    D.dumpStructureInformation();
#endif

    cout << "D = " << D << endl;

    return 0;
}

And the output:

A = (0,2) x (0,2)
[ 1 2 3 
  4 5 6 
  7 8 9 ]

B = (0,2) x (0,2)
[ 1 2 3 
  4 5 6 
  7 8 9 ]

C = (0,2) x (0,2)
[ 1 2 3 
  4 5 6 
  7 8 9 ]

D = (0,2) x (0,2)
[ 3 6 9 
  12 15 18 
  21 24 27 ]


Footnotes

(2)

Taking a linear combination is sufficient for dense, asymmetric arrays, such as are provided by the Blitz++ Array class.

(3)

For example, certain bitmap formats store image rows from bottom to top rather than top to bottom.


Previous: , Up: Arrays   [Contents][Index]