2.5 Arrays

An array is an object with components arranged according to a Cartesian coordinate system. In general, these components may be any Lisp data objects.

The number of dimensions of an array is called its rank (this terminology is borrowed from APL); the rank is a non-negative integer. Likewise, each dimension is itself a non-negative integer. The total number of elements in the array is the product of all the dimensions.

An implementation of Common Lisp may impose a limit on the rank of an array, but this limit may not be smaller than 7. Therefore, any Common Lisp program may assume the use of arrays of rank 7 or less. (A program may determine the actual limit on array ranks for a given implementation by examining the constant array-rank-limit.)

It is permissible for a dimension to be zero. In this case, the array has no elements, and any attempt to access an element is in error. However, other properties of the array, such as the dimensions themselves, may be used. If the rank is zero, then there are no dimensions, and the product of the dimensions is then by definition 1. A zero-rank array therefore has a single element.

An array element is specified by a sequence of indices. The length of the sequence must equal the rank of the array. Each index must be a non-negative integer strictly less than the corresponding array dimension. Array indexing is therefore zero-origin, not one-origin as in (the default case of) Fortran.

As an example, suppose that the variable foo names a 3-by-5 array. Then the first index may be 0, 1, or 2, and the second index may be 0, 1, 2, 3, or 4. One may refer to array elements using the function aref; for example, (aref foo 2 1) refers to element (2, 1) of the array. Note that aref takes a variable number of arguments: an array, and as many indices as the array has dimensions. A zero-rank array has no dimensions, and therefore aref would take such an array and no indices, and return the sole element of the array.

In general, arrays can be multidimensional, can share their contents with other array objects, and can have their size altered dynamically (either enlarging or shrinking) after creation. A one-dimensional array may also have a fill pointer.

Multidimensional arrays store their components in row-major order; that is, internally a multidimensional array is stored as a one-dimensional array, with the multidimensional index sets ordered lexicographically, last index varying fastest. This is important in two situations: (1) when arrays with different dimensions share their contents, and (2) when accessing very large arrays in a virtual-memory implementation. (The first situation is a matter of semantics; the second, a matter of efficiency.)

An array that is not displaced to another array, has no fill pointer, and is not to have its size adjusted dynamically after creation is called a simple array. The user may provide declarations that certain arrays will be simple. Some implementations can handle simple arrays in an especially efficient manner; for example, simple arrays may have a more compact representation than non-simple arrays.

If one or more of the :adjustable, :fill-pointer, and :displaced-to arguments is true when make-array is called, then whether the resulting array is simple is unspecified; but if all three arguments are false, then the resulting array is guaranteed to be simple.

2.5.1 Vectors

One-dimensional arrays are called vectors in Common Lisp and constitute the type vector (which is therefore a subtype of array). Vectors and lists are collectively considered to be sequences. They differ in that any component of a one-dimensional array can be accessed in constant time, whereas the average component access time for a list is linear in the length of the list; on the other hand, adding a new element to the front of a list takes constant time, whereas the same operation on an array takes time linear in the length of the array.

A general vector (a one-dimensional array that can have any data object as an element but that has no additional paraphernalia) can be notated by notating the components in order, separated by whitespace and surrounded by #( and ). For example:

#(a b c)                    ;A vector of length 3
#()                         ;An empty vector
#(2 3 5 7 11 13 17 19 23 29 31 37 41 43 47)
                            ;A vector containing the primes below 50

Note that when the function read parses this syntax, it always constructs a simple general vector. _____________________________________________________

Rationale: Many people have suggested that brackets be used to notate vectors, as [a b c] instead of #(a b c). This notation would be shorter, perhaps more readable, and certainly in accord with cultural conventions in other parts of computer science and mathematics. However, to preserve the usefulness of the user-definable macro-character feature of the function read, it is necessary to leave some characters to the user for this purpose. Experience in MacLisp has shown that users, especially implementors of languages for use in artificial intelligence research, often want to define special kinds of brackets. Therefore Common Lisp avoids using brackets and braces for any syntactic purpose.

___________________________________________________________________________________________________________

Implementations may provide certain specialized representations of arrays for efficiency in the case where all the components are of the same specialized (typically numeric) type. All implementations provide specialized arrays for the cases when the components are characters (or rather, a special subset of the characters); the one-dimensional instances of this specialization are called strings. All implementations are also required to provide specialized arrays of bits, that is, arrays of type (array bit); the one-dimensional instances of this specialization are called bit-vectors.

2.5.2 Strings

base-string  (vector base-char)
simple-base-string  (simple-array base-char (*))

An implementation may support other string subtypes as well. All Common Lisp functions that operate on strings treat all strings uniformly; note, however, that it is an error to attempt to insert an extended character into a base string.

The type string is therefore a subtype of the type vector.

A string can be written as the sequence of characters contained in the string, preceded and followed by a " (double quote) character. Any " or \ character in the sequence must additionally have a \ character before it.

For example:

"Foo"                         ;A string with three characters in it
""                            ;An empty string
"\"APL\\360?\" he cried."     ;A string with twenty characters
"|x| = |-x|"                  ;A ten-character string

Notice that any vertical bar | in a string need not be preceded by a \. Similarly, any double quote in the name of a symbol written using vertical-bar notation need not be preceded by a \. The double-quote and vertical-bar notations are similar but distinct: double quotes indicate a character string containing the sequence of characters, whereas vertical bars indicate a symbol whose name is the contained sequence of characters.

The characters contained by the double quotes, taken from left to right, occupy locations within the string with increasing indices. The leftmost character is string element number 0, the next one is element number 1, the next one is element number 2, and so on.

Note that the function prin1 will print any character vector (not just a simple one) using this syntax, but the function read will always construct a simple string when it reads this syntax.

2.5.3 Bit-Vectors

A bit-vector can be written as the sequence of bits contained in the string, preceded by #*; any delimiter character, such as whitespace, will terminate the bit-vector syntax. For example:

#*10110     ;A five-bit bit-vector; bit 0 is a 1
#*          ;An empty bit-vector

The bits notated following the #*, taken from left to right, occupy locations within the bit-vector with increasing indices. The leftmost notated bit is bit-vector element number 0, the next one is element number 1, and so on.

The function prin1 will print any bit-vector (not just a simple one) using this syntax, but the function read will always construct a simple bit-vector when it reads this syntax.