Associative arrays generalize the more familiar notions of matrices, graphs, databases, and spreadsheets.

Formally, an **associativey array** is a map $\mathbf{A} \colon I \times J \to S$ where (i) $I$ and $J$ are arbitrary sets whose elements are called 'row indices' (or 'row labels') and 'column indices' (or 'column labels'), respectively, (ii) $S$ is the underlying set of a semiring $(S, \oplus, \otimes, 0, 1)$, and (iii) $\mathbf{A}(i, j) = 0$ for all but finitely many pairs $(i, j) \in I \times J$.

A **semiring** is a quintuple $(S, \oplus, \otimes, 0, 1)$ where $\oplus$ is a commutative ($a \oplus b = b \oplus a$) and associative ($a \oplus (b \oplus c) = (a \oplus b) \oplus c$) binary operation with identity $0$ ($a \oplus 0 = 0 \oplus a = a$) and $\otimes$ is an associative binary operation with identity $1$ which distributes over $\oplus$ ($a \otimes (b \oplus c) = (a \otimes b) \oplus (a \otimes c)$ and $(a \oplus b) \otimes c = (a \otimes c) \oplus (b \otimes c)$) and which has $0$ as an annihilator ($a \otimes 0 = 0 \otimes a = 0$). The choice of semiring is usually fixed for a particular context, and includes the *max-min algebra* $(\mathbb{R} \cup \{\pm \infty\}, \max, \min, -\infty, \infty)$, the *max-plus algebra* $(\mathbb{R} \cup \{-\infty\}, \max, +, -\infty, 0)$, and the usual *plus-times algebra* $(\mathbb{R}, +, \cdot, 0, 1)$.

In comparison, an $m \times n$ matrix with real entries can be described as a map $\mathbf{A} \colon \{1, 2, \ldots, m\} \times \{1, 2, \ldots, n\} \to \mathbb{R}$, so associative arrays generalize matrices both by allowing for more general row and column indices and alternative values and algebraic operations associated with those values. Like matrices, associative arrays support their own versions of matrix addition, matrix multiplication, and element-wise multiplication—given associative arrays $\mathbf{A} \colon I_\mathbf{A} \times J_\mathbf{A} \to S$ and $\mathbf{B} \colon I_\mathbf{B} \times J_\mathbf{B} \to S$ then

$$\begin{align*}

(\mathbf{A} \oplus \mathbf{B})(i, j) & = \mathbf{A}(i, j) \oplus \mathbf{B}(i, j), & (\text{associative array addition}) \\

(\mathbf{A} \mathbin{{\oplus}.{\otimes}} \mathbf{B})(i, j) & = \bigoplus_k{\mathbf{A}(i,k) \otimes \mathbf{B}(k, j)} & (\text{associative array multiplication}) \\

(\mathbf{A} \otimes \mathbf{B})(i, j) & = \mathbf{A}(i, j) \otimes \mathbf{B}(i, j), & (\text{associative array element-wise multiplication})

\end{align*}$$

with the understanding that if $\mathbf{C} \colon I \times J \to S$ is an associative array then $\mathbf{C}(i, j) = 0$ whenever $(i, j) \notin I \times J$, leveraging the assumed algebraic properties of $S$. These operations can be shown to satisfy many natural algebraic properties—e.g., associative array multiplication is associative and distributes over associative array addition—allowing them to be used in any linear algreba application that normally applies matrices or any other object which matrices can model. Moreover, the added flexibility of associative arrays can lead to shortened and simpler code by keeping track of row and column labels along the way or by utilizing nonstandard semirings.

A standard reference for the theory and application of associative arrays is Mathematics of Big Data by Jeremy Kepner and myself, particularly covering the D4M (Dynamic Distributed Dimensional Data Model) technology architected by Jeremy Kepner implemented in Python, MATLAB, and Julia. The Python implementation, D4M.py, is written and maintained by me.