Merge pull request #178 from asmeurer/docs-fixes

Small fixes to the indexing guide
Quansight-Labs · May 24, 2024 · a1379ac · a1379ac
2 parents 0ba06d3 + 94100f0
commit a1379ac
Show file tree

Hide file tree

Showing 7 changed files with 136 additions and 86 deletions.
diff --git a/docs/indexing-guide/integer-indices.md b/docs/indexing-guide/integer-indices.md
@@ -140,6 +140,7 @@ Therefore, negative indices are primarily a syntactic convenience that
 allows one to specify parts of a list that would otherwise need to be
 specified in terms of the size of the list.
 
+(integer-indices-bounds-checking)=
 If an integer index is greater than or equal to the size of the list, or less
 than negative the size of the list (`i >= len(a)` or `i < -len(a)`), then it
 is out of bounds and will raise an `IndexError`.

diff --git a/docs/indexing-guide/multidimensional-indices/boolean-arrays.md b/docs/indexing-guide/multidimensional-indices/boolean-arrays.md
@@ -671,7 +671,7 @@ Or if it had no actual `0`s:[^0-d-mask-footnote]
     with the shape `(0,)` to the shape `(0,)`, and so this is what gets
     assigned, i.e., "nothing" (of shape `(0,)`) gets assigned to "nothing" (of
     matching shape `(0,)`). This is one reason why [broadcasting
-    rules](broadcasting) apply even to dimensions of size 0.
+    rules](broadcasting) apply even to dimensions of size `0`.
 
 ```py
 >>> a = np.asarray([1, 1, 2])

diff --git a/docs/indexing-guide/multidimensional-indices/integer-arrays.md b/docs/indexing-guide/multidimensional-indices/integer-arrays.md
@@ -1,7 +1,7 @@
 # Integer Array Indices
 
 ```{note}
-In this section, and [the next](boolean-arrays), do not confuse the *array
+In this section and [the next](boolean-arrays), do not confuse the *array
 being indexed* with the *array that is the index*. The former can be anything
 and have any dtype. It is only the latter that is restricted to being integer
 or boolean.
@@ -35,7 +35,7 @@ elements of the array in order (or possibly [reversed order](negative-steps)
 for slices), whereas this array has elements completely shuffled from `a`, and
 some are even repeated.
 
-However, we could "cheat" a bit here, and do something like
+However, we could "cheat" a bit here and do something like
 
 ```py
 >>> new_array = np.array([[a[0], a[2], a[0]],
@@ -47,7 +47,7 @@ array([[100, 102, 100],
 
 This is the array we want. We sort of constructed it using only indexing
 operations, but we didn't actually do `a[idx]` for some index `idx`. Instead,
-we just listed the index of each individual element.
+we just listed the indices of each individual element.
 
 An integer array index is essentially this "cheating" method, but as a single
 index. Instead of listing out `a[0]`, `a[2]`, and so on, we just create a
@@ -78,25 +78,19 @@ Note that `a[idx]` above is not the same size as `a` at all. `a` has 4
 elements and is 1-dimensional, whereas `a[idx]` has 6 elements and is
 2-dimensional. `a[idx]` also contains some duplicate elements from `a`, and
 there are some elements which aren't selected at all. Indeed, we could take
-*any* integer array of any shape, and as long as the elements are between 0
-and 3, `a[idx]` would create a new array with the same shape as `idx` with
-corresponding elements selected from `a`.
+*any* integer array `idx` of any shape, and as long as the elements are
+between 0 and 3, `a[idx]` would create a new array with the same shape as
+`idx` with corresponding elements selected from `a`.
 
-A useful way to think about integer array indexing is that it generalizes
-[integer indexing](../integer-indices.md). With integer indexing, we are
-effectively indexing using a 0-dimensional integer array, that is, a single
-integer.[^integer-scalar-footnote] This always selects the corresponding
-element from the given axis and removes the dimension. That is, it replaces
-that dimension in the shape with `()`, the "shape" of the integer index.
-
-Similarly,
+The shape of `a` is `(4,)` and the shape of `a[idx]` is `(2, 3)`, the same as the
+shape of `idx`. In general:
 
 > **an integer array index `a[idx]` selects elements from the specified axis
-> and replaces the dimension in the shape with the shape of the index array
-> `idx`.**
+> and replaces the selected dimension in the shape of `a` with the shape of
+> the index array `idx`.**
 
-
-For example:
+For example, in `a[idx].shape`, `4` is replaced with `(2, 3)`. Consider what
+happens when `a` has more than one dimension:
 
 ```
 >>> a = np.empty((3, 4))
@@ -107,13 +101,67 @@ For example:
 (3, 2, 2)
 ```
 
-In particular, even when the index array `idx` has more than one dimension, an
+Here `a.shape` is `(3, 4)` and `idx.shape` is `(2, 2)`. In `a[idx].shape`, the
+`3` is replaced with `(2, 2)`, giving `(2, 2, 4)`, and in `a[:, idx].shape`,
+the `4` is replaced with `(2, 2)`, giving `(3, 2, 2)`.
+
+A useful way to think about integer array indexing is that it generalizes
+[integer indexing](../integer-indices.md). With integer indexing, we are
+effectively indexing using a 0-dimensional integer array, that is, a single
+integer. This always selects the corresponding element from the given axis and
+removes the dimension. That is, it replaces that dimension in the shape with
+`()` (i.e., nothing), the "shape" of the integer index. The result of indexing
+with an `int` and a corresponding 0-D array is exactly the
+same.[^integer-scalar-footnote]
+
+[^integer-scalar-footnote]:
+    <!-- This is the only way to cross reference a footnote across documents -->
+    (integer-scalar-footnote-ref)=
+
+    There is one difference between `a[0]` and `a[asarray(0)]`. The
+    latter is considered an advanced index, so it does not create a
+    [view](views-vs-copies):
+
+    ```py
+    >>> a = np.empty((2, 3))
+    >>> a[0].base is a
+    True
+    >>> print(a[np.array(0)].base)
+    None
+    ```
+
+    In ndindex,
+    [`IntegerArray.reduce()`](ndindex.IntegerArray.reduce) will always convert
+    a 0-D array index into an [`Integer`](ndindex.integer.Integer).
+
+```py
+>>> idx = np.asarray(0) # 0-D array
+>>> idx.shape
+()
+>>> a = np.arange(12).reshape((3, 4))
+>>> a[idx].shape # replaces (3,) with ()
+(4,)
+>>> a[:, idx].shape # replaces (4,) with ()
+(3,)
+>>> a[idx] # a[asarray(0)] is the exact same as a[0]
+array([0, 1, 2, 3])
+>>> a[0]
+array([0, 1, 2, 3])
+```
+
+Note that even when the index array `idx` has more than one dimension, an
 integer array index still only selects elements from a single axis of `a`. It
 would appear that this limits the ability to arbitrarily shuffle elements of
-`a` using integer indexing. For instance, suppose we want to create the array
-`[105, 100]` from the above 2-D `a`. Based on the above examples, it might not
-seem possible, since the elements `105` and `100` are not in the same row or
-column of `a`.
+`a` using integer indexing. For instance, suppose we have the 2-D array
+
+```
+>>> a = [[100, 101, 102],
+...      [103, 104, 105]]
+```
+
+and we wanted use indexing to create the array `[105, 100]`. Based on the
+above examples, this might not seem possible, since the elements `105` and
+`100` are not in the same row or column of `a`.
 
 However, this is doable by providing multiple integer array
 indices:
@@ -285,6 +333,28 @@ array([100, 101, 103])
 array([100, 101, 103])
 ```
 
+### Bounds Checking
+
+As with [integer indices](../integer-indices.md), integer array indexing uses
+bounds checking, with the [same rule as integer
+indices](integer-indices-bounds-checking).
+
+> **If any entry in an integer array index is greater than `size - 1` or less
+> than `-size`, where `size` is the size of the dimension being indexed, an
+> `IndexError` is raised.**
+
+```py
+>>> a = np.array([100, 101, 102, 103]) # as above
+>>> a[[2, 3, 4]]
+Traceback (most recent call last):
+  File "<stdin>", line 1, in <module>
+IndexError: index 4 is out of bounds for axis 0 with size 4
+>>> a[[-5, -4, -3]]
+Traceback (most recent call last):
+  File "<stdin>", line 1, in <module>
+IndexError: index -5 is out of bounds for axis 0 with size 4
+```
+
 (integer-array-broadcasting)=
 ### Broadcasting
 
@@ -360,28 +430,6 @@ The ndindex methods
 [`expand()`](ndindex.Tuple.expand) will broadcast array indices together into
 a canonical form.
 
-[^integer-scalar-footnote]:
-    <!-- This is the only way to cross reference a footnote across documents -->
-    (integer-scalar-footnote-ref)=
-
-    In fact, if the integer array index itself has
-    shape `()`, then the behavior is identical to simply using an `int` with
-    the same value. So it's a true generalization. In ndindex,
-    [`IntegerArray.reduce()`](ndindex.IntegerArray.reduce) will always convert
-    a 0-D array index into an [`Integer`](ndindex.integer.Integer).
-
-    However, there is one difference between `a[0]` and `a[asarray(0)]`. The
-    latter is considered an advanced index, so it does not create a
-    [view](views-vs-copies):
-
-    ```py
-    >>> a = np.empty((2, 3))
-    >>> a[0].base is a
-    True
-    >>> print(a[np.array(0)].base)
-    None
-    ```
-
 (outer-indexing)=
 #### Outer Indexing
 
@@ -474,8 +522,8 @@ axis, i.e., exactly the arrays we want.
 
 This is why NumPy automatically broadcasts integer array indices together.
 
-> **Outer indexing arrays can be constructed by inserting size-1 dimensions
-> into the desired "outer" integer array indices so that the non-size-1
+> **Outer indexing arrays can be constructed by inserting size `1` dimensions
+> into the desired "outer" integer array indices so that the non-size `1`
 > dimension for each is in the indexing dimension.**
 
 For example,
@@ -492,7 +540,7 @@ Here, we use [newaxis](newaxis.md) along with `:` to turn `idx0` and
 `idx1` into shape `(2, 1)` and `(1, 3)` arrays, respectively. These then
 automatically broadcast together to give the desired outer index.
 
-This "insert size-1 dimensions" operation can also be performed automatically
+This "insert size `1` dimensions" operation can also be performed automatically
 with the {external+numpy:func}`numpy.ix_` function.[^ix-footnote]
 
 [^ix-footnote]: `ix_()` is currently limited to only support 1-D input arrays
@@ -599,7 +647,7 @@ For example, consider:
 ...                [103, 104, 105]]])
 ```
 
-This is the same `a` as in the above examples, except it has an extra size-1
+This is the same `a` as in the above examples, except it has an extra size `1`
 dimension:
 
 ```py

diff --git a/docs/indexing-guide/multidimensional-indices/newaxis.md b/docs/indexing-guide/multidimensional-indices/newaxis.md
@@ -15,11 +15,11 @@ None
 True
 ```
 
-`newaxis`, as the name suggests, adds a new axis. This new axis has size `1`.
-The new axis is added at the corresponding location within the array. A size
-`1` axis neither adds nor removes any elements from the array. Using the
-[nested lists analogy](what-is-an-array.md), it essentially adds a new "layer"
-to the list of lists.
+`newaxis`, as the name suggests, adds a new axis to an array. This new axis
+has size `1`. The new axis is added at the corresponding location within the
+array shape. A size `1` axis neither adds nor removes any elements from the
+array. Using the [nested lists analogy](what-is-an-array.md), it essentially
+adds a new "layer" to the list of lists.
 
 
 ```py
@@ -77,7 +77,7 @@ within the index `a[0, :2]`:
 
 In each case, the exact same elements are selected: `0` always targets the
 first axis, and `:2` always targets the second axis. The only difference is
-where the size-1 axis is inserted:
+where the size `1` axis is inserted:
 
 ```py
 >>> a[np.newaxis, 0, :2]
@@ -127,11 +127,11 @@ its position in the tuple index after removing any `newaxis` indices.
 Equivalently, `newaxis` indices can be though of as adding new axes *after*
 the existing axes are indexed.
 
-A size-1 axis can always be inserted anywhere in an array's shape without
+A size `1` axis can always be inserted anywhere in an array's shape without
 changing the underlying elements.
 
 An array index can include multiple instances of `newaxis` (or `None`). Each
-will add a size-1 axis in the corresponding location.
+will add a size `1` axis in the corresponding location.
 
 **Exercise:** Can you determine the shape of this array, given that `a.shape`
 is `(3, 2, 4)`?
@@ -151,7 +151,7 @@ a[np.newaxis, 0, newaxis, :2, newaxis, ..., newaxis]
 
 In summary,
 
-> **`np.newaxis` (which is just an alias for `None`) inserts a new size-1 axis
+> **`np.newaxis` (which is just an alias for `None`) inserts a new size `1` axis
   in the corresponding location in the tuple index. The remaining,
   non-`newaxis` indices in the tuple index are indexed as if the `newaxis`
   indices were not there.**
@@ -184,22 +184,22 @@ array([[ 0],
 `(3, 1)` column vector.
 
 But the most common usage is due to [broadcasting](broadcasting). The key idea
-of broadcasting is that size-1 dimensions are not directly useful, in the
+of broadcasting is that size `1` dimensions are not directly useful, in the
 sense that they could be removed without actually changing anything about the
 underlying data in the array. So they are used as a signal that that dimension
 can be repeated in operations. `newaxis` is therefore useful for inserting
-these size-1 dimensions in situations where you want to force your data to be
-repeated. For example, suppose we have the two arrays
+these size `1` dimensions in situations where you want to force your data to
+be repeated. For example, suppose we have the two arrays
 
 ```py
 >>> x = np.array([1, 2, 3])
 >>> y = np.array([100, 200])
 ```
 
 and suppose we want to compute an "outer" sum of `x` and `y`, that is, we want
-to compute every combination of `i + j` where `i` is from `x` and `j` is from
+to compute every combination of `a + b` where `a` is from `x` and `b` is from
 `y`. The key realization here is that what we want is simply to
-repeat each entry of `x` 3 times, to correspond to each entry of `y`, and
+repeat each entry of `x` 2 times, to correspond to each entry of `y`, and
 respectively repeat each entry of `y` 3 times, to correspond to each entry of
 `x`. And this is exactly the sort of thing broadcasting does! We only need to
 make the shapes of `x` and `y` match in such a way that the broadcasting will
@@ -217,7 +217,7 @@ from `x`, and the second dimension will correspond to values from `y`, i.e.,
 `a[i, j]` will be `x[i] + y[j]`. Thus the resulting array will have shape `(3,
 2)`. So to make `x` (which is shape `(3,)`) and `y` (which is shape `(2,)`)
 broadcast to this, we need to make them `(3, 1)` and `(1, 2)`, respectively.
-This can easily be done with `np.newaxis`.
+This can easily be done with `np.newaxis`:
 
 ```py
 >>> x[:, np.newaxis].shape
@@ -245,7 +245,7 @@ array([[101, 201],
        [103, 203]])
 ```
 
-Note: broadcasting automatically prepends shape `1` dimensions, so the
+Note: broadcasting automatically prepends size `1` dimensions, so the
 `y[np.newaxis, :]` operation is unnecessary.
 
 ```py
@@ -255,7 +255,7 @@ array([[101, 201],
        [103, 203]])
 ```
 
-As we saw [before](single-axis-tuple), size-1 dimensions may seem redundant,
+As we saw [before](single-axis-tuple), size `1` dimensions may seem redundant,
 but they are not a bad thing. Not only do they allow indexing an array
 uniformly, they are also very important in the way they interact with NumPy's
 broadcasting rules.

diff --git a/docs/indexing-guide/multidimensional-indices/tuples.md b/docs/indexing-guide/multidimensional-indices/tuples.md
@@ -42,7 +42,7 @@ array([[[16, 17, 18, 19],
 ```
 
 We also observe that integer indices remove the axis, and slices keep the axis
-(even when the resulting axis has size-1):
+(even when the resulting axis has size 1):
 
 ```py
 >>> a[0].shape
@@ -344,14 +344,14 @@ because it means that you can index the array
 uniformly.[^size-1-dimension-footnote] And this doesn't apply just to
 indexing. Many NumPy functions reduce the number of dimensions of their output
 (for example, {external+numpy:func}`numpy.sum`), but they have a `keepdims`
-argument to retain the dimension as a size-1 dimension instead.
+argument to retain the dimension as a size `1` dimension instead.
 
 [^size-1-dimension-footnote]: In this example, if we knew that we were always
     going to select exactly one element (say, the second one) from the first
     dimension, we could equivalently use `a[1, np.newaxis]` (see
     [](../integer-indices.md) and [](newaxis.md)). The advantage of this is
     that we would get an error if the first dimension of `a` didn't actually
-    have `2` elements, whereas `a[1:2]` would just silently give a size-0
+    have `2` elements, whereas `a[1:2]` would just silently give a size `0`
     array.
 
 There are two final facts about tuple indices that should be noted before we

diff --git a/docs/indexing-guide/multidimensional-indices/what-is-an-array.md b/docs/indexing-guide/multidimensional-indices/what-is-an-array.md
@@ -26,9 +26,9 @@ subsets of it.
 ```
 
 You can imagine all sorts of different things you'd want to do with your
-scores that might involve selecting individual scores or ranges of scores (for
+scores that might involve selecting individual scores or ranges of scores. For
 example, with the above examples, we could easily compute the average score of
-our last three games, and see how it compares to our first game). So hopefully
+our last three games, and see how it compares to our first game. So hopefully
 you are convinced that at least the types of indices we have learned so far
 are useful.