Skip to content

Commit b3396a9

Browse files
authored
Document map_selection core operation (#617)
* Document `map_selection` core operation. * Rework dependency tree diagram for core and primitive ops to include `map_selection` * Update five-layer design diagram * Add rechunk apidoc * Add note about `general_blockwise` * Fix double backticks
1 parent c75556f commit b3396a9

File tree

8 files changed

+175
-154
lines changed

8 files changed

+175
-154
lines changed

cubed/core/ops.py

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -775,6 +775,28 @@ def map_selection(
775775
max_num_input_blocks,
776776
**kwargs,
777777
) -> "Array":
778+
"""
779+
Apply a function to selected subsets of an input array using standard NumPy indexing notation.
780+
781+
Parameters
782+
----------
783+
func : callable
784+
Function to apply to every block to produce the output array.
785+
Must accept ``block_id`` as a keyword argument (with same meaning as for ``map_blocks``).
786+
selection_function : callable
787+
A function that maps an output chunk key to one or more selections on the input array.
788+
x: Array
789+
The input array.
790+
shape : tuple
791+
Shape of the output array.
792+
dtype : np.dtype
793+
The ``dtype`` of the output array.
794+
chunks : tuple
795+
Chunk shape of blocks in the output array.
796+
max_num_input_blocks : int
797+
The maximum number of input blocks read from the input array.
798+
"""
799+
778800
def key_function(out_key):
779801
# compute the selection on x required to get the relevant chunk for out_key
780802
in_sel = selection_function(out_key)
@@ -1009,6 +1031,18 @@ def wrap(*a, block_id=None, **kw):
10091031

10101032

10111033
def rechunk(x, chunks, target_store=None):
1034+
"""Change the chunking of an array without changing its shape or data.
1035+
1036+
Parameters
1037+
----------
1038+
chunks : tuple
1039+
The desired chunks of the array after rechunking.
1040+
1041+
Returns
1042+
-------
1043+
cubed.Array
1044+
An array with the desired chunks.
1045+
"""
10121046
if isinstance(chunks, dict):
10131047
chunks = {validate_axis(c, x.ndim): v for c, v in chunks.items()}
10141048
for i in range(x.ndim):

docs/design.md

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ Cubed is composed of five layers: from the storage layer at the bottom, to the A
44

55
![Five layer diagram](images/design.svg)
66

7-
Blue blocks are implemented in Cubed, green in Rechunker, and red in other projects like Zarr and Beam.
7+
Blue blocks are implemented in Cubed; red blocks in other projects like Zarr and Lithops.
88

99
Let's go through the layers from the bottom:
1010

@@ -14,7 +14,7 @@ Every _array_ in Cubed is backed by a Zarr array. This means that the array type
1414

1515
## Runtime
1616

17-
Cubed uses external runtimes for computation. It follows the Rechunker model (and uses its algorithm) to delegate tasks to stateless executors, which include Python (in-process), Lithops, Modal, and Apache Beam.
17+
Cubed uses external runtimes for computation, delegating tasks to stateless executors, which include Python (in-process), Lithops, Modal, and Apache Beam.
1818

1919

2020
## Primitive operations
@@ -45,8 +45,7 @@ These are built on top of the primitive operations, and provide functions that a
4545
4646
elemwise
4747
map_blocks
48-
map_direct
49-
index
48+
map_selection
5049
reduction
5150
arg_reduction
5251
```

docs/images/design.svg

Lines changed: 1 addition & 1 deletion
Loading

docs/images/map_selection.svg

Lines changed: 1 addition & 0 deletions
Loading

docs/images/ops.dot

Lines changed: 7 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -11,21 +11,20 @@ digraph {
1111
// core
1212
elemwise [style="filled"; fillcolor="#ffd8b1";];
1313
map_blocks [style="filled"; fillcolor="#ffd8b1";];
14-
map_direct [style="filled"; fillcolor="#ffd8b1";];
14+
map_selection [style="filled"; fillcolor="#ffd8b1";];
1515
reduction [style="filled"; fillcolor="#ffd8b1";];
1616
arg_reduction [style="filled"; fillcolor="#ffd8b1";];
1717

1818
elemwise -> blockwise;
1919
map_blocks -> blockwise;
20-
map_direct -> map_blocks;
20+
map_selection -> blockwise;
2121
reduction -> blockwise;
22-
reduction -> rechunk;
2322
arg_reduction -> reduction;
2423

2524
// array API
2625

2726
// array object
28-
__getitem__ -> map_direct
27+
__getitem__ -> map_selection
2928

3029
// elementwise
3130
add -> elemwise
@@ -34,12 +33,11 @@ digraph {
3433
// linear algebra
3534
matmul -> blockwise;
3635
matmul -> reduction;
37-
outer -> blockwise;
3836

3937
// manipulation
40-
concat -> map_direct;
38+
concat -> blockwise;
4139
reshape -> rechunk;
42-
reshape -> map_direct;
40+
reshape -> blockwise;
4341
squeeze -> map_blocks;
4442

4543
// searching
@@ -51,18 +49,17 @@ digraph {
5149
// utility
5250
all -> reduction;
5351

54-
5552
{
5653
rank = min;
5754

5855
// fix horizontal placing with invisible edges
5956
edge[style=invis];
60-
add -> negative -> outer -> matmul -> __getitem__ -> concat -> reshape -> squeeze -> argmax -> sum -> all;
57+
add -> negative -> squeeze -> __getitem__ -> concat -> matmul -> sum -> all -> argmax -> reshape;
6158
rankdir = LR;
6259
}
6360
{
6461
rank = same;
65-
elemwise; map_blocks; reduction;
62+
elemwise; map_blocks; map_selection; reduction;
6663
}
6764
{
6865
rank = max;

0 commit comments

Comments
 (0)