|
| 1 | +--- |
| 2 | +engine: knitr |
| 3 | +knitr: true |
| 4 | +syntax-definition: "../Assets/zig.xml" |
| 5 | +--- |
| 6 | + |
| 7 | + |
| 8 | +```{r} |
| 9 | +#| include: false |
| 10 | +source("../zig_engine.R") |
| 11 | +knitr::opts_chunk$set( |
| 12 | + auto_main = FALSE, |
| 13 | + build_type = "lib" |
| 14 | +) |
| 15 | +``` |
| 16 | + |
| 17 | + |
| 18 | + |
| 19 | + |
| 20 | +# Introducing Vectors and SIMD {#sec-vectors-simd} |
| 21 | + |
| 22 | +In this chapter, I'm going to discuss vectors in Zig, which are |
| 23 | +related to SIMD operations (i.e. they have no relationship with the `std::vector` class |
| 24 | +from C++). |
| 25 | + |
| 26 | +## What is SIMD? |
| 27 | + |
| 28 | +SIMD (*Single Instruction/Multiple Data*) is a group of operations that are widely used |
| 29 | +on video/audio editing programs, and also in graphics applications. SIMD is not a new technology, |
| 30 | +but the massive use of SIMD on normal desktop computers is somewhat recent. In the old days, SIMD |
| 31 | +was only used on "supercomputers models". |
| 32 | + |
| 33 | +Most modern CPU models (from AMD, Intel, etc.) these days (either in a desktop or in a |
| 34 | +notebook model) have support for SIMD operations. So, if you have a very old CPU model installed in your |
| 35 | +computer, then, is possible that you have no support for SIMD operations in your computer. |
| 36 | + |
| 37 | +Why people have started using SIMD on their software? The answer is performance. |
| 38 | +But what SIMD precisely do to achieve better performance? Well, in essence, SIMD operations are a different |
| 39 | +strategy to get parallel computing in your program, and therefore, make faster calculations. |
| 40 | + |
| 41 | +The basic idea behind SIMD is to have a single instruction that operates over multiple data |
| 42 | +at the same time. When you perform a normal scalar operation, like for example, four add instructions, |
| 43 | +each addition is performed separately, one after another. But with SIMD, these four add instructions |
| 44 | +are translated into a single instruction, and, as consequence, the four additions are performed |
| 45 | +in parallel, at the same time. |
| 46 | + |
| 47 | +Currently, the following group of operators are allowed to use on vector objects. All of |
| 48 | +these operators are applied element-wise and in parallel by default. |
| 49 | + |
| 50 | +- Arithmetic (`+`, `-`, `/`, `*`, `@divFloor()`, `@sqrt()`, `@ceil()`, `@log()`, etc.). |
| 51 | +- Bitwise operators (`>>`, `<<`, `&`, `|`, `~`, etc.). |
| 52 | +- Comparison operators (`<`, `>`, `==`, etc.). |
| 53 | + |
| 54 | + |
| 55 | +## Vectors {#sec-what-vectors} |
| 56 | + |
| 57 | +A SIMD operation is usually performed through a "SIMD intrinsic", which is just a fancy |
| 58 | +name for a function that performs a SIMD operation. These SIMD intrinsics (or "SIMD functions") |
| 59 | +always operate over a special type of object, which are called "vectors". So, |
| 60 | +in order to use SIMD, you have to create a "vector object". |
| 61 | + |
| 62 | +A vector object is usually a fixed-sized block of 128 bits (16 bytes). |
| 63 | +As consequence, most vectors that you find in the wild are essentially arrays that contains 2 values of 8 bytes each, |
| 64 | +or, 4 values of 4 bytes each, or, 8 values of 2 bytes each, etc. |
| 65 | +However, different CPU models may have different extensions (or, "implementations") of SIMD, |
| 66 | +which may offer more types of vector objects that are bigger in size (256 bits or 512 bits) |
| 67 | +to accomodate more data into a single vector object. |
| 68 | + |
| 69 | +You can create a new vector object in Zig by using the `@Vector()` built-in function. Inside this function, |
| 70 | +you specify the vector length (number of elements in the vector), and the data type of the elements |
| 71 | +of the vector. Only primitive data types are supported in these vector objects. |
| 72 | +In the example below, I'm creating two vector objects (`v1` and `v2`) of 4 elements of type `u32` each. |
| 73 | + |
| 74 | +Also notice in the example below, that a third vector object (`v3`) is created from the |
| 75 | +sum of the previous two vector objects (`v1` plus `v2`). Therefore, |
| 76 | +math operations over vector objects take place element-wise by default, because |
| 77 | +the same operation (in this case, addition) is transformed into a single instruction |
| 78 | +that is replicated in parallel, across all elements of the vectors. |
| 79 | + |
| 80 | + |
| 81 | +```{zig} |
| 82 | +#| auto_main: true |
| 83 | +#| build_type: "run" |
| 84 | +const v1 = @Vector(4, u32){4, 12, 37, 9}; |
| 85 | +const v2 = @Vector(4, u32){10, 22, 5, 12}; |
| 86 | +const v3 = v1 + v2; |
| 87 | +try stdout.print("{any}\n", .{v3}); |
| 88 | +``` |
| 89 | + |
| 90 | +This is how SIMD introduces more performance in your program. Instead of using a for loop |
| 91 | +to iterate through the elements of `v1` and `v2`, and adding them together, one element at a time, |
| 92 | +we enjoy the benefits of SIMD, which performs all 4 additions in parallel, at the same time. |
| 93 | + |
| 94 | +Therefore, the `@Vector` structure in Zig is essentially, the Zig representation of SIMD vector objects. |
| 95 | +But the elements on these vector objects will be operated in parallel, if, and only if your current CPU model |
| 96 | +supports SIMD operations. If your CPU model does not support SIMD, then, the `@Vector` structure will |
| 97 | +likely produce a similar performance from a "for loop solution". |
| 98 | + |
| 99 | + |
| 100 | +### Transforming arrays into vectors |
| 101 | + |
| 102 | +There are different ways you can transform a normal array into a vector object. |
| 103 | +You can either use implicit conversion (which is when you assign the array to |
| 104 | +a vector object directly), or, use slices to create a vector object from a normal array. |
| 105 | + |
| 106 | +In the example below, we implicitly convert the array `a1` into a vector object (`v1`) |
| 107 | +of length 4. All we had to do was to just explicitly annotate the data type of the vector object, |
| 108 | +and then, assign the array object to this vector object. |
| 109 | + |
| 110 | +Also notice in the example below, that a second vector object (`v2`) is also created |
| 111 | +by taking a slice of the array object (`a1`), and then, storing the pointer to this |
| 112 | +slice (`.*`) into this vector object. |
| 113 | + |
| 114 | + |
| 115 | +```{zig} |
| 116 | +#| auto_main: true |
| 117 | +#| build_type: "run" |
| 118 | +const a1 = [4]u32{4, 12, 37, 9}; |
| 119 | +const v1: @Vector(4, u32) = a1; |
| 120 | +const v2: @Vector(2, u32) = a1[1..3].*; |
| 121 | +_ = v1; _ = v2; |
| 122 | +``` |
| 123 | + |
| 124 | + |
| 125 | +Is worth emphasizing that only arrays and slices whose sizes |
| 126 | +are compile-time known can be transformed into vectors. Vectors in general |
| 127 | +are structures that work only with compile-time known sizes. Therefore, if |
| 128 | +you have an array whose size is runtime known, then, you first need to |
| 129 | +copy it into an array with a compile-time known size, before transforming it into a vector. |
| 130 | + |
| 131 | + |
| 132 | + |
| 133 | +### The `@splat()` function |
| 134 | + |
| 135 | +You can use the `@splat()` built-in function to create a vector object that is filled |
| 136 | +with the same value across all of it's elements. This function was created to offer a quick |
| 137 | +and easy way to directly convert a scalar value (a.k.a. a single value, like a single character, or a single integer, etc.) |
| 138 | +into a vector object. |
| 139 | + |
| 140 | +Thus, we can use `@splat()` to convert a single value, like the integer `16` into a vector object |
| 141 | +of length 1. But we can also use this function to convert the same integer `16` into a |
| 142 | +vector object of length 10, that is filled with 10 `16` values. The example below demonstrates |
| 143 | +this idea. |
| 144 | + |
| 145 | +```{zig} |
| 146 | +#| auto_main: true |
| 147 | +#| build_type: "run" |
| 148 | +const v1: @Vector(10, u32) = @splat(16); |
| 149 | +try stdout.print("{any}\n", .{v1}); |
| 150 | +``` |
| 151 | + |
| 152 | + |
| 153 | + |
| 154 | +### Careful with vectors that are too big |
| 155 | + |
| 156 | +As I described at @sec-what-vectors, each vector object is usually a small block of 128, 256 or 512 bits. |
| 157 | +This means that a vector object is usually small in size, and when you try to go in the opposite direction, |
| 158 | +by creating a vector object in Zig that is very big in size (i.e. sizes that are close to $2^{20}$), |
| 159 | +you usually end up with crashes and loud errors from the compiler. |
| 160 | + |
| 161 | +For example, if you try to compile the program below, you will likely face segmentation faults, or, LLVM errors during |
| 162 | +the build process. Just be careful to not create vector objects that are too big in size. |
| 163 | + |
| 164 | +```{zig} |
| 165 | +#| eval: false |
| 166 | +const v1: @Vector(1000000, u32) = @splat(16); |
| 167 | +_ = v1; |
| 168 | +``` |
| 169 | + |
| 170 | +``` |
| 171 | +Segmentation fault (core dumped) |
| 172 | +``` |
| 173 | + |
| 174 | + |
| 175 | + |
| 176 | + |
0 commit comments