Skip to content

Commit d764c72

Browse files
authored
Merge pull request #48 from pedropark99/vectors
Add chapter to talk about SIMD and Vectors
2 parents e1ed191 + a6a9b25 commit d764c72

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

54 files changed

+8095
-32
lines changed

Chapters/01-zig-weird.qmd

+3-3
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ Zig as a modern and better version of C.
3939
In the author's personal interpretation, Zig is tightly connected with "less is more".
4040
Instead of trying to become a modern language by adding more and more features,
4141
many of the core improvements that Zig brings to the
42-
table are actually about removing annoying and evil behaviours/features from C and C++.
42+
table are actually about removing annoying behaviours/features from C and C++.
4343
In other words, Zig tries to be better by simplifying the language, and by having more consistent and robust behaviour.
4444
As a result, analyzing, writing and debugging applications become much easier and simpler in Zig, than it is in C or C++.
4545

@@ -1421,7 +1421,7 @@ details about it. Just as a quick recap:
14211421

14221422
But, for now, this amount of knowledge is enough for us to continue with this book.
14231423
Later, over the next chapters we will still talk more about other parts of
1424-
Zig's syntax that are also equally important as the other parts. Such as:
1424+
Zig's syntax that are also equally important. Such as:
14251425

14261426

14271427
- How Object-Oriented programming can be done in Zig through *struct declarations* at @sec-structs-and-oop.
@@ -1430,7 +1430,7 @@ Zig's syntax that are also equally important as the other parts. Such as:
14301430
- Pointers and Optionals at @sec-pointer;
14311431
- Error handling with `try` and `catch` at @sec-error-handling;
14321432
- Unit tests at @sec-unittests;
1433-
- Vectors;
1433+
- Vectors at @sec-vectors-simd;
14341434
- Build System at @sec-build-system;
14351435

14361436

Chapters/15-vectors.qmd

+176
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,176 @@
1+
---
2+
engine: knitr
3+
knitr: true
4+
syntax-definition: "../Assets/zig.xml"
5+
---
6+
7+
8+
```{r}
9+
#| include: false
10+
source("../zig_engine.R")
11+
knitr::opts_chunk$set(
12+
auto_main = FALSE,
13+
build_type = "lib"
14+
)
15+
```
16+
17+
18+
19+
20+
# Introducing Vectors and SIMD {#sec-vectors-simd}
21+
22+
In this chapter, I'm going to discuss vectors in Zig, which are
23+
related to SIMD operations (i.e. they have no relationship with the `std::vector` class
24+
from C++).
25+
26+
## What is SIMD?
27+
28+
SIMD (*Single Instruction/Multiple Data*) is a group of operations that are widely used
29+
on video/audio editing programs, and also in graphics applications. SIMD is not a new technology,
30+
but the massive use of SIMD on normal desktop computers is somewhat recent. In the old days, SIMD
31+
was only used on "supercomputers models".
32+
33+
Most modern CPU models (from AMD, Intel, etc.) these days (either in a desktop or in a
34+
notebook model) have support for SIMD operations. So, if you have a very old CPU model installed in your
35+
computer, then, is possible that you have no support for SIMD operations in your computer.
36+
37+
Why people have started using SIMD on their software? The answer is performance.
38+
But what SIMD precisely do to achieve better performance? Well, in essence, SIMD operations are a different
39+
strategy to get parallel computing in your program, and therefore, make faster calculations.
40+
41+
The basic idea behind SIMD is to have a single instruction that operates over multiple data
42+
at the same time. When you perform a normal scalar operation, like for example, four add instructions,
43+
each addition is performed separately, one after another. But with SIMD, these four add instructions
44+
are translated into a single instruction, and, as consequence, the four additions are performed
45+
in parallel, at the same time.
46+
47+
Currently, the following group of operators are allowed to use on vector objects. All of
48+
these operators are applied element-wise and in parallel by default.
49+
50+
- Arithmetic (`+`, `-`, `/`, `*`, `@divFloor()`, `@sqrt()`, `@ceil()`, `@log()`, etc.).
51+
- Bitwise operators (`>>`, `<<`, `&`, `|`, `~`, etc.).
52+
- Comparison operators (`<`, `>`, `==`, etc.).
53+
54+
55+
## Vectors {#sec-what-vectors}
56+
57+
A SIMD operation is usually performed through a "SIMD intrinsic", which is just a fancy
58+
name for a function that performs a SIMD operation. These SIMD intrinsics (or "SIMD functions")
59+
always operate over a special type of object, which are called "vectors". So,
60+
in order to use SIMD, you have to create a "vector object".
61+
62+
A vector object is usually a fixed-sized block of 128 bits (16 bytes).
63+
As consequence, most vectors that you find in the wild are essentially arrays that contains 2 values of 8 bytes each,
64+
or, 4 values of 4 bytes each, or, 8 values of 2 bytes each, etc.
65+
However, different CPU models may have different extensions (or, "implementations") of SIMD,
66+
which may offer more types of vector objects that are bigger in size (256 bits or 512 bits)
67+
to accomodate more data into a single vector object.
68+
69+
You can create a new vector object in Zig by using the `@Vector()` built-in function. Inside this function,
70+
you specify the vector length (number of elements in the vector), and the data type of the elements
71+
of the vector. Only primitive data types are supported in these vector objects.
72+
In the example below, I'm creating two vector objects (`v1` and `v2`) of 4 elements of type `u32` each.
73+
74+
Also notice in the example below, that a third vector object (`v3`) is created from the
75+
sum of the previous two vector objects (`v1` plus `v2`). Therefore,
76+
math operations over vector objects take place element-wise by default, because
77+
the same operation (in this case, addition) is transformed into a single instruction
78+
that is replicated in parallel, across all elements of the vectors.
79+
80+
81+
```{zig}
82+
#| auto_main: true
83+
#| build_type: "run"
84+
const v1 = @Vector(4, u32){4, 12, 37, 9};
85+
const v2 = @Vector(4, u32){10, 22, 5, 12};
86+
const v3 = v1 + v2;
87+
try stdout.print("{any}\n", .{v3});
88+
```
89+
90+
This is how SIMD introduces more performance in your program. Instead of using a for loop
91+
to iterate through the elements of `v1` and `v2`, and adding them together, one element at a time,
92+
we enjoy the benefits of SIMD, which performs all 4 additions in parallel, at the same time.
93+
94+
Therefore, the `@Vector` structure in Zig is essentially, the Zig representation of SIMD vector objects.
95+
But the elements on these vector objects will be operated in parallel, if, and only if your current CPU model
96+
supports SIMD operations. If your CPU model does not support SIMD, then, the `@Vector` structure will
97+
likely produce a similar performance from a "for loop solution".
98+
99+
100+
### Transforming arrays into vectors
101+
102+
There are different ways you can transform a normal array into a vector object.
103+
You can either use implicit conversion (which is when you assign the array to
104+
a vector object directly), or, use slices to create a vector object from a normal array.
105+
106+
In the example below, we implicitly convert the array `a1` into a vector object (`v1`)
107+
of length 4. All we had to do was to just explicitly annotate the data type of the vector object,
108+
and then, assign the array object to this vector object.
109+
110+
Also notice in the example below, that a second vector object (`v2`) is also created
111+
by taking a slice of the array object (`a1`), and then, storing the pointer to this
112+
slice (`.*`) into this vector object.
113+
114+
115+
```{zig}
116+
#| auto_main: true
117+
#| build_type: "run"
118+
const a1 = [4]u32{4, 12, 37, 9};
119+
const v1: @Vector(4, u32) = a1;
120+
const v2: @Vector(2, u32) = a1[1..3].*;
121+
_ = v1; _ = v2;
122+
```
123+
124+
125+
Is worth emphasizing that only arrays and slices whose sizes
126+
are compile-time known can be transformed into vectors. Vectors in general
127+
are structures that work only with compile-time known sizes. Therefore, if
128+
you have an array whose size is runtime known, then, you first need to
129+
copy it into an array with a compile-time known size, before transforming it into a vector.
130+
131+
132+
133+
### The `@splat()` function
134+
135+
You can use the `@splat()` built-in function to create a vector object that is filled
136+
with the same value across all of it's elements. This function was created to offer a quick
137+
and easy way to directly convert a scalar value (a.k.a. a single value, like a single character, or a single integer, etc.)
138+
into a vector object.
139+
140+
Thus, we can use `@splat()` to convert a single value, like the integer `16` into a vector object
141+
of length 1. But we can also use this function to convert the same integer `16` into a
142+
vector object of length 10, that is filled with 10 `16` values. The example below demonstrates
143+
this idea.
144+
145+
```{zig}
146+
#| auto_main: true
147+
#| build_type: "run"
148+
const v1: @Vector(10, u32) = @splat(16);
149+
try stdout.print("{any}\n", .{v1});
150+
```
151+
152+
153+
154+
### Careful with vectors that are too big
155+
156+
As I described at @sec-what-vectors, each vector object is usually a small block of 128, 256 or 512 bits.
157+
This means that a vector object is usually small in size, and when you try to go in the opposite direction,
158+
by creating a vector object in Zig that is very big in size (i.e. sizes that are close to $2^{20}$),
159+
you usually end up with crashes and loud errors from the compiler.
160+
161+
For example, if you try to compile the program below, you will likely face segmentation faults, or, LLVM errors during
162+
the build process. Just be careful to not create vector objects that are too big in size.
163+
164+
```{zig}
165+
#| eval: false
166+
const v1: @Vector(1000000, u32) = @splat(16);
167+
_ = v1;
168+
```
169+
170+
```
171+
Segmentation fault (core dumped)
172+
```
173+
174+
175+
176+

ZigExamples/image_filter/src/test.zig

+19-13
Original file line numberDiff line numberDiff line change
@@ -36,19 +36,27 @@ fn read_data_to_buffer(ctx: *png.spng_ctx, buffer: []u8) !void {
3636

3737
fn apply_image_filter(buffer: []u8) !void {
3838
const len = buffer.len;
39-
const red_factor: f16 = 0.2126;
40-
const green_factor: f16 = 0.7152;
41-
const blue_factor: f16 = 0.0722;
42-
var index: u64 = 0;
39+
var rv: @Vector(1080000, f16) = @splat(0.0);
40+
var gv: @Vector(1080000, f16) = @splat(0.0);
41+
var bv: @Vector(1080000, f16) = @splat(0.0);
42+
43+
var index: usize = 0;
44+
var vec_index: usize = 0;
4345
while (index < (len - 4)) : (index += 4) {
44-
const rf: f16 = @floatFromInt(buffer[index]);
45-
const gf: f16 = @floatFromInt(buffer[index + 1]);
46-
const bf: f16 = @floatFromInt(buffer[index + 2]);
47-
const y_linear: f16 = ((rf * red_factor) + (gf * green_factor) + (bf * blue_factor));
48-
buffer[index] = @intFromFloat(y_linear);
49-
buffer[index + 1] = @intFromFloat(y_linear);
50-
buffer[index + 2] = @intFromFloat(y_linear);
46+
rv[vec_index] = @floatFromInt(buffer[index]);
47+
gv[vec_index + 1] = @floatFromInt(buffer[index + 1]);
48+
bv[vec_index + 2] = @floatFromInt(buffer[index + 2]);
49+
vec_index += 3;
5150
}
51+
52+
const rfactor: @Vector(1080000, f16) = @splat(0.2126);
53+
const gfactor: @Vector(1080000, f16) = @splat(0.7152);
54+
const bfactor: @Vector(1080000, f16) = @splat(0.0722);
55+
rv = rv * rfactor;
56+
gv = gv * gfactor;
57+
bv = bv * bfactor;
58+
const result = rv + gv + bv;
59+
try stdout.print("{any}\n", .{result});
5260
}
5361

5462
fn save_png(image_header: *png.spng_ihdr, buffer: []u8) !void {
@@ -82,12 +90,10 @@ pub fn main() !void {
8290

8391
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
8492
const allocator = gpa.allocator();
85-
var image_header = try get_image_header(ctx);
8693
const output_size = try calc_output_size(ctx);
8794
var buffer = try allocator.alloc(u8, output_size);
8895
@memset(buffer[0..], 0);
8996

9097
try read_data_to_buffer(ctx, buffer[0..]);
9198
try apply_image_filter(buffer[0..]);
92-
try save_png(&image_header, buffer[0..]);
9399
}

ZigExamples/vectors/build.zig

+91
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
const std = @import("std");
2+
3+
// Although this function looks imperative, note that its job is to
4+
// declaratively construct a build graph that will be executed by an external
5+
// runner.
6+
pub fn build(b: *std.Build) void {
7+
// Standard target options allows the person running `zig build` to choose
8+
// what target to build for. Here we do not override the defaults, which
9+
// means any target is allowed, and the default is native. Other options
10+
// for restricting supported target set are available.
11+
const target = b.standardTargetOptions(.{});
12+
13+
// Standard optimization options allow the person running `zig build` to select
14+
// between Debug, ReleaseSafe, ReleaseFast, and ReleaseSmall. Here we do not
15+
// set a preferred release mode, allowing the user to decide how to optimize.
16+
const optimize = b.standardOptimizeOption(.{});
17+
18+
const lib = b.addStaticLibrary(.{
19+
.name = "vectors",
20+
// In this case the main source file is merely a path, however, in more
21+
// complicated build scripts, this could be a generated file.
22+
.root_source_file = b.path("src/root.zig"),
23+
.target = target,
24+
.optimize = optimize,
25+
});
26+
27+
// This declares intent for the library to be installed into the standard
28+
// location when the user invokes the "install" step (the default step when
29+
// running `zig build`).
30+
b.installArtifact(lib);
31+
32+
const exe = b.addExecutable(.{
33+
.name = "vectors",
34+
.root_source_file = b.path("src/main.zig"),
35+
.target = target,
36+
.optimize = optimize,
37+
});
38+
39+
// This declares intent for the executable to be installed into the
40+
// standard location when the user invokes the "install" step (the default
41+
// step when running `zig build`).
42+
b.installArtifact(exe);
43+
44+
// This *creates* a Run step in the build graph, to be executed when another
45+
// step is evaluated that depends on it. The next line below will establish
46+
// such a dependency.
47+
const run_cmd = b.addRunArtifact(exe);
48+
49+
// By making the run step depend on the install step, it will be run from the
50+
// installation directory rather than directly from within the cache directory.
51+
// This is not necessary, however, if the application depends on other installed
52+
// files, this ensures they will be present and in the expected location.
53+
run_cmd.step.dependOn(b.getInstallStep());
54+
55+
// This allows the user to pass arguments to the application in the build
56+
// command itself, like this: `zig build run -- arg1 arg2 etc`
57+
if (b.args) |args| {
58+
run_cmd.addArgs(args);
59+
}
60+
61+
// This creates a build step. It will be visible in the `zig build --help` menu,
62+
// and can be selected like this: `zig build run`
63+
// This will evaluate the `run` step rather than the default, which is "install".
64+
const run_step = b.step("run", "Run the app");
65+
run_step.dependOn(&run_cmd.step);
66+
67+
// Creates a step for unit testing. This only builds the test executable
68+
// but does not run it.
69+
const lib_unit_tests = b.addTest(.{
70+
.root_source_file = b.path("src/root.zig"),
71+
.target = target,
72+
.optimize = optimize,
73+
});
74+
75+
const run_lib_unit_tests = b.addRunArtifact(lib_unit_tests);
76+
77+
const exe_unit_tests = b.addTest(.{
78+
.root_source_file = b.path("src/main.zig"),
79+
.target = target,
80+
.optimize = optimize,
81+
});
82+
83+
const run_exe_unit_tests = b.addRunArtifact(exe_unit_tests);
84+
85+
// Similar to creating the run step earlier, this exposes a `test` step to
86+
// the `zig build --help` menu, providing a way for the user to request
87+
// running the unit tests.
88+
const test_step = b.step("test", "Run unit tests");
89+
test_step.dependOn(&run_lib_unit_tests.step);
90+
test_step.dependOn(&run_exe_unit_tests.step);
91+
}

0 commit comments

Comments
 (0)