@@ -983,10 +983,21 @@ The first project that we are going to build and discuss in this book is a base6
983
983
But in order for us to build such a thing, we need to get a better understanding on how strings work in Zig.
984
984
So let's discuss this specific aspect of Zig.
985
985
986
- In Zig, a string literal (or a string object if you prefer) is a pointer to a null-terminated array
987
- of bytes. Each byte in this array is represented by an ` u8 ` value, which is an unsigned 8 bit integer,
986
+ In Zig, a string literal value is just a pointer to a null-terminated array of bytes (i.e. the same thing as a C string).
987
+ However, a string object in Zig is a little more than just a pointer. A string object
988
+ in Zig is an object of type ` []const u8 ` , and, this object always contains two things: the
989
+ same null-terminated array of bytes that you would find in a string literal value, plus a length value.
990
+ Each byte in this "array of bytes" is represented by an ` u8 ` value, which is an unsigned 8 bit integer,
988
991
so, it is equivalent to the C data type ` unsigned char ` .
989
992
993
+ ``` {zig}
994
+ #| eval: false
995
+ // This is a string literal value:
996
+ "A literal value";
997
+ // This is a string object:
998
+ const object: []const u8 = "A string object";
999
+ ```
1000
+
990
1001
Zig always assumes that this sequence of bytes is UTF-8 encoded. This might not be true for every
991
1002
sequence of bytes you have it, but is not really Zig's job to fix the encoding of your strings
992
1003
(you can use [ ` iconv ` ] ( https://www.gnu.org/software/libiconv/ ) [ ^ libiconv ] for that).
@@ -1015,7 +1026,7 @@ pub fn main() !void {
1015
1026
1016
1027
1017
1028
If you want to see the actual bytes that represents a string in Zig, you can use
1018
- a ` for ` loop to iterate trough each byte in the string, and ask Zig to print each byte as an hexadecimal
1029
+ a ` for ` loop to iterate through each byte in the string, and ask Zig to print each byte as an hexadecimal
1019
1030
value to the terminal. You do that by using a ` print() ` statement with the ` X ` formatting specifier,
1020
1031
like you would normally do with the [ ` printf() ` function] ( https://cplusplus.com/reference/cstdio/printf/ ) [ ^ printfs ] in C.
1021
1032
@@ -1026,9 +1037,9 @@ like you would normally do with the [`printf()` function](https://cplusplus.com/
1026
1037
const std = @import("std");
1027
1038
const stdout = std.io.getStdOut().writer();
1028
1039
pub fn main() !void {
1029
- const string_literal = "This is an example of string literal in Zig";
1040
+ const string_object = "This is an example of string literal in Zig";
1030
1041
try stdout.print("Bytes that represents the string object: ", .{});
1031
- for (string_literal ) |byte| {
1042
+ for (string_object ) |byte| {
1032
1043
try stdout.print("{X} ", .{byte});
1033
1044
}
1034
1045
try stdout.print("\n", .{});
@@ -1037,8 +1048,8 @@ pub fn main() !void {
1037
1048
1038
1049
### Strings in C
1039
1050
1040
- At first glance, this looks very similar to how C treats strings as well. That is , string values
1041
- in C are also treated internally as an array of bytes, and this array is also null-terminated.
1051
+ At first glance, this looks very similar to how C treats strings as well. In more details , string values
1052
+ in C are treated internally as an array of arbitrary bytes, and this array is also null-terminated.
1042
1053
1043
1054
But one key difference between a Zig string and a C string, is that Zig also stores the length of
1044
1055
the array inside the string object. This small detail makes your code safer, because is much
@@ -1074,16 +1085,16 @@ Number of elements in the array: 25
1074
1085
```
1075
1086
1076
1087
But in Zig, you do not have to do this, because the object already contains a ` len `
1077
- field which stores the length information of the array. As an example, the ` string_literal ` object below is 43 bytes long:
1088
+ field which stores the length information of the array. As an example, the ` string_object ` object below is 43 bytes long:
1078
1089
1079
1090
1080
1091
``` {zig}
1081
1092
#| auto_main: false
1082
1093
const std = @import("std");
1083
1094
const stdout = std.io.getStdOut().writer();
1084
1095
pub fn main() !void {
1085
- const string_literal = "This is an example of string literal in Zig";
1086
- try stdout.print("{d}\n", .{string_literal .len});
1096
+ const string_object = "This is an example of string literal in Zig";
1097
+ try stdout.print("{d}\n", .{string_object .len});
1087
1098
}
1088
1099
```
1089
1100
@@ -1095,19 +1106,19 @@ Now, we can inspect better the type of objects that Zig create. To check the typ
1095
1106
is a array of 4 elements. Each element is a signed integer of 32 bits which corresponds to the data type ` i32 ` in Zig.
1096
1107
That is what an object of type ` [4]i32 ` is.
1097
1108
1098
- But if we look closely at the type of the ` string_literal ` object below, you will find that this object is a
1109
+ But if we look closely at the type of the ` string_object ` object below, you will find that this object is a
1099
1110
constant pointer (hence the ` *const ` annotation) to an array of 43 elements (or 43 bytes). Each element is a
1100
1111
single byte (more precisely, an unsigned 8 bit integer - ` u8 ` ), that is why we have the ` [43:0]u8 ` portion of the type below.
1101
- In other words, the string stored inside the ` string_literal ` object is 43 bytes long.
1112
+ In other words, the string stored inside the ` string_object ` object is 43 bytes long.
1102
1113
That is why you have the type ` *const [43:0]u8 ` below.
1103
1114
1104
- In the case of ` string_literal ` , it is a constant pointer (` *const ` ) because the object ` string_literal ` is declared
1105
- as constant in the source code (in the line ` const string_literal = ... ` ). So, if we changed that for some reason, if
1106
- we declare ` string_literal ` as a variable object (i.e. ` var string_literal = ... ` ), then, ` string_literal ` would be
1115
+ In the case of ` string_object ` , it is a constant pointer (` *const ` ) because the object ` string_object ` is declared
1116
+ as constant in the source code (in the line ` const string_object = ... ` ). So, if we changed that for some reason, if
1117
+ we declare ` string_object ` as a variable object (i.e. ` var string_object = ... ` ), then, ` string_object ` would be
1107
1118
just a normal pointer to an array of unsigned 8-bit integers (i.e. ` * [43:0]u8 ` ).
1108
1119
1109
1120
Now, if we create an pointer to the ` simple_array ` object, then, we get a constant pointer to an array of 4 elements (` *const [4]i32 ` ),
1110
- which is very similar to the type of the ` string_literal ` object. This demonstrates that a string object (or a string literal)
1121
+ which is very similar to the type of the ` string_object ` object. This demonstrates that a string object (or a string literal)
1111
1122
in Zig is already a pointer to an array.
1112
1123
1113
1124
Just remember that a "pointer to an array" is different than an "array". So a string object in Zig is a pointer to an array
@@ -1120,12 +1131,12 @@ of bytes, and not simply an array of bytes.
1120
1131
const std = @import("std");
1121
1132
const stdout = std.io.getStdOut().writer();
1122
1133
pub fn main() !void {
1123
- const string_literal = "This is an example of string literal in Zig";
1134
+ const string_object = "This is an example of string literal in Zig";
1124
1135
const simple_array = [_]i32{1, 2, 3, 4};
1125
1136
try stdout.print("Type of array object: {}", .{@TypeOf(simple_array)});
1126
1137
try stdout.print(
1127
1138
"Type of string object: {}",
1128
- .{@TypeOf(string_literal )}
1139
+ .{@TypeOf(string_object )}
1129
1140
);
1130
1141
try stdout.print(
1131
1142
"Type of a pointer that points to the array object: {}",
@@ -1162,9 +1173,9 @@ the unicode point 570 is actually stored inside the computer’s memory as the b
1162
1173
const std = @import("std");
1163
1174
const stdout = std.io.getStdOut().writer();
1164
1175
pub fn main() !void {
1165
- const string_literal = "Ⱥ";
1176
+ const string_object = "Ⱥ";
1166
1177
try stdout.print("Bytes that represents the string object: ", .{});
1167
- for (string_literal ) |char| {
1178
+ for (string_object ) |char| {
1168
1179
try stdout.print("{X} ", .{char});
1169
1180
}
1170
1181
}
0 commit comments