|
| 1 | +""" |
| 2 | +297. Serialize and Deserialize Binary Tree |
| 3 | +
|
| 4 | +https://leetcode.com/problems/serialize-and-deserialize-binary-tree |
| 5 | +
|
| 6 | +NOTES |
| 7 | + * Serialization and deserialization has applicability in real world software |
| 8 | + engineering, making this a great problem! |
| 9 | +
|
| 10 | +/!\ NOTE /!\ |
| 11 | +This solution description comprises my original thought process, backtracking |
| 12 | +when a solution was nonviable, and a final correct solution. I've retroactively |
| 13 | +provided additional context in order call attention to flaws in my original |
| 14 | +approach. |
| 15 | +
|
| 16 | +--- |
| 17 | +
|
| 18 | +Serialization is the process of converting information (e.g., a data structure |
| 19 | +or object) into a sequence of bits, so that it can be stored on disk or in |
| 20 | +memory, or transmitted over a network. Deserialization is the process of |
| 21 | +constructing the serialized information. |
| 22 | +
|
| 23 | +Serialization and deserialization is facilitated by a codec (short for |
| 24 | +coder/decoder), which enables both data compression and data conversion. |
| 25 | +
|
| 26 | +Let's recall from 'Construct Binary Tree from Preorder and Inorder Traversal' |
| 27 | +that: |
| 28 | +
|
| 29 | + >No single traversal order (pre-order, post-order, or in-order) uniquely |
| 30 | + identifies the structure of a tree... |
| 31 | +
|
| 32 | +/!\ NOTE /!\ |
| 33 | +This statement is only true when we do not account for gaps in the tree. Adding |
| 34 | +gaps as well as node values gives us enough information to uniquely determine |
| 35 | +the structure of the tree. |
| 36 | +
|
| 37 | +Therefore, the serialized tree will need to store both the pre-order and |
| 38 | +in-order traversal in order to uniquely reconstruct the binary tree. |
| 39 | +
|
| 40 | +Each node has a value between -1000 and 1000, which means we will need to |
| 41 | +represent 2001 numbers. To find the minimum number of bits required to |
| 42 | +represent all possible node values in the tree, we need to find n where 2^n ≥ |
| 43 | +2001, since each bit pattern must uniquely identify a number. Taking the log |
| 44 | +(base-2) of both sides, results in the following: |
| 45 | +
|
| 46 | + n ≥ log(2001) ≈ 10.97 |
| 47 | +
|
| 48 | +Therefore, rounding up, we need 11 bits, 2^11 = 2048, to represent 2001 values. |
| 49 | +Further leveraging the fact that node values are constrained to -1000 and 1000, |
| 50 | +we can create a simple codec by concatenating both the pre-order and in-order |
| 51 | +traversals. A special sequence, 01111111111 (1023 in base-10), is used to |
| 52 | +denote the termination of the pre-order sequence and start of the in-order |
| 53 | +sequence. NOTE: 01111111111 is 1023 in two's complement. |
| 54 | +
|
| 55 | +/!\ NOTE /!\ |
| 56 | +This approach only works for trees with node values that are unique. Looking |
| 57 | +back at 'Construct Binary Tree from Preorder and Inorder Traversal', this was |
| 58 | +one of the problem constraints: |
| 59 | +
|
| 60 | + >preorder and inorder consist of *unique* values. |
| 61 | +
|
| 62 | +So, the correct solution involves using either a depth-first or breadth-first |
| 63 | +traversal, while accounting for gaps. Null nodes are denoted by a null marker. |
| 64 | +Here, we can reuse the special sequence above to denote gaps in the tree. An |
| 65 | +added element of complexity to this approach is the deserialization logic must |
| 66 | +account for the fact that the serialization does not include all gaps. |
| 67 | +
|
| 68 | +/!\ NOTE /!\ |
| 69 | +Creating a serialization that accounts for all gaps in the tree, essentially |
| 70 | +representing a complete binary tree, exceeds the time limit. |
| 71 | +
|
| 72 | +In the end, I learned a new algorithm for serializing and deserializing binary |
| 73 | +trees. This is the same algorithm used by LeetCode. |
| 74 | +
|
| 75 | +Example: |
| 76 | +
|
| 77 | + [1, 2, 3, None, None, 4, 5, 6, 7] |
| 78 | +
|
| 79 | + 1 |
| 80 | + ● |
| 81 | + / \ |
| 82 | + 2 3 |
| 83 | + ● ● |
| 84 | + / \ |
| 85 | + 4 5 |
| 86 | + ● ● |
| 87 | + / \ |
| 88 | + 6 7 |
| 89 | + ● ● |
| 90 | +""" |
| 91 | + |
| 92 | +from collections import deque |
| 93 | + |
| 94 | +from src.classes import TreeNode |
| 95 | + |
| 96 | + |
| 97 | +class Codec: |
| 98 | + NULL_MARKER = 1023 # 01111111111 in two's complement |
| 99 | + |
| 100 | + def serialize(self, root: TreeNode | None) -> str: |
| 101 | + """ |
| 102 | + Encodes a tree into a string. |
| 103 | + """ |
| 104 | + s = "" |
| 105 | + |
| 106 | + if not root: |
| 107 | + return s |
| 108 | + |
| 109 | + q: deque[TreeNode | None] = deque([root]) |
| 110 | + |
| 111 | + while q: |
| 112 | + curr: TreeNode | None = q.popleft() |
| 113 | + if curr: |
| 114 | + s += format(curr.val, "011b") |
| 115 | + else: |
| 116 | + s += format(self.NULL_MARKER, "011b") |
| 117 | + if curr: |
| 118 | + q.append(curr.left) |
| 119 | + q.append(curr.right) |
| 120 | + |
| 121 | + return s |
| 122 | + |
| 123 | + def deserialize(self, data: str) -> TreeNode | None: |
| 124 | + """ |
| 125 | + Decodes a string into a tree. |
| 126 | + """ |
| 127 | + l: list[int | None] = [] |
| 128 | + |
| 129 | + # Iterate over the data in 11 bit chunks. |
| 130 | + for i in range(0, len(data), 11): |
| 131 | + bits = data[i : i + 11] |
| 132 | + # Convert the chunk to an integer using two's complement. |
| 133 | + value = int(bits, 2) |
| 134 | + # If the leftmost bit is 1, the value was negative, so we have to |
| 135 | + # convert from unsigned to two's complement by subtracting 2^11 |
| 136 | + # (2048). |
| 137 | + if bits[0] == "1": |
| 138 | + value -= 1 << 11 |
| 139 | + if value == self.NULL_MARKER: |
| 140 | + l.append(None) |
| 141 | + else: |
| 142 | + l.append(value) |
| 143 | + |
| 144 | + if not l: |
| 145 | + return None |
| 146 | + |
| 147 | + root = TreeNode(val=l[0]) |
| 148 | + q: deque[TreeNode] = deque([root]) |
| 149 | + i = 1 |
| 150 | + |
| 151 | + # The crucial property of this algorithm is that i increments by 2 |
| 152 | + # every iteration, while nodes are only enqueued if l[i] is not None. |
| 153 | + # This ensures our index into l is always aligned with the possible |
| 154 | + # left and right child nodes of the current node under consideration. |
| 155 | + # This allows the tree structure to be determined without storing |
| 156 | + # additional null nodes. |
| 157 | + while q and i < len(l): |
| 158 | + curr: TreeNode = q.popleft() |
| 159 | + if l[i] is not None: |
| 160 | + left = TreeNode(val=l[i]) |
| 161 | + curr.left = left |
| 162 | + q.append(left) |
| 163 | + i += 1 |
| 164 | + if l[i] is not None: |
| 165 | + right = TreeNode(val=l[i]) |
| 166 | + curr.right = right |
| 167 | + q.append(right) |
| 168 | + i += 1 |
| 169 | + |
| 170 | + return root |
0 commit comments