This repository has been archived by the owner on Dec 22, 2021. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 14
/
Copy pathChangeLog
217 lines (165 loc) · 7.85 KB
/
ChangeLog
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
2009-08-20 Sylvain Collange <[email protected]>
* Fixed number of bits of operand 4 constant segment
* Added Abs/Sat and rounding mode modifiers to cvt0 (I2I)
* Fixed number of bits of operand 3
* Added mad24.s16 and mad24 with carry
* Reduced madness in mad.f32/mac.f32 (hopefully)
* Added .atom flag for atomic shared memory moves (SM 1.2)
* Added double-precision instruction decoding (SM 1.3)
2006-06-20 Wladimir J. van der Laan <[email protected]>
* Fix command line options, especially -k
* Default to plain not ansi output
2008-06-11 Wladimir J. van der Laan <[email protected]>
* Rename decuda to decuda.py and cudasm to cudasm.py,
this is more convenient for windows users.
2008-01-03 Wladimir J. van der Laan <[email protected]>
* Add 0x12 instruction (load from shared) to assembler and
disassembler
2007-12-22 Wladimir J. van der Laan <[email protected]>
* Fix deltaf instruction for disassembler
2007-12-07 Wladimir J. van der Laan <[email protected]>
* re-assembly fixes
* support inversion (~) on logic ops
* assembler operand lookup restructured
* factor out important constants from assembler rule table, this
makes it easier to just cache the data structure later on, for
improved load time.
* Release 0.4.1
2007-12-06 Wladimir J. van der Laan <[email protected]>
* add all half instructions into instruction table
* fix flip bits for half add instruction
* simplify operand specification in instruction table
* finish rules for all halfops except tex
* finish rules for all immops
* add .half on disassembly of half instructions (for easier
re-assembly)
* remove debugging stuff from CubinFile, change Formatter in
preparation to re-parse segments
* Immediate values ending on 0xf were being rejected, solved
* disassemble bar.sync in a sane way
2007-12-05 Wladimir J. van der Laan <[email protected]>
* found out and/xor/or/not instruction format
* fix not instruction
* add support for .half 32 bit instructions (for now, only add and
mov)
* fix constant segment in half instructions
* post-increment offset registers ($ofs+=4)
* remove [$ofsX+$rX] addressing mode as it doesn't exist
2007-12-03 Wladimir J. van der Laan <[email protected]>
* cvt.neg instruction added
2007-12-02 Wladimir J. van der Laan <[email protected]>
* "Immediate operand 3" is only allowed on a few instructions, mad
and others that use subsubop bits as an input are certainly not
one of them.
* Version 0.4.0
2007-12-01 Wladimir J. van der Laan <[email protected]>
* Improve error reporting of assembler somewhat
2007-11-30 Wladimir J. van der Laan <[email protected]>
* mul24 instruction fixes (quite nasty)
* basic flow control
* labels are resolved
* predication (in and out)
* assembler is now functional
* format predicates and labels in re-parsable format
* Logic.py contains all condition code / flag combinations
2007-11-29 Wladimir J. van der Laan <[email protected]>
* cvt instruction fixes (half/full reg confusion)
* more assembler work
* Instruction.__repr__() actually changed operands. Oops.
* added cudasm
* cubin file loading and writing factored out to CubinFile class
2007-11-27 Wladimir J. van der Laan <[email protected]>
* Finally found out the meaning of subsubop in ld.const: type, like
in shared memory load
* Release 0.3.1
2007-11-25 Wladimir J. van der Laan <[email protected]>
* Assembler rules for operand selection cleaned up
* st.shared disassembler cleaned up, bit meanings now known
* there is no st or ld, only mov, without changes of perspective
2007-11-23 Wladimir J. van der Laan <[email protected]>
* Beginnings of assembler
* Get rid of multiplier in s[....] and c[....]
* d.u32 instruction to inject raw data
* Fix bug in disassemble offset registers, where $ofs4 was
erronously displayed as $ofs1
* More addressing fixes
2007-11-18 Wladimir J. van der Laan <[email protected]>
* Update readme on global memory segments
* promote OP_SIGN_NONE operands if one of the operands is signed
* Interpretation of alt bit for add.u32 instruction was wrong, it
flips the result, not one of the operands
* Register naming to decimal
2007-11-13 Wladimir J. van der Laan <[email protected]>
* Instructions can source from constant segment 2 or 3 as well
* Fill in condition codes
* Output registers added (as used in vertex shaders)
* Float16 conversion
* Release 0.3.0
* All condition codes complete (afaik)
2007-11-12 Wladimir J. van der Laan <[email protected]>
* decudaraw util - decode raw cubin code from commandline, useful
for disassembling shaders in mmio logs.
2007-11-09 Wladimir J. van der Laan <[email protected]>
* subsubop and subop are now a property instead of an attribute.
This helps find cases in which they are missed.
* Added an extensive internal debugging system to account for
each bit set in instruction words. This will, ultimately, help
a lot when writing the assembler counterpart.
* Heed 'ignore result' bit, no more register $r7f.
* addc (add with carry) instruction.
* add rounding op for add.f32.
* sop.4 and sop.5 defined
* some instructions are not predicated (nop, bar, some flow control)
* rounding mode for mad
2007-11-08 Wladimir J. van der Laan <[email protected]>
* Last argument of mad.f32 comes from register
* Release 0.2.9
* remove self.flags, this served no purpose beyond to confuse
bitfields
* solve an issue with constant operands (hopefully)
* self.bits(i,mask) instead of manually and/shift to clean up code
* mark visited bits, warn if we neglected any
2007-11-03 Wladimir J. van der Laan <[email protected]>
* Sanitize output type of tex instructions to b32. We can't be sure
what this is anyway, as it depends on the type of sampler.
* Print embedded const segments
* Extend README on G80 hardware
* Add .end modifier to last instruction, to make sure the GPU stops
executing this has a special marker.
* Texture instruction types 3 and 5 are unknown.
2007-11-02 Wladimir J. van der Laan <[email protected]>
* st.local and friends can use the index register+immediate as well
* Try to get some meaning out of unknown predicates like @$10
* Version 0.2.6
* ld.pred and st.pred instructions
* There are four offset registers, and there was an hidden extra bit
signifying bit 2 of the offset
* Predicate registers are now separate register class, as they should be
* Disassemble all kernels in the cubin file by default
* Version 0.2.7
* Sign bits in the add.f32 instruction are now used
* ldgpu loads from internal register space, nothing more
* tex instruction solved
* Version 0.2.8
2007-10-31 Wladimir J. van der Laan <[email protected]>
* Grid parameters aren't in a separate space after all, just in a
special type (u16)
* Version 0.2.5
2007-10-30 Wladimir J. van der Laan <[email protected]>
* Change G80 hardware section in Readme to reflect new findings
* Version 0.2.4
2007-10-29 Wladimir J. van der Laan <[email protected]>
* Make shared memory / grid parameter handling saner
* Version 0.2.3 release
* Multiple constant segments (up to 16)
* cvn.sat instruction is decoded without glitch now
* half-registers ($rx.low, $rx.high) now supported as separate operand type
* Fix handling of offset registers
2007-10-27 Wladimir J. van der Laan <[email protected]>
* Version 0.2.2 was released.
* Add better error reporting, so buggy texture instructions don't
crash disassembly.
* Version 0.2.1 was released.
* Adds -p and -n options, and solves some problem is disassembling
ld.shared instructions.
* This file was created.