Skip to content
This repository was archived by the owner on Sep 9, 2024. It is now read-only.

Possible bug (skipping last bytes) in uncompress function due to sIdx and eIdx #55

Open
CanisLupus opened this issue Oct 1, 2017 · 4 comments

Comments

@CanisLupus
Copy link

Doesn't the following code cause the end bytes of the input to be skipped if sIdx is not 0?

exports.uncompress = function (input, output, sIdx, eIdx) {
        sIdx = sIdx || 0
        eIdx = eIdx || (input.length - sIdx)
        // Process each sequence in the incoming data
        for (var i = sIdx, n = eIdx, j = 0; i < n;) {
        // ...

I had trouble with this until I noticed that eIdx is (input.length - sIdx) instead of input.length, causing the last "sIdx" bytes to not be processed.

Is this correct? sIdx and eIdx are not documented in the params above the function, but still. :)

@pierrec
Copy link
Owner

pierrec commented Oct 1, 2017

sIdx is the starting index (default=0)
eIdx is the end index (default=length of buffer - start index)

If sIdx is supplied and eIdx is not, then all bytes from sIdx are processed.
If sIdx is supplied and eIDx is as well, then all the specified bytes are processed. Of course, in that case the indexes must be valid for the block to be decoded.

I think the signature was originally defined as per the C version, but the indexes are never used. Should probably be removed. Want to take a stab?

@CanisLupus
Copy link
Author

The indexes were useful for me, since I was trying to decode the mozlz4 format used on a few Firefox files and it uses a custom 8-byte header plus the data size field, all of which I had to skip, so I used sIdx only. Of course, it could be done by slicing the input externally, but this felt better. :)

But "length minus start index" is the part I'm not understanding. If we pass sIdx but not eIdx, the uncompress function will process input from sIdx to length MINUS sIdx, meaning that if we have an input of 100 bytes and try to skip the first 10, it will process not the remaining 90 bytes, but only 80, since the last 10 are not included. Additionally, if we try to skip the first 50 bytes, sIdx will get 50 and eIdx will get 100 - 50, which is also 50, so no processing is done (starts and ends at index 50).

I would also argue that the variable "n" is unnecessary, since it is a synonym of eIdx.

Am I missing something? I was decoding a semi-custom lz4 format, after all, so I might not be understanding how the real format is supposed to work. Maybe "n" or "eIdx" was supposed to be something else? Or maybe I'm just wrong. :)

@pierrec
Copy link
Owner

pierrec commented Oct 1, 2017

Ah I see, you are right indeed.
I think that the intent was to specify none or both start/end indexes, not just one of them, in which case it is broken.
Should just remove the start/end index and slice the input accordingly, it is much clearer IMHO.

@CanisLupus
Copy link
Author

Glad I wasn't seeing things. :) If you want to remove them it's your call, of course. I'm okay with slicing the input myself in my use-case, though slice returns a shallow copy, while using the indexes doesn't create new arrays. It is also really easy to fix by removing the "- sIdx" (and maybe replace "n" with "eIdx"), but I understand either way.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants