Fix encapsulated pixeldata handling #103

weanti · 2025-06-26T13:20:33Z

Fix constructing frame offsets for encapsulated pixel data.
Fix reading frames from encapsulated pixel data.

Remove restrictions for bits stored value, because it is not valid in all cases e.g. CT.

… specific e.g. in case of CT it can be 12,13,14,15,16

…ally not read, undefined length was assumed use more intuitive offset calculation: previous offset + tag and length size + current length

jcupitt · 2025-06-30T10:56:48Z

Hi @weanti,

This looks great! Thank you for doing this work.

Could you explain what kinds of DICOM you are working with? Do you have some sample DICOMs I could use for testing?

You probably saw the other PR (#102): we should probably merge that first, since it covers some of the same ground.

weanti · 2025-06-30T11:58:53Z

Hi @weanti,

This looks great! Thank you for doing this work.

Could you explain what kinds of DICOM you are working with? Do you have some sample DICOMs I could use for testing?

You probably saw the other PR (#102): we should probably merge that first, since it covers some of the same ground.

Hi,

thank you for your response.
In the meantime I did further testing and I found problems. Memory overwrites and offset handling problems.
Sure, I try to attach or send some test files. Btw I used JPEG2000 encoded DICOM images.
First I need to polish this PR.

jcupitt · 2025-06-30T15:01:01Z

OK, let's tag this as a draft for now.

fix indexing problem when creating offsets table fix frame size calculation: bits allocated was not considered

weanti · 2025-07-01T19:27:29Z

I think I solved the problems.
I have attached a bunch of DICOM files. See the readme.txt for important properties.
UPDATE: uoloading files doesn't seem to work. Here is a link for a shared zip file: https://drive.google.com/file/d/1UPgq89YLx94XyiuGOupR2t0LtgQ4VnkS/view?usp=sharing

…s not considered

weanti · 2025-08-27T07:04:19Z

@jcupitt What's your opinion about this PR? The other PR (the one before this) is updated. If that is merged, then I'll rebase and update my PR.

jcupitt · 2025-08-27T09:22:18Z

Hi @weanti, sorry, I was on holiday and then got distracted by other projects.

I'll look this over again now.

jcupitt · 2025-08-27T11:48:28Z

... I saw one final tiny issue in #102, when that's resolved I'll look at this more closely.

weanti · 2025-08-27T12:04:34Z

... I saw one final tiny issue in #102, when that's resolved I'll look at this more closely.

Thank you for the feedback.

weanti · 2025-09-30T09:02:51Z

Resolved conflicts.

jcupitt · 2025-09-30T19:43:55Z

I'll read this tomorrow. Thanks for the update!

jcupitt · 2025-10-01T07:47:57Z

I downloaded the zip file, these are useful samples!

What's the licence? Could we add them to the test suite?

weanti · 2025-10-01T09:10:42Z

I downloaded the zip file, these are useful samples!

What's the licence? Could we add them to the test suite?
I downloaded the compsamples_j2k archive (a set of jpeg 2000 compressed images) from ftp://medical.nema.org/MEDICAL/Dicom/DataSets/WG04
This includes the SC1_J2KR and VL1_J2KR images.
Another set of images was dwnloaded from https://www.dcmtk.org/download/images/
nema97cd.zip.
This contains im309 in gems/dlx folder.
I think these are public domain.
I try to find the source of the segmentation_j2k.

jcupitt · 2025-10-01T08:33:07Z

src/dicom-file.c

-                                       &length);
+    uint32_t length = 0;
+    char* frame_data = NULL;
+    if ( dcm_is_encapsulated_transfer_syntax(filehandle->desc.transfer_syntax_uid) )


There's a getter for this, I would write:

const char *syntax = dcm_filehandle_get_transfer_syntax_uid(filehandle); uint32_t length = 0; char* frame_data = NULL; if (dcm_is_encapsulated_transfer_syntax(syntax)) {

jcupitt · 2025-10-01T09:04:20Z

src/dicom-parse.c

        .big_endian = is_big_endian(),
    };

+    const uint8_t bytes_per_pixel = desc->bits_allocated/8;


bits_allocated is defined as a uint16, so I think uint8 could (potentially!) be too small here. Let's make this uint16 as well.

libdicom style is for spaces around operators, so:

const uint16_t bytes_per_pixel = desc->bits_allocated / 8;

jcupitt · 2025-10-01T09:08:28Z

src/dicom-parse.c

+    while( position < frame_end_offset && tag != TAG_SQ_DELIM )
+    {
+        read_uint32(&state, &fragment_length, &position);
+        dcm_read(&state, fragment, fragment_length, &position );


You need dcm_require() here, you must loop on dcm_read() if you need a certain number of bytes.

I would also check the return status of require in case of IO errors.

So this second loop could perhaps be (untested!):

char *fragment = value; while (position < frame_end_offset) { uint32_t tag; read_tag(&state, &tag, &position); if (tag == TAG_SQ_DELIM) { break; } uint32_t fragment_length; read_uint32(&state, &fragment_length, &position); if (!dcm_require(&state, fragment, fragment_length, &position)) { free(value); return NULL; } fragment += fragment_length; }

jcupitt · 2025-10-01T09:21:45Z

src/dicom-parse.c

+    }
+    uint32_t fragment_length = 0;
+    // first determine the total length of bytes to be read
+    while( position < frame_end_offset && tag != TAG_SQ_DELIM )


Two tag reads and checking the first tag at the head of the loop is a little ugly. How about (untested!):

int64_t position = 0; // first determine the total length of bytes to be read *length = 0; while (position < frame_end_offset) { uint32_t tag; if (!read_tag(&state, &tag, &position)) { return NULL; } if (tag == TAG_SQ_DELIM) { break; } if (tag != TAG_ITEM) { dcm_error_set(error, DCM_ERROR_CODE_PARSE, "reading frame item failed", "no item tag found for frame item"); return NULL; } uint32_t fragment_length; if (!read_uint32(&state, &fragment_length, &position)) { return NULL; } *length += fragment_length; dcm_seekcur(&state, fragment_length, &position); }

Now you only have one tag read and the tag check always tests the tag read just before.

You can reorder the loop below in the same way.

Thanks. I managed to simplify the other loop.

jcupitt · 2025-10-01T09:53:39Z

src/dicom-parse.c

+                }
+                if ( fragment_idx < num_frames )
+                {
+                    offsets[fragment_idx] = offsets[fragment_idx-1] + 8/*tag and length field size*/ + length;


I'm not sure I understand. I think you can have many fragments in each frame, so don't you need two loops here?

for (i = 0; i < num_frames; i++) { record position as start of frame i while (frame has fragments) skip fragment }

This part is a kind of an assumption. It may be according to the standard, but the standard is not an easy reading.
This part covers the case where there is no Basic Offset Table (so the frame offsets are absent). In this case we need to find the frame offsets ourselves.
Here is the assumption: if there is no BOT then I assume that each frame consists of only 1 fragment.
Why do I assume this: If each frame could have arbitrary number of fragments then it would be impossible to find out which fagment belongs to which frame. This means that each frame shall have the same amount of fragments.
We could use a different assumption here: number of fragments = C * number of frames, where C >= 1 and C is an integer. This would complicate things a bit. I think encapsulation is used mainly for compressed Transfer Syntaxes and there is no guarantee that each frame can be compressed in a way that they span across exactly C fragment.
Feel free to ask or propose a different approach to assign fragments to frames.

jcupitt · 2025-10-01T09:56:06Z

Great! Still to do:

add some tests
add a line to the changelog and credit yourself
reformat for libdicom style
some minor restructuring, as noted

… handling, etc

fedorov · 2025-10-02T16:00:51Z

What's the licence? Could we add them to the test suite?

I downloaded the compsamples_j2k archive (a set of jpeg 2000 compressed images) from ftp://medical.nema.org/MEDICAL/Dicom/DataSets/WG04
This includes the SC1_J2KR and VL1_J2KR images.
Another set of images was dwnloaded from https://www.dcmtk.org/download/images/
nema97cd.zip.
This contains im309 in gems/dlx folder.
I think these are public domain.
I try to find the source of the segmentation_j2k.

@dclunie can you confirm what is the license for the images available from the NEMA FTP server?

@michaelonken @jriesmeier how about the images shared in https://www.dcmtk.org/download/images/ ?

dclunie · 2025-10-02T18:45:45Z

There is no specified license for the CAR97, NEMA97, or WG04 images - we had intended all of these to be publicly usable without restriction (we gave away the CDs at meetings, and shared the images online by anonymous ftp), but never defined a license.

michaelonken · 2025-10-03T09:59:51Z

David said it all ☝️

[Edit: The folder ddsm/ has its own README. The collection of images stems from the University of South Florida and has originally been provided in non-DICOM format. We converted it (I don't remember the exact context) and offered to host these images on the OFFIS servers (which they were fine with). I found a copy of the now-offline original website in the Internet archive. The images have been part of a research grant; maybe you find more information if you dig through the site.]

weanti · 2025-10-06T12:15:43Z

Great! Still to do:

add some tests

add a line to the changelog and credit yourself

reformat for libdicom style

some minor restructuring, as noted

I'm working on tests. Will take some time due to limited availability.

weanti · 2025-10-09T12:00:17Z

Added some tests. These are rather functional test, because the implementation that handles encapsulated pixel data is not on the public API.
The test data is generated and may not be DICOM conformant e.g. not really loadable by real applications. The tests focus on the "happy path". Shall I add some tests for the error cases as well?

Antal Ispanovity added 3 commits June 26, 2025 15:03

remove restriction for bits stored, because its value can be modality…

87c0436

… specific e.g. in case of CT it can be 12,13,14,15,16

fix offset reading and frame reading for encapsulated pixel data

835ed96

fix offset calculation for encapsualted fragments: length was essenti…

d190231

…ally not read, undefined length was assumed use more intuitive offset calculation: previous offset + tag and length size + current length

jcupitt marked this pull request as draft June 30, 2025 15:01

split regular and encapsulated frame parsing

0ceb9cc

fix indexing problem when creating offsets table fix frame size calculation: bits allocated was not considered

weanti marked this pull request as ready for review July 3, 2025 04:18

fix offset calculation for non-encapsulated frames: bits allocated wa…

c02f192

…s not considered

Merge branch 'main' into main

7a08c78

jcupitt reviewed Oct 1, 2025

View reviewed changes

coding style, code documentation, review fixes: simplification, error…

ab8b9f0

… handling, etc

functional tests for encapsulated pixel data handling

ac71ff4

Fix encapsulated pixeldata handling #103

Are you sure you want to change the base?

Fix encapsulated pixeldata handling #103

Uh oh!

Conversation

weanti commented Jun 26, 2025

Uh oh!

jcupitt commented Jun 30, 2025

Uh oh!

weanti commented Jun 30, 2025

Uh oh!

jcupitt commented Jun 30, 2025

Uh oh!

weanti commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

weanti commented Aug 27, 2025

Uh oh!

jcupitt commented Aug 27, 2025

Uh oh!

jcupitt commented Aug 27, 2025

Uh oh!

weanti commented Aug 27, 2025

Uh oh!

weanti commented Sep 30, 2025

Uh oh!

jcupitt commented Sep 30, 2025

Uh oh!

jcupitt commented Oct 1, 2025

Uh oh!

weanti commented Oct 1, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jcupitt commented Oct 1, 2025

Uh oh!

fedorov commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dclunie commented Oct 2, 2025

Uh oh!

michaelonken commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

weanti commented Oct 6, 2025

Uh oh!

weanti commented Oct 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

weanti commented Jul 1, 2025 •

edited

Loading

fedorov commented Oct 2, 2025 •

edited

Loading

michaelonken commented Oct 3, 2025 •

edited

Loading