JSON BOM serialization "trims" whitespace from DPKG license text (XML does not)

# Background

On debian & ubuntu systems the dpkg copyright files are (in modern times, thank goodness) intended to be [machine readable according to this spec](https://www.debian.org/doc/manuals/maint-guide/dreq.en.html#copyright).  The [CycloneDX linux generator](https://github.com/CycloneDX/cyclonedx-linux-generator) on Ubuntu faithfully replicates the text of the copyright file into `components/[]/licenses/[]/license/text/content` as one might expect.

According to [the JSON AbstractBomGenerator.java line 68](https://github.com/CycloneDX/cyclonedx-core-java/blob/a27a57cf92e7aa06c4c0a221e3c6ff22a42dbd73/src/main/java/org/cyclonedx/generators/json/AbstractBomJsonGenerator.java#L68) it would appear *ALL STRINGS*, when serialized to JSON, are serialized with [TrimStringSerialize](https://github.com/CycloneDX/cyclonedx-core-java/blob/a27a57cf92e7aa06c4c0a221e3c6ff22a42dbd73/src/main/java/org/cyclonedx/util/TrimStringSerializer.java) which not only trims whitespace but [removes it similar to how an HTML processor might](https://github.com/CycloneDX/cyclonedx-core-java/blob/a27a57cf92e7aa06c4c0a221e3c6ff22a42dbd73/src/main/java/org/cyclonedx/util/TrimStringSerializer.java#L35).

The [XML AbstractBomXmlGenerator.java](https://github.com/CycloneDX/cyclonedx-core-java/blob/a27a57cf92e7aa06c4c0a221e3c6ff22a42dbd73/src/main/java/org/cyclonedx/generators/xml/AbstractBomXmlGenerator.java) does not remove whitespace, which would seem to be the correct behavior.

# Bug

1. I would argue that not all strings in BOMs should have their whitespace remove & coalesced when converted to JSON. Copyright and license file text in particular is a good example where replicating the original is probably best.
2. I think the JSON & XML formats of the same BOM should contain identical data, this includes text/strings and their whitespace.

# History

* @stevespringett implemented `TrimStringSerializer` in https://github.com/CycloneDX/cyclonedx-core-java/commit/0fab7fb8855887f1e359d849732b9fbc8f89a01d#diff-9f5ef24a21ed10eaae782875e4efc0cd90cec8b7f598bee89d096f50431db5cc.
* That was in turn a cleanup of his earlier work in https://github.com/CycloneDX/cyclonedx-core-java/commit/2e5ace4d1979571eaaca82952ab243850d1d6400#diff-1ba2a4fedacc21e8c4b7d22713a15269ac6f187cd10332015d28de1c341e0465.

Without test cases to accompany either of those changes, it's hard for me to understand why they were made.  The history of [July 9th 2020](https://github.com/CycloneDX/cyclonedx-core-java/commits/master?before=a27a57cf92e7aa06c4c0a221e3c6ff22a42dbd73+350&branch=master) doesn't show PRs or groups of commits that seem to help me understand either.  The problem is that this behavior was obviously *desired*, but I'm not clear why or how it would be helpful.

# Potential Solutions

1. I can use the XML formatted output (at least for now), which does not appear to mangle the structure of the dpkg copyright files when converting them to license text.
2. I'd be happy to submit a PR with appropriate fixes, but I'm really hoping @stevespringett might somehow remember the reason behind this before I go writing code that could break something important as per my note above about at least some part of this being desired behavior.

# Personal Note

This is my first comment to this project, and I look forward to working with you if possible.  I have both personal and professional interest in this area, and I hope to both integrate with and contribute to CycloneDX.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

JSON BOM serialization "trims" whitespace from DPKG license text (XML does not) #135

Background

Bug

History

Potential Solutions

Personal Note

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

JSON BOM serialization "trims" whitespace from DPKG license text (XML does not) #135

Description

Background

Bug

History

Potential Solutions

Personal Note

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions