Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[malwarebazaar-recent-additions] Allow optional storage of malware samples #3503

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

ParamConstructor
Copy link

@ParamConstructor ParamConstructor commented Feb 26, 2025

Proposed changes

The core intent of this update is to support the "optional" storage of the malware files related to the Malware Bazaar data fetches. Some users may not wish to "actually" store the "real" malware files on their system disk. This could be due to security requirements, or could be because they never use the actual file and it is needlessly increasing their S3/MinIO storage requirements. This update allows for access to the Artifact entries and metadata about the malware artifacts - but removes the storage of the "real" malware when enabled. The malware is replaced with a file that is 85 bytes in size and when downloaded within OpenCTI states:

"This would normally be Malware, but we have disabled the saving of the real malware."

Changes:

  • Cleaned up code for autopep8/black/isort/pylint compliance.

  • Removed references to confidence_level, create_indicator, and update_existing_data as these within the connector are deprecated. This data is now handled within the OpenCTI platform based on the Connector User/Group configuration.

  • Added the following in the docker-compose.yml to show connector values could be in the config.yml or here and added the missing Auth-Key token reference that is required for the connector to function properly.

      - MALWAREBAZAAR_RECENT_ADDITIONS_USER_TOKEN=your-token-here # Free Auth-Key Required - https://bazaar.abuse.ch/api/#auth_key
  • Added the new key disable_malware_sample to both docker-compose.yml and config.yml.sample in order to support NOT actually storing the real malware in S3/MinIO storage.
      - MALWAREBAZAAR_RECENT_ADDITIONS_DISABLE_MALWARE_SAMPLE=true # If true, malware will be replaced with a benign Text file sample.
  disable_malware_sample: false # If true, malware will be replaced with a benign Text file sample.
  • Added code changes required to support this capability enhancement.

Related issues

  • None

Checklist

  • I consider the submitted work as finished
  • I tested the code for its functionality using different use cases
    • Tested connector with the disable_malware_sample set to true
    • Tested connector with the disable_malware_sample set to false
  • I added/update the relevant documentation (within connector code)
  • Where necessary I refactored code to improve the overall quality

Further comments

N/A

@ParamConstructor ParamConstructor force-pushed the malware-bazaar-disable-malware-storage branch 2 times, most recently from 8426772 to 4179485 Compare February 26, 2025 17:20
@helene-nguyen helene-nguyen self-assigned this Feb 27, 2025
@helene-nguyen helene-nguyen added the partner used to identify PR from patner label Feb 27, 2025
Comment on lines 10 to 11
- CONNECTOR_CONFIDENCE_LEVEL=40 # From 0 (Unknown) to 100 (Fully trusted); ENV or can be set in config.yml
- CONNECTOR_UPDATE_EXISTING_DATA=false # ENV or can be set in config.yml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ParamConstructor Thank you for your contribution!

Some remarks here: CONFIDENCE_LEVEL and UPDATE_EXISTING_DATA for connector are deprecated now.
You can find a full detailed explanation by @Powlinett from our team here: #3316 (review)

I see that are using it later as score, I suggest to remove those parameters and add for example MALWAREBAZAAR_RECENT_ADDITIONS_SCORE=40

Suggested change
- CONNECTOR_CONFIDENCE_LEVEL=40 # From 0 (Unknown) to 100 (Fully trusted); ENV or can be set in config.yml
- CONNECTOR_UPDATE_EXISTING_DATA=false # ENV or can be set in config.yml

Copy link
Author

@ParamConstructor ParamConstructor Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@helene-nguyen - I made suggested corrections and moved them down to the local connector level as MALWAREBAZAAR vars.

Based on this (#3526) - maybe I should just remove it and let the connector user (account connector runs as) default the score

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ParamConstructor I saw that you have rebased master, I assume everything is good now for this ?

@ParamConstructor ParamConstructor force-pushed the malware-bazaar-disable-malware-storage branch from 4179485 to 298ac3b Compare February 28, 2025 15:32
@ParamConstructor
Copy link
Author

ParamConstructor commented Feb 28, 2025

After rebase - two new capabilities had been added to create relationships and indicators via (bc63dde). This had some slight issues where OpenCTI URL and OpenCTI Token had been duplicated down to the MALWAREBAZAAR level (They are available via the config.yml or docker-compose.yml at the top level). I do not believe this duplication was the author's Intent? (@Noxurge - tagged to verify). Additionally, the new Indicators and Relationships should have a Marking for TLP:CLEAR. These issues/corrections are incorporated in the commit - (8a4d4ce) on this PR.

@Noxurge
Copy link
Contributor

Noxurge commented Feb 28, 2025

Hello @ParamConstructor! It wasn't my intention to do this duplication. My focus is to use OpenCTIApiClient to create the relationship and indicators with SHA256. Thanks for add TLP:CLEAR for this two capabilities. If you have any question for my features, feel free to ping me again.

@ParamConstructor ParamConstructor force-pushed the malware-bazaar-disable-malware-storage branch 3 times, most recently from d6d82aa to 127c93e Compare March 11, 2025 16:41
@ParamConstructor ParamConstructor force-pushed the malware-bazaar-disable-malware-storage branch from 127c93e to 2569b24 Compare March 11, 2025 17:03
@ParamConstructor
Copy link
Author

ParamConstructor commented Mar 11, 2025

@helene-nguyen - This code has been rebased to 6.5.6 and depreciated features/flags have been removed (i.e. confidence_level, create_indicator, and update_existing_data).

@SamuelHassine SamuelHassine force-pushed the master branch 3 times, most recently from e97d87b to cbdee1f Compare March 14, 2025 15:22
@helene-nguyen helene-nguyen added this to the PRs backlog milestone Mar 20, 2025
@animedbz16
Copy link
Contributor

If you are disabling the downloading of the actual malware artifact, why are you creating a separate artifact to upload into OpenCTI that is not the real artifact. It will not have the same hashes and would be confusing.

It seems more appropriate to instead modify the Artifact to just not include the payload_bin and instead use a url per Stix2.1 since both payload_bin and url are mutually exclusive.

https://docs.oasis-open.org/cti/stix/v2.1/csprd01/stix-v2.1-csprd01.html#_Toc16070681

You can then provide the url to where the artifact could actually be downloaded from, while not actually importing it here.

@ParamConstructor
Copy link
Author

animedbz16 - The code accounts for passing in the real artifacts hash values, even with replacement of the actual artifact (i.e. it will still keep the correct hash values for the real malware). The payload route was the original intent/behavior of the connector, so this PR seeks to not deviate too highly from that intent. Otherwise, you will have a mix of data in your platform after you update the connector (i.e. some with a URL and some with a file).

@animedbz16
Copy link
Contributor

animedbz16 commented Mar 27, 2025

I disagree with your statement. It is this PR itself that is deviating from STIX 2.1 standards, so I will reiterate more clearly:

The payload_bin is an optional property on a Stix2.1 Artifact object which should be a type binary where this property Specifies the binary data contained in the artifact as a base64-encoded string. This property MUST NOT be present if url is provided.

The fact that you are changing the Artifact itself is inappropriate because the payload_bin is supposed to be the base64-encoded string of the actual artifact... when it is included into the Stix 2.1 Object

If you don't want to store the actual payload_bin of the real artifact, Stix 2.1 conveniently provides a way for you to do this in a way that conforms with the standard by utilizing the url property where a string can be in a way that The value of this property MUST be a valid URL that resolves to the unencoded content. This property MUST NOT be present if payload_bin is provided.

So if you dont want to store the binary, which is what you are doing, and you already have the url of where the actual binary is located, then it seems prudent to store the actual URL into the Artifact so that if someone in the future wanted to reference and obtain that artifact they would have an Actual URL where they can hopefully still obtain it from... instead of having a pointless 85 byte file that serves no purpose and is misaligned with the definition of STIX 2.1 standards

image

@ParamConstructor
Copy link
Author

animedbz16 - The original behavior of the connector was not to store a URL, it stored the payload_bin. This modification is allowing OpenCTI "owners" activating this feature on their deployment to NOT store the malware and they would be doing this to avoid any reference to the physical malware file, as they don't wish to have a subsequent user inadvertently download and store it on their systems locally. Turning it off, then providing a URL to click - would expose a user to downloading the malware with 1-click (the thing the "owners" of that OpenCTI deployment were attempting to prevent). I am fully aware of the STIX spec - this is storing a payload_bin - it just doesn't "happen" to be the "real payload" - and this is for a specific purpose - but it is storing the "intended" payload should the feature be enabled. Not all OpenCTI deployments are used by a company/end user that needs the malware or even wants a reference to it other than its identifiers.

If you would like a URL option you could commit a subsequent option called "REPLACE_WITH_URL" or whatever and set it to True and have it do what you would like. Allowing for this 3rd option, don't download the malware - don't replace with a 85 byte txt, include a URL.

@animedbz16
Copy link
Contributor

OpenCTI does not provide a clickable link in the User Interface where someone could "download" the artifact. It is merely a place where the actual URL can be stored with the object so that someone could download the artifact if they desired.

The connector in its previous state downloads artifacts. This PR seeks to change this behavior to support creating Artifact objects but in a way that does not actually store the real payload_bin.

As I have already mentioned, there is a way to do this without storing a Fake / Incorrect payload_bin artifact in the system.

It is pointless to do this and the recommended way to create the STIX Artifact object would be to create it with the URL property.

If you look to see how this functions when uploading a Stix Artifact object without a payload_bin you can see how this functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
partner used to identify PR from patner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants