Skip to content

[filestream] Add native file identity. #14351

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 9, 2025

Conversation

belimawr
Copy link
Contributor

@belimawr belimawr commented Jun 27, 2025

Proposed commit message

This commit adds the native file identity to the filestream package

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.
  • I have verified that any added dashboard complies with Kibana's Dashboard good practices

## Author's Checklist
## How to test this PR locally

  1. Build the package, deploy it using elastic-package
  2. Add the 'Custom Logs (filestream)' integration, disabling the 'Fingerprint file identity' and enabling 'Native file identity' (see screenshot below)
  3. Create a small file (< 1kb, ideally one line): echo "foo" > /tmp/log.log
  4. Configure 'Paths' to ingest the file
  5. Deploy an Elastic-Agent with this integration and ensure it ingest the file.

## Related issues

Screenshots

2025-07-02_14-50

@belimawr belimawr marked this pull request as ready for review June 27, 2025 14:56
@belimawr belimawr requested a review from a team as a code owner June 27, 2025 14:56
@belimawr belimawr requested review from AndersonQ and mauri870 June 27, 2025 14:56
@belimawr belimawr added the Team:Elastic-Agent-Data-Plane Agent Data Plane team [elastic/elastic-agent-data-plane] label Jun 27, 2025
@elasticmachine
Copy link

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@belimawr belimawr self-assigned this Jun 27, 2025
@belimawr belimawr added the bugfix Pull request that fixes a bug issue label Jun 27, 2025
@andrewkroh andrewkroh added the Integration:filestream Custom Logs (Filestream) label Jun 27, 2025
- version: "1.2.0"
changes:
- description: Add the native file identity
type: bugfix
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a bug because it should have had native file identity since the beginning?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bug because disabling fingerprint is broken in 9.x. Actually in 9.x disabling the fingerprint only removes the offset and length settings, without actually changing the file identity 🤦‍♂️.

Another way to see this is: we released a breaking change in 9.0 and did not update the integration to correctly support it.

Comment on lines 141 to 155
{{#if fingerprint }}
{{#unless file_identity_native }}
file_identity.fingerprint: ~
prospector.scanner.fingerprint.enabled: true
prospector.scanner.fingerprint.offset: {{ fingerprint_offset }}
prospector.scanner.fingerprint.length: {{ fingerprint_length }}
{{/unless}}
{{/if}}

{{#if file_identity_native }}
{{#unless fingerprint }}
file_identity.native: ~
prospector.scanner.fingerprint.enabled: false
{{/unless}}
{{/if}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Blocker]
I'm not sure it's behaving as expected. Better it isn't cleat to me what is the expected behaviour:

I played the user and enabled both, the result is none on the policy. Is it expected?
I think we could either define a precedence so when both are enabled it picks one, or if both are enabled produce an error.The error could be achieved by creating a invalid config. I'm not sure it's a good idea, but if we cannot do validation in the UI, it's a way to surface the error to the user.

[Blocker-ish suggestion|Question]
I believe there is a great change users will mix both file identity. Perhaps there is a option to use a drop down and select one or the other?
I know there are additional configuration to each one of them, but better at least ensure only one file identity is chosen.

[Suggestion]
As there isn't really an hierarchic between the elements in the advanced config, perhaps it could be achieved with a prefix in the name, like:

File Identity: Fingerprint: enabled
File Identity: Fingerprint: offset

File Identity: Native: enabled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I played the user and enabled both, the result is none on the policy. Is it expected?

Yes, this is the expected result. Because both settings cannot be enabled at the same time and there is no way in Fleet UI to prevent this/show an error, I opted by ignoring both and keep the integration working rather than having the Filestream input fail to start.

I think we could either define a precedence so when both are enabled it picks one (...)

That's the behaviour now, well if both are enabled, then Filestream default setting is used. Which is fingerprint for 9.x and Native for 8.x/7.x

(...) or if both are enabled produce an error. The error could be achieved by creating a invalid config. I'm not sure it's a good idea, but if we cannot do validation in the UI, it's a way to surface the error to the user.

Because both define the same key in the YAML and Fleet UI renders the YAML, the error is quite hard to understand, see the screenshots.

Error screenshots

Error pop-up at the bottom of the screen:
2025-07-02_14-54_1

Clicking in see full error brings up the other pop-up
2025-07-02_14-54

Regardless of how it is going to look in Fleet UI, I don't think this is a very helpful error.

I believe there is a great change users will mix both file identity. Perhaps there is a option to use a drop down and select one or the other?

I've looked into something like that in the past (probably when I was resurrecting this integration) and I didn't find any good solution. As far as I know there is no drop-down option.

I know there are additional configuration to each one of them, but better at least ensure only one file identity is chosen.

The native file identity does not have any extra configuration, only the fingerprint has extra ones.

I added the single prefix:

Screenshot

2025-07-02_14-50

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Endpoint/Defend has it:
Screenshot from 2025-07-07 16-41-54

if you're in a rush, I could approve it, but if it can be done, I believe we'd avoid a lot of issues down the line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have this merged by tomorrow, but I can wait a bit while exploring this option.

@elasticmachine
Copy link

💚 Build Succeeded

History

cc @belimawr

Copy link

Copy link
Member

@AndersonQ AndersonQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there seems there isn't a better option for the settings, endpoint is a special cases, coded directly in kibana. So I'm approving it. We can improve it later if we find a way to do so

@belimawr belimawr merged commit da0a934 into elastic:main Jul 9, 2025
7 checks passed
@belimawr belimawr deleted the add-native-file-identity branch July 9, 2025 12:38
@elastic-vault-github-plugin-prod

Package filestream - 1.2.0 containing this change is available at https://epr.elastic.co/package/filestream/1.2.0/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix Pull request that fixes a bug issue Integration:filestream Custom Logs (Filestream) Team:Elastic-Agent-Data-Plane Agent Data Plane team [elastic/elastic-agent-data-plane]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants