-
Notifications
You must be signed in to change notification settings - Fork 113
fix: restore proper license headers for third-party code #4134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ibution This commit addresses license compliance issues identified during an audit: 1. Restored original MIT license headers for third-party code: - pyright-language-service/src/*.ts (TypeFox monaco-languageclient) - common/workflow-operator/src/main/scala/com/kjetland/** (mbknor-jackson-jsonschema) - frontend/src/app/common/formly/array.type.ts (Google Angular) 2. Updated LICENSE file with proper third-party attribution section including full MIT license text for each bundled dependency 3. Updated .licenserc.yaml to exclude third-party files from Apache license header checking 4. Added sbt-license-report plugin (v1.7.0) for automated dependency license tracking and compliance auditing The third-party code (all MIT licensed, Category A) is compatible with Apache License 2.0 but requires proper attribution per Apache policy.
Remove pyright-language-service license header changes as they are already addressed in PR apache#4132. This commit now focuses only on: - mbknor-jackson-jsonschema (MIT license attribution) - Angular array.type.ts (MIT license attribution) - sbt-license-report plugin for dependency tracking
|
While this fixes some of the issues, I still think there are more remaining. I would also suggest that this task not be done via Claude or other LLMs. I don't think it's the right tool for the job. Using a script that an LLM might generate could be good, but there is no easy way to check the validity of the output an LLM would generate in this task. In fact it's exceedingly hard to validate this task, and easy to mistake it for being done correctly, as we have seen. Therefore the method in which it is done is important to scrutinize and have a high degree of confidence in. My method so far has been to look at the diff of the change that added all of the ASF headers (and inadvertently changed some), and look carefully at any instance where lines were removed instead of added. There are not many of these. Each of those should be scrutinized and marked as either appropriate or mistaken. |
Thank you for your review! I will manually check files and make the fix complete. I have one question regarding the location of attribution. Quoting from https://infra.apache.org/licensing-howto.html#permissive-deps,
Seems the attribution should still be put in LICENSE, not |
|
Ah yes, you are right about that being in LICENSE and not NOTICE. Sorry for the misdirection. I am too used to the way it is done in Asterix, where we never modify that file, we only modify the template that surrounds it. So I reflexively look too skeptically at modifications there. The change is looking better already- let me know when it's ready for a look. |
Thanks for confirming. Can you take another look on current PR? |
|
What about the pyright-language-service files that triggered this, i.e. https://github.com/apache/texera/pull/4132/files ? Should those fixes be part of this change? |
Yes, since currently #4132 is separate PR. Is it better to merge it with current PR and close that one? |
|
Yeah, I think so. This way this change can contain all of the fixes regarding inadvertently changed license headers. |
Please take another look. I merged the changes from #4132 |
|
Some things are still missing. For example, frontend/src/assets/svg/operator-view-result.svg has an ambiguous license. It seems like it was obtained from svgrepo.com, so it likely wasn't authored by anyone on Texera, but the license is not noted. |
Thank you for pointing them out. I fixed some of them. I just checked the folder |
|
Good job tracking those icon's actual licenses down, I couldn't find them myself. The one that's MIT licensed is fine, and I think the version of the CC license used by the other two are fine. If we can't find the license for the last one, we need to delete it. |
I found the license for the last one, and it is https://www.shutterstock.com/license, which is incompatible with apache. So I deleted it and its usage |
|
Cool. Hopefully that's everything then. |
LICENSE
Outdated
| Copyright (c) 2018 Google Inc. All Rights Reserved. | ||
| Source: https://angular.io | ||
|
|
||
| SVG Icons |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep this grouped by license and not by artifact type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
pyright-language-service/src/main.ts
Outdated
| * specific language governing permissions and limitations | ||
| * under the License. | ||
| */ | ||
| //The source file can be referred to: https://github.com/TypeFox/monaco-languageclient/blob/main/packages/examples/src/python/server/main.ts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the MIT license header from TypeFox be here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
| * under the License. | ||
| */ | ||
|
|
||
| declare module 'hocon-parser' { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. What's the license of this file? Is it MIT-licensed from TypeFox?
parshimers
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few small things, but I think the hard work of finding all the source inclusions is done.
LICENSE
Outdated
| CC BY 3.0 License (licenses/LICENSE-CC-BY-3.0.txt) | ||
| -------------------------------------------------- | ||
|
|
||
| NOTE: The following files use CC BY 3.0 license which requires attribution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not that they should be removed in the release, it's that they can only be packaged with binaries and not as part of the source release. it's not exactly clear to me how this applies to a svg. it is textual but not in the way source code is usually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed. I removed it.
parshimers
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
two more small things
| * specific language governing permissions and limitations | ||
| * under the License. | ||
| */ | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're sure this file was written by someone in Texera? it seems like src/main.ts refers to it- was that file taken verbatim from TypeFox or was this file and that modification done by someone in Texera?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't find the original file from TypeFox. Should I reach out to the people who introduced this file to confirm?
I replaced two CC icons with MIT-licensed icons. |
What changes were proposed in this PR?
The third-party code (all MIT licensed, Category A) is compatible with Apache License 2.0 but requires proper attribution per Apache policy. This PR addresses license compliance issues identified during an audit:
Restored original MIT license headers for third-party code:
common/workflow-operator/src/main/scala/com/kjetland/**(mbknor-jackson-jsonschema)frontend/src/app/common/formly/array.type.ts(Google Angular)frontend/src/app/common/formly/object.type.ts(Google Angular)frontend/src/app/common/formly/multischema.type.ts(Google Angular)frontend/src/app/common/formly/null.type.ts(Google Angular)Updated LICENSE file with third-party attribution following Apache Spark's approach:
licenses/directory for full license textCreated
licenses/LICENSE-MIT.txtcontaining the full MIT license textUpdated
.licenserc.yamlto exclude third-party files from Apache license header checkingReferences:
Any related issues, documentation, discussions?
Closes #4135. Related to #4132.
How was this PR tested?
Manual verification that:
licenses/LICENSE-MIT.txtcontains the complete MIT license textWas this PR authored or co-authored using generative AI tooling?
Co-authored with Claude code.