Skip to content

Fix #13398 (Clang import: use json output instead of debug output) #7646

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

danmar
Copy link
Owner

@danmar danmar commented Jul 6, 2025

No description provided.

@danmar danmar force-pushed the fix-13398-2 branch 6 times, most recently from bad7991 to a434fdc Compare July 7, 2025 05:47
@danmar
Copy link
Owner Author

danmar commented Jul 7, 2025

@firewave any opinion?
I've always considered clang import to be experimental it wasn't robust . but with the json input it should be possible to make it work well.
this PR deactivates support for c++ clang import because I think that it far from working. But if we work on it.. I hope we can reactivate it later..

@firewave
Copy link
Collaborator

firewave commented Jul 7, 2025

@firewave any opinion?
I've always considered clang import to be experimental it wasn't robust . but with the json input it should be possible to make it work well.

I am all for more parsable inputs. But as I pasted in the ticket the source documentation states that this format is not stable and the data might be incomplete. No such note is present with the old test output. So it would be good to check the official documentation and also get some feedback from upstream (you can just file a question as a ticket).

this PR deactivates support for c++ clang import because I think that it far from working. But if we work on it.. I hope we can reactivate it later..

I was not aware of that. It seemed to work rather well IIRC.

Running TEST_CPPCHECK_INJECT_CLANG=clang python3 -m pytest test/cli failures show several issues with language-related stuff. I will have a look at that. But IIRC most failures were about missing system includes and inconsistencies in the output.

@firewave
Copy link
Collaborator

firewave commented Jul 7, 2025

failures show several issues with language-related stuff. I will have a look at that.

Actually it is because the location is set after tokens have already been parsed:

cppcheck: lib/tokenlist.cpp:105: int TokenList::appendFileIfNew(std::string): Assertion `mTokensFrontBack->front == nullptr' failed.`

That should be fixed before the code is switched over to the other format.

I still need to understand how the information is not applied to the existing tokens.

I also think we should defer merging this until after the release.

@firewave
Copy link
Collaborator

firewave commented Jul 7, 2025

A problem with the existing unit tests is that they are (were?) lacking location data which would have existed in the actual data.

They should rather be generated from actual source code we dumped to AST. That would also solve possible issues with detecting changes in the Clang AST with newer versions. But I reckon getting such code for the existing tests is impossible...

@danmar danmar added the merge-after-next-release Wait with merging this PR until after the next Release label Jul 8, 2025
@danmar
Copy link
Owner Author

danmar commented Jul 8, 2025

A problem with the existing unit tests is that they are (were?) lacking location data which would have existed in the actual data.

I intentionally skipped location data in testclangimport, to make the tests more manageable.

But if that cause this problem: "Actually it is because the location is set after tokens have already been parsed:" then maybe we should tweak that test so it does not fail for testclangimport.

They should rather be generated from actual source code we dumped to AST. That would also solve possible issues with detecting changes in the Clang AST with newer versions. But I reckon getting such code for the existing tests is impossible...

We use actual code in clang_import-test.py they can detect such possible issues however they are pretty basic.

It should also be pretty easy to re-generate the ast dumps in the future in testclangimport.cpp. The original c/c++ code for each unit test is either shown in a comment in each test or is pretty clear in the ASSERT_EQUALS. So I think I can re-generate the ast dumps manually in an hour or so.
And I think it would be good to reuse these code examples somehow to detect changes in the Clang AST with newer versions.. but I don't suggest we will look into that in this PR

@danmar
Copy link
Owner Author

danmar commented Jul 8, 2025

I was not aware of that. It seemed to work rather well IIRC.

This PR breaks the handling. In the short term I think it's better to deactivate it. And then we can look into a fix later. It's not unfixable.

@danmar
Copy link
Owner Author

danmar commented Jul 8, 2025

I also think we should defer merging this until after the release.

sure 👍

@firewave
Copy link
Collaborator

firewave commented Jul 8, 2025

I intentionally skipped location data in testclangimport, to make the tests more manageable.

The problem is that it tests data which we will never encounter and requires us to have logic which is test-only (slightly similar to having different preprocessor logic in the unit tests than in actual application code). Hence the tickets about handled "invalid" locations as well as the current assertions.

It should also be pretty easy to re-generate the ast dumps in the future in testclangimport.cpp. The original c/c++ code for each unit test is either shown in a comment in each test or is pretty clear in the ASSERT_EQUALS. So I think I can re-generate the ast dumps manually in an hour or so.

That would be awesome.

And I think it would be good to reuse these code examples somehow to detect changes in the Clang AST with newer versions.. but I don't suggest we will look into that in this PR

Yeah, but I think this should be done before this PR is being applied - I assume I want to do it the other way around.

@danmar
Copy link
Owner Author

danmar commented Jul 8, 2025

Yeah, but I think this should be done before this PR is being applied - I assume I want to do it the other way around.

what do you suggest such test would do?

  1. extract c/c++ code from each testcase in testclangimport.
  2. run cppcheck --clang
  3. .. what exact verification do you expect here ..

@firewave
Copy link
Collaborator

firewave commented Jul 9, 2025

Yeah, but I think this should be done before this PR is being applied - I assume I want to do it the other way around.

what do you suggest such test would do?

1. extract c/c++ code from each testcase in testclangimport.

2. run cppcheck --clang

3. .. what exact verification do you expect here ..

Some (most? all?) of the tests could probably just be moved to a Python test which simply takes a source input and an expected Cppcheck debug output.

In cases where we need to check more internal stuff we should go with unit tests.

@firewave
Copy link
Collaborator

firewave commented Jul 9, 2025

  1. .. what exact verification do you expect here ..

The verification that the behavior should be the same between using --clang is implicitly tested by running the tests with the TEST_CPPCHECK_INJECT_CLANG=clang.

There is also the idea to make sure that the AST/processed output stays the same over a bigger input (see https://trac.cppcheck.net/ticket/12358).

@danmar
Copy link
Owner Author

danmar commented Jul 10, 2025

Some (most? all?) of the tests could probably just be moved to a Python test which simply takes a source input and an expected Cppcheck debug output.

When you say "move".. Let's keep the unit tests that does not execute clang. I think the advantage with the unit tests that then we test that the "original" clang ast generates the expected debug output. So it detects regressions.

Then it does make sense also as an additional test to generate ast with newer clang and check that the debug output matches. So failures here would not be regressions but just missing forward compatibility..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge-after-next-release Wait with merging this PR until after the next Release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants