Skip to content

Commit e440b07

Browse files
Improve correctness of our Clang tooling infrastructure. (carbon-language#392)
This restructures the `compile_flags.txt` to use the downloaded libc++ system headers and avoid needing a virtual include directory to be built. It still needs _some_ Bazel build to complete before working in order to have the libc++ system headers downloaded and the symlink to the Bazel tree created. One (very) tricky part of making this work is to work around bugs in Clang's tooling layer that incorrectly handle `..` path components after traversing symlinks. To avoid this, we add a custom symlinks (`bazel-execroot` and `bazel-clang-toolchain`) that hide the relevant traversal of the Bazel layout to find build artifacts and the downloaded toolchain. These symlinks will be broken until a build with Bazel downloads the toolchain and creates the basic output tree structure. It also adds a `create_compdb.py` script. Running this script improves the tooling fidelity by taking a few steps: 1. It queries Bazel to find all the relevant files and adds them to a `compile_commands.json` database that allows `clangd` and other tools to index the entire project for improved cross-references, etc. 2. It builds all the generated files with Bazel so that they can be included successfully. This is very fast in my testing, taking only 10s of seconds. It is also very likely to be cached effectively. 3. It translates the arguments from `compile_flags.txt` to make them persistently use the built generated files include paths so that nothing breaks even as different targets are built potentially with different configurations. There are still some limitations. - It still requires running Bazel before anything works, even if a fast run. - It will require re-running if new generated files are added and needed but not built. - It assumes that the standard Bazel symlink names are used and available. Much of the Python here was written by @geoffromer in carbon-language#384 -- I've adapted it here after discussing to try to fill in some of the blanks and use a slightly different approach to querying Bazel. I use the normal `bazel query` rather than `bazel aquery`. This, for example, allows the index to reliably cover header files in header-only libraries more directly (rather than relying on transitive inclusion). It also seems a bit simpler too parse, but that is a pretty minor difference. Co-authored-by: Geoffrey Romer <[email protected]>
1 parent 7e1721d commit e440b07

7 files changed

+178
-10
lines changed

.gitignore

+6
Original file line numberDiff line numberDiff line change
@@ -10,3 +10,9 @@
1010

1111
# VSCode creates this directory in ways that are hard to prevent.
1212
/.vscode/
13+
14+
# Directories created by clangd
15+
/.cache/clangd
16+
17+
# Compilation database used by clangd
18+
compile_commands.json

.pre-commit-config.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ repos:
1717
- id: check-executables-have-shebangs
1818
- id: check-merge-conflict
1919
- id: check-symlinks
20-
exclude: '^website/jekyll/site/_includes$'
20+
exclude: '^(website/jekyll/site/_includes|bazel-(clang-toolchain|execroot))$'
2121
- id: check-yaml
2222
- id: detect-private-key
2323
- id: end-of-file-fixer

bazel-clang-toolchain

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
bazel-out/../../../external/bootstrap_clang_toolchain

bazel-execroot

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
bazel-out/../../carbon

compile_flags.txt

+11-8
Original file line numberDiff line numberDiff line change
@@ -43,35 +43,38 @@
4343
-iquote
4444
bazel-bin
4545
-iquote
46-
bazel-carbon-lang/external/llvm-project
46+
bazel-execroot/external/llvm-project
4747
-iquote
4848
bazel-bin/external/llvm-project
4949
-iquote
50-
bazel-carbon-lang/external/llvm_terminfo
50+
bazel-execroot/external/llvm_terminfo
5151
-iquote
5252
bazel-bin/external/llvm_terminfo
5353
-iquote
54-
bazel-carbon-lang/external/llvm_zlib
54+
bazel-execroot/external/llvm_zlib
5555
-iquote
5656
bazel-bin/external/llvm_zlib
5757
-iquote
58-
bazel-carbon-lang/external/bazel_tools
58+
bazel-execroot/external/bazel_tools
5959
-iquote
6060
bazel-bin/external/bazel_tools
61-
-Ibazel-bin/external/llvm-project/llvm/_virtual_includes/gtest_internal_headers
61+
-Ibazel-execroot/external/llvm-project/llvm/utils/unittest/googletest/src
6262
-isystem
63-
bazel-carbon-lang/external/llvm-project/llvm/include
63+
bazel-execroot/external/llvm-project/llvm/include
6464
-isystem
6565
bazel-bin/external/llvm-project/llvm/include
6666
-isystem
67-
bazel-carbon-lang/external/llvm-project/llvm/utils/unittest/googlemock/include
67+
bazel-execroot/external/llvm-project/llvm/utils/unittest/googlemock/include
6868
-isystem
6969
bazel-bin/external/llvm-project/llvm/utils/unittest/googlemock/include
7070
-isystem
71-
bazel-carbon-lang/external/llvm-project/llvm/utils/unittest/googletest/include
71+
bazel-execroot/external/llvm-project/llvm/utils/unittest/googletest/include
7272
-isystem
7373
bazel-bin/external/llvm-project/llvm/utils/unittest/googletest/include
7474
-std=c++17
75+
-nostdinc++
76+
-isystem
77+
bazel-clang-toolchain/include/c++/v1
7578
-no-canonical-prefixes
7679
-Wno-builtin-macro-redefined
7780
-D__DATE__="redacted"

scripts/create_compdb.py

+156
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,156 @@
1+
#!/usr/bin/env python3
2+
3+
"""Create a compilation database for Clang tools like `clangd`.
4+
5+
If you want `clangd` to be able to index this project, run this script from
6+
the workspace root to generate a rich compilation database. After the first
7+
run, you should only need to run it if you encounter `clangd` problems, or if
8+
you want `clangd` to build an up-to-date index of the entire project. Note
9+
that in the latter case you may need to manually clear and rebuild clangd's
10+
index after running this script.
11+
12+
Note that this script will build generated files in the Carbon project and
13+
otherwise touch the Bazel build. It works to do the minimum amount necessary.
14+
Once setup, generally subsequent builds, even of small parts of the project,
15+
different configurations, or that hit errors won't disrupt things. But, if
16+
you do hit errors, you can get things back to a good state by fixing the
17+
build of generated files and re-running this script.
18+
"""
19+
20+
__copyright__ = """
21+
Part of the Carbon Language project, under the Apache License v2.0 with LLVM
22+
Exceptions. See /LICENSE for license information.
23+
SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
24+
"""
25+
26+
import json
27+
import os
28+
import re
29+
import shutil
30+
import subprocess
31+
import sys
32+
from pathlib import Path
33+
34+
# Change the working directory to the repository root so that the remaining
35+
# operations reliably operate relative to that root.
36+
os.chdir(Path(__file__).parent.parent)
37+
directory = Path.cwd()
38+
39+
# We use the `BAZEL` environment variable if present. If not, then we try to
40+
# use `bazelisk` and then `bazel`.
41+
bazel = os.environ.get("BAZEL")
42+
if not bazel:
43+
bazel = "bazelisk"
44+
if not shutil.which(bazel):
45+
bazel = "bazel"
46+
if not shutil.which(bazel):
47+
sys.exit("Unable to run Bazel")
48+
49+
# Load compiler flags. We do this first in order to fail fast if not run from
50+
# the workspace root.
51+
print("Reading the arguments to use...")
52+
try:
53+
with open("compile_flags.txt") as flag_file:
54+
arguments = [line.strip() for line in flag_file]
55+
except FileNotFoundError:
56+
sys.exit(Path(sys.argv[0]).name + " must be run from the project root")
57+
58+
# Prepend the `clang` executable path to the arguments that looks into our
59+
# downloaded Clang toolchain.
60+
arguments = [str(Path("bazel-clang-toolchain/bin/clang"))] + arguments
61+
62+
print("Building compilation database...")
63+
64+
# Find all of the C++ source files that we expect to compile cleanly as
65+
# stand-alone files. This is a bit simpler than scraping the actual compile
66+
# actions and allows us to directly index header-only libraries easily and
67+
# pro-actively index the specific headers in the project.
68+
source_files_query = subprocess.run(
69+
[
70+
bazel,
71+
"query",
72+
"--keep_going",
73+
"--output=location",
74+
'filter(".*\\.(h|cpp|cc|c|cxx)$", kind("source file", deps(//...)))',
75+
],
76+
check=True,
77+
stdout=subprocess.PIPE,
78+
stderr=subprocess.DEVNULL,
79+
universal_newlines=True,
80+
).stdout
81+
source_files = [
82+
Path(line.split(":")[0]) for line in source_files_query.splitlines()
83+
]
84+
85+
# Filter into the Carbon source files that we'll find directly in the
86+
# workspace, and LLVM source files that need to be mapped through the merged
87+
# LLVM tree in Bazel's execution root.
88+
carbon_files = [
89+
f.relative_to(directory)
90+
for f in source_files
91+
if f.parts[: len(directory.parts)] == directory.parts
92+
]
93+
llvm_files = [
94+
Path("bazel-execroot/external").joinpath(
95+
*f.parts[f.parts.index("llvm-project") :]
96+
)
97+
for f in source_files
98+
if "llvm-project" in f.parts
99+
]
100+
print(
101+
"Found %d Carbon source files and %d LLVM source files..."
102+
% (len(carbon_files), len(llvm_files))
103+
)
104+
105+
# Now collect the generated file labels.
106+
generated_file_labels = subprocess.run(
107+
[
108+
bazel,
109+
"query",
110+
"--keep_going",
111+
"--output=label",
112+
(
113+
'filter(".*\\.(h|cpp|cc|c|cxx|def|inc)$",'
114+
'kind("generated file", deps(//...)))'
115+
),
116+
],
117+
check=True,
118+
stdout=subprocess.PIPE,
119+
stderr=subprocess.DEVNULL,
120+
universal_newlines=True,
121+
).stdout.splitlines()
122+
print("Found %d generated files..." % (len(generated_file_labels),))
123+
124+
# Directly build these labels so that indexing can find them. Allow this to
125+
# fail in case there are build errors in the client, and just warn the user
126+
# that they may be missing generated files.
127+
print("Building the generated files so that tools can find them...")
128+
subprocess.run([bazel, "build", "--keep_going"] + generated_file_labels)
129+
130+
131+
# Manually translate the label to a user friendly path into the Bazel output
132+
# symlinks.
133+
def _label_to_path(s):
134+
# Map external repositories to their part of the output tree.
135+
s = re.sub(r"^@([^/]+)//", r"bazel-bin/external/\1/", s)
136+
# Map this repository to the root of the output tree.
137+
s = s if not s.startswith("//") else "bazel-bin/" + s[len("//") :]
138+
# Replace the colon used to mark the package name with a slash.
139+
s = s.replace(":", "/")
140+
# Convert to a native path.
141+
return Path(s)
142+
143+
144+
generated_files = [_label_to_path(label) for label in generated_file_labels]
145+
146+
# Generate compile_commands.json with an entry for each C++ input.
147+
entries = [
148+
{
149+
"directory": str(directory),
150+
"file": str(f),
151+
"arguments": arguments + [str(f)],
152+
}
153+
for f in carbon_files + llvm_files + generated_files
154+
]
155+
with open("compile_commands.json", "w") as json_file:
156+
json.dump(entries, json_file, indent=2)

setup.cfg

+2-1
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
[flake8]
66
max-line-length = 80
77
exclude = website/jekyll/build
8+
# E203: This warning is not PEP 8 compliant.
89
# E402: Allow the pythonpath modifications before repo-local imports.
910
# W503: flake8 v3.8.4 is inconsistent with black v20.8b1 (pre-commit run -a).
10-
ignore = E402,W503
11+
ignore = E203,E402,W503

0 commit comments

Comments
 (0)