-
Notifications
You must be signed in to change notification settings - Fork 15.5k
[C++20][Modules] Improve namespace look-up performance for modules. #171769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-clang-modules Author: Michael Park (mpark) ChangesProblemGiven code such as The following synthetic script demonstrates the problem: #/usr/bin/env bash
CLANG=${CLANG:-clang++}
NUM_MODULES=${NUM_MODULES:-1500}
NUM_USES=${NUM_USES:-500000}
USE_MODULES=${USE_MODULES:-true}
TMPDIR=$(mktemp -d)
echo "Working in temp directory: $TMPDIR"
cd $TMPDIR
trap "rm -rf \"$TMPDIR\"" EXIT
echo "namespace N { inline void foo() {} }" > m1.h
for i in $(seq 2 $NUM_MODULES); do echo "namespace N {}" > m${i}.h; done
if $USE_MODULES; then
seq 1 $NUM_MODULES | xargs -I {} -P $(nproc) bash -c "$CLANG -std=c++20 -fmodule-header m{}.h"
fi
> a.cpp
if $USE_MODULES; then
for i in $(seq 1 $NUM_MODULES); do echo "import \"m${i}.h\";" >> a.cpp; done
else
for i in $(seq 1 $NUM_MODULES); do echo "#include \"m${i}.h\"" >> a.cpp; done
fi
echo "int main() {" >> a.cpp
for i in $(seq 1 $NUM_USES); do echo " N::foo();" >> a.cpp; done
echo "}" >> a.cpp
if $USE_MODULES; then
time $CLANG -std=c++20 -Wno-experimental-header-units -c a.cpp -o /dev/null \
$(for i in $(seq 1 $NUM_MODULES); do echo -n "-fmodule-file=m${i}.pcm "; done)
else
time $CLANG -std=c++20 -Wno-experimental-header-units -c a.cpp -o /dev/null
fiAs of 575d689, without modules ( With this PR, without modules there's no change (as expected) at 4.5s, but with modules it improves to about 5.2s. ApproachThe approach taken here aims to maintain status-quo with respect to the input and output of modules. That is, the This PR continues to read all of the external namespace decls from The other half of the problem is to write out all of the external namespaces that we used to have available in Alternatives AttemptedI tried reading fewer declarations on the Full diff: https://github.com/llvm/llvm-project/pull/171769.diff 2 Files Affected:
diff --git a/clang/lib/Serialization/ASTReader.cpp b/clang/lib/Serialization/ASTReader.cpp
index aec61322fb8be..5f66beefd6388 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -555,7 +555,26 @@ namespace {
using MacroDefinitionsMap =
llvm::StringMap<std::pair<StringRef, bool /*IsUndef*/>>;
-using DeclsMap = llvm::DenseMap<DeclarationName, SmallVector<NamedDecl *, 8>>;
+
+class DeclsSet {
+ SmallVector<NamedDecl *, 64> Decls;
+ llvm::SmallPtrSet<NamedDecl *, 8> Found;
+
+public:
+ NamedDecl *const *data() const { return Decls.data(); }
+
+ bool empty() const { return Decls.empty(); }
+ size_t size() const { return Decls.size(); }
+
+ bool insert(NamedDecl *ND) {
+ auto [_, Inserted] = Found.insert(ND);
+ if (Inserted)
+ Decls.push_back(ND);
+ return Inserted;
+ }
+};
+
+using DeclsMap = llvm::DenseMap<DeclarationName, DeclsSet>;
} // namespace
@@ -8702,14 +8721,22 @@ bool ASTReader::FindExternalVisibleDeclsByName(const DeclContext *DC,
return false;
// Load the list of declarations.
- SmallVector<NamedDecl *, 64> Decls;
- llvm::SmallPtrSet<NamedDecl *, 8> Found;
+ DeclsSet DS;
auto Find = [&, this](auto &&Table, auto &&Key) {
for (GlobalDeclID ID : Table.find(Key)) {
NamedDecl *ND = cast<NamedDecl>(GetDecl(ID));
- if (ND->getDeclName() == Name && Found.insert(ND).second)
- Decls.push_back(ND);
+ if (ND->getDeclName() != Name)
+ continue;
+ // Special case for namespaces: There can be a lot of redeclarations of
+ // some namespaces, and we import a "key declaration" per imported module.
+ // Since all declarations of a namespace are essentially interchangeable,
+ // we can optimize namespace look-up by only storing the key declaration
+ // of the current TU, rather than storing N key declarations where N is
+ // the # of imported modules that declare that namespace.
+ if (isa<NamespaceDecl>(ND))
+ ND = cast<NamedDecl>(getKeyDeclaration(ND));
+ DS.insert(ND);
}
};
@@ -8744,8 +8771,8 @@ bool ASTReader::FindExternalVisibleDeclsByName(const DeclContext *DC,
Find(It->second.Table, Name);
}
- SetExternalVisibleDeclsForName(DC, Name, Decls);
- return !Decls.empty();
+ SetExternalVisibleDeclsForName(DC, Name, DS);
+ return !DS.empty();
}
void ASTReader::completeVisibleDeclsMap(const DeclContext *DC) {
@@ -8763,7 +8790,15 @@ void ASTReader::completeVisibleDeclsMap(const DeclContext *DC) {
for (GlobalDeclID ID : It->second.Table.findAll()) {
NamedDecl *ND = cast<NamedDecl>(GetDecl(ID));
- Decls[ND->getDeclName()].push_back(ND);
+ // Special case for namespaces: There can be a lot of redeclarations of
+ // some namespaces, and we import a "key declaration" per imported module.
+ // Since all declarations of a namespace are essentially interchangeable,
+ // we can optimize namespace look-up by only storing the key declaration
+ // of the current TU, rather than storing N key declarations where N is
+ // the # of imported modules that declare that namespace.
+ if (isa<NamespaceDecl>(ND))
+ ND = cast<NamedDecl>(getKeyDeclaration(ND));
+ Decls[ND->getDeclName()].insert(ND);
}
// FIXME: Why a PCH test is failing if we remove the iterator after findAll?
@@ -8773,9 +8808,9 @@ void ASTReader::completeVisibleDeclsMap(const DeclContext *DC) {
findAll(ModuleLocalLookups, NumModuleLocalVisibleDeclContexts);
findAll(TULocalLookups, NumTULocalVisibleDeclContexts);
- for (DeclsMap::iterator I = Decls.begin(), E = Decls.end(); I != E; ++I) {
- SetExternalVisibleDeclsForName(DC, I->first, I->second);
- }
+ for (auto &[Name, DS] : Decls)
+ SetExternalVisibleDeclsForName(DC, Name, DS);
+
const_cast<DeclContext *>(DC)->setHasExternalVisibleStorage(false);
}
diff --git a/clang/lib/Serialization/ASTWriter.cpp b/clang/lib/Serialization/ASTWriter.cpp
index 667e04049dac8..231e0a206ab11 100644
--- a/clang/lib/Serialization/ASTWriter.cpp
+++ b/clang/lib/Serialization/ASTWriter.cpp
@@ -4396,20 +4396,20 @@ class ASTDeclContextNameLookupTrait
template <typename Coll> data_type getData(const Coll &Decls) {
unsigned Start = DeclIDs.size();
- for (NamedDecl *D : Decls) {
+ auto AddDecl = [this](NamedDecl *D) {
NamedDecl *DeclForLocalLookup =
getDeclForLocalLookup(Writer.getLangOpts(), D);
if (Writer.getDoneWritingDeclsAndTypes() &&
!Writer.wasDeclEmitted(DeclForLocalLookup))
- continue;
+ return;
// Try to avoid writing internal decls to reduced BMI.
// See comments in ASTWriter::WriteDeclContextLexicalBlock for details.
if (Writer.isGeneratingReducedBMI() &&
!DeclForLocalLookup->isFromExplicitGlobalModule() &&
IsInternalDeclFromFileContext(DeclForLocalLookup))
- continue;
+ return;
auto ID = Writer.GetDeclRef(DeclForLocalLookup);
@@ -4423,7 +4423,7 @@ class ASTDeclContextNameLookupTrait
ModuleLocalDeclsMap.insert({Key, DeclIDsTy{ID}});
else
Iter->second.push_back(ID);
- continue;
+ return;
}
break;
case LookupVisibility::TULocal: {
@@ -4432,7 +4432,7 @@ class ASTDeclContextNameLookupTrait
TULocalDeclsMap.insert({D->getDeclName(), DeclIDsTy{ID}});
else
Iter->second.push_back(ID);
- continue;
+ return;
}
case LookupVisibility::GenerallyVisibile:
// Generally visible decls go into the general lookup table.
@@ -4440,6 +4440,27 @@ class ASTDeclContextNameLookupTrait
}
DeclIDs.push_back(ID);
+ };
+ for (NamedDecl *D : Decls) {
+ if (isa<NamespaceDecl>(D) && D->isFromASTFile()) {
+ // In ASTReader, we stored only the key declaration of a namespace decl
+ // for this TU rather than storing all of the key declarations from each
+ // imported module. If we have an external namespace decl, this is that
+ // key declaration and we need to re-expand it to write out all of the
+ // key declarations from each imported module again.
+ //
+ // See comment 'ASTReader::FindExternalVisibleDeclsByName' for details.
+ ASTReader *Chain = Writer.getChain();
+ assert(Chain && "An external namespace decl without an ASTReader");
+ assert(D == Chain->getKeyDeclaration(D) &&
+ "An external namespace decl that is not "
+ "key declaration of this TU");
+ Chain->forEachImportedKeyDecl(D, [&AddDecl](const Decl *D) {
+ AddDecl(cast<NamedDecl>(const_cast<Decl*>(D)));
+ });
+ } else {
+ AddDecl(D);
+ }
}
return std::make_pair(Start, DeclIDs.size());
}
|
|
@llvm/pr-subscribers-clang Author: Michael Park (mpark) ChangesProblemGiven code such as The following synthetic script demonstrates the problem: #/usr/bin/env bash
CLANG=${CLANG:-clang++}
NUM_MODULES=${NUM_MODULES:-1500}
NUM_USES=${NUM_USES:-500000}
USE_MODULES=${USE_MODULES:-true}
TMPDIR=$(mktemp -d)
echo "Working in temp directory: $TMPDIR"
cd $TMPDIR
trap "rm -rf \"$TMPDIR\"" EXIT
echo "namespace N { inline void foo() {} }" > m1.h
for i in $(seq 2 $NUM_MODULES); do echo "namespace N {}" > m${i}.h; done
if $USE_MODULES; then
seq 1 $NUM_MODULES | xargs -I {} -P $(nproc) bash -c "$CLANG -std=c++20 -fmodule-header m{}.h"
fi
> a.cpp
if $USE_MODULES; then
for i in $(seq 1 $NUM_MODULES); do echo "import \"m${i}.h\";" >> a.cpp; done
else
for i in $(seq 1 $NUM_MODULES); do echo "#include \"m${i}.h\"" >> a.cpp; done
fi
echo "int main() {" >> a.cpp
for i in $(seq 1 $NUM_USES); do echo " N::foo();" >> a.cpp; done
echo "}" >> a.cpp
if $USE_MODULES; then
time $CLANG -std=c++20 -Wno-experimental-header-units -c a.cpp -o /dev/null \
$(for i in $(seq 1 $NUM_MODULES); do echo -n "-fmodule-file=m${i}.pcm "; done)
else
time $CLANG -std=c++20 -Wno-experimental-header-units -c a.cpp -o /dev/null
fiAs of 575d689, without modules ( With this PR, without modules there's no change (as expected) at 4.5s, but with modules it improves to about 5.2s. ApproachThe approach taken here aims to maintain status-quo with respect to the input and output of modules. That is, the This PR continues to read all of the external namespace decls from The other half of the problem is to write out all of the external namespaces that we used to have available in Alternatives AttemptedI tried reading fewer declarations on the Full diff: https://github.com/llvm/llvm-project/pull/171769.diff 2 Files Affected:
diff --git a/clang/lib/Serialization/ASTReader.cpp b/clang/lib/Serialization/ASTReader.cpp
index aec61322fb8be..5f66beefd6388 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -555,7 +555,26 @@ namespace {
using MacroDefinitionsMap =
llvm::StringMap<std::pair<StringRef, bool /*IsUndef*/>>;
-using DeclsMap = llvm::DenseMap<DeclarationName, SmallVector<NamedDecl *, 8>>;
+
+class DeclsSet {
+ SmallVector<NamedDecl *, 64> Decls;
+ llvm::SmallPtrSet<NamedDecl *, 8> Found;
+
+public:
+ NamedDecl *const *data() const { return Decls.data(); }
+
+ bool empty() const { return Decls.empty(); }
+ size_t size() const { return Decls.size(); }
+
+ bool insert(NamedDecl *ND) {
+ auto [_, Inserted] = Found.insert(ND);
+ if (Inserted)
+ Decls.push_back(ND);
+ return Inserted;
+ }
+};
+
+using DeclsMap = llvm::DenseMap<DeclarationName, DeclsSet>;
} // namespace
@@ -8702,14 +8721,22 @@ bool ASTReader::FindExternalVisibleDeclsByName(const DeclContext *DC,
return false;
// Load the list of declarations.
- SmallVector<NamedDecl *, 64> Decls;
- llvm::SmallPtrSet<NamedDecl *, 8> Found;
+ DeclsSet DS;
auto Find = [&, this](auto &&Table, auto &&Key) {
for (GlobalDeclID ID : Table.find(Key)) {
NamedDecl *ND = cast<NamedDecl>(GetDecl(ID));
- if (ND->getDeclName() == Name && Found.insert(ND).second)
- Decls.push_back(ND);
+ if (ND->getDeclName() != Name)
+ continue;
+ // Special case for namespaces: There can be a lot of redeclarations of
+ // some namespaces, and we import a "key declaration" per imported module.
+ // Since all declarations of a namespace are essentially interchangeable,
+ // we can optimize namespace look-up by only storing the key declaration
+ // of the current TU, rather than storing N key declarations where N is
+ // the # of imported modules that declare that namespace.
+ if (isa<NamespaceDecl>(ND))
+ ND = cast<NamedDecl>(getKeyDeclaration(ND));
+ DS.insert(ND);
}
};
@@ -8744,8 +8771,8 @@ bool ASTReader::FindExternalVisibleDeclsByName(const DeclContext *DC,
Find(It->second.Table, Name);
}
- SetExternalVisibleDeclsForName(DC, Name, Decls);
- return !Decls.empty();
+ SetExternalVisibleDeclsForName(DC, Name, DS);
+ return !DS.empty();
}
void ASTReader::completeVisibleDeclsMap(const DeclContext *DC) {
@@ -8763,7 +8790,15 @@ void ASTReader::completeVisibleDeclsMap(const DeclContext *DC) {
for (GlobalDeclID ID : It->second.Table.findAll()) {
NamedDecl *ND = cast<NamedDecl>(GetDecl(ID));
- Decls[ND->getDeclName()].push_back(ND);
+ // Special case for namespaces: There can be a lot of redeclarations of
+ // some namespaces, and we import a "key declaration" per imported module.
+ // Since all declarations of a namespace are essentially interchangeable,
+ // we can optimize namespace look-up by only storing the key declaration
+ // of the current TU, rather than storing N key declarations where N is
+ // the # of imported modules that declare that namespace.
+ if (isa<NamespaceDecl>(ND))
+ ND = cast<NamedDecl>(getKeyDeclaration(ND));
+ Decls[ND->getDeclName()].insert(ND);
}
// FIXME: Why a PCH test is failing if we remove the iterator after findAll?
@@ -8773,9 +8808,9 @@ void ASTReader::completeVisibleDeclsMap(const DeclContext *DC) {
findAll(ModuleLocalLookups, NumModuleLocalVisibleDeclContexts);
findAll(TULocalLookups, NumTULocalVisibleDeclContexts);
- for (DeclsMap::iterator I = Decls.begin(), E = Decls.end(); I != E; ++I) {
- SetExternalVisibleDeclsForName(DC, I->first, I->second);
- }
+ for (auto &[Name, DS] : Decls)
+ SetExternalVisibleDeclsForName(DC, Name, DS);
+
const_cast<DeclContext *>(DC)->setHasExternalVisibleStorage(false);
}
diff --git a/clang/lib/Serialization/ASTWriter.cpp b/clang/lib/Serialization/ASTWriter.cpp
index 667e04049dac8..231e0a206ab11 100644
--- a/clang/lib/Serialization/ASTWriter.cpp
+++ b/clang/lib/Serialization/ASTWriter.cpp
@@ -4396,20 +4396,20 @@ class ASTDeclContextNameLookupTrait
template <typename Coll> data_type getData(const Coll &Decls) {
unsigned Start = DeclIDs.size();
- for (NamedDecl *D : Decls) {
+ auto AddDecl = [this](NamedDecl *D) {
NamedDecl *DeclForLocalLookup =
getDeclForLocalLookup(Writer.getLangOpts(), D);
if (Writer.getDoneWritingDeclsAndTypes() &&
!Writer.wasDeclEmitted(DeclForLocalLookup))
- continue;
+ return;
// Try to avoid writing internal decls to reduced BMI.
// See comments in ASTWriter::WriteDeclContextLexicalBlock for details.
if (Writer.isGeneratingReducedBMI() &&
!DeclForLocalLookup->isFromExplicitGlobalModule() &&
IsInternalDeclFromFileContext(DeclForLocalLookup))
- continue;
+ return;
auto ID = Writer.GetDeclRef(DeclForLocalLookup);
@@ -4423,7 +4423,7 @@ class ASTDeclContextNameLookupTrait
ModuleLocalDeclsMap.insert({Key, DeclIDsTy{ID}});
else
Iter->second.push_back(ID);
- continue;
+ return;
}
break;
case LookupVisibility::TULocal: {
@@ -4432,7 +4432,7 @@ class ASTDeclContextNameLookupTrait
TULocalDeclsMap.insert({D->getDeclName(), DeclIDsTy{ID}});
else
Iter->second.push_back(ID);
- continue;
+ return;
}
case LookupVisibility::GenerallyVisibile:
// Generally visible decls go into the general lookup table.
@@ -4440,6 +4440,27 @@ class ASTDeclContextNameLookupTrait
}
DeclIDs.push_back(ID);
+ };
+ for (NamedDecl *D : Decls) {
+ if (isa<NamespaceDecl>(D) && D->isFromASTFile()) {
+ // In ASTReader, we stored only the key declaration of a namespace decl
+ // for this TU rather than storing all of the key declarations from each
+ // imported module. If we have an external namespace decl, this is that
+ // key declaration and we need to re-expand it to write out all of the
+ // key declarations from each imported module again.
+ //
+ // See comment 'ASTReader::FindExternalVisibleDeclsByName' for details.
+ ASTReader *Chain = Writer.getChain();
+ assert(Chain && "An external namespace decl without an ASTReader");
+ assert(D == Chain->getKeyDeclaration(D) &&
+ "An external namespace decl that is not "
+ "key declaration of this TU");
+ Chain->forEachImportedKeyDecl(D, [&AddDecl](const Decl *D) {
+ AddDecl(cast<NamedDecl>(const_cast<Decl*>(D)));
+ });
+ } else {
+ AddDecl(D);
+ }
}
return std::make_pair(Start, DeclIDs.size());
}
|
4c14d68 to
bcff2a7
Compare
ChuanqiXu9
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the namespace decl special? Can we make this optimization more general?
I did try to generalize it for all redeclarable decls, but it failed a bunch of unit tests. My understanding is that namespaces are indeed special in this aspect, according to this comment in |
fe39409 to
dcad405
Compare
I think the only point in the comment are all declarations of a namespace are interchangable. While for class we may have: and they are not interchangable. But the idea of the patch still seems valuable and doable for at least classes. Maybe it is not a super stragihtfoward extension. But I believe the problems are solvable and it is worthy to look into the problems. I am not asking to address this in the PR. But let's add a FIXME to TODO for that. |
|
The patch itself looks good. But I like to test it internally before merging. Hope I can make it in next week. And also I think it is worthy to add a test for it. We are able to test it with unittests. e.g., we can test the number of the return decls of |
Right, I was giving this more thought earlier today and I do think there's likely to be an improvement to be had there. Probably along the lines of: keeping one class decl, and replacing it with the definition if one exists, etc rather than unconditionally taking one decl like we can with namespaces. For now though, I'll leave a note for it as you suggest. |
|
I've added a |
dcad405 to
2be2052
Compare
In our internal workloads, I didn't see noticeable changes in End-to-End build process. Maybe the reason is we've already done a so called "bottom up" build. But the good news is that I didn't find regression too. So LGTM.
Would you like to add a test for this? If you don't know how to do, you can look at clang/unittests/Serialization/LoadSpecLazilyTest.cpp for example. This is useful as lit test can't show all things. |
Thanks for checking! Yeah, unless you have fine-grained modules with 1K+ modules declaring the same namespaces, I wouldn't expect to see any improvements with this. For us, it's because at a certain scale with Thrift generated files, we end up importing 1K+ modules that all declare the
I'll be working on the test tomorrow! Thanks for the pointer. |
Maybe this is a nice point of named modules. We use thrift too. But we just wrap it into a single named module. Then we avoid the problem. |
3b2ecdc to
d1cd20d
Compare
ChuanqiXu9
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks.
d1cd20d to
8b117ab
Compare
8b117ab to
1af1a44
Compare
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/27/builds/20378 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/110/builds/6824 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/95/builds/19450 Here is the relevant piece of the build log for the reference |
Problem
Given code such as
N::foo();, we perform name look-up onN. In the case whereNis a namespace declared in imported modules, one namespace decl (the "key declaration") for each module that declares a namespacefoois loaded and stored. In large scales where there are many such modules, (e.g., 1,500) and many uses (e.g., 500,000), this becomes extremely inefficient because every look-up (500,000 of them) return 1,500 results.The following synthetic script demonstrates the problem:
As of 575d689, without modules (
USE_MODULES=false) this takes about 4.5s, whereas with modules (USE_MODULES=true), this takes about 37s.With this PR, without modules there's no change (as expected) at 4.5s, but with modules it improves to about 5.2s.
Approach
The approach taken here aims to maintain status-quo with respect to the input and output of modules. That is, the
ASTReaderandASTWriterboth read and write the same declarations as it did before. The difference is in the middle part: theStoredDeclsMapinDeclContext. TheStoredDeclsMapis roughly amap<DeclarationName, StoredDeclsList>. Currently, we read all of the external namespace decls fromASTReader, they all get stored into theStoredDeclsList, and theASTWriteriterates through that list and writes out the results.This PR continues to read all of the external namespace decls from
ASTReader, but only stores one namespace decl in theStoredDeclsList. This is okay since the reading of the decls handles all of the merging and chaining of the namespace decls, and as long as they're loaded and chained, returning one for look-up purposes is sufficient.The other half of the problem is to write out all of the external namespaces that we used to store in
StoredDeclsListbut no longer. For this, we take advantage of theKeyDeclsdata structure inASTReader.KeyDeclsis roughly amap<Decl *, vector<GlobalDeclID>>, and it stores a mapping from the canonical decl of a redeclarable decl to a list ofGlobalDeclIDs where each ID represents a "key declaration" from each imported module. More to the point, if we read external namespacesN1,N2,N3inASTReader, we'll either haveN1mapped to[N2, N3], or some newly local canonical decl mapped to[N1, N2, N3]. Either way, we can visitN1,N2, andN3by doingASTReader::forEachImportedKeyDecls(N1, Visitor), and we leverage this to maintain the current behavior of writing out all of the imported namespace decls inASTWriter.Alternatives Attempted
ASTReaderside, and writing out fewer declarations on theASTWriterside, and neither options worked at all.StoredDeclsListinto two pieces, one with non-namespace decls and one with only namespace decls, but that didn't work well... I think because the order of the declarations matter sometimes, and maybe also because the declaration replacement logic gets more complicated.SemaLookuplevel. Basically, retrieve all the stored decls but deduplicate populating theLookupResulthere. This did improve things slightly, but not quite enough, and this solution seemed cleaner in the end anyway.