-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-45423: [C++] Don't require Boost library with ARROW_TESTING=ON/ARROW_BUILD_SHARED=OFF #45424
Conversation
…ON/ARROW_BUILD_SHARED=OFF Because we can use `libarrow_testing.a` without `boost::filesystem` when `arrow::util::Process` isn't used.
@github-actions crossbow submit test-build-cpp-fuzz |
|
Revision: 06a70af Submitted crossbow builds: ursacomputing/crossbow @ actions-a3bc80b5c0
|
@@ -1256,14 +1256,19 @@ endif() | |||
# - Gandiva has a compile-time (header-only) dependency on Boost, not runtime. | |||
# - Tests need Boost at runtime. | |||
# - S3FS and Flight benchmarks need Boost at runtime. | |||
# - arrow_testing uses boost::filesystem. So arrow_testing requires |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we replace boost::filesystem
with std::filesystem
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No... It's used by boost::process
...:
arrow/cpp/src/arrow/testing/process.cc
Lines 76 to 92 in 1567be0
# ifdef BOOST_PROCESS_USE_V2 | |
namespace process = BOOST_PROCESS_V2_NAMESPACE; | |
namespace filesystem = process::filesystem; | |
// For Boost < 1.87.0 | |
# ifdef BOOST_PROCESS_V2_ASIO_NAMESPACE | |
namespace asio = BOOST_PROCESS_V2_ASIO_NAMESPACE; | |
# else | |
namespace asio = process::net; | |
# endif | |
# elif defined(BOOST_PROCESS_HAVE_V1) | |
namespace process = boost::process::v1; | |
namespace filesystem = boost::process::v1::filesystem; | |
# else | |
namespace process = boost::process; | |
namespace filesystem = boost::filesystem; | |
# endif | |
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, wait. We may be able to replace boost::filesystem
with std::filesystem
. We don't need to use boost::filesystem
by replacing boost::filesystem
with std::filesystem
here:
arrow/cpp/src/arrow/testing/process.cc
Lines 251 to 283 in 1567be0
Result<filesystem::path> ResolveCurrentExecutable() { | |
// See https://stackoverflow.com/a/1024937/10194 for various | |
// platform-specific recipes. | |
filesystem::path path; | |
boost::system::error_code error_code; | |
# if defined(__linux__) | |
path = filesystem::canonical("/proc/self/exe", error_code); | |
# elif defined(__APPLE__) | |
char buf[PATH_MAX + 1]; | |
uint32_t bufsize = sizeof(buf); | |
if (_NSGetExecutablePath(buf, &bufsize) < 0) { | |
return Status::Invalid("Can't resolve current exe: path too large"); | |
} | |
path = filesystem::canonical(buf, error_code); | |
# elif defined(_WIN32) | |
char buf[MAX_PATH + 1]; | |
if (!GetModuleFileNameA(NULL, buf, sizeof(buf))) { | |
return Status::Invalid("Can't get executable file path"); | |
} | |
path = filesystem::canonical(buf, error_code); | |
# else | |
ARROW_UNUSED(error_code); | |
return Status::NotImplemented("Not available on this system"); | |
# endif | |
if (error_code) { | |
// XXX fold this into the Status class? | |
return Status::IOError("Can't resolve current exe: ", error_code.message()); | |
} else { | |
return path; | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. We may need to use boost::filesystem
here...:
arrow/cpp/src/arrow/testing/process.cc
Lines 290 to 292 in 1567be0
new process::process(ctx_, executable_, args_, env, | |
keep_stderr_ ? process::process_stdio{{}, {}, {}} | |
: process::process_stdio{{}, {}, nullptr})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine... It seems requiring non-trivial work to replace boost::process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't find any issue after a quick search. Is this something that worth doing? Perhaps we need to vendor a lightweight platform-independent process library like https://gitlab.com/eidheim/tiny-process-library to get rid of boost::process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought I heard @pitrou suggesting we wanted to remove our boost dependencies (or something related) but I might be mistaken
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is worthwhile and I just did similar things for the Apache Avro C++ library: https://issues.apache.org/jira/browse/AVRO-4095. However, I don't know if there is any good alternative to boost::process. If the solution is clear, I'm happy to fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, unfortunately we must rely on boost::process
for now. The only alternative AFAIK would be to write our own cross-platform process-launching code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I agree with removing Boost dependency. But it's not easy and it requires many changes.
Let's discuss it in #45440.
BTW, can we merge this to fix test-build-cpp-fuzz failure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you @kou
After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit deccce1. There were 8 benchmark results with an error:
There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 2 possible false positives for unstable benchmarks that are known to sometimes produce them. |
Rationale for this change
We can use
libarrow_testing.a
withoutboost::filesystem
whenarrow::util::Process
isn't used.What changes are included in this PR?
ARROW_TESTING=ON
requires Boost. IfARROW_BUILD_SHARED=ON
is also used, Boost libraries are also required.Are these changes tested?
Yes.
Are there any user-facing changes?
No.