Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-45423: [C++] Don't require Boost library with ARROW_TESTING=ON/ARROW_BUILD_SHARED=OFF #45424

Merged
merged 1 commit into from
Feb 6, 2025

Conversation

kou
Copy link
Member

@kou kou commented Feb 5, 2025

Rationale for this change

We can use libarrow_testing.a without boost::filesystem when arrow::util::Process isn't used.

What changes are included in this PR?

ARROW_TESTING=ON requires Boost. If ARROW_BUILD_SHARED=ON is also used, Boost libraries are also required.

Are these changes tested?

Yes.

Are there any user-facing changes?

No.

…ON/ARROW_BUILD_SHARED=OFF

Because we can use `libarrow_testing.a` without `boost::filesystem`
when `arrow::util::Process` isn't used.
@kou
Copy link
Member Author

kou commented Feb 5, 2025

@github-actions crossbow submit test-build-cpp-fuzz

Copy link

github-actions bot commented Feb 5, 2025

⚠️ GitHub issue #45423 has been automatically assigned in GitHub to PR creator.

Copy link

github-actions bot commented Feb 5, 2025

Revision: 06a70af

Submitted crossbow builds: ursacomputing/crossbow @ actions-a3bc80b5c0

Task Status
test-build-cpp-fuzz GitHub Actions

@@ -1256,14 +1256,19 @@ endif()
# - Gandiva has a compile-time (header-only) dependency on Boost, not runtime.
# - Tests need Boost at runtime.
# - S3FS and Flight benchmarks need Boost at runtime.
# - arrow_testing uses boost::filesystem. So arrow_testing requires
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we replace boost::filesystem with std::filesystem?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No... It's used by boost::process...:

# ifdef BOOST_PROCESS_USE_V2
namespace process = BOOST_PROCESS_V2_NAMESPACE;
namespace filesystem = process::filesystem;
// For Boost < 1.87.0
# ifdef BOOST_PROCESS_V2_ASIO_NAMESPACE
namespace asio = BOOST_PROCESS_V2_ASIO_NAMESPACE;
# else
namespace asio = process::net;
# endif
# elif defined(BOOST_PROCESS_HAVE_V1)
namespace process = boost::process::v1;
namespace filesystem = boost::process::v1::filesystem;
# else
namespace process = boost::process;
namespace filesystem = boost::filesystem;
# endif
#endif

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, wait. We may be able to replace boost::filesystem with std::filesystem. We don't need to use boost::filesystem by replacing boost::filesystem with std::filesystem here:

Result<filesystem::path> ResolveCurrentExecutable() {
// See https://stackoverflow.com/a/1024937/10194 for various
// platform-specific recipes.
filesystem::path path;
boost::system::error_code error_code;
# if defined(__linux__)
path = filesystem::canonical("/proc/self/exe", error_code);
# elif defined(__APPLE__)
char buf[PATH_MAX + 1];
uint32_t bufsize = sizeof(buf);
if (_NSGetExecutablePath(buf, &bufsize) < 0) {
return Status::Invalid("Can't resolve current exe: path too large");
}
path = filesystem::canonical(buf, error_code);
# elif defined(_WIN32)
char buf[MAX_PATH + 1];
if (!GetModuleFileNameA(NULL, buf, sizeof(buf))) {
return Status::Invalid("Can't get executable file path");
}
path = filesystem::canonical(buf, error_code);
# else
ARROW_UNUSED(error_code);
return Status::NotImplemented("Not available on this system");
# endif
if (error_code) {
// XXX fold this into the Status class?
return Status::IOError("Can't resolve current exe: ", error_code.message());
} else {
return path;
}
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. We may need to use boost::filesystem here...:

new process::process(ctx_, executable_, args_, env,
keep_stderr_ ? process::process_stdio{{}, {}, {}}
: process::process_stdio{{}, {}, nullptr}));

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine... It seems requiring non-trivial work to replace boost::process.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find any issue after a quick search. Is this something that worth doing? Perhaps we need to vendor a lightweight platform-independent process library like https://gitlab.com/eidheim/tiny-process-library to get rid of boost::process.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I heard @pitrou suggesting we wanted to remove our boost dependencies (or something related) but I might be mistaken

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is worthwhile and I just did similar things for the Apache Avro C++ library: https://issues.apache.org/jira/browse/AVRO-4095. However, I don't know if there is any good alternative to boost::process. If the solution is clear, I'm happy to fix this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, unfortunately we must rely on boost::process for now. The only alternative AFAIK would be to write our own cross-platform process-launching code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I agree with removing Boost dependency. But it's not easy and it requires many changes.

Let's discuss it in #45440.

BTW, can we merge this to fix test-build-cpp-fuzz failure?

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels Feb 5, 2025
Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you @kou

@kou kou merged commit deccce1 into apache:main Feb 6, 2025
45 checks passed
@kou kou removed the awaiting changes Awaiting changes label Feb 6, 2025
@kou kou deleted the ci-cpp-fuzz branch February 6, 2025 07:43
Copy link

After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit deccce1.

There were 8 benchmark results with an error:

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 2 possible false positives for unstable benchmarks that are known to sometimes produce them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants