Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DISCUSS] Exceptions vs status codes #14

Open
gaborkaszab opened this issue Dec 16, 2024 · 23 comments
Open

[DISCUSS] Exceptions vs status codes #14

gaborkaszab opened this issue Dec 16, 2024 · 23 comments

Comments

@gaborkaszab
Copy link
Collaborator

I'm pretty sure there are pros and cons for each side and people might get into religious fights on this topic. Let's try to come to a conclusion which one we should use in this project.

@gaborkaszab gaborkaszab changed the title Exceptions vs status codes [DISCUSS] Exceptions vs status codes Dec 16, 2024
@pitrou
Copy link
Member

pitrou commented Dec 16, 2024

The main thing I like about the "Status" style is that it makes error propagation and handling (or lack thereof) explicit. However, I have no religious preference :)

@wgtmac
Copy link
Member

wgtmac commented Dec 16, 2024

As an Arrow developer who has experienced both status and exception in the same repo, I feel inclined to use exception to make the code shorter and easier to debug. It is preferred (perhaps biased) to use exception for modern C++: https://learn.microsoft.com/en-us/cpp/cpp/errors-and-exception-handling-modern-cpp?view=msvc-170

@zhjwpku
Copy link
Collaborator

zhjwpku commented Dec 17, 2024

I came across this expected[0] class which seems conform to the idea of Status/Result, but it's in c++23.

[0] https://en.cppreference.com/w/cpp/utility/expected

@zhjwpku
Copy link
Collaborator

zhjwpku commented Dec 17, 2024

Facebook's Folly has a Expected class [0], Google's Abseil has a StatusOr class [1] all for the same purpose.

[0] https://github.com/facebook/folly/blob/main/folly/Expected.h
[1] https://github.com/abseil/abseil-cpp/blob/master/absl/status/statusor.h

@mapleFU
Copy link
Member

mapleFU commented Dec 17, 2024

Personally I think if iceberg-cpp is just for parsing the metadata, all is ok for me. If it's also able to handle the dataset layer, I think exception might making maintaining the internal status of async fetching a bit more hard ( though they're already hard )

But I also have no religious preference :)

@wgtmac
Copy link
Member

wgtmac commented Jan 10, 2025

I came across this expected[0] class which seems conform to the idea of Status/Result, but it's in c++23.

[0] https://en.cppreference.com/w/cpp/utility/expected

I'm not sure if this is a good reason to use C++23. Choosing C++20 was arbitrary decision from me.

I just checked that expected implementation is complete from major compilers: https://en.cppreference.com/w/cpp/compiler_support/23

@zhjwpku
Copy link
Collaborator

zhjwpku commented Jan 10, 2025

I came across this expected[0] class which seems conform to the idea of Status/Result, but it's in c++23.
[0] https://en.cppreference.com/w/cpp/utility/expected

I'm not sure if this is a good reason to use C++23. Choosing C++20 was arbitrary decision from me.

I'm not sure if there will be any difficulty for other projects to include iceberg-cpp if we choose a fairly
new standard, other than that, I'm no objection for C++23.

I just checked that expected implementation is complete from major compilers: https://en.cppreference.com/w/cpp/compiler_support/23

good to know.

@zhjwpku
Copy link
Collaborator

zhjwpku commented Jan 13, 2025

While looking the std::expected usage, I'm thinking no matter we choose the std::expected way or Arrow's Result way, it is no harm that we add a Status data structure, it's useful in the following two cases:

  1. directly return Status, example: Status Close();, the returned value can state whether the call success or failed for some reason.
  2. return std::expected<value_type, Status> or implicitly wrapped by Result<value_type>, example: std::expected<int64_t, Status> Tell(), either return the current access position of a opened file or the failure reason.

Thought? @wgtmac @pitrou @gaborkaszab @mapleFU

@lidavidm
Copy link
Member

(FWIW, I believe the...expected...way of using std::expected when there is no return value is std::expected<void, Error>. You could always using Status = expected<void, Error> + using Result = expected<T, Error>.)

@gaborkaszab
Copy link
Collaborator Author

I'm hesitant to introduce a dependency for C++23. Some projects might not move that fast and we'd make the adoption hard for them.
I'd still go for using exceptions rather. It would make the interfaces and the code itself way cleaner, that matters for API's and libraries like this that are used by other tools. I haven't heard anyone from the Java implementation side complaining about the difficulty of debugging code with exceptions.

@pitrou
Copy link
Member

pitrou commented Jan 13, 2025

Well, you can find a std::expected backport for C++20, for example https://github.com/zeus-cpp/expected

Edit: I'm not saying that std::expected should be preferred over exceptions. Just that the language compatibility problem can be solved if std::expected is preferred.

@pitrou
Copy link
Member

pitrou commented Jan 13, 2025

Ah, by the way, a nice consequence of using exceptions is that you can throw them from constructors, while with std::expected you would have to define static factory functions instead.

@pitrou
Copy link
Member

pitrou commented Jan 13, 2025

I went looking around for comparisons of exceptions and std::expected and I found an insightful tidbit here: https://www.foonathan.net/2017/12/exceptions-vs-expected/

[Exceptions] are not easily composable: There is only one current exception, you can’t have multiple ones. This was a problem, for example, for the C++17 parallel algorithms. What if an exception is thrown in multiple of the worker threads? How to report all of them back to the caller? The implementation gave up on solving that problem and just decided to terminate the program if any exception is thrown.

If this is still the case with C++20 (you can't store an exception on the side to later wrap or re-throw it, for example), then this sounds like a major annoyance.

@lidavidm
Copy link
Member

There is std::current_exception/std::rethrow_exception, though it has some intricacies. (And I don't think there's anything like ExceptionGroup in Python.)

@lidavidm
Copy link
Member

I'll say I personally find it easier to debug code with exceptions (catch throw vs having to carefully figure out the right breakpoint), though on the flip side if we do end up with exceptions hopefully we end up with a single or clearly documented small set of exceptions we throw and clear documentation on which APIs throw which exceptions.

(Also, that operator try proposal looks interesting)

@pitrou
Copy link
Member

pitrou commented Jan 13, 2025

There is std::current_exception/std::rethrow_exception, though it has some intricacies. (And I don't think there's anything like ExceptionGroup in Python.)

Hmm... apparently std::exception_ptr doesn't allow accessing the underlying exception object, except by re-throwing and then catching it?

@ormandi
Copy link

ormandi commented Jan 23, 2025

+1 for Status/StatusOr/RETURN_IF_ERROR/ASSIGN_OR_RETURN combo (more about some of these: https://abseil.io/docs/cpp/guides/status). In fact, I'd probably just depend on these:

These are battle tested, explicit constructs with good test gtest integration, compile time support and it is not so verbose with the proper use of RETURN_IF_ERROR and ASSIGN_OR_RETURN in my opinion which are unfortunately not part of the public API of abseil , but they are ubiquitous constructs (they are in apache-arrow, protobuf, grpc, TensorlFow (also TF hosts its own RETURN_IF_ERROR macro), JAX) and self-hosting them in a small header is not of a big deal in my opinion.

@pitrou
Copy link
Member

pitrou commented Jan 23, 2025

A hard public API dependency on Abseil would be a very bad idea IMHO.

@ormandi
Copy link

ormandi commented Jan 23, 2025

A hard public API dependency on Abseil would be a very bad idea IMHO.

Was wondering why do you think so? (just curious)

@pitrou
Copy link
Member

pitrou commented Jan 23, 2025

A hard public API dependency on Abseil would be a very bad idea IMHO.

Was wondering why do you think so? (just curious)

Because any consumer of libiceberg would have to be careful about API/ABI conflicts with other Abseil-using libraries.

The same applies to Boost in some ways (though Boost, at least, is not often a public dependency).

@ormandi
Copy link

ormandi commented Jan 23, 2025

Because any consumer of libiceberg would have to be careful about API/ABI conflicts with other Abseil-using libraries.

Thank you for the response! Though, I think API-wise absl is fairly stable. Regarding ABI compatibility, I usually think in terms of static linking which - in my opinion - (pretty much) neglects this issue. On the other hand, if you want to link dynamically, yeah, I would not do that either.

zhjwpku added a commit to zhjwpku/iceberg-cpp that referenced this issue Jan 26, 2025
After studied [0], [1] and [2], I'm inclined to backport [0] to
iceberg-cpp, the main reason is that [0] has exactly the same
APIs as std::expected, [1] provide some extra member functions
like `map` and `map_error`, [2] has different API names like
`then` and `thenOrThrow`.

We want users to use iceberg::expected the same way as std::expected,
and we will replace iceberg::expected with std::expected when
we decide to move to c++23 some day, by backporting [0], we can
facilitate a smoother transition process.

We discussed about `Exceptions vs Expected` in apache#14, while
backporting [0], I had the feeling we shouldn't choose one over
the other, we can use both approaches effectively.

expected provide monadic operations like `and_then`, `transform`,
`or_else`, `transform_error`, let us do method chaining, which
is a good sign.

[0] https://github.com/zeus-cpp/expected
[1] https://github.com/TartanLlama/expected
[2] https://github.com/facebook/folly/blob/main/folly/Expected.h
[3] https://en.cppreference.com/w/cpp/utility/expected

Signed-off-by: Junwang Zhao <[email protected]>
zhjwpku added a commit to zhjwpku/iceberg-cpp that referenced this issue Jan 26, 2025
After studied [0], [1] and [2], I'm inclined to backport [0] to
iceberg-cpp, the main reason is that [0] has exactly the same
APIs as std::expected, [1] provide some extra member functions
like `map` and `map_error`, [2] has different API names like
`then` and `thenOrThrow`.

We want users to use iceberg::expected the same way as std::expected,
and we will replace iceberg::expected with std::expected when
we decide to move to c++23 some day, by backporting [0], we can
facilitate a smoother transition process.

We discussed about `Exceptions vs Expected` in apache#14, while
backporting [0], I had the feeling we shouldn't choose one over
the other, we can use both approaches effectively.

expected provide monadic operations like `and_then`, `transform`,
`or_else`, `transform_error`, let us do method chaining, which
is a good sign.

[0] https://github.com/zeus-cpp/expected
[1] https://github.com/TartanLlama/expected
[2] https://github.com/facebook/folly/blob/main/folly/Expected.h
[3] https://en.cppreference.com/w/cpp/utility/expected

Signed-off-by: Junwang Zhao <[email protected]>
zhjwpku added a commit to zhjwpku/iceberg-cpp that referenced this issue Jan 26, 2025
After studied [0], [1] and [2], I'm inclined to backport [0] to
iceberg-cpp, the main reason is that [0] has exactly the same
APIs as std::expected, [1] provide some extra member functions
like `map` and `map_error`, [2] has different API names like
`then` and `thenOrThrow`.

We want users to use iceberg::expected the same way as std::expected,
and we will replace iceberg::expected with std::expected when
we decide to move to c++23 some day, by backporting [0], we can
facilitate a smoother transition process.

We discussed about `Exceptions vs Expected` in apache#14, while
backporting [0], I had the feeling we shouldn't choose one over
the other, we can use both approaches effectively.

expected provide monadic operations like `and_then`, `transform`,
`or_else`, `transform_error`, let us do method chaining, which
is a good sign.

[0] https://github.com/zeus-cpp/expected
[1] https://github.com/TartanLlama/expected
[2] https://github.com/facebook/folly/blob/main/folly/Expected.h
[3] https://en.cppreference.com/w/cpp/utility/expected

Signed-off-by: Junwang Zhao <[email protected]>
zhjwpku added a commit to zhjwpku/iceberg-cpp that referenced this issue Jan 26, 2025
After studied [0], [1] and [2], I'm inclined to backport [0] to
iceberg-cpp, the main reason is that [0] has exactly the same
APIs as std::expected, [1] provide some extra member functions
like `map` and `map_error`, [2] has different API names like
`then` and `thenOrThrow`.

We want users to use iceberg::expected the same way as std::expected,
and we will replace iceberg::expected with std::expected when
we decide to move to c++23 some day, by backporting [0], we can
facilitate a smoother transition process.

We discussed about `Exceptions vs Expected` in apache#14, while
backporting [0], I had the feeling we shouldn't choose one over
the other, we can use both approaches effectively.

expected provide monadic operations like `and_then`, `transform`,
`or_else`, `transform_error`, let us do method chaining, which
is a good sign.

[0] https://github.com/zeus-cpp/expected
[1] https://github.com/TartanLlama/expected
[2] https://github.com/facebook/folly/blob/main/folly/Expected.h
[3] https://en.cppreference.com/w/cpp/utility/expected

Signed-off-by: Junwang Zhao <[email protected]>
zhjwpku added a commit to zhjwpku/iceberg-cpp that referenced this issue Jan 26, 2025
After studied [0], [1] and [2], I'm inclined to backport [0] to
iceberg-cpp, the main reason is that [0] has exactly the same
APIs as std::expected, [1] provide some extra member functions
like `map` and `map_error`, [2] has different API names like
`then` and `thenOrThrow`.

We want users to use iceberg::expected the same way as std::expected,
and we will replace iceberg::expected with std::expected when
we decide to move to c++23 some day, by backporting [0], we can
facilitate a smoother transition process.

We discussed about `Exceptions vs Expected` in apache#14, while
backporting [0], I had the feeling we shouldn't choose one over
the other, we can use both approaches effectively.

expected provide monadic operations like `and_then`, `transform`,
`or_else`, `transform_error`, let us do method chaining, which
is a good sign.

[0] https://github.com/zeus-cpp/expected
[1] https://github.com/TartanLlama/expected
[2] https://github.com/facebook/folly/blob/main/folly/Expected.h
[3] https://en.cppreference.com/w/cpp/utility/expected

Signed-off-by: Junwang Zhao <[email protected]>
zhjwpku added a commit to zhjwpku/iceberg-cpp that referenced this issue Jan 26, 2025
After studied [0], [1] and [2], I'm inclined to backport [0] to
iceberg-cpp, the main reason is that [0] has exactly the same
APIs as std::expected, [1] provide some extra member functions
like `map` and `map_error`, [2] has different API names like
`then` and `thenOrThrow`.

We want users to use iceberg::expected the same way as std::expected,
and we will replace iceberg::expected with std::expected when
we decide to move to c++23 some day, by backporting [0], we can
facilitate a smoother transition process.

We discussed about `Exceptions vs Expected` in apache#14, while
backporting [0], I had the feeling we shouldn't choose one over
the other, we can use both approaches effectively.

expected provide monadic operations like `and_then`, `transform`,
`or_else`, `transform_error`, let us do method chaining, which
is a good sign.

[0] https://github.com/zeus-cpp/expected
[1] https://github.com/TartanLlama/expected
[2] https://github.com/facebook/folly/blob/main/folly/Expected.h
[3] https://en.cppreference.com/w/cpp/utility/expected

Signed-off-by: Junwang Zhao <[email protected]>
@zhjwpku
Copy link
Collaborator

zhjwpku commented Jan 27, 2025

I've created a PR #40 to backport std::expected, I'm looking forward to hearing all of your feedback.

zhjwpku added a commit to zhjwpku/iceberg-cpp that referenced this issue Jan 31, 2025
After studied [0], [1] and [2], I'm inclined to backport [0] to
iceberg-cpp, the main reason is that [0] has exactly the same
APIs as std::expected, [1] provide some extra member functions
like `map` and `map_error`, [2] has different API names like
`then` and `thenOrThrow`.

We want users to use iceberg::expected the same way as std::expected,
and we will replace iceberg::expected with std::expected when
we decide to move to c++23 some day, by backporting [0], we can
facilitate a smoother transition process.

We discussed about `Exceptions vs Expected` in apache#14, while
backporting [0], I had the feeling we shouldn't choose one over
the other, we can use both approaches effectively.

expected provide monadic operations like `and_then`, `transform`,
`or_else`, `transform_error`, let us do method chaining, which
is a good sign.

[0] https://github.com/zeus-cpp/expected
[1] https://github.com/TartanLlama/expected
[2] https://github.com/facebook/folly/blob/main/folly/Expected.h
[3] https://en.cppreference.com/w/cpp/utility/expected

Signed-off-by: Junwang Zhao <[email protected]>
zhjwpku added a commit to zhjwpku/iceberg-cpp that referenced this issue Feb 2, 2025
After studied [0], [1] and [2], I'm inclined to backport [0] to
iceberg-cpp, the main reason is that [0] has exactly the same
APIs as std::expected, [1] provide some extra member functions
like `map` and `map_error`, [2] has different API names like
`then` and `thenOrThrow`.

We want users to use iceberg::expected the same way as std::expected,
and we will replace iceberg::expected with std::expected when
we decide to move to c++23 some day, by backporting [0], we can
facilitate a smoother transition process.

We discussed about `Exceptions vs Expected` in apache#14, while
backporting [0], I had the feeling we shouldn't choose one over
the other, we can use both approaches effectively.

expected provide monadic operations like `and_then`, `transform`,
`or_else`, `transform_error`, let us do method chaining, which
is a good sign.

[0] https://github.com/zeus-cpp/expected
[1] https://github.com/TartanLlama/expected
[2] https://github.com/facebook/folly/blob/main/folly/Expected.h
[3] https://en.cppreference.com/w/cpp/utility/expected

Signed-off-by: Junwang Zhao <[email protected]>
@pitrou
Copy link
Member

pitrou commented Feb 12, 2025

As a general data point for this discussion, I would mention the case of the Arrow C++ CSV parser. Arrow C++ uses a Status class (but std::expected would be similar) for error reporting. This implies that any potentially-failing function should return a Status (or, similarly, Result<T> or std::expected<T, E>).

However, a CSV parser is a state machine that must withhold tens or hundreds of millions of transitions per second. It turns out that reporting a Status in such a critical path can reduce performance by 20-30%. The Arrow C++ CSV parser tries various tricks to avoid such reporting, which can have other consequences on security (see current discussion in apache/arrow#45498).

Edit: let me qualify/moderate this a bit. The performance overhead is not only reporting the Status, it's also simply detecting the error and branching on it in the leaf functions.

Fokko pushed a commit that referenced this issue Feb 26, 2025
After studied [0], [1] and [2], I'm inclined to backport [0] to
iceberg-cpp, the main reason is that [0] has exactly the same APIs as
std::expected, [1] provide some extra member functions like `map` and
`map_error`, [2] has different API names like `then` and `thenOrThrow`.

We want users to use iceberg::expected the same way as std::expected,
and we will replace iceberg::expected with std::expected when we decide
to move to c++23 some day, by backporting [0], we can facilitate a
smoother transition process.

We discussed about `Exceptions vs Expected` in #14, while backporting
[0], I had the feeling we shouldn't choose one over the other, we can
use both approaches effectively.

expected provide monadic operations like `and_then`, `transform`,
`or_else`, `transform_error`, let us do method chaining, which is a good
sign.

[0] https://github.com/zeus-cpp/expected
[1] https://github.com/TartanLlama/expected
[2] https://github.com/facebook/folly/blob/main/folly/Expected.h
[3] https://en.cppreference.com/w/cpp/utility/expected

---------

Signed-off-by: Junwang Zhao <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants