Skip to content
This repository was archived by the owner on Nov 30, 2022. It is now read-only.

new literacy::{Read, Write} traits to handle std/no_std (alternative) #127

Closed
wants to merge 14 commits into from

Conversation

RCasatta
Copy link
Contributor

@RCasatta RCasatta commented May 6, 2021

Alternative to #126
see description there

@RCasatta RCasatta marked this pull request as draft May 6, 2021 11:34
src/literacy.rs Outdated
@@ -0,0 +1,123 @@
#[cfg(all(feature = "std", feature = "use-core2"))]
compile_error!("feature \"std\" and \"use-core2\" cannot be enabled together.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should "default" to core2 in this case - if we already are building against the core2 dep, we might as well just assume the user set core2/std and then the core2 impl is the same as doing the std impl.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree but there are some implementation issue, in particular

we might as well just assume the user set core2/std

How so? with another feature use-core2-std = ["core2/std"]? This would cause other issues in feature combinations...
The other option, using std = ["core2/std"] will cause problem in default features compilation on rust 1.29

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like you said in chat....

ugh, I wish there were a way in cargo.toml to say "if core2 && std -> require core2/std"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How so? with another feature use-core2-std = ["core2/std"]? This would cause other issues in feature combinations...

We don't have to specify it ourselves for core2/std to be enabled by some other dependency path. Its probably neater to just impl for std and not core2 than to compile-error, and if users intended to use core2 they can always fix it by, themselves, directly requiring core2/std. This avoids the issue of having some crate that depends on A and B and A enables std and B enables core2, giving the user no way to fix the issue aside from patching either A or B.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, removed the compile_error and implementing std impl only if there is std but not core2

@TheBlueMatt
Copy link
Member

I think also we should have a not-std&&not-core2 impl block for Write into a Vec and Read from a slice, like @devrandom did in rust-bitcoin at https://github.com/rust-bitcoin/rust-bitcoin/pull/603/files#diff-76866598ce8fd16261a27ac58a84b2825e6e77fc37c163a6afa60f0f4477e569R199-R256

cargo clean
CC='clang -fsanitize=memory -fno-omit-frame-pointer' \
RUSTFLAGS='-Zsanitizer=memory -Zsanitizer-memory-track-origins -Cforce-frame-pointers=yes' \
cargo test --lib --all --features="$FEATURES" -Zbuild-std --target x86_64-unknown-linux-gnu
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be --all-features here and above, waiting to find a solution for defaulting to core2 if enabled (without conflicting with std)

@RCasatta
Copy link
Contributor Author

I think also we should have a not-std&&not-core2 impl block for Write into a Vec and Read from a slice, like @devrandom did in rust-bitcoin at https://github.com/rust-bitcoin/rust-bitcoin/pull/603/files#diff-76866598ce8fd16261a27ac58a84b2825e6e77fc37c163a6afa60f0f4477e569R199-R256

added in 4291b4b and also removed write_all blanket impl

Copy link
Member

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is basically almost there. Is it time to un-draft it?

src/lib.rs Outdated
@@ -25,7 +25,7 @@
#![deny(non_camel_case_types)]
#![deny(non_snake_case)]
#![deny(unused_mut)]
#![deny(missing_docs)]
//#![deny(missing_docs)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope you didn't intend to land this :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed, and added doc where missing in 8bf2f9e

src/literacy.rs Outdated
@@ -0,0 +1,123 @@
#[cfg(all(feature = "std", feature = "use-core2"))]
compile_error!("feature \"std\" and \"use-core2\" cannot be enabled together.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How so? with another feature use-core2-std = ["core2/std"]? This would cause other issues in feature combinations...

We don't have to specify it ourselves for core2/std to be enabled by some other dependency path. Its probably neater to just impl for std and not core2 than to compile-error, and if users intended to use core2 they can always fix it by, themselves, directly requiring core2/std. This avoids the issue of having some crate that depends on A and B and A enables std and B enables core2, giving the user no way to fix the issue aside from patching either A or B.

@RCasatta RCasatta marked this pull request as ready for review May 19, 2021 09:42
Copy link
Contributor

@sgeisler sgeisler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK 8bf2f9e for the rust code, not so sure about the tests, needs some clarification at least imo.

contrib/test.sh Outdated
FEATURES="serde serde-std"
# Combination of features to test, should be every features combination but std and core2 together
# note std has a comma in the end so that following regex avoid matching serde-std
FEATURES=("" "std," "use-core2" "use-core2-std" "std,use-core2" "std,serde-std" "use-core2,serde-std")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happened to the serde feature? I assume serde without std is expected to work in no-std mode so it should be tested accordingly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

serde added in 7b909c9, I forget about it because it's used but not listed in the Cargo.toml feature section...

Not sure if other combination are needed though

echo "--------------$feature----------------"
cargo build --no-default-features --features="$feature"
if [[ ${feature} =~ "std," ]] ; then
cargo test --no-default-features --features="$feature"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we only test with std and just build the rest of the time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because tests have std enabled

#![cfg_attr(all(not(test), not(feature = "std")), no_std)]

@devrandom
Copy link
Contributor

I tried to compile the following example, which uses a pattern seen in rust-bitcoin. Namely, the caller passes in a reference, but the function called expects a non-reference Write trait. In particular, this is seen when a hash engine is passed into consensus_encode.

extern crate bitcoin_hashes;

use bitcoin_hashes::literacy::Write;
use bitcoin_hashes::{sha256d, Hash, hash_newtype, hex_fmt_impl, index_impl, serde_impl, borrow_slice_impl};

hash_newtype!(Txid, sha256d::Hash, 32, doc="A bitcoin transaction hash/transaction ID.");

fn do_it<W: Write>(mut w: W) {
	let _ = w.write_all(&[0u8; 1]);
}

fn main() {
	let mut enc = Txid::engine();
	do_it(&mut enc);
}

But this fails to compile. I tried creating an impl<W: ::std::io::Write> Write for &mut W {, but that conflicted with the existing impl. Any ideas?

@TheBlueMatt
Copy link
Member

impl<W: ::std::io::Write> Write for &mut W {

What about impl<W: ::literacy::Write> ::literact::Write for &mut W {? That's closer to what std::io::Write does, though I have a feeling that will conflict, too, because of the blanket std::io::Write impl. I don't think its the end of the world to move rust-bitcoin to always taking a &mut W instead, though. @apoelstra ?

@devrandom
Copy link
Contributor

Yeah, the other way results in the same conflict.

I'm playing with taking &mut W instead of mut W, and that seems to work.

A further issue is that rust-bitcoin uses Read::take, which we don't implement here. Should we grab that from std?

And one more question - how exactly do we concretize the Error type param? For example, is this what we want for the Encodeable trait in rust-bitcoin (and the few dozen impls)?

pub trait Encodable {
    fn consensus_encode<W: io::Write<Error=io::Error>>(&self, writer: &mut W) -> Result<usize, io::Error>;
}

(where the io::Error type depends on std vs no-std)

@RCasatta
Copy link
Contributor Author

@afilini and I did some test for accepting both self and &mut self like std https://github.com/RCasatta/bitcoin_hashes/tree/literacy_associated_alek

we can't impl<W: ::std::io::Write> Write for &mut W { because we already have a generic impl (differently from std)

however, Alekos come up with one generic implementation that should support both self and &mut self

impl<W: BorrowMut<dyn (::std::io::Write)>> Write for W {

which works but the problem in @devrandom's example remains because engines implement directly the literacy trait so they are not getting the BorrowMut thing

@sgeisler
Copy link
Contributor

I don't really like the BorrowMut<dyn (::std::io::Write)> solution, do we know if it will be optimized away? Or will people unknowingly be using dynamic dispatch? Especially for something kinda low-level like io that would be annoying.

I think implementing literacy::Write and literacy::Read directly is actually an outlier (although very prominent in hashes). Afaik the remaining API of rust-bitcoin is more a consumer of objects implementing these traits. Since std implements impl<W: Write + ?Sized> Write for &mut W and we implement impl<W: ::std::io::Write> Write for W we already get literacy impls for most relevant mut borrows, making the hacky blanket BorrowMut impl unnecessary except for the few cases where literacy traits are implemented directly. I propose to instead use a macro to just impl literacy::Write for both HashEngine and &mut HashEngine (19b74b7).

@afilini
Copy link

afilini commented May 20, 2021

making the hacky blanket BorrowMut impl unnecessary except for the few cases where literacy traits are implemented directly.

Not even that, because implementing it that way made it only apply to types that already implemented std::io::Write and that we were "bridging" to literacy::Write, which as you pointed out already had an implementation for their references since the std trait is implemented on both automatically. We were just trying to come up with something cool but in the end I don't think it can be done automatically.

@TheBlueMatt
Copy link
Member

Yea, I think we definitely can't do anything that relies on dyn here, the optimization story of this type of deserialization logic is already pretty rough (and we've had users with significant performance bottlenecks in our deserialization before, it probably still is the bottleneck for electrs), we can't make it worse.

@RCasatta
Copy link
Contributor Author

I think 19b74b7 is the way to go, but there are some issues with lifetimes on 1.29 https://github.com/RCasatta/bitcoin_hashes/runs/2631403404?check_suite_focus=true, @sgeisler do you want to look at it?

@sgeisler
Copy link
Contributor

Sure, will do, the commit was merely to illustrate my point, I'll also remove the example again.

Using a macro makes the code more compact and allows us to easily add an impl for &mut HashEngine which could not be achieved through the type system otherwise.
@sgeisler
Copy link
Contributor

The embedded test broke somehow 😦 I'm not sure if it has anything to do with my commit. Did it increase the lib size so much that it doesn't fit into the flash anymore?

@RCasatta RCasatta force-pushed the literacy_associated branch from 7b909c9 to a37f2be Compare May 21, 2021 08:56
@RCasatta
Copy link
Contributor Author

A further issue is that rust-bitcoin uses Read::take, which we don't implement here. Should we grab that from std?

added in a37f2be

Another option without adding the fn to the trait and the associated type would have been to restore the CappedRead RCasatta/rust-bitcoin@24015e8, however I preferred like it's done a37f2be (inheriting std/core2 implementation, and the simple implementation for the slice, there should not be any other case where we implement Read in rust-bitcoin if I am not mistaken)

@devrandom
Copy link
Contributor

This question remains, how do we concretize Error for Encodable in rust-bitcoin?

One issue is that the hash engines already concretize Error to () in this crate, so that would require a bunch of error mapping if Encodable wants to concretize to io::Error.

And one more question - how exactly do we concretize the Error type param? For example, is this what we want for the Encodable trait in rust-bitcoin (and the few dozen impls)?

pub trait Encodable {
    fn consensus_encode<W: io::Write<Error=io::Error>>(&self, writer: &mut W) -> Result<usize, io::Error>;
}

(where the io::Error type depends on std vs no-std)

@RCasatta
Copy link
Contributor Author

This question remains, how do we concretize Error for Encodable in rust-bitcoin?

with the latest commits it should be:
(literacy::Error it's a re-export according to the activated features)

  pub trait Encodable {
      fn consensus_encode<W: literacy::Write<Error=literacy::Error>(&self, writer: &mut W) -> Result<usize, literacy::Error>;
  }

One issue is that the hash engines already concretize Error to () in this crate, so that would require a bunch of error mapping if Encodable wants to concretize to io::Error.

1f500f1 use literacy::Error also for engines (the () was conveying more the idea of the absence of error but this way should be more convenient to use and I added a comment)

src/literacy.rs Outdated
/// The error type returned in Result
type Error;
/// The type to implement limited reads
type Take;
Copy link

@afilini afilini May 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a trait bound to literacy::Read here, it might help later on when working with generics

Copy link

@afilini afilini May 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively you could also add another trait specifically for the Take type that "extends" Read and forces a specific constructor-like method. With that you can then provide a default implementation for take() like std does, because you can always construct an object of type Self::Take using the constructor defined by its trait.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added trait bound in 6b7b28f

@RCasatta
Copy link
Contributor Author

RCasatta commented May 21, 2021

After adding the literacy::Error logic I realized it may be possible to remove the associated parameter Error, adding something like alloc::Box<dyn OurErrorTrait> where OurErrorTrait: Display + Debug + … {} for the default_impl...

But I would like to ear opinions before doing such changes

@TheBlueMatt
Copy link
Member

adding something like alloc::Box where OurErrorTrait: Display + Debug + … {} for the default_impl...

Why? Lets generally avoid dyn indirection unless we have a good reason to prefer it.

@TheBlueMatt
Copy link
Member

This question remains, how do we concretize Error for Encodable in rust-bitcoin?

How much do we actually add an error ourselves? I assume we should basically never error unless the underlying Write does, so could we not just return Result<usize, W::Error>?

@devrandom
Copy link
Contributor

devrandom commented May 21, 2021

Do we really want these verbose function signatures (illustrated for the std case):

pub trait Decodable: Sized {
    /// Decode an object with a well-defined format
    fn consensus_decode<D: io::Read<Error=encode::Error, Take=::std::io::Take<D>>>(d: &mut D) -> Result<Self, encode::Error>;
}

impl Decodable for OutPoint {
    fn consensus_decode<D: io::Read<Error=encode::Error, Take=::std::io::Take<D>>>(d: &mut D) -> Result<Self, encode::Error> {
        Ok(OutPoint {
            txid: Decodable::consensus_decode(d)?,
            vout: Decodable::consensus_decode(d)?,
        })
    }
}

@sgeisler
Copy link
Contributor

After adding the literacy::Error logic I realized it may be possible to remove the associated parameter Error, adding something like alloc::Box<dyn OurErrorTrait> where OurErrorTrait: Display + Debug + … {} for the default_impl...

Please don't, the entire point of this PR was to have the error be an associated type instead of a Box<dyn …> compared to #126 .

This question remains, how do we concretize Error for Encodable in rust-bitcoin?

My vision for the API is as follows:

pub trait Encodable {
    fn consensus_encode<W: literacy::Write>(&self, writer: W) -> Result<usize, W::Error>;
}

pub trait Decodable: Sized {
    fn consensus_decode<D: literacy::Read>(d: D) -> Result<Self, DecodeError<D::Error>>;
}

pub enum DecodeError<E: Display + Debug + …> {
    Io(E),
    Psbt(psbt::Error),}

The caller of one of these functions typically knows if they use std or core2, or if not they will just have to do the same generic trick that we are doing here. So there is no need to use a Box<dyn> for errors.

Do we really want these verbose function signatures (illustrated for the std case):

I think there are some misconceptions about having to name associated types in that context, you don't have to. You can access them, for every struct-trait pair these are well-defined, so no need to actually specify them anywhere here.

@RCasatta
Copy link
Contributor Author

@TheBlueMatt @sgeisler so it's good like it is?

src/literacy.rs Outdated
/// see [std::io::Read::read_exact]
fn read_exact(&mut self, buf: &mut [u8]) -> ::core::result::Result<(), Self::Error>;
/// see [std::io::Read::take]
fn take(self, limit: u64) -> Self::Take;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to quite work with the &mut pattern:

    fn consensus_decode<D: io::Read>(d: &mut D) -> Result<Self, encode::Error> {
        let mut d = d.take(MAX_VEC_SIZE as u64);

gives:

error[E0507]: cannot move out of `*d` which is behind a mutable reference

    let mut d = d.take(MAX_VEC_SIZE as u64);
                      ^ move occurs because `*d` has type `D`, which does not implement the `Copy` trait

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe removing take from the trait and using something like the CappedRead in RCasatta/rust-bitcoin@24015e8 ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll give this a try.

Copy link
Contributor

@devrandom devrandom May 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to work fine, let's take out the take.

rust-bitcoin/rust-bitcoin@4e2a451

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

take removed in c6d4ca4

@sgeisler
Copy link
Contributor

I tried to migrate the consensus de/encoding API in rust-bitcoin to this PR and have to say it's not too pleasant 😆 some problems I noticed:

  • The generic error type is infecting way more parts of the code than I expected, partly because the encode::Error is not limited to io-related functionality. I think it would actually be a good thing to untangle it a bit, but that might turn out to be a very big task.
  • Some of our code actually uses the information in the io errors, e.g. there is StreamReader::read_next which tries to decode some data to see if it received enough bytes or if it needs more (being signaled by io::ErrorKind::UnexpectedEof).

This means we'd have to do a lot of work to make rust-bitcoin compatible. But I also noticed that the ErrorKind type from both std and core is the same. So a middle ground might be found by building our own error type that is slightly more intelligent than Box<dyn OurErrorTrait>:

struct IoError {
    kind: IoErrorKind,
    error: Box<dyn OurErrorTrait>, //
}

trait OurErrorTrait: Debug + Any {}

This would allow cheap conversion from both std and core2 errors, cheap superficial mathching of error causes and expensive deeper introspection when needed by downcasting the underlying error. It's not pretty, but probably a lot less work than restructuring the entire error handling of rust-bitcoin. What do you think?

@RCasatta
Copy link
Contributor Author

Would like to ear from @devrandom, @apoelstra and @TheBlueMatt about @sgeisler proposal before doing changes

@devrandom
Copy link
Contributor

It sounds plausible, but it's difficult to evaluate without trying it. If it can be quickly prototyped, I can try it out in rust-bitcoin.

BTW, I had the same experience as @sgeisler when trying to apply the current version to rust-bitcoin, which motivated some of my previous questions...

@TheBlueMatt
Copy link
Member

Without having dug into it too much, building on @sgeisler's suggestion, what about:

trait ErrTrait { fn kind(&self) -> ErrorKind; }
impl Write { type Error: ErrTrait; ... }
impl ErrTrait for std::io::Error { .. }

@RCasatta
Copy link
Contributor Author

So I worked on the ErrorKind in 38bb903, unfortunately, it is not in core so I had to duplicate the enum...

@apoelstra
Copy link
Member

I'm having trouble following all the discussion, especially as the PR is now 14 commits, many of which undo changes that earlier commits do. I agree with others saying that we should not use dyn anywhere. @sgeisler's suggested IoError still involves an allocation when constructing the error type which is far too much overhead for something that (conceptually) shouldn't need to be much more than a u16.

I'm also unconvinced of the merits of creating our own parallel implementation of the core2 crate. Our types which implement our traits will not be usable with other crates that expect the core2 traits or vice-versa, and as we've seen there is actually a significant amount of complexity involved in simulating Read/Wirte well enough to make this usable. This is an extra maintenance burden that we don't want to take on, and it's very difficult to reason about all the potential feature-flag interactions.

@TheBlueMatt
Copy link
Member

TheBlueMatt commented May 25, 2021

I'm also unconvinced of the merits of creating our own parallel implementation of the core2 crate

Ugh, yea, after all the work here, it just seems much more complicated than I was anticipating. Maybe if ErrorKind was exposed it could have gone down this path without the crazy amount of work here. Apologies for pushing the wrong way here.

@RCasatta
Copy link
Contributor Author

At least we tried... Somehow relieved to close this...

I remain with an open question though, is "deeper introspection when needed by downcasting the underlying error" really needed somewhere? or one could hypothetically go with just ErrorKind (not ideal losing error information but maybe good enough)?

@RCasatta RCasatta closed this May 26, 2021
@devrandom devrandom mentioned this pull request May 28, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants