-
Notifications
You must be signed in to change notification settings - Fork 0
Description
In the current implementation, the model is generic on the two internal types
of the nodes, ST and LT. However, once the small type ST is fixed, the
large type LT is automatically determined* (e.g. i8 and i32 respectively).
This means that in practice, the model is generic on the small type ST only. We can
therefore bind the small and large types in a single trait, SmallNIO, bounded by Integral.
SmallNIO would contain an associated type LargeNIO: Integral + From<ST> + TryFrom<LT>, which
corresponds to the large type LT determined by the small type ST.
The enumQTypeArray could be renamed to NIOArray, with the following definition:
pub enum NIOArray<ST: SmallNIO> {
S(Tensor<ST>),
L(Tensor<ST::LargeNIO>),
}
This would allow us to remove the LT type parameter and conversion trait bounds from the all nodes, leading to much cleaner code.
*edit (@Antonio95) To add some more background on the longer conversation César and I had about this, the reason "once the small type ST is fixed, the large type LT is automatically determined" is because the TF Lite specification is only defined for i8 and i32. Having i8 and i16 would be unassumably restrictive on BMM layers (#rows in the weight matrix would have to be 1), and i8 with i64 is overkill. The only natural next step for the small type is i16 with large type i64 to keep the x4 relation found in the TF Lite definition for i8. Note that even this step wouldn't be automatic, since TF Lite doesn't have requantisation to code compute parameters (scales -> multiplier + shift) of the corresponding sizes - but at least the transformation is conceivable.
In any case, the relation ST - LT is more or less untouchable by the spec and having one ST with several LTs seems like something we won't encounter.