-
Notifications
You must be signed in to change notification settings - Fork 139
Add Image Transformer Library #679
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
tensorflow_lite_support/cc/task/vision/proto/image_transformer_options.proto
Outdated
Show resolved
Hide resolved
tensorflow_lite_support/cc/task/vision/proto/image_transformer_options.proto
Outdated
Show resolved
Hide resolved
tensorflow_lite_support/cc/task/vision/proto/image_transformer_options.proto
Outdated
Show resolved
Hide resolved
// `base_options` to specifying the TFLite model and using | ||
// `base_options.compute_settings.tflite_settings.cpu_settings.num_threads`, | ||
// to configure the number of threads. | ||
optional int32 num_threads = 7 [default = -1]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
num_threads is configured through compute_settings, so can be removed too.
f7310c0
to
fe089a3
Compare
if (options.base_options().compute_settings().tflite_settings().cpu_settings().num_threads() == 0 || | ||
options.base_options().compute_settings().tflite_settings().cpu_settings().num_threads() < -1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the correct way to get num_threads
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's correct. But you don't need to check num_threads specifically. It has been verified here when the base task is created.
// TODO: Will the output be float and should be converted or directly available? | ||
// The example had float and it had to be converted. Anyway, we're guaranteed to have uint8 as output. | ||
has_uint8_outputs_ = (output_tensor->type == kTfLiteUInt8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, here the model will most likely output kTfLiteFloat32
right? Then we convert it to uint8_t
type explicitly in Preprocess
? Or does all ImageTransformer
model should always have kTfLiteuint8
type as output tensor type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think both situation will happen. If the output is float, it most likely will be in the same range as the input image, meaning if the input is normalized to [0, 1], you need to denormalized the output from [0, 1] to [0, 255]. We'll need to implement for both. See the style transfer model here: https://www.tensorflow.org/lite/examples/style_transfer/overview#performance_benchmarks
Hey @lu-wang-g, we decided upon the tflite::support::StatusOr<std::unique_ptr<::tflite::task::vision::FrameBuffer>> Postprocess(); API for Tagging @xunkai55 @wangtz for their opinion as well. Cheers. |
StatusOr<FrameBuffer> ImageTransformer::Postprocess( | ||
const std::vector<const TfLiteTensor*>& /*output_tensors*/, | ||
const FrameBuffer& /*frame_buffer*/, const BoundingBox& /*roi*/) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very well, I've come up with a compromise. For now I'm using the old API but not using any of the arguments. When the new API lands, one can just remove the arguments and it will work like charm.
Also @lu-wang-g , it looks like we can't use std::unique_ptr<FrameBuffer>
as return type, so I'm returning plain FrameBuffer
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my comment, "StatusOr<unique_ptr> ImagePostprocessor::Postprocess();" in the doc, it was referring to creating an ImagePostprocessor class, similar to ImagePreprocessor. Then ImageTransformer::Postprocess can all ImagePostprocessor::Postprocess(), just like ImageEmbedder::Postprocess.
Agreed we should return StatusOr<FrameBuffer>
from ImageTransformer::Postprocess
and ImagePostprocessor::Postprocess
.
You may noticed that EmbeddingPostprocessor::Postprocess takes the output embedding object, instead of creating and returning a new embedding object. This is due to some legacy reasons, and we'll change it to the following API later:
StatusOr<Embeddings> EmbeddingPostprocessor::Postprocess()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see! Sorry I missed it before, so we should delegate the postprocess task to ImagePostprocessor
class. Right? Will make that change in a flash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lu-wang-g For passing options, we can pass NormalizationOptions
to the postprocessor
sounds cool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, that's acceptable.
StatusOr<FrameBuffer> ImageTransformer::Postprocess( | ||
const std::vector<const TfLiteTensor*>& /*output_tensors*/, | ||
const FrameBuffer& /*frame_buffer*/, const BoundingBox& /*roi*/) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my comment, "StatusOr<unique_ptr> ImagePostprocessor::Postprocess();" in the doc, it was referring to creating an ImagePostprocessor class, similar to ImagePreprocessor. Then ImageTransformer::Postprocess can all ImagePostprocessor::Postprocess(), just like ImageEmbedder::Postprocess.
Agreed we should return StatusOr<FrameBuffer>
from ImageTransformer::Postprocess
and ImagePostprocessor::Postprocess
.
You may noticed that EmbeddingPostprocessor::Postprocess takes the output embedding object, instead of creating and returning a new embedding object. This is due to some legacy reasons, and we'll change it to the following API later:
StatusOr<Embeddings> EmbeddingPostprocessor::Postprocess()
// results will be filled. | ||
// | ||
// An example of such model can be found at: | ||
// https://tfhub.dev/bohemian-visual-recognition-alliance/lite-model/models/mushroom-identification_v1/1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use this model as an example for testing.
8a1274d
to
26a9783
Compare
Thoughts on testing @lu-wang-g ? |
Create(core::TfLiteEngine* engine, | ||
const std::initializer_list<int> output_indices, | ||
std::unique_ptr<vision::NormalizationOptions> options); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After a second thought, this API may be improved by the following two paths:
- (most common use case) If the output tensor use the same processing metadata as the input tensor, instead of passing in normalization params, we can pass the input input_indices. Then ImagePostprocessor can read whatever metadata associated with the input tensor.
- If the output tensor use a different processing metadata as the input tensor, it should be populated in the metadata. And ImagePostprocessor should read it just like ImagePreprocessor.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, but won't this break the existing design pattern? To implement this, I think we can extract input metadata from image_tensor_specs
utility by getting metadata_extractor()
from our engine. Something like this could work
// Gather metadata
auto* output_metadata =
engine_->metadata_extractor()->GetOutputTensorMetadata(output_index);
auto* input_metadata =
engine_->metadata_extractor()->GetInputTensorMetadata(input_index);
// Use input metadata for normalization as fallback.
auto* processing_metadata = output_metadata != nullptr ? output_metadata : input_metadata;
absl::optional<vision::NormalizationOptions> normalization_options;
ASSIGN_OR_RETURN(normalization_options,
GetNormalizationOptionsIfAny(*processing_metadata))
Thoughts on handling NULL
cases?
InputMetaData
shouldn't ever beNULL
I guess for this task, alsoNormalizationOptions
shouldn't beNULL
as well.- What if
OutputMetadata
exists but it doesn't haveNormalizationOptions
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
InputMetaData
shouldn't ever beNULL
I guess for this task, alsoNormalizationOptions
shouldn't beNULL
as well.
Those has been verified here. You can share large amount of code with image_preprocessor (for both initialization and process ), that they both read metadata from a specified tensor (either input or output), and process FrameBuffer according to the metadata.
- What if
OutputMetadata
exists but it doesn't haveNormalizationOptions
?
If the tensor is unit8, it's OK. But if the tensor is float, raise an error. See here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those has been verified here.
In BuildInputTensorSpecs
, when InputMetaData
is nullptr
it just skips over, and fills normalization_options
as abs::opt
. But then you're accessing it inside ImagePreprocessor::Preprocess
here. Is this undefined behaviour?
P.S. Nvm figured it out! So it checks for abs::opt
in later stage there. Nice! Can you have a look at the code again and see if it fits your description?
tensorflow_lite_support/cc/task/processor/image_postprocessor.h
Outdated
Show resolved
Hide resolved
You can have an end-2-end test using ImageTransformer to validate the implementation. Try the style transfer model here: https://www.tensorflow.org/lite/examples/style_transfer/overview#performance_benchmarks |
@lu-wang-g Correct me if I'm wrong, but this requires 2 input images, aka two tensors. But our entire library is, currently, based on single input image. So, what's the solution here? |
ASSIGN_OR_RETURN(normalization_options, | ||
GetNormalizationOptionsIfAny(*processing_metadata)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since GetNormalizationOptionsIfAny
was wrapped inside unknown namespace
inside image_tensor_specs.cc
we might need to copy-paste the code here unfortunately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please share as much code as possible betweenImagePostprocessor::Init
and ImageTensorSpecs::BuildInputImageTensorSpecs
. You can put a todo and implement it in a follow up PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just so we're clear, do you mean copy code from BuildInputImageTensorSpecs
when you said "share".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Share code" means ImagePostprocessor::Init
and ImageTensorSpecs::BuildInputImageTensorSpecs
use the same piece of code to do processing or validation.
tensorflow_lite_support/cc/task/processor/image_postprocessor.h
Outdated
Show resolved
Hide resolved
tensorflow_lite_support/cc/task/processor/image_postprocessor.h
Outdated
Show resolved
Hide resolved
public: | ||
static tflite::support::StatusOr<std::unique_ptr<ImagePostprocessor>> | ||
Create(core::TfLiteEngine* engine, | ||
const std::initializer_list<int> output_indices. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use int
instead of std::initializer_list for output_index, since it only supports one output tensor. Same for input_index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document what input_index
is above Create().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't because Postprocessor
class stores std::vector<int> output_indices_
as state variable. I've continued the same pattern for input_indices
. We can just access output_indices_.at(0)
. Sounds good?
const std::initializer_list<int> output_indices. | ||
const std::initializer_list<int> input_indices); | ||
|
||
// Processes the provided vision::FrameBuffer and populate tensor values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the comment accordingly to reflect real functionality of Postprocess().
tensorflow_lite_support/cc/task/processor/image_postprocessor.h
Outdated
Show resolved
Hide resolved
// is currently detected by checking if all output tensors data type is uint8. | ||
bool has_uint8_outputs_; | ||
|
||
absl::Status Init(std::unique_ptr<vision::NormalizationOptions> options); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is "absl::Status Init(std::unique_ptrvision::NormalizationOptions options);" implemented anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. Should be Init(const std::vector<int>& input_indices)
instead. Updated it, thanks.
ASSIGN_OR_RETURN(normalization_options, | ||
GetNormalizationOptionsIfAny(*processing_metadata)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please share as much code as possible betweenImagePostprocessor::Init
and ImageTensorSpecs::BuildInputImageTensorSpecs
. You can put a todo and implement it in a follow up PR.
tensorflow_lite_support/cc/task/processor/image_postprocessor.h
Outdated
Show resolved
Hide resolved
* Transformations => FrameBuffer * No inputs to postprocess.
6699c77
to
08c536b
Compare
@lu-wang-g Looks like there were some changes in the API? I've updated the PR accordingly. Also added filled the model metadata using this script. |
Thanks Sai for updating the code! Yes, some internal API is changed which could affect your code here. I'll take a look at the changes on Friday (catching up a ddl today, and team off-site event tomorrow. Sorry about the delay!). |
I'm getting worse PSNR score than the one in the colab notebook. Either my PSNR implementation is wrong or I haven't taken images correctly. I'll fix this. Another concern is, the |
@lu-wang-g Hi, in this notebook. Why this happens? def preprocess_image(image_path):
..
..
hr_size = (tf.convert_to_tensor(hr_image.shape[:-1]) // 4) * 4
hr_image = tf.image.crop_to_bounding_box(hr_image, 0, 0, hr_size[0], hr_size[1])
hr_image = tf.cast(hr_image, tf.float32) It is converting my (50 x 50) image into (48 x 48) image. The inference engine is pretty strict on (50 x 50) as input. |
This is the TF model, which is different from the TFLite model. I think here, it tries to align the width/height to the multiples of 4. You can ignore this in the TFLite implementation. |
@lu-wang-g Reading the same file using They're off by 1-2 , but since I'm comparing PSNR these changes are effecting a lot. What do you think is the cause? |
That's a known issue. It is due to different Jepg reader algorithms used in tf.io.decode_jpg and DecodeImage. For testing purpose in Task library, you can use |
Okay, but in our test I wanted to test against a known PSNR value, with a known set of images. Which is why I loaded the same image on colab (as in the figure above) and our testing model. But since they're reading different value, PSNR value come off very different. So, how do I perform PSNR test? Or, should I perform PSNR test? |
// Use a bi-cubically downsampled image as input to the model and compare | ||
// the model output with the original image. Expected PSNR taken from: | ||
// https://colab.research.google.com/github/tensorflow/hub/blob/master/examples/colab/image_enhancing.ipynb. | ||
TEST_F(SuperResolutionTest, PSNRTest) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lu-wang-g , continuing above comment. i was talkin about this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much is your PSNR value? Is your output image clear than the input image, like is it enhanced?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It wasn't particularly bad, around 25.7081, expected value is 29.799528. I haven't inspected the output image. How do I convert FrameBuffer to jpeg?
After doing EncodeToPng(), the output png is something like this. Possible reasons could be:
This confirms the error happens during writing to png. Have a look at the notebook I created. Model output from python inference Model output from reading FrameBuffer output via txt file Looks almost same right? That means our framework works properly, but storing to png is causing this weird behaviour. I'm not sure why though, maybe because input image dims is not divisible by 4? Let me know what you think @lu-wang-g |
It is probably due to misusage of |
But then height and width have the same value (200), so not sure what's the reason. Let's discuss this over #718 since this PR isn't to blame.
Yeah, I'll use the image I got from dumping & displaying in ipynb. This should seal the deal with this PR. |
@lu-wang-g I'm closing this PR & opening dividing into two as discussed. |
@lu-wang-g @wangtz
TODO
ImageTransformerOptions.proto
files and fill inBUILD
.PostProcess
logic using a class derived fromPostProcessor
..h
and.cc
file forImageTransformer
.Postprocessor
class to delegate the task.