Add Image Transformer Library #679

jonpsy · 2021-09-09T06:32:15Z

@lu-wang-g @wangtz

TODO

ImageTransformerOptions.proto files and fill in BUILD.
PostProcess logic using a class derived from PostProcessor.
Ensure dynamic image support exist.
.h and .cc file for ImageTransformer.
Use Postprocessor class to delegate the task.
Add test and confirm working.
Write a demo app.

tensorflow_lite_support/cc/task/vision/proto/BUILD

tensorflow_lite_support/cc/task/vision/proto/image_transformer_options.proto

lu-wang-g · 2021-09-10T22:30:40Z

tensorflow_lite_support/cc/task/vision/proto/image_transformer_options.proto

+  // `base_options` to specifying the TFLite model and using
+  // `base_options.compute_settings.tflite_settings.cpu_settings.num_threads`,
+  // to configure the number of threads.
+  optional int32 num_threads = 7 [default = -1];


num_threads is configured through compute_settings, so can be removed too.

jonpsy · 2021-10-07T14:11:34Z

tensorflow_lite_support/cc/task/vision/image_transformer.cc

+  if (options.base_options().compute_settings().tflite_settings().cpu_settings().num_threads() == 0 ||
+      options.base_options().compute_settings().tflite_settings().cpu_settings().num_threads() < -1) {


Is this the correct way to get num_threads?

Yes, that's correct. But you don't need to check num_threads specifically. It has been verified here when the base task is created.

jonpsy · 2021-10-07T14:13:21Z

tensorflow_lite_support/cc/task/vision/image_transformer.cc

+  // TODO: Will the output be float and should be converted or directly available?
+  // The example had float and it had to be converted. Anyway, we're guaranteed to have uint8 as output.
+  has_uint8_outputs_ = (output_tensor->type == kTfLiteUInt8);


So, here the model will most likely output kTfLiteFloat32 right? Then we convert it to uint8_t type explicitly in Preprocess? Or does all ImageTransformer model should always have kTfLiteuint8 type as output tensor type?

I think both situation will happen. If the output is float, it most likely will be in the same range as the input image, meaning if the input is normalized to [0, 1], you need to denormalized the output from [0, 1] to [0, 255]. We'll need to implement for both. See the style transfer model here: https://www.tensorflow.org/lite/examples/style_transfer/overview#performance_benchmarks

jonpsy · 2021-10-27T10:40:45Z

Hey @lu-wang-g, we decided upon the

  tflite::support::StatusOr<std::unique_ptr<::tflite::task::vision::FrameBuffer>> Postprocess();

API for Postprocess but it looks like the current BaseVisionTaskAPI is still using the old API. Any idea when it'd be changed to the new? Based on that I can make it work on the old API just fine.

Tagging @xunkai55 @wangtz for their opinion as well. Cheers.

jonpsy · 2021-10-27T11:29:10Z

tensorflow_lite_support/cc/task/vision/image_transformer.cc

+StatusOr<FrameBuffer> ImageTransformer::Postprocess(
+    const std::vector<const TfLiteTensor*>& /*output_tensors*/,
+    const FrameBuffer& /*frame_buffer*/, const BoundingBox& /*roi*/) {


Very well, I've come up with a compromise. For now I'm using the old API but not using any of the arguments. When the new API lands, one can just remove the arguments and it will work like charm.

Also @lu-wang-g , it looks like we can't use std::unique_ptr<FrameBuffer> as return type, so I'm returning plain FrameBuffer.

For my comment, "StatusOr<unique_ptr> ImagePostprocessor::Postprocess();" in the doc, it was referring to creating an ImagePostprocessor class, similar to ImagePreprocessor. Then ImageTransformer::Postprocess can all ImagePostprocessor::Postprocess(), just like ImageEmbedder::Postprocess.

Agreed we should return StatusOr<FrameBuffer> from ImageTransformer::Postprocess and ImagePostprocessor::Postprocess.

You may noticed that EmbeddingPostprocessor::Postprocess takes the output embedding object, instead of creating and returning a new embedding object. This is due to some legacy reasons, and we'll change it to the following API later:

StatusOr<Embeddings> EmbeddingPostprocessor::Postprocess()

Oh I see! Sorry I missed it before, so we should delegate the postprocess task to ImagePostprocessor class. Right? Will make that change in a flash.

@lu-wang-g For passing options, we can pass NormalizationOptions to the postprocessor sounds cool?

As discussed, that's acceptable.

lu-wang-g · 2021-10-27T19:12:53Z

tensorflow_lite_support/cc/task/vision/image_transformer.cc

+StatusOr<FrameBuffer> ImageTransformer::Postprocess(
+    const std::vector<const TfLiteTensor*>& /*output_tensors*/,
+    const FrameBuffer& /*frame_buffer*/, const BoundingBox& /*roi*/) {


For my comment, "StatusOr<unique_ptr> ImagePostprocessor::Postprocess();" in the doc, it was referring to creating an ImagePostprocessor class, similar to ImagePreprocessor. Then ImageTransformer::Postprocess can all ImagePostprocessor::Postprocess(), just like ImageEmbedder::Postprocess.

Agreed we should return StatusOr<FrameBuffer> from ImageTransformer::Postprocess and ImagePostprocessor::Postprocess.

You may noticed that EmbeddingPostprocessor::Postprocess takes the output embedding object, instead of creating and returning a new embedding object. This is due to some legacy reasons, and we'll change it to the following API later:

StatusOr<Embeddings> EmbeddingPostprocessor::Postprocess()

lu-wang-g · 2021-10-27T19:21:58Z

tensorflow_lite_support/cc/task/vision/image_transformer.h

+//      results will be filled.
+//
+// An example of such model can be found at:
+// https://tfhub.dev/bohemian-visual-recognition-alliance/lite-model/models/mushroom-identification_v1/1


You can use this model as an example for testing.

jonpsy · 2021-11-01T06:14:58Z

Thoughts on testing @lu-wang-g ?

lu-wang-g · 2021-11-01T17:50:34Z

tensorflow_lite_support/cc/task/processor/image_postprocessor.h

+  Create(core::TfLiteEngine* engine,
+         const std::initializer_list<int> output_indices,
+         std::unique_ptr<vision::NormalizationOptions> options);


After a second thought, this API may be improved by the following two paths:

(most common use case) If the output tensor use the same processing metadata as the input tensor, instead of passing in normalization params, we can pass the input input_indices. Then ImagePostprocessor can read whatever metadata associated with the input tensor.

If the output tensor use a different processing metadata as the input tensor, it should be populated in the metadata. And ImagePostprocessor should read it just like ImagePreprocessor.

What do you think?

Okay, but won't this break the existing design pattern? To implement this, I think we can extract input metadata from image_tensor_specs utility by getting metadata_extractor() from our engine. Something like this could work

// Gather metadata auto* output_metadata = engine_->metadata_extractor()->GetOutputTensorMetadata(output_index); auto* input_metadata = engine_->metadata_extractor()->GetInputTensorMetadata(input_index); // Use input metadata for normalization as fallback. auto* processing_metadata = output_metadata != nullptr ? output_metadata : input_metadata; absl::optional<vision::NormalizationOptions> normalization_options; ASSIGN_OR_RETURN(normalization_options, GetNormalizationOptionsIfAny(*processing_metadata))

Thoughts on handling NULL cases?

InputMetaData shouldn't ever be NULL I guess for this task, also NormalizationOptions shouldn't be NULL as well.

What if OutputMetadata exists but it doesn't have NormalizationOptions?

InputMetaData shouldn't ever be NULL I guess for this task, also NormalizationOptions shouldn't be NULL as well.

Those has been verified here. You can share large amount of code with image_preprocessor (for both initialization and process ), that they both read metadata from a specified tensor (either input or output), and process FrameBuffer according to the metadata.

What if OutputMetadata exists but it doesn't have NormalizationOptions?
If the tensor is unit8, it's OK. But if the tensor is float, raise an error. See here.

Those has been verified here.

In BuildInputTensorSpecs, when InputMetaData is nullptr it just skips over, and fills normalization_options as abs::opt. But then you're accessing it inside ImagePreprocessor::Preprocess here. Is this undefined behaviour?

P.S. Nvm figured it out! So it checks for abs::opt in later stage there. Nice! Can you have a look at the code again and see if it fits your description?

tensorflow_lite_support/cc/task/processor/image_postprocessor.h

lu-wang-g · 2021-11-01T19:40:17Z

You can have an end-2-end test using ImageTransformer to validate the implementation. Try the style transfer model here: https://www.tensorflow.org/lite/examples/style_transfer/overview#performance_benchmarks

jonpsy · 2021-11-10T05:37:10Z

You can have an end-2-end test using ImageTransformer to validate the implementation. Try the style transfer model here: https://www.tensorflow.org/lite/examples/style_transfer/overview#performance_benchmarks

@lu-wang-g Correct me if I'm wrong, but this requires 2 input images, aka two tensors. But our entire library is, currently, based on single input image. So, what's the solution here?

jonpsy · 2021-11-10T06:41:33Z

tensorflow_lite_support/cc/task/processor/image_postprocessor.cc

+  ASSIGN_OR_RETURN(normalization_options,
+                   GetNormalizationOptionsIfAny(*processing_metadata));


Since GetNormalizationOptionsIfAny was wrapped inside unknown namespace inside image_tensor_specs.cc we might need to copy-paste the code here unfortunately.

Please share as much code as possible betweenImagePostprocessor::Init and ImageTensorSpecs::BuildInputImageTensorSpecs. You can put a todo and implement it in a follow up PR.

Just so we're clear, do you mean copy code from BuildInputImageTensorSpecs when you said "share".

"Share code" means ImagePostprocessor::Init and ImageTensorSpecs::BuildInputImageTensorSpecs use the same piece of code to do processing or validation.

tensorflow_lite_support/cc/task/processor/image_postprocessor.h

lu-wang-g · 2021-11-11T01:54:45Z

tensorflow_lite_support/cc/task/processor/image_postprocessor.h

+ public:
+  static tflite::support::StatusOr<std::unique_ptr<ImagePostprocessor>>
+  Create(core::TfLiteEngine* engine,
+         const std::initializer_list<int> output_indices.


Use int instead of std::initializer_list for output_index, since it only supports one output tensor. Same for input_index.

Document what input_index is above Create().

Can't because Postprocessor class stores std::vector<int> output_indices_ as state variable. I've continued the same pattern for input_indices. We can just access output_indices_.at(0). Sounds good?

lu-wang-g · 2021-11-11T01:56:19Z

tensorflow_lite_support/cc/task/processor/image_postprocessor.h

+         const std::initializer_list<int> output_indices.
+         const std::initializer_list<int> input_indices);
+
+  // Processes the provided vision::FrameBuffer and populate tensor values.


Update the comment accordingly to reflect real functionality of Postprocess().

tensorflow_lite_support/cc/task/processor/image_postprocessor.h

lu-wang-g · 2021-11-11T02:03:31Z

tensorflow_lite_support/cc/task/processor/image_postprocessor.h

+  // is currently detected by checking if all output tensors data type is uint8.
+  bool has_uint8_outputs_;
+
+  absl::Status Init(std::unique_ptr<vision::NormalizationOptions> options);


Is "absl::Status Init(std::unique_ptrvision::NormalizationOptions options);" implemented anywhere?

No. Should be Init(const std::vector<int>& input_indices) instead. Updated it, thanks.

lu-wang-g · 2021-11-11T02:14:53Z

tensorflow_lite_support/cc/task/processor/image_postprocessor.cc

+  ASSIGN_OR_RETURN(normalization_options,
+                   GetNormalizationOptionsIfAny(*processing_metadata));


Please share as much code as possible betweenImagePostprocessor::Init and ImageTensorSpecs::BuildInputImageTensorSpecs. You can put a todo and implement it in a follow up PR.

tensorflow_lite_support/cc/task/processor/image_postprocessor.h

* Transformations => FrameBuffer * No inputs to postprocess.

jonpsy · 2021-11-17T14:57:44Z

@lu-wang-g Looks like there were some changes in the API? I've updated the PR accordingly. Also added filled the model metadata using this script.

lu-wang-g · 2021-11-17T18:50:34Z

Thanks Sai for updating the code! Yes, some internal API is changed which could affect your code here. I'll take a look at the changes on Friday (catching up a ddl today, and team off-site event tomorrow. Sorry about the delay!).

jonpsy · 2021-11-18T12:08:23Z

I'm getting worse PSNR score than the one in the colab notebook. Either my PSNR implementation is wrong or I haven't taken images correctly. I'll fix this.

Another concern is, the ImageTransformer fails to CreateFromOptions when only InputMetadata is provided. Not sure why, I'll investigate this further.

jonpsy · 2021-11-23T15:24:09Z

@lu-wang-g Hi, in this notebook. Why this happens?

def preprocess_image(image_path):
.. 
..
  hr_size = (tf.convert_to_tensor(hr_image.shape[:-1]) // 4) * 4
  hr_image = tf.image.crop_to_bounding_box(hr_image, 0, 0, hr_size[0], hr_size[1])
  hr_image = tf.cast(hr_image, tf.float32)

It is converting my (50 x 50) image into (48 x 48) image. The inference engine is pretty strict on (50 x 50) as input.

lu-wang-g · 2021-11-23T18:56:59Z

@lu-wang-g Hi, in this notebook. Why this happens?
def preprocess_image(image_path):
.. 
..
  hr_size = (tf.convert_to_tensor(hr_image.shape[:-1]) // 4) * 4
  hr_image = tf.image.crop_to_bounding_box(hr_image, 0, 0, hr_size[0], hr_size[1])
  hr_image = tf.cast(hr_image, tf.float32)
It is converting my (50 x 50) image into (48 x 48) image. The inference engine is pretty strict on (50 x 50) as input.

This is the TF model, which is different from the TFLite model. I think here, it tries to align the width/height to the multiples of 4. You can ignore this in the TFLite implementation.

jonpsy · 2021-11-24T05:35:13Z

@lu-wang-g Reading the same file using tf.io.decode_jpg and DecodeImage is giving different values.

They're off by 1-2 , but since I'm comparing PSNR these changes are effecting a lot. What do you think is the cause?

lu-wang-g · 2021-11-24T22:44:22Z

@lu-wang-g Reading the same file using tf.io.decode_jpg and DecodeImage is giving different values.

They're off by 1-2 , but since I'm comparing PSNR these changes are effecting a lot. What do you think is the cause?

That's a known issue. It is due to different Jepg reader algorithms used in tf.io.decode_jpg and DecodeImage. For testing purpose in Task library, you can use DecodeImage to parse your image.

jonpsy · 2021-11-25T10:53:33Z

Okay, but in our test I wanted to test against a known PSNR value, with a known set of images. Which is why I loaded the same image on colab (as in the figure above) and our testing model. But since they're reading different value, PSNR value come off very different.

So, how do I perform PSNR test? Or, should I perform PSNR test?

jonpsy · 2021-11-25T10:59:12Z

tensorflow_lite_support/cc/test/task/vision/image_transformer_test.cc

+// Use a bi-cubically downsampled image as input to the model and compare
+// the model output with the original image. Expected PSNR taken from:
+// https://colab.research.google.com/github/tensorflow/hub/blob/master/examples/colab/image_enhancing.ipynb.
+TEST_F(SuperResolutionTest, PSNRTest) {


@lu-wang-g , continuing above comment. i was talkin about this test.

How much is your PSNR value? Is your output image clear than the input image, like is it enhanced?

It wasn't particularly bad, around 25.7081, expected value is 29.799528. I haven't inspected the output image. How do I convert FrameBuffer to jpeg?

jonpsy · 2021-12-02T04:21:10Z

After doing EncodeToPng(), the output png is something like this.

Possible reasons could be:

Corrupted model: To verify this, I used ipynb notebook and fed image to same model. Which means this is not the case.
Incorrect postprocessing: To verify this, I used ofstream to dumped the output of FrameBuffer result_or.value() to .txt file and read it in ipynb. Then, I performed RMSE loss between image_transformer output vs python inference with the same tflite model, the loss was around ~2.222 which is not bad. Which means this is not the case either.

This confirms the error happens during writing to png. Have a look at the notebook I created.

Model output from python inference

Model output from reading FrameBuffer output via txt file

Looks almost same right? That means our framework works properly, but storing to png is causing this weird behaviour. I'm not sure why though, maybe because input image dims is not divisible by 4? Let me know what you think @lu-wang-g

lu-wang-g · 2021-12-02T23:03:55Z

o

It is probably due to misusage of EncodeToPng, i.e. width and height are reversed. If you already have a working output image (as you pasted in the thread), just use that one in test. Good to confirm that the pipeline is good.

jonpsy · 2021-12-03T02:55:21Z

o

It is probably due to misusage of EncodeToPng, i.e. width and height are reversed.

But then height and width have the same value (200), so not sure what's the reason. Let's discuss this over #718 since this PR isn't to blame.

If you already have a working output image (as you pasted in the thread), just use that one in test. Good to confirm that the pipeline is good.

Yeah, I'll use the image I got from dumping & displaying in ipynb. This should seal the deal with this PR.

jonpsy · 2021-12-03T02:56:05Z

@lu-wang-g I'm closing this PR & opening dividing into two as discussed.

google-cla bot added the cla: yes label Sep 9, 2021

lu-wang-g reviewed Sep 10, 2021

View reviewed changes

jonpsy force-pushed the image_transformer branch from f7310c0 to fe089a3 Compare October 7, 2021 11:19

jonpsy commented Oct 7, 2021

View reviewed changes

jonpsy commented Oct 27, 2021

View reviewed changes

lu-wang-g reviewed Oct 27, 2021

View reviewed changes

jonpsy force-pushed the image_transformer branch from 8a1274d to 26a9783 Compare October 29, 2021 18:38

lu-wang-g reviewed Nov 1, 2021

View reviewed changes

jonpsy commented Nov 10, 2021

View reviewed changes

lu-wang-g reviewed Nov 11, 2021

View reviewed changes

jonpsy added 15 commits November 17, 2021 14:56

proto for input options.

f6bda02

All except base_options + added \n

79ca011

Add \n in transformatinos.proto

f12ce03

Yet another \n

58a7d54

add transformation result message

bcefc49

add proto include header

b43ef89

Add cc and h files.

5499ba4

minor adjust

f284452

no need to check num threads

540c5d5

rm uint8t comment

c9bfd13

no need transformation data structure

18d6bec

* has model file check already handled: TaskAPIFac

ea30bae

* Transformations => FrameBuffer * No inputs to postprocess.

postprocess logic done.

ebecdfc

Added RGB check

772ea3a

Add BUILD dep

dea4922

jonpsy and others added 4 commits November 17, 2021 14:56

Move output count to postprocess

df9fc0a

Check in a single line.

ce176e8

end() => begin()

6706c1e

Use the latest API

08c536b

jonpsy force-pushed the image_transformer branch from 6699c77 to 08c536b Compare November 17, 2021 14:57

ESR-GAN models with metadata.

d93e81d

jonpsy added 3 commits November 18, 2021 04:52

Add fox images.

745618c

minor comment fix.

beed843

Add unit tests.

ce53d49

jonpsy added 2 commits November 23, 2021 08:06

use husky

0181b1c

output_meta 0, 1

f1309d0

jonpsy commented Nov 25, 2021

View reviewed changes

jonpsy mentioned this pull request Nov 30, 2021

Encode to png for Framebuffer. #718

Open

jonpsy added 3 commits November 30, 2021 14:22

Use the correct model.

cc95019

enhanced husky

1c0409d

test

878aef9

jonpsy closed this Dec 3, 2021

jonpsy mentioned this pull request Dec 3, 2021

Image postprocessor #720

Open

		if (options.base_options().compute_settings().tflite_settings().cpu_settings().num_threads() == 0 \|\|
		options.base_options().compute_settings().tflite_settings().cpu_settings().num_threads() < -1) {

		ASSIGN_OR_RETURN(normalization_options,
		GetNormalizationOptionsIfAny(*processing_metadata));

Add Image Transformer Library #679

Add Image Transformer Library #679

Uh oh!

Conversation

jonpsy commented Sep 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jonpsy commented Oct 27, 2021

Uh oh!

jonpsy Oct 27, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jonpsy Oct 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jonpsy commented Nov 1, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jonpsy Nov 2, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jonpsy Nov 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lu-wang-g commented Nov 1, 2021

Uh oh!

jonpsy commented Nov 10, 2021

Uh oh!

jonpsy Nov 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jonpsy commented Sep 9, 2021 •

edited

Loading

jonpsy Oct 27, 2021 •

edited

Loading

jonpsy Oct 29, 2021 •

edited

Loading

jonpsy Nov 2, 2021 •

edited

Loading

jonpsy Nov 10, 2021 •

edited

Loading

jonpsy Nov 10, 2021 •

edited

Loading

jonpsy Nov 11, 2021 •

edited

Loading

jonpsy commented Nov 17, 2021 •

edited

Loading

jonpsy commented Nov 23, 2021 •

edited

Loading

jonpsy Nov 29, 2021 •

edited

Loading