-
Notifications
You must be signed in to change notification settings - Fork 4.1k
Enable load_from_bytes of model and ExtScorer #3331
base: master
Are you sure you want to change the base?
Conversation
No Taskcluster jobs started for this pull requestThe `allowPullRequests` configuration for this repository (in `.taskcluster.yml` on the
default branch) does not allow starting tasks for this pull request. |
this will be problematic for compatibility with some systems
Please introduce a new
I really don't see how this feature can work well with the tensorflow infra without forcing allocating huge strings |
native_client/tfmodelstate.cc
Outdated
MemmappedFileSystem::kMemmappedPackageDefaultGraphDef, | ||
&graph_def_); | ||
if (init_from_bytes){ | ||
// Need some help |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it makes any sense at all?
native_client/tflitemodelstate.cc
Outdated
std::cerr << "Error at reading model file " << model_path << std::endl; | ||
return DS_ERR_FAIL_INIT_MMAP; | ||
if (init_from_bytes){ | ||
fbmodel_ = tflite::FlatBufferModel::BuildFromBuffer(model_string.c_str(), model_string.size_t); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does that works, performance-wise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I see this, much alloc: https://github.com/tensorflow/tensorflow/blob/r2.3/tensorflow/lite/model_builder.cc#L104-L115
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't performed any benchmark, but I haven't noticed any slow down (for loading the pb model). When loading the Scorer, the test seems a bit slower due to reading/converting more than 900MB.
When running for our Language Model (size<20MB), it runs smoothly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, if you use a 20MB scorer, yeah, it might not be too bad, but I guess it can also depend on use device you target. My fear is people using that without understanding the consequences and then complaining about memory usage.
@bernardohenz I fail to get the usecase where this can be useful for mmap-able file formats (PBMM, TFLite), given how much this is going to force TensorFlow to allocate some memory. |
Does it? From looking at the implem, I'm unsure this can work without trashing memory of the device: https://github.com/tensorflow/tensorflow/blob/r2.3/tensorflow/lite/model_builder.cc#L104-L115 |
Also, when you introduce that, please try and expose it in as much bindings as you can. |
That's a good point, maybe we need to modify TensorFlow to take ownership of the pointer, or assume it'll outlive the session. |
That's much more invasive and risky, except if we can upstream that, but we know how long it can take. Depending on @bernardohenz use-case, maybe it makes sense to just expose that for non protobuf mmap files ? or even only for TFLite models ? |
@bernardohenz I see you have code already for TFLite, can you describe what kind of help you need? |
@lissyx I will be addressing the implementation of new methods for the API. I have no answer for using |
I want to make sure we don't put ourselves in a position where people will misuse and have unrealistic expectations
This does impact how C++ code handles strings, so maybe it depends on your env and in your case it makes sense? If you can share more of a stack, for example. |
@bernardohenz can you:
I could take a look and run CI on that, so we can cross-check |
Maybe this should be warned about in the docs: "using this feature can have negative impact on the amount of memory consumed by your application" |
@bernardohenz Your PR needs rebasing, it seems. |
@bernardohenz Please don't change file's mode from 644 to 755. |
@@ -69,45 +71,60 @@ void Scorer::setup_char_map() | |||
} | |||
} | |||
|
|||
int Scorer::load_lm(const std::string& lm_path) | |||
int Scorer::load_lm(const std::string& lm_string, bool load_from_bytes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either provide a default value, split the implementation in two load_lm
or update all call sites: this PR does not build because of load_lm
calls in generate_scorer_package.cpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've provided a default value (false) for the boolean
} | ||
|
||
int Scorer::load_trie(std::ifstream& fin, const std::string& file_path) | ||
int Scorer::load_trie(std::stringstream& fin, const std::string& file_path, bool load_from_bytes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stringstream
is compatible with ifstream
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I tested yes
native_client/deepspeech.cc
Outdated
} | ||
|
||
int | ||
DS_CreateModel_(const std::string &aModelString, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how this is implemented, we go straight from const char*
to std::string &
native_client/deepspeech.h
Outdated
* @return Zero on success, non-zero on failure. | ||
*/ | ||
DEEPSPEECH_EXPORT | ||
int DS_CreateModelFromBuffer(const std::string &aModelBuffer, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::string
here makes that C++
limited, the API is C
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll be changing all the strings to char *
. But I'll have to pass the buffer size as parameter (I was having problems when a buffer encountered the \0
in the middle of a buffer)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thats perfectly fine.
* @return Zero on success, non-zero on failure (invalid arguments). | ||
*/ | ||
DEEPSPEECH_EXPORT | ||
int DS_EnableExternalScorerFromBuffer(ModelState* aCtx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, this is limiting to C++
callers when we expose C
API.
@@ -17,9 +17,13 @@ DontBhiksha::DontBhiksha(const void * /*base*/, uint64_t /*max_offset*/, uint64_ | |||
const uint8_t kArrayBhikshaVersion = 0; | |||
|
|||
// TODO: put this in binary file header instead when I change the binary file format again. | |||
void ArrayBhiksha::UpdateConfigFromBinary(const BinaryFormat &file, uint64_t offset, Config &config) { | |||
void ArrayBhiksha::UpdateConfigFromBinary(const BinaryFormat &file, uint64_t offset, Config &config, bool load_from_bytes) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be upstreamed ?
4726e4a
to
8113f98
Compare
@bernardohenz You should now be able to run PRs yourself. |
I've fixed the API functions exposition, tested the loading for TFLite. I've performed a small time benchmark for running the client (both for loading a .b and a .tflite): Loading a dummy .pb
Loading a .tflite
|
@bernardohenz There are still failures related to |
Just for the sake of completeness, can you add figures for before your changes?
Good, so now it's just about:
Right? |
@bernardohenz I see some merges, can you make sure this branch is clean of merges ? |
@lissyx yep, I'm trying to squash the commits, sorry. |
e42c7cb
to
5a5ef46
Compare
Just writing down some notes from my talk with Bernardo last week on the status of this PR:
|
5a5ef46
to
df40e22
Compare
This PR allows the loading of model and ExtScorer through array of bytes (
string
)Some observations:
D_GLIBCXX_USE_CXX11_ABI=1
when compiling the lib--init_from_bytes
) in the client for testing the loadingDS_CreateModel
for instance)