Description
System Info
Currently the example TensorRT LLM engine builder for Bert models simply ignores model weights if those are present in the model directory, it only reads the config.json
file, making it essentially impossible to generate a working engine from a pretrained model.
A possible fix is available in #2187
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
Scenario 1 (simplest)
- Prepare a pre-trained model (i.e. have a directory with
config.json
and the weights file). - Replace the weights file with some random content (e.g. any text file).
- Run
examples/bert/build.py --model_dir input_model
Scenario 2 (use of the weights)
- Prepare a pre-trained model (i.e. have a directory with
config.json
and the weights file). - Run
examples/bert/build.py --model_dir input_model
- Execute the generated TensorRT LLM engine with some input and check the output tensor.
- Execute the input model with the same input and check the output tensor.
Expected behavior
Scenario 1 (simplest)
build.py
shall show an error message complaining about invalid weights file.
Scenario 2 (use of the weights)
The output tensors shall have numerically close components.
actual behavior
Scenario 1 (simplest)
build.py
finished successfully, generating bert_outputs/config.json
and bert_outputs/BertModel_float16_tp1_rank0.engine
.
Scenario 2 (use of the weights)
The output tensors look totally unrelated and different from each other.
additional notes
The problem is that the script code only loads the config and does not do anything to load the weights. The fix is available in #2187.