Skip to content

examples/bert/build.py does not use model weights #2197

Closed as not planned
Closed as not planned
@tkhanipov

Description

@tkhanipov

System Info

Currently the example TensorRT LLM engine builder for Bert models simply ignores model weights if those are present in the model directory, it only reads the config.json file, making it essentially impossible to generate a working engine from a pretrained model.

A possible fix is available in #2187

Who can help?

@byshiue

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Scenario 1 (simplest)

  1. Prepare a pre-trained model (i.e. have a directory with config.json and the weights file).
  2. Replace the weights file with some random content (e.g. any text file).
  3. Run examples/bert/build.py --model_dir input_model

Scenario 2 (use of the weights)

  1. Prepare a pre-trained model (i.e. have a directory with config.json and the weights file).
  2. Run examples/bert/build.py --model_dir input_model
  3. Execute the generated TensorRT LLM engine with some input and check the output tensor.
  4. Execute the input model with the same input and check the output tensor.

Expected behavior

Scenario 1 (simplest)

build.py shall show an error message complaining about invalid weights file.

Scenario 2 (use of the weights)

The output tensors shall have numerically close components.

actual behavior

Scenario 1 (simplest)

build.py finished successfully, generating bert_outputs/config.json and bert_outputs/BertModel_float16_tp1_rank0.engine.

Scenario 2 (use of the weights)

The output tensors look totally unrelated and different from each other.

additional notes

The problem is that the script code only loads the config and does not do anything to load the weights. The fix is available in #2187.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingstale

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions