ONNX model 13049 outputs
Which pooling technique does the '13049' output use?
Maybe the answer exists, but I don't think its obvious ... When manually normalizing and mean pooling the onnx embeddings, these results lines up with HF/ST mean-pooling, however it does not line up with 13049, so I assume 13049 does not do mean pooling?
As a sidenote, if mean pooling is the recommended technique, why not have the onnx model output the mean pooling embedding rather than us post-processing the embeddings?
These questions may be ill-posed due to my lack of knowledge, but hopefully I get answers!
Hi @here4data , what do you mean by 13049?
Hi
@jupyterjazz
, 13049
is the label given to one of the outputs of the ONNX model, along with text_embeds
'inputs': [{'name': 'task_id', 'datatype': 'INT64', 'shape': [1]},
{'name': 'attention_mask', 'datatype': 'INT64', 'shape': [-1, -1]},
{'name': 'input_ids', 'datatype': 'INT64', 'shape': [-1, -1]}],
'outputs': [{'name': '13049', 'datatype': 'FP32', 'shape': [-1, 1024]},
{'name': 'text_embeds', 'datatype': 'FP32', 'shape': [-1, -1, 1024]}]}
I also wanted to ask about the task_id input being optional in HF/ST but mandatory in ONNX. Are the HF/ST approaches just using one of the loras by default? and if so, which one?
Thanks!
Hi
@here4data
, thanks for clarifying the question. The 13049
output is a cls embedding, however we don't train our models on the cls embeddings so it's redundant and should not be used.
I also wanted to ask about the task_id input being optional in HF/ST but mandatory in ONNX. Are the HF/ST approaches just using one of the loras by default? and if so, which one?
HF/ST are using no adapters by default. We went with a mandatory task type for onnx because of the implementation nuances, however supporting no-adapter option should not be difficult so if you'd like to see it lmk and I can update it sometime next week.
@jupyterjazz Thanks for the reply and the no-adapter option would be very helpful in ONNX format, I appreciate it very much!
@jupyterjazz Is it necessary to prepend the input text with the task_instructions (for retrieval.query/retrieval.passage) when doing inference using ONNX, or is it somehow automatically prepended by the model during inference?
Hi @Nemphys , yes you should prepend the insturctions for the retrieval tasks
@jupyterjazz Thank you!
@jupyterjazz have you had a chance to update the ONNX model with a no-adapter option ?
No worries if there is too much to do currently!