-
Notifications
You must be signed in to change notification settings - Fork 25.3k
[ML] SPLADE embedding support #131679
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[ML] SPLADE embedding support #131679
Conversation
Pinging @elastic/ml-core (Team:ML) |
@@ -121,11 +135,12 @@ public InferenceConfig apply(InferenceConfigUpdate update) { | |||
return new TextExpansionConfig( | |||
vocabularyConfig, | |||
configUpdate.tokenizationUpdate == null ? tokenization : configUpdate.tokenizationUpdate.apply(tokenization), | |||
Optional.ofNullable(configUpdate.getResultsField()).orElse(resultsField) | |||
Optional.ofNullable(configUpdate.getResultsField()).orElse(resultsField), | |||
Optional.ofNullable(configUpdate.getExpansionType()).orElse(expansionType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing the expansionType
changes the way the the output is processed and may not be compatible with the different types of model. Switching this value for the ELSER model would break the processing.
Fixing expansionType
when the config is created and not allowing it to be overridden at inference is good enough. If the wrong expansionType
value is set the user will have to recreate the model
@@ -24,22 +24,24 @@ | |||
|
|||
import static org.elasticsearch.xpack.core.ml.inference.trainedmodel.NlpConfig.RESULTS_FIELD; | |||
import static org.elasticsearch.xpack.core.ml.inference.trainedmodel.NlpConfig.TOKENIZATION; | |||
import static org.elasticsearch.xpack.core.ml.inference.trainedmodel.TextExpansionConfig.EXPANSION_TYPE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above regarding updating expansion type. I think the changes in this file can be reverted.
} | ||
|
||
public TextExpansionConfig(StreamInput in) throws IOException { | ||
vocabularyConfig = new VocabularyConfig(in); | ||
tokenization = in.readNamedWriteable(Tokenization.class); | ||
resultsField = in.readOptionalString(); | ||
expansionType = in.readOptionalString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a mixed version cluster older nodes will not expect this new field to be serialised. We protect against this using TransportVersions. Pls add a new version to TransportVersions.java at
public static final TransportVersion ESQL_TOPN_TIMINGS = def(9_128_0_00); |
expansionType = in.readOptionalString(); | |
if (in.getTransportVersion().onOrAfter(TransportVersions.ML_EXPANSION_TYPE)) { | |
expansionType = in.readOptionalString(); | |
} |
} | ||
|
||
@Override | ||
public void writeTo(StreamOutput out) throws IOException { | ||
vocabularyConfig.writeTo(out); | ||
out.writeNamedWriteable(tokenization); | ||
out.writeOptionalString(resultsField); | ||
out.writeOptionalString(expansionType); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out.writeOptionalString(expansionType); | |
if (out.getTransportVersion().onOrAfter(TransportVersions. ML_EXPANSION_TYPE)) { | |
out.writeOptionalString(expansionType); | |
} |
Overview
Elasticsearch supports sparse embeddings including non-ELSER models which is introduced by #116935.
A significant example of sparse vector model is the SPLADE model, which is a reference model for ELSER. But the format of the output of SPLADE is not same as ELSER, so we need to implement a post processing step for it.
Interface change
This PR introduces a new
expansion_type
parameter to trained model resource.expansion_type
can be one ofelser
orsplade
. If it is not specified, it defaults toelser
.To use the SPLADE model, eland needs to be updated to support the
expansion_type
parameter.elastic/eland#802
Logic for SPLADE
SPLADE model outputs a embedding in the shape of [1, input_token_size, vocab_size]. The second dimention is different from ELSER, which is [1, chunk_size, vocab_size].
For SPLADE, we need to apply the saturation function to the output, which is
log(1 + relu(x))
, and then apply max pooling to the second dimension.To be considered
Reference