- Personalization systems are used for tasks like CTR prediction and rankings. Two perspectives contributed to the current design of models for personalization, recommendation
- Content-filtering, where users selected their preferred categories and were matched on their preferences
- Collaborative filtering, where recs are based on past behaviors.
- Neighborhood methods that group users/products ina latent space were also deployed
- Predictive models
- Embeddings: they map each category to a dense representation in an abstract space. They can map categorical features to a dense representation.
- Matrix factorization:
- Factorization machine:
- Prediction function phi -> T, from input datapoint x in R, to target label y in T.
- Multilayer perceptron:
- Series of fully connected layers and activation function.
- Users and products are described by many continuous and categorical features. Categorical features are represented by embeddings. Continuous features are transformed by an MLP to yield a dense representation of the same length as the embedding layer.
- Second-order interactions - dotproduct of all vectors. Concatenate with the original processed dense features and post-process with output MLP, fed into sigmoid function to give a probability.
- Embeddings contribute the majority of the parameters, with several tables requiring an excess of multiple GBs of memory.
- MLP parameters are smaller in memory and translate into sizeable amounts of compute.
- Data-parallelism is preferred for MLPs