- Occurs when a model learns the training data too well, including noise, and fails to generalize to unseen data.
- Happens when a model is too simple to capture the underlying patterns in the data, leading to poor performance on both training and test sets.
- A phenomenon in which a neural network forgets previously learned tasks when fine-tuned on new tasks.
- Occurs when information from outside the training dataset is used to create the model, leading to overly optimistic performance.
- A metric for evaluating the quality of text generation tasks by comparing machine-generated text to human references using n-gram overlap.
- Uses contextual embeddings from BERT to evaluate the similarity between machine-generated text and reference text.
- Measures the overlap of n-grams, word sequences, and word pairs between machine-generated and reference texts, often used for summarization tasks.
- Evaluates the quality of probabilistic models by measuring how well they predict a sample. Lower perplexity indicates better performance.
- Evaluates text generation by aligning words semantically and syntactically, focusing on recall.
- Measures the probability of solving a task in at least one of the top-k attempts.
- The proportion of correctly classified instances out of the total instances.
- The proportion of relevant instances correctly identified out of all relevant instances.
- The proportion of relevant instances out of all instances identified as relevant.
- The harmonic mean of precision and recall, balancing both metrics.
- Uses unlabeled data to learn representations by creating pseudo-labels or tasks.
- Predicts masked tokens in a sentence, often used in transformer models like BERT.
- Predicts whether two sentences are consecutive, used to improve understanding of sentence relationships.
- Fine-tuning a model on labeled data to adapt it to specific tasks.
- Updating all model parameters on task-specific data.
- Fine-tunes a subset of parameters, reducing computational cost.
- Fine-tunes specific layers or modules of the model.
- Adds new parameters to the model for specific tasks while keeping original parameters fixed.
- Lightweight modules inserted into the model that can be fine-tuned for new tasks.
- Modifies parameterization for efficient adaptation to new tasks.
- Fine-tunes low-rank matrices in the model, saving memory and computation.
- A derivative of LoRA designed for better efficiency and scaling.
- Combines LoRA with quantized models for memory-efficient fine-tuning.
- Reduces model size by representing parameters with lower-bit precision, saving memory and improving efficiency.
- Applies quantization at multiple stages for further efficiency.
- Learnable vectors added to prompts for task-specific adaptation.
- Fine-tunes prompts instead of model parameters to adapt models to new tasks.
- Optimizes prefixes added to input sequences for task-specific performance.
- Enhances prompt tuning with continuous parameter optimization.
- Fine-tunes a model on multiple tasks simultaneously to improve generalization.
- Fine-tunes a model to follow natural language instructions across tasks.
- Combines human feedback with reinforcement learning to fine-tune models for desired behavior.
- Fine-tunes models using predefined principles instead of human feedback.
- Fine-tunes models on diverse tasks with hierarchical or meta-contextual instructions.
- Adapts models for specific tasks with limited (few-shot) or no (zero-shot) labeled data.
- Extends instruction tuning to handle multiple modalities, like text and images.
- Fine-tunes models to generate intermediate reasoning steps for better problem-solving.
- Trains specialized modules (experts) for specific tasks, enabling efficient task routing.
- Models evaluate and refine their outputs iteratively to improve performance.
- Combines neural networks with symbolic reasoning for better generalization and interpretability.
- Aligns models with specific values, societal norms, or operational goals.
- Fine-tunes models for understanding abstract or multi-task instructions.
- Creates a model to predict human preferences for fine-tuning with reinforcement learning.
- Uses reward modeling and reinforcement learning to align models with human feedback.
- A reinforcement learning algorithm commonly used in RLHF.
- Optimizes models directly based on preference scores without complex reward modeling.
- Extends RLHF by incorporating feedback from domain experts.
- Enables models to perform tasks by providing task examples in the input context.
- Dynamically adapts the context or examples during inference for better performance.
- Models generate or refine in-context examples during inference.
- Selects task-relevant examples dynamically based on a scoring mechanism.
- Uses meta-learning principles to improve example selection and adaptation.
- Includes contrasting examples to help the model distinguish patterns.
- Incorporates reinforcement learning to optimize example selection.
- Combines in-context learning with external retrieval or data augmentation.
- Iteratively refines prompts based on model outputs or feedback.
- Optimizes prompts using gradient-based methods for specific tasks.
- Encourages models to generate intermediate reasoning steps to improve task performance.
- Models explore multiple reasoning paths, like a decision tree, for complex tasks.
- Represents reasoning as a graph structure, capturing complex relationships between steps.