modelscope · 0russwest0 · Jun 24, 2025 · Jun 20, 2025 · Jun 20, 2025 · Jun 23, 2025
diff --git a/README.md b/README.md
@@ -74,6 +74,7 @@ You can contact us and communicate with us by adding our group:
 
 
 ## 🎉 News
+- 🎁 2025.06.23: Fine-tuning of reranker models is supported. Training scripts can be found here: [Reranker](https://github.com/modelscope/ms-swift/blob/main/examples/train/reranker/train_reranker.sh).
 - 🎁 2025.06.18: Support for accelerating the ms-swift [inference](https://github.com/modelscope/ms-swift/blob/main/examples/infer/sglang), deployment, evaluation, and UI modules using the [sglang](https://github.com/sgl-project/sglang) inference acceleration engine. Simply set `--infer_backend sglang` to enable it.
 - 🎁 2025.06.15: Support for GKD training on both pure text large models and multimodal models. Training scripts can be found here: [Pure Text](https://github.com/modelscope/ms-swift/blob/main/examples/train/rlhf/gkd), [Multimodal](https://github.com/modelscope/ms-swift/blob/main/examples/train/multimodal/rlhf/gkd).
 - 🎁 2025.06.11: Support for using Megatron parallelism techniques for RLHF training. The training script can be found [here](https://github.com/modelscope/ms-swift/tree/main/examples/train/megatron/rlhf).
@@ -295,7 +296,7 @@ Supported Training Methods:
 | ORPO Training                      | ✅                                                            | [✅](https://github.com/modelscope/ms-swift/blob/main/examples/train/rlhf/orpo.sh)           | ✅                                                            | [✅](https://github.com/modelscope/ms-swift/blob/main/examples/train/rlhf/orpo.sh) | ✅                                                            | ✅                                                                                            |
 | Classification Model Training      | ✅                                                            | [✅](https://github.com/modelscope/ms-swift/blob/main/examples/train/seq_cls/qwen2_5/sft.sh) | ✅                                                            | ✅                                                            | ✅                                                            | [✅](https://github.com/modelscope/ms-swift/blob/main/examples/train/seq_cls/qwen2_vl/sft.sh) |
 | Embedding Model Training           | ✅                                                            | [✅](https://github.com/modelscope/ms-swift/blob/main/examples/train/embedding/train_gte.sh) | ✅                                                            | ✅                                                            | ✅                                                            | [✅](https://github.com/modelscope/ms-swift/blob/main/examples/train/embedding/train_gme.sh)  |
-
+| Reranker Model Training | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
 
 
 Pre-training:

diff --git a/README_CN.md b/README_CN.md
@@ -70,6 +70,7 @@
 - **模型量化**：支持AWQ、GPTQ和BNB的量化导出，导出的模型支持使用vLLM/SGLang/LmDeploy推理加速，并支持继续训练。
 
 ## 🎉 新闻
+- 🎁 2025.06.23: 支持Reranker模型训练，训练脚本参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/train/reranker/train_reranker.sh)。
 - 🎁 2025.06.18: 支持使用[sglang](https://github.com/sgl-project/sglang)推理加速引擎对ms-swift[推理](https://github.com/modelscope/ms-swift/blob/main/examples/infer/sglang)/部署/评测/ui模块进行加速，设置`--infer_backend sglang`即可。
 - 🎁 2025.06.15: 支持对纯文本大模型和多模态模型进行GKD训练。训练脚本参考这里：[纯文本](https://github.com/modelscope/ms-swift/blob/main/examples/train/rlhf/gkd), [多模态](https://github.com/modelscope/ms-swift/blob/main/examples/train/multimodal/rlhf/gkd)。
 - 🎁 2025.06.11: 支持使用Megatron并行技术进行RLHF训练，训练脚本参考[这里](https://github.com/modelscope/ms-swift/tree/main/examples/train/megatron/rlhf)。
@@ -284,6 +285,7 @@ print(f'response: {resp_list[0].choices[0].message.content}')
 | ORPO训练 | ✅ | [✅](https://github.com/modelscope/ms-swift/blob/main/examples/train/rlhf/orpo.sh)           | ✅ | [✅](https://github.com/modelscope/ms-swift/blob/main/examples/train/rlhf/orpo.sh) | ✅ | ✅                                                                                            |
 | 分类模型训练 | ✅ | [✅](https://github.com/modelscope/ms-swift/blob/main/examples/train/seq_cls/qwen2_5/sft.sh) | ✅ | ✅ | ✅ | [✅](https://github.com/modelscope/ms-swift/blob/main/examples/train/seq_cls/qwen2_vl/sft.sh) |
 | Embedding模型训练 | ✅ | [✅](https://github.com/modelscope/ms-swift/blob/main/examples/train/embedding/train_gte.sh) | ✅ | ✅ | ✅ | [✅](https://github.com/modelscope/ms-swift/blob/main/examples/train/embedding/train_gme.sh)  |
+| Reranker模型训练 | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
 
 
 预训练：

diff --git a/docs/source/BestPractices/Reranker训练.md b/docs/source/BestPractices/Reranker训练.md
@@ -0,0 +1,104 @@
+# Reranker训练
+
+SWIFT已经支持Reranker模型的训练，目前已经支持的模型有：
+
+1. modernbert reranker模型
+   - [ModelScope](https://www.modelscope.cn/models/iic/gte-reranker-modernbert-base) [Hugging Face](https://huggingface.co/Alibaba-NLP/gte-reranker-modernbert-base)
+2. qwen3-reranker模型
+   - 0.6B: [ModelScope](https://www.modelscope.cn/models/Qwen/Qwen3-Reranker-0.6B) [Hugging Face](https://huggingface.co/Qwen/Qwen3-Reranker-0.6B)
+   - 4B: [ModelScope](https://www.modelscope.cn/models/Qwen/Qwen3-Reranker-4B) [Hugging Face](https://huggingface.co/Qwen/Qwen3-Reranker-4B)
+   - 8B: [ModelScope](https://www.modelscope.cn/models/Qwen/Qwen3-Reranker-8B) [Hugging Face](https://huggingface.co/Qwen/Qwen3-Reranker-8B)
+
+## 实现方式
+
+目前SWIFT支持两种Reranker模型的实现方式，二者在架构和损失函数计算上有显著差异：
+
+### 1. 分类式Reranker（Classification Reranker）
+
+**适用模型：** modernbert reranker模型（如gte-reranker-modernbert-base）
+
+**核心原理：**
+- 基于序列分类架构，在预训练模型基础上添加分类头
+- 输入：query-document对，输出：单个相关性分数
+
+
+### 2. 生成式Reranker（Generative Reranker）
+
+**适用模型：** qwen3-reranker模型（0.6B/4B/8B）
+
+**核心原理：**
+- 基于生成式语言模型架构（CausalLM）
+- 输入：query-document对，输出：特定token的概率（如"yes"/"no"）
+- 通过对比最后位置特定token的logits进行分类
+
+## 损失函数类型
+
+SWIFT支持多种损失函数来训练Reranker模型：
+
+### Pointwise损失函数
+Pointwise方法将排序问题转化为二分类问题，独立处理每个query-document对：
+
+- **核心思想：** 对每个query-document对进行二分类，判断文档是否与查询相关
+- **损失函数：** 二分类交叉熵
+- **适用场景：** 简单高效，适合大规模数据训练
+
+环境变量配置：
+- `GENERATIVE_RERANKER_POSITIVE_TOKEN`：正例token（默认："yes"）
+- `GENERATIVE_RERANKER_NEGATIVE_TOKEN`：负例token（默认："no"）
+
+### Listwise损失函数
+Listwise方法将排序问题转化为多分类问题，从多个候选文档中选择正例：
+
+- **核心思想：** 对每个query的候选文档组（1个正例 + n个负例）进行多分类，识别正例文档
+- **损失函数：** 多分类交叉熵
+- **适用场景：** 学习文档间的相对排序关系，更符合信息检索的实际需求
+
+环境变量配置：
+- `LISTWISE_RERANKER_TEMPERATURE`：softmax温度参数（默认：1.0）
+- `LISTWISE_RERANKER_MIN_GROUP_SIZE`：最小组大小（默认：2）
+- `LISTWISE_GENERATIVE_RERANKER_TEMPERATURE`：listwise温度参数（默认：1.0）
+- `LISTWISE_GENERATIVE_RERANKER_MIN_GROUP_SIZE`：最小组大小（默认：2）
+
+**Listwise vs Pointwise：**
+- **Pointwise：** 独立判断相关性，训练简单，但忽略了文档间的相对关系
+- **Listwise：** 学习相对排序，性能更优，更适合排序任务的本质需求
+
+## 评估指标
+
+SWIFT为Reranker训练提供了专业的信息检索评估指标：
+
+### MRR (Mean Reciprocal Rank)
+- **定义：** 所有查询的倒数排名的平均值
+- **计算方式：** MRR = (1/|Q|) × Σ(1/rank_i)，其中rank_i是第i个查询的正例文档排名
+- **取值范围：** [0, 1]，越大越好
+- **适用场景：** 关注正例文档在排序结果中的位置
+
+### NDCG (Normalized Discounted Cumulative Gain)
+- **定义：** 标准化折扣累积增益
+- **计算方式：** NDCG = DCG / IDCG，考虑了排序位置对相关性的影响
+- **取值范围：** [0, 1]，越大越好
+- **适用场景：** 综合评估排序质量，对top位置的相关性更敏感
+
+**指标计算说明：**
+- 指标基于query分组计算，每个query组以正例文档开始，后跟负例文档
+- 数据格式：`[1,0,0,1,0,0,0]` 表示2个query：query1=[1,0,0]，query2=[1,0,0,0]
+- 自动识别query边界并分别计算每个query的指标，最后取平均值
+
+loss的源代码可以在[这里](https://github.com/modelscope/ms-swift/blob/main/swift/plugin/loss.py)找到。
+
+## 数据集格式
+
+```json lines
+{"query": "query", "positive": ["relevant_doc1", "relevant_doc2", ...], "negative": ["irrelevant_doc1", "irrelevant_doc2", ...]}
+```
+
+> 参考[MTEB/scidocs-reranking](https://www.modelscope.cn/datasets/MTEB/scidocs-reranking)
+
+## 脚手架
+
+SWIFT提供了两个脚手架训练脚本：
+
+- [Pointwise分类式Reranker](https://github.com/tastelikefeet/swift/blob/main/examples/train/reranker/train_reranker.sh)
+- [Pointwise生成式Reranker](https://github.com/tastelikefeet/swift/blob/main/examples/train/reranker/train_generative_reranker.sh)
+- [Listwise分类式Reranker](https://github.com/tastelikefeet/swift/blob/main/examples/train/reranker/train_reranker_listwise.sh)
+- [Listwise生成式Reranker](https://github.com/tastelikefeet/swift/blob/main/examples/train/reranker/train_generative_reranker_listwise.sh)
diff --git a/docs/source/Customization/自定义数据集.md b/docs/source/Customization/自定义数据集.md
@@ -119,6 +119,10 @@ query-response格式：
 
 请参考[embedding训练文档](../BestPractices/Embedding训练.md#数据集格式)
 
+### Reranker
+
+请参考[Reranker训练文档](../BestPractices/Reranker训练.md#数据集格式)
+
 ### 多模态
 
 对于多模态数据集，和上述任务的格式相同。区别在于增加了`images`, `videos`, `audios`几个key，分别代表多模态资源的url或者path（推荐使用绝对路径），`<image>` `<video>` `<audio>`标签代表了插入图片/视频/音频的位置，ms-swift支持多图片/视频/音频的情况。这些特殊tokens将在预处理的时候进行替换，参考[这里](https://github.com/modelscope/ms-swift/blob/main/swift/llm/template/template/qwen.py#L198)。下面给出的四条示例分别展示了纯文本，以及包含图像、视频和音频数据的数据格式。

diff --git a/docs/source/Instruction/支持的模型和数据集.md b/docs/source/Instruction/支持的模型和数据集.md
@@ -214,6 +214,9 @@
 |[Qwen/Qwen3-Embedding-0.6B](https://modelscope.cn/models/Qwen/Qwen3-Embedding-0.6B)|qwen3_emb|qwen3_emb|-|&#x2718;|-|[Qwen/Qwen3-Embedding-0.6B](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B)|
 |[Qwen/Qwen3-Embedding-4B](https://modelscope.cn/models/Qwen/Qwen3-Embedding-4B)|qwen3_emb|qwen3_emb|-|&#x2718;|-|[Qwen/Qwen3-Embedding-4B](https://huggingface.co/Qwen/Qwen3-Embedding-4B)|
 |[Qwen/Qwen3-Embedding-8B](https://modelscope.cn/models/Qwen/Qwen3-Embedding-8B)|qwen3_emb|qwen3_emb|-|&#x2718;|-|[Qwen/Qwen3-Embedding-8B](https://huggingface.co/Qwen/Qwen3-Embedding-8B)|
+|[Qwen/Qwen3-Reranker-0.6B](https://modelscope.cn/models/Qwen/Qwen3-Reranker-0.6B)|qwen3_reranker|qwen3_reranker|-|&#x2718;|-|[Qwen/Qwen3-Reranker-0.6B](https://huggingface.co/Qwen/Qwen3-Reranker-0.6B)|
+|[Qwen/Qwen3-Reranker-4B](https://modelscope.cn/models/Qwen/Qwen3-Reranker-4B)|qwen3_reranker|qwen3_reranker|-|&#x2718;|-|[Qwen/Qwen3-Reranker-4B](https://huggingface.co/Qwen/Qwen3-Reranker-4B)|
+|[Qwen/Qwen3-Reranker-8B](https://modelscope.cn/models/Qwen/Qwen3-Reranker-8B)|qwen3_reranker|qwen3_reranker|-|&#x2718;|-|[Qwen/Qwen3-Reranker-8B](https://huggingface.co/Qwen/Qwen3-Reranker-8B)|
 |[iic/gte_Qwen2-1.5B-instruct](https://modelscope.cn/models/iic/gte_Qwen2-1.5B-instruct)|qwen2_gte|dummy|-|&#x2718;|-|[Alibaba-NLP/gte-Qwen2-1.5B-instruct](https://huggingface.co/Alibaba-NLP/gte-Qwen2-1.5B-instruct)|
 |[iic/gte_Qwen2-7B-instruct](https://modelscope.cn/models/iic/gte_Qwen2-7B-instruct)|qwen2_gte|dummy|-|&#x2718;|-|[Alibaba-NLP/gte-Qwen2-7B-instruct](https://huggingface.co/Alibaba-NLP/gte-Qwen2-7B-instruct)|
 |[codefuse-ai/CodeFuse-QWen-14B](https://modelscope.cn/models/codefuse-ai/CodeFuse-QWen-14B)|codefuse_qwen|codefuse|-|&#x2718;|coding|[codefuse-ai/CodeFuse-QWen-14B](https://huggingface.co/codefuse-ai/CodeFuse-QWen-14B)|
@@ -552,6 +555,7 @@
 |[answerdotai/ModernBERT-base](https://modelscope.cn/models/answerdotai/ModernBERT-base)|modern_bert|dummy|transformers>=4.48|&#x2718;|bert|[answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base)|
 |[answerdotai/ModernBERT-large](https://modelscope.cn/models/answerdotai/ModernBERT-large)|modern_bert|dummy|transformers>=4.48|&#x2718;|bert|[answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large)|
 |[iic/gte-modernbert-base](https://modelscope.cn/models/iic/gte-modernbert-base)|modern_bert_gte|dummy|transformers>=4.48|&#x2718;|bert, embedding|[Alibaba-NLP/gte-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-modernbert-base)|
+|[iic/gte-reranker-modernbert-base](https://modelscope.cn/models/iic/gte-reranker-modernbert-base)|modern_bert_gte_reranker|bert|transformers>=4.48|&#x2718;|bert, reranker|[Alibaba-NLP/gte-reranker-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-reranker-modernbert-base)|
 |[iic/nlp_structbert_backbone_base_std](https://modelscope.cn/models/iic/nlp_structbert_backbone_base_std)|bert|dummy|-|&#x2718;|bert|-|
 |[Shanghai_AI_Laboratory/internlm2-1_8b-reward](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-1_8b-reward)|internlm2_reward|internlm2_reward|transformers>=4.38|&#x2718;|-|[internlm/internlm2-1_8b-reward](https://huggingface.co/internlm/internlm2-1_8b-reward)|
 |[Shanghai_AI_Laboratory/internlm2-7b-reward](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-7b-reward)|internlm2_reward|internlm2_reward|transformers>=4.38|&#x2718;|-|[internlm/internlm2-7b-reward](https://huggingface.co/internlm/internlm2-7b-reward)|

diff --git a/docs/source_en/BestPractices/Reranker.md b/docs/source_en/BestPractices/Reranker.md
@@ -0,0 +1,103 @@
+# Reranker Training
+
+SWIFT supports Reranker model training. Currently supported models include:
+
+1. modernbert reranker model
+   - [ModelScope](https://www.modelscope.cn/models/iic/gte-reranker-modernbert-base) [Hugging Face](https://huggingface.co/Alibaba-NLP/gte-reranker-modernbert-base)
+2. qwen3-reranker model
+   - 0.6B: [ModelScope](https://www.modelscope.cn/models/Qwen/Qwen3-Reranker-0.6B) [Hugging Face](https://huggingface.co/Qwen/Qwen3-Reranker-0.6B)
+   - 4B: [ModelScope](https://www.modelscope.cn/models/Qwen/Qwen3-Reranker-4B) [Hugging Face](https://huggingface.co/Qwen/Qwen3-Reranker-4B)
+   - 8B: [ModelScope](https://www.modelscope.cn/models/Qwen/Qwen3-Reranker-8B) [Hugging Face](https://huggingface.co/Qwen/Qwen3-Reranker-8B)
+
+## Implementation Methods
+
+SWIFT currently supports two implementation methods for Reranker models, which have significant differences in architecture and loss function computation:
+
+### 1. Classification Reranker
+
+**Applicable Models:** modernbert reranker models (e.g., gte-reranker-modernbert-base)
+
+**Core Principles:**
+- Based on sequence classification architecture, adding a classification head on top of pre-trained models
+- Input: query-document pairs, Output: single relevance score
+
+### 2. Generative Reranker
+
+**Applicable Models:** qwen3-reranker models (0.6B/4B/8B)
+
+**Core Principles:**
+- Based on generative language model architecture (CausalLM)
+- Input: query-document pairs, Output: probability of specific tokens (e.g., "yes"/"no")
+- Classification is performed by comparing logits of specific tokens at the final position
+
+## Loss Function Types
+
+SWIFT supports multiple loss functions for training Reranker models:
+
+### Pointwise Loss Functions
+Pointwise methods transform the ranking problem into a binary classification problem, processing each query-document pair independently:
+
+- **Core Idea:** Binary classification for each query-document pair to determine document relevance to the query
+- **Loss Function:** Binary cross-entropy
+- **Use Cases:** Simple and efficient, suitable for large-scale data training
+
+Environment variable configuration:
+- `GENERATIVE_RERANKER_POSITIVE_TOKEN`: Positive token (default: "yes")
+- `GENERATIVE_RERANKER_NEGATIVE_TOKEN`: Negative token (default: "no")
+
+### Listwise Loss Functions
+Listwise methods transform the ranking problem into a multi-classification problem, selecting positive examples from multiple candidate documents:
+
+- **Core Idea:** Multi-classification for each query's candidate document group (1 positive + n negative examples) to identify positive documents
+- **Loss Function:** Multi-class cross-entropy
+- **Use Cases:** Learning relative ranking relationships between documents, better aligned with the actual needs of information retrieval
+
+Environment variable configuration:
+- `LISTWISE_RERANKER_TEMPERATURE`: Softmax temperature parameter (default: 1.0)
+- `LISTWISE_RERANKER_MIN_GROUP_SIZE`: Minimum group size (default: 2)
+- `LISTWISE_GENERATIVE_RERANKER_TEMPERATURE`: Listwise temperature parameter (default: 1.0)
+- `LISTWISE_GENERATIVE_RERANKER_MIN_GROUP_SIZE`: Minimum group size (default: 2)
+
+**Listwise vs Pointwise:**
+- **Pointwise:** Independent relevance judgment, simple training, but ignores relative relationships between documents
+- **Listwise:** Learning relative ranking, better performance, more suitable for the essential needs of ranking tasks
+
+## Evaluation Metrics
+
+SWIFT provides professional information retrieval evaluation metrics for Reranker training:
+
+### MRR (Mean Reciprocal Rank)
+- **Definition:** Average of reciprocal ranks across all queries
+- **Calculation:** MRR = (1/|Q|) × Σ(1/rank_i), where rank_i is the rank of the positive document for the i-th query
+- **Range:** [0, 1], higher is better
+- **Use Cases:** Focus on the position of positive documents in ranking results
+
+### NDCG (Normalized Discounted Cumulative Gain)
+- **Definition:** Normalized discounted cumulative gain
+- **Calculation:** NDCG = DCG / IDCG, considering the impact of ranking position on relevance
+- **Range:** [0, 1], higher is better
+- **Use Cases:** Comprehensive evaluation of ranking quality, more sensitive to relevance at top positions
+
+**Metric Calculation Notes:**
+- Metrics are calculated based on query grouping, with each query group starting with a positive document followed by negative documents
+- Data format: `[1,0,0,1,0,0,0]` represents 2 queries: query1=[1,0,0], query2=[1,0,0,0]
+- Automatically identifies query boundaries and calculates metrics for each query separately, then takes the average
+
+The loss function source code can be found [here](https://github.com/modelscope/ms-swift/blob/main/swift/plugin/loss.py).
+
+## Dataset Format
+
+```json lines
+{"query": "query", "positive": ["relevant_doc1", "relevant_doc2", ...], "negative": ["irrelevant_doc1", "irrelevant_doc2", ...]}
+```
+
+> Reference: [MTEB/scidocs-reranking](https://www.modelscope.cn/datasets/MTEB/scidocs-reranking)
+
+## Training Scripts
+
+SWIFT provides four training script templates:
+
+- [Pointwise Classification Reranker](https://github.com/tastelikefeet/swift/blob/main/examples/train/reranker/train_reranker.sh)
+- [Pointwise Generative Reranker](https://github.com/tastelikefeet/swift/blob/main/examples/train/reranker/train_generative_reranker.sh)
+- [Listwise Classification Reranker](https://github.com/tastelikefeet/swift/blob/main/examples/train/reranker/train_reranker_listwise.sh)
+- [Listwise Generative Reranker](https://github.com/tastelikefeet/swift/blob/main/examples/train/reranker/train_generative_reranker_listwise.sh)
diff --git a/docs/source_en/Customization/Custom-dataset.md b/docs/source_en/Customization/Custom-dataset.md
@@ -125,7 +125,11 @@ If `seq_kd` is enabled, the final round of the 'assistant' part is not required
 
 ### Embedding
 
-Please refer to [embedding训练文档](../BestPractices/Embedding.md#dataset-format).
+Please refer to [Embedding training document](../BestPractices/Embedding.md#dataset-format).
+
+### Reranker
+
+Please refer to [Reranker training document](../BestPractices/Reranker.md#dataset-format).
 
 ### Multimodal