Merge pull request #1683 from XiangJinyu/main

Implement Self-Supervised Prompt Optimizer (SPO)
geekan · Feb 13, 2025 · e935b97 · e935b97
2 parents 4954729 + 62dda60
commit e935b97
Show file tree

Hide file tree

Showing 23 changed files with 1,230 additions and 0 deletions.
diff --git a/docs/resources/spo/SPO-closed_task_figure.png b/docs/resources/spo/SPO-closed_task_figure.png
diff --git a/docs/resources/spo/SPO-closed_task_table.png b/docs/resources/spo/SPO-closed_task_table.png
diff --git a/docs/resources/spo/SPO-logo.png b/docs/resources/spo/SPO-logo.png
diff --git a/docs/resources/spo/SPO-method.png b/docs/resources/spo/SPO-method.png
diff --git a/docs/resources/spo/SPO-open_ended_task_figure.png b/docs/resources/spo/SPO-open_ended_task_figure.png
diff --git a/examples/spo/README.md b/examples/spo/README.md
@@ -0,0 +1,184 @@
+# SPO | Self-Supervised Prompt Optimization <img src="../../docs/resources/spo/SPO-logo.png" width="60" height="60" style="vertical-align: middle; margin-left: 10px; position: relative; top: -5px;">
+
+
+An automated prompt engineering tool for Large Language Models (LLMs), designed for universal domain adaptation.
+
+A next-generation prompt engineering system implementing **Self-Supervised Prompt Optimization (SPO)**. Achieves state-of-the-art performance with 17.8-90.9× higher cost efficiency than conventional methods. 🚀
+
+<p align="center">
+<a href=""><img src="../../docs/resources/spo/SPO-method.png" alt="Framework of SPO" title="Framework of SPO <sub>1</sub>" width="80%"></a>
+</p>
+
+## ✨ Core Advantages
+
+- 💸 **Ultra-Low Cost** - _$0.15 per task optimization_
+- 🏷️ **Zero Supervision** - _No ground truth/human feedback required_
+- ⚡ **Universal Adaptation** - _Closed & open-ended tasks supported_
+- 🔄 **Self-Evolving** - _Auto-optimization via LLM-as-judge mechanism_
+
+[Read our paper on arXiv](https://arxiv.org/pdf/2502.06855)
+
+## 📊 Experiment
+
+###  Closed Tasks
+<p align="center">
+<a href=""><img src="../../docs/resources/spo/SPO-closed_task_table.png" alt="SPO closed task table" title="SPO closed task table <sub>1</sub>" width="80%"></a>
+<a href=""><img src="../../docs/resources/spo/SPO-closed_task_figure.png" alt="SPO closed task figure" title="SPO closed task figure <sub>1</sub>" width="80%"></a>
+</p>
+
+*SPO demonstrates superior cost efficiency, requiring only 1.1% to 5.6% of the cost of state-of-the-art methods while maintaining competitive performance.*
+
+### Open-ended Tasks
+<p align="center">
+<a href=""><img src="../../docs/resources/spo/SPO-open_ended_task_figure.png" alt="Open-ended task figure" title="Open-ended task figure <sub>1</sub>" width="80%"></a>
+</p>
+
+*SPO significantly improves model performance across all model configurations in open-ended tasks.*
+
+## 🚀 Quick Start
+
+### 1. Configure Your API Key ⚙️
+
+Configure LLM parameters in `config/config2.yaml` (see `examples/spo/config2.example.yaml` for reference)
+### 2. Define Your Iteration template 📝
+
+Create a Iteration template file `metagpt/ext/spo/settings/task_name.yaml`:
+```yaml
+prompt: |
+  Please solve the following problem.
+
+requirements: |
+  ...
+
+count: None
+
+faq:
+  - question: |
+      ...
+    answer: |
+      ...
+
+  - question: |
+      ...
+    answer: |
+      ...
+```
+
+Notes:
+- `prompt`: Initial prompt for iteration
+- `requirements`: Desired effects/outcomes (e.g., generate more thinking, use more humorous language)
+- `count`: Target word count for the generated prompt (e.g., 50). Set to None for no limit
+- `faq`: QA pairs used for iteration, can include appropriate number of pairs (typically 3)
+  - `question`: Questions from the dataset used for iteration
+  - `answer`: Corresponding answers. Can contain desired thinking patterns or responses instead of actual answers, or can be left empty. See `metagpt/ext/spo/settings/Navigate.yaml` for reference
+
+### 3. Implement the PromptOptimizer 🔧
+
+You have three ways to run the PromptOptimizer:
+
+#### Option 1: Python Script
+
+```python
+from metagpt.ext.spo.components.optimizer import PromptOptimizer
+from metagpt.ext.spo.utils.llm_client import SPO_LLM
+
+if __name__ == "__main__":
+  # Initialize LLM settings
+  SPO_LLM.initialize(
+    optimize_kwargs={"model": "claude-3-5-sonnet-20240620", "temperature": 0.7},
+    evaluate_kwargs={"model": "gpt-4o-mini", "temperature": 0.3},
+    execute_kwargs={"model": "gpt-4o-mini", "temperature": 0}
+  )
+
+  # Create and run optimizer
+  optimizer = PromptOptimizer(
+    optimized_path="workspace",  # Output directory
+    initial_round=1,  # Starting round
+    max_rounds=10,  # Maximum optimization rounds
+    template="Poem.yaml",  # Template file
+    name="Poem",  # Project name
+  )
+
+  optimizer.optimize()
+```
+
+#### Option 2: Command Line Interface
+
+```bash
+python -m examples.spo.optimize
+```
+
+Available command line options:
+```
+--opt-model            Model for optimization (default: claude-3-5-sonnet-20240620)
+--opt-temp            Temperature for optimization (default: 0.7)
+--eval-model          Model for evaluation (default: gpt-4o-mini)
+--eval-temp          Temperature for evaluation (default: 0.3)
+--exec-model          Model for execution (default: gpt-4o-mini)
+--exec-temp          Temperature for execution (default: 0)
+--workspace          Output directory path (default: workspace)
+--initial-round      Initial round number (default: 1)
+--max-rounds        Maximum number of rounds (default: 10)
+--template          Template file name (default: Poem.yaml)
+--name              Project name (default: Poem)
+```
+
+For help:
+```bash
+python -m examples.spo.optimize --help
+```
+
+#### Option 3: Streamlit Web Interface
+
+For a more user-friendly experience, you can use the Streamlit web interface to configure and run the optimizer.
+
+First, install Streamlit:
+```bash
+pip install "streamlit~=1.42.0"
+```
+
+Then run the web interface:
+```bash 
+python -m streamlit run metagpt/ext/spo/app.py
+```
+
+### 4. View Results
+```
+workspace
+  └── Project_name
+      └── prompts
+          ├── results.json 
+          ├── round_1
+          │   ├── answers.txt
+          │   └── prompt.txt
+          ├── round_2
+          │   ├── answers.txt
+          │   └── prompt.txt
+          ├── round_3
+          │   ├── answers.txt
+          │   └── prompt.txt
+          ├── ...
+          └── round_n
+              ├── answers.txt
+              └── prompt.txt
+```
+
+- `results.json`: Stores whether each iteration round was judged successful and other related information
+- `prompt.txt`: The optimized prompt for the corresponding round
+- `answers.txt`: The output results generated using the prompt for the corresponding round
+
+## Citation
+
+If you use SPO in your research, please cite our paper:
+
+```
+@misc{xiang2025spo,
+      title={Self-Supervised Prompt Optimization}, 
+      author={Jinyu Xiang and Jiayi Zhang and Zhaoyang Yu and Fengwei Teng and Jinhao Tu and Xinbing Liang and Sirui Hong and Chenglin Wu and Yuyu Luo},
+      year={2025},
+      eprint={2502.06855},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2502.06855}, 
+}
+```
diff --git a/examples/spo/config2.example.yaml b/examples/spo/config2.example.yaml
@@ -0,0 +1,12 @@
+models:
+ "<model_name>": # model: "gpt-4-turbo"  # or gpt-3.5-turbo
+   api_type: "openai"  # or azure / ollama / groq etc.
+   base_url: "<your base url>" 
+   api_key: "<your api key>"
+   temperature: 0
+ "<model_name>":  
+   api_type: "openai"  
+   base_url: "<your base url>"
+   api_key: "<your api key>"
+   temperature: 0
+
diff --git a/examples/spo/optimize.py b/examples/spo/optimize.py
@@ -0,0 +1,49 @@
+import argparse
+
+from metagpt.ext.spo.components.optimizer import PromptOptimizer
+from metagpt.ext.spo.utils.llm_client import SPO_LLM
+
+
+def parse_args():
+    parser = argparse.ArgumentParser(description="SPO PromptOptimizer CLI")
+
+    # LLM parameter
+    parser.add_argument("--opt-model", type=str, default="claude-3-5-sonnet-20240620", help="Model for optimization")
+    parser.add_argument("--opt-temp", type=float, default=0.7, help="Temperature for optimization")
+    parser.add_argument("--eval-model", type=str, default="gpt-4o-mini", help="Model for evaluation")
+    parser.add_argument("--eval-temp", type=float, default=0.3, help="Temperature for evaluation")
+    parser.add_argument("--exec-model", type=str, default="gpt-4o-mini", help="Model for execution")
+    parser.add_argument("--exec-temp", type=float, default=0, help="Temperature for execution")
+
+    # PromptOptimizer parameter
+    parser.add_argument("--workspace", type=str, default="workspace", help="Path for optimized output")
+    parser.add_argument("--initial-round", type=int, default=1, help="Initial round number")
+    parser.add_argument("--max-rounds", type=int, default=10, help="Maximum number of rounds")
+    parser.add_argument("--template", type=str, default="Poem.yaml", help="Template file name")
+    parser.add_argument("--name", type=str, default="Poem", help="Project name")
+
+    return parser.parse_args()
+
+
+def main():
+    args = parse_args()
+
+    SPO_LLM.initialize(
+        optimize_kwargs={"model": args.opt_model, "temperature": args.opt_temp},
+        evaluate_kwargs={"model": args.eval_model, "temperature": args.eval_temp},
+        execute_kwargs={"model": args.exec_model, "temperature": args.exec_temp},
+    )
+
+    optimizer = PromptOptimizer(
+        optimized_path=args.workspace,
+        initial_round=args.initial_round,
+        max_rounds=args.max_rounds,
+        template=args.template,
+        name=args.name,
+    )
+
+    optimizer.optimize()
+
+
+if __name__ == "__main__":
+    main()
diff --git a/metagpt/ext/spo/__init__.py b/metagpt/ext/spo/__init__.py