Spaces:

derek-thomas
/

prompt-order-experiment

Running

App Files Files Community

derek-thomas HF staff commited on 1 day ago

Commit

e4449d4

1 Parent(s): e612968

Updating for falcon

Browse files

Files changed (5) hide show

assets/prompt-order-experiment.svg +1 -1
mermaid.md +10 -10
prompt_order_exeriment/pages/index.py +2 -2
prompt_order_exeriment/pages/overview.py +2 -2
prompt_order_exeriment/pages/results.py +5 -5

assets/prompt-order-experiment.svg CHANGED Viewed

mermaid.md CHANGED Viewed

@@ -14,7 +14,7 @@ graph TD
     style F fill:#333,stroke:#FF9D00,color:#FFD21E
     subgraph Notebooks
-        NB0[00-poe-generate-mistral-reasoning.ipynb]
         NB1[01-poe-dataset-creation.ipynb]
         NB2[02-autotrain.ipynb]
         NB3[03-poe-token-count-exploration.ipynb]
@@ -23,15 +23,15 @@ graph TD
     subgraph Models
         D[Fine-Tuned MODELS]
-        G[BASE_MODEL: mistralai/Mistral-7B-Instruct-v0.3]
     end
     subgraph Datasets
         A[(layoric/labeled-multiple-choice-explained)]
-        B[(derek-thomas/labeled-multiple-choice-explained-mistral-reasoning)]
-        C[(derek-thomas/labeled-multiple-choice-explained-mistral-tokenized)]
         E[Deployment Config]
-        F[(derek-thomas/labeled-multiple-choice-explained-mistral-results)]
     end
     A --> NB0
@@ -56,14 +56,14 @@ graph TD
     G --> NB4
     NB4 --> F
-    click NB0 href "https://huggingface.co/derek-thomas/prompt-order-experiment/blob/main/00-poe-generate-mistral-reasoning.ipynb"
     click NB1 href "https://huggingface.co/derek-thomas/prompt-order-experiment/blob/main/01-poe-dataset-creation.ipynb"
     click NB2 href "https://huggingface.co/derek-thomas/prompt-order-experiment/blob/main/02-autotrain.ipynb"
     click NB3 href "https://huggingface.co/derek-thomas/prompt-order-experiment/blob/main/03-poe-token-count-exploration.ipynb"
     click NB4 href "https://huggingface.co/derek-thomas/prompt-order-experiment/blob/main/04-poe-eval.ipynb"
-    click G href "https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3"
     click A href "https://huggingface.co/datasets/layoric/labeled-multiple-choice-explained"
-    click B href "https://huggingface.co/datasets/derek-thomas/labeled-multiple-choice-explained-mistral-reasoning"
-    click C href "https://huggingface.co/datasets/derek-thomas/labeled-multiple-choice-explained-mistral-tokenized"
-    click F href "https://huggingface.co/datasets/derek-thomas/labeled-multiple-choice-explained-mistral-results"
 ```

     style F fill:#333,stroke:#FF9D00,color:#FFD21E
     subgraph Notebooks
+        NB0[00-poe-generate-falcon-reasoning.ipynb]
         NB1[01-poe-dataset-creation.ipynb]
         NB2[02-autotrain.ipynb]
         NB3[03-poe-token-count-exploration.ipynb]
     subgraph Models
         D[Fine-Tuned MODELS]
+        G[BASE_MODEL: tiiuae/Falcon3-7B-Instruct]
     end
     subgraph Datasets
         A[(layoric/labeled-multiple-choice-explained)]
+        B[(derek-thomas/labeled-multiple-choice-explained-falcon-reasoning)]
+        C[(derek-thomas/labeled-multiple-choice-explained-falcon-tokenized)]
         E[Deployment Config]
+        F[(derek-thomas/labeled-multiple-choice-explained-falcon-results)]
     end
     A --> NB0
     G --> NB4
     NB4 --> F
+    click NB0 href "https://huggingface.co/derek-thomas/prompt-order-experiment/blob/main/00-poe-generate-falcon-reasoning.ipynb"
     click NB1 href "https://huggingface.co/derek-thomas/prompt-order-experiment/blob/main/01-poe-dataset-creation.ipynb"
     click NB2 href "https://huggingface.co/derek-thomas/prompt-order-experiment/blob/main/02-autotrain.ipynb"
     click NB3 href "https://huggingface.co/derek-thomas/prompt-order-experiment/blob/main/03-poe-token-count-exploration.ipynb"
     click NB4 href "https://huggingface.co/derek-thomas/prompt-order-experiment/blob/main/04-poe-eval.ipynb"
+    click G href "https://huggingface.co/tiiuae/Falcon3-7B-Instruct"
     click A href "https://huggingface.co/datasets/layoric/labeled-multiple-choice-explained"
+    click B href "https://huggingface.co/datasets/derek-thomas/labeled-multiple-choice-explained-falcon-reasoning"
+    click C href "https://huggingface.co/datasets/derek-thomas/labeled-multiple-choice-explained-falcon-tokenized"
+    click F href "https://huggingface.co/datasets/derek-thomas/labeled-multiple-choice-explained-falcon-results"
 ```

prompt_order_exeriment/pages/index.py CHANGED Viewed

@@ -11,7 +11,7 @@ This experiment aims to explore various scenarios for **prompt fine-tuning** usi
 ## Scenarios
 We will evaluate the following prompt orders:
-### **Scenario 1: Q - AC - R - FA** (Mistral and GPT3.5)
 This is the most natural order. The model generates reasoning before the final answer, providing the most information prior to making a selection. This order leverages decoding mechanics effectively.
@@ -35,7 +35,7 @@ This is our assistant message, you can see that we are forcing a JSON (note I ad
 ```
 </details>
-### **Scenario 2: Q - AC - FA - R** (Mistral and GPT3.5)
 An awkward order, placing reasoning after the final answer. While it is faster, it assumes the model can "know" reasoning internally before generating it. This approach saves tokens but is a skeptical case worth testing.

 ## Scenarios
 We will evaluate the following prompt orders:
+### **Scenario 1: Q - AC - R - FA** (Falcon and GPT3.5)
 This is the most natural order. The model generates reasoning before the final answer, providing the most information prior to making a selection. This order leverages decoding mechanics effectively.
 ```
 </details>
+### **Scenario 2: Q - AC - FA - R** (Falcon and GPT3.5)
 An awkward order, placing reasoning after the final answer. While it is faster, it assumes the model can "know" reasoning internally before generating it. This approach saves tokens but is a skeptical case worth testing.

prompt_order_exeriment/pages/overview.py CHANGED Viewed

@@ -3,9 +3,9 @@ import reflex as rx
 p2 = '''
 # Steps
 ### Dataset Selection
-We begin with the <a href="https://huggingface.co/datasets/layoric/labeled-multiple-choice-explained" target="_blank">layoric/labeled-multiple-choice-explained</a> dataset, which includes reasoning provided by GPT-3.5-turbo. reasoning explanations serve as a starting point but may differ from Mistral's reasoning style.
-0. <i><a href="https://huggingface.co/derek-thomas/prompt-order-experiment/blob/main/00-poe-generate-mistral-reasoning.ipynb" target="_blank">00-poe-generate-mistral-reasoning.ipynb</a></i>: To align with Mistral, we need to create a refined dataset: <a href="https://huggingface.co/datasets/derek-thomas/labeled-multiple-choice-explained-mistral-reasoning" target="_blank">derek-thomas/labeled-multiple-choice-explained-mistral-reasoning</a>.
 1. <i><a href="https://huggingface.co/derek-thomas/prompt-order-experiment/blob/main/01-poe-dataset-creation.ipynb" target="_blank">01-poe-dataset-creation.ipynb</a></i>: Then we need to create our prompt experiments.
 2. <i><a href="https://huggingface.co/derek-thomas/prompt-order-experiment/blob/main/02-autotrain.ipynb" target="_blank">02-autotrain.ipynb</a></i>: We generate autotrain jobs on spaces to train our models.
 3. <i><a href="https://huggingface.co/derek-thomas/prompt-order-experiment/blob/main/03-poe-token-count-exploration.ipynb" target="_blank">03-poe-token-count-exploration.ipynb</a></i>: We do some quick analysis so we can optimize our TGI settings.

 p2 = '''
 # Steps
 ### Dataset Selection
+We begin with the <a href="https://huggingface.co/datasets/layoric/labeled-multiple-choice-explained" target="_blank">layoric/labeled-multiple-choice-explained</a> dataset, which includes reasoning provided by GPT-3.5-turbo. reasoning explanations serve as a starting point but may differ from Falcon's reasoning style.
+0. <i><a href="https://huggingface.co/derek-thomas/prompt-order-experiment/blob/main/00-poe-generate-falcon-reasoning.ipynb" target="_blank">00-poe-generate-falcon-reasoning.ipynb</a></i>: To align with falcon, we need to create a refined dataset: <a href="https://huggingface.co/datasets/derek-thomas/labeled-multiple-choice-explained-falcon-reasoning" target="_blank">derek-thomas/labeled-multiple-choice-explained-falcon-reasoning</a>.
 1. <i><a href="https://huggingface.co/derek-thomas/prompt-order-experiment/blob/main/01-poe-dataset-creation.ipynb" target="_blank">01-poe-dataset-creation.ipynb</a></i>: Then we need to create our prompt experiments.
 2. <i><a href="https://huggingface.co/derek-thomas/prompt-order-experiment/blob/main/02-autotrain.ipynb" target="_blank">02-autotrain.ipynb</a></i>: We generate autotrain jobs on spaces to train our models.
 3. <i><a href="https://huggingface.co/derek-thomas/prompt-order-experiment/blob/main/03-poe-token-count-exploration.ipynb" target="_blank">03-poe-token-count-exploration.ipynb</a></i>: We do some quick analysis so we can optimize our TGI settings.

prompt_order_exeriment/pages/results.py CHANGED Viewed

@@ -13,7 +13,7 @@ Make sure you explore what happeened between:
 """
 # Load the HF dataset
-dataset = load_dataset("derek-thomas/labeled-multiple-choice-explained-mistral-results")
 # Convert the dataset to a Pandas DataFrame
 df = dataset['train'].to_pandas()
@@ -22,8 +22,8 @@ df = dataset['train'].to_pandas()
 cols_to_analyze = [
     "predictions_base",
     "predictions_FA",
-    "predictions_RFA_mistral",
-    "predictions_FAR_mistral",
     "predictions_RFA_gpt3_5",
     "predictions_FAR_gpt3_5",
     ]
@@ -32,8 +32,8 @@ cols_to_analyze = [
 model_names = {
     "predictions_base": "Base Model",
     "predictions_FA": "Final Answer",
-    "predictions_RFA_mistral": "Reasoning (Mistral) -> Final Answer)",
-    "predictions_FAR_mistral": "Final Answer -> Reasoning (Mistral)",
     "predictions_RFA_gpt3_5": "Reasoning (GPT-3.5 ) -> Final Answer",
     "predictions_FAR_gpt3_5": "Final Answer -> Reasoning(GPT-3.5)",
     }

 """
 # Load the HF dataset
+dataset = load_dataset("derek-thomas/labeled-multiple-choice-explained-falcon-results")
 # Convert the dataset to a Pandas DataFrame
 df = dataset['train'].to_pandas()
 cols_to_analyze = [
     "predictions_base",
     "predictions_FA",
+    "predictions_RFA_falcon",
+    "predictions_FAR_falcon",
     "predictions_RFA_gpt3_5",
     "predictions_FAR_gpt3_5",
     ]
 model_names = {
     "predictions_base": "Base Model",
     "predictions_FA": "Final Answer",
+    "predictions_RFA_falcon": "Reasoning (Falcon) -> Final Answer)",
+    "predictions_FAR_falcon": "Final Answer -> Reasoning (Falcon)",
     "predictions_RFA_gpt3_5": "Reasoning (GPT-3.5 ) -> Final Answer",
     "predictions_FAR_gpt3_5": "Final Answer -> Reasoning(GPT-3.5)",
     }