chore: fix all EOF (#3852)

* fix all eof * fix test * fix test * fix test * typo * fix sample * fix sample * add logs * fix page_dynamic_result.txt
deepset-ai · Jan 16, 2023 · 3ffdb0a · 3ffdb0a
1 parent 62935bd
commit 3ffdb0a
Show file tree

Hide file tree

Showing 514 changed files with 685 additions and 968 deletions.
diff --git a/.github/ISSUE_TEMPLATE/config.yml b/.github/ISSUE_TEMPLATE/config.yml
@@ -2,4 +2,4 @@ blank_issues_enabled: true
 contact_links:
   - name: Something unclear? Just ask :)
     url: https://github.com/deepset-ai/haystack/discussions/new
-    about: Start a Github discussion with your question
+    about: Start a Github discussion with your question
diff --git a/.github/labeler.yml b/.github/labeler.yml
@@ -1,2 +1,2 @@
 Proposal:
-- proposals/text/*
+- proposals/text/*
diff --git a/.github/workflows/labeler.yml b/.github/workflows/labeler.yml
@@ -12,4 +12,4 @@ jobs:
     steps:
     - uses: actions/labeler@v4
       with:
-        repo-token: "${{ secrets.GITHUB_TOKEN }}"
+        repo-token: "${{ secrets.GITHUB_TOKEN }}"
diff --git a/annotation_tool/README.md b/annotation_tool/README.md
@@ -66,4 +66,4 @@ The manual (of a slightly earlier version) can be found [here](https://drive.goo
 - Please do not annotate this text
 - You can write down what is missing, or the cause why you cannot label the text + the text number and title.
 8. Which browser to use?
-- Please use the Chrome browser. The tool is not tested for other browsers.
+- Please use the Chrome browser. The tool is not tested for other browsers.
diff --git a/code_of_conduct.txt b/code_of_conduct.txt
@@ -95,4 +95,4 @@ This Code of Conduct is adapted from the Contributor Covenant, version 2.0, avai
 Community Impact Guidelines were inspired by Mozilla’s code of conduct enforcement ladder.
 
 For answers to common questions about this code of conduct, see the FAQ at https://www.contributor-covenant.org/faq.
-Translations are available at https://www.contributor-covenant.org/translations.
+Translations are available at https://www.contributor-covenant.org/translations.
diff --git a/docker/README.md b/docker/README.md
@@ -42,7 +42,7 @@ HAYSTACK_VERSION=mybranch_or_tag BASE_IMAGE_TAG_SUFFIX=latest docker buildx bake
 ### Multi-Platform Builds
 
 Haystack images support multiple architectures. But depending on your operating system and Docker
-environment, you might not be able to build all of them locally. 
+environment, you might not be able to build all of them locally.
 
 You may encounter the following error when trying to build the image:
 
@@ -68,4 +68,4 @@ other licenses (such as Bash, etc from the base distribution, along with any dir
 indirect dependencies of the primary software being contained).
 
 As for any pre-built image usage, it is the image user's responsibility to ensure that any
-use of this image complies with any relevant licenses for all software contained within.
+use of this image complies with any relevant licenses for all software contained within.
diff --git a/docs/_src/api/_static/floating_sidebar.css b/docs/_src/api/_static/floating_sidebar.css
@@ -26,4 +26,4 @@ div.sphinxsidebar .logo img {
 
 div.sphinxsidebar .download a img {
     vertical-align: middle;
-}
+}
diff --git a/docs/_src/api/_templates/xxlayout.html b/docs/_src/api/_templates/xxlayout.html
@@ -43,4 +43,4 @@
       });
     </script>
 {#- endif #}
-{% endblock %}
+{% endblock %}
diff --git a/docs/_src/api/api/crawler.md b/docs/_src/api/api/crawler.md
@@ -182,4 +182,3 @@ E.g. 1) crawler_naming_function=lambda url, page_content: re.sub("[<>:'/\\|?*\0
 **Returns**:
 
 Tuple({"paths": List of filepaths, ...}, Name of output edge)
-
diff --git a/docs/_src/api/api/document_classifier.md b/docs/_src/api/api/document_classifier.md
@@ -184,4 +184,3 @@ Documents are updated in place.
 **Returns**:
 
 List of Documents or list of lists of Documents enriched with meta information.
-
diff --git a/docs/_src/api/api/document_store.md b/docs/_src/api/api/document_store.md
@@ -5804,7 +5804,7 @@ namespace (vectors) if it exists, otherwise the document namespace (no-vectors).
 
 **Returns**:
 
-`None`: 
+`None`:
 
 <a id="pinecone.PineconeDocumentStore.delete_index"></a>
 
@@ -6056,4 +6056,3 @@ and UTC as default time zone.
 
 This method cannot be part of WeaviateDocumentStore, as this would result in a circular import between weaviate.py
 and filter_utils.py.
-
diff --git a/docs/_src/api/api/evaluation.md b/docs/_src/api/api/evaluation.md
@@ -163,4 +163,3 @@ https://huggingface.co/transformers/main_classes/model.html#transformers.PreTrai
 **Returns**:
 
 top_1_sas, top_k_sas, pred_label_matrix
-
diff --git a/docs/_src/api/api/extractor.md b/docs/_src/api/api/extractor.md
@@ -194,4 +194,3 @@ This is a wrapper class to create a Pytorch dataset object from the data attribu
 
 - `model_inputs`: The data attribute of the output from a HuggingFace tokenizer which is needed to evaluate the
 forward pass of a token classification model.
-
diff --git a/docs/_src/api/api/file_classifier.md b/docs/_src/api/api/file_classifier.md
@@ -42,4 +42,3 @@ Sends out files on a different output edge depending on their extension.
 **Arguments**:
 
 - `file_paths`: paths to route on different edges.
-
diff --git a/docs/_src/api/api/file_converter.md b/docs/_src/api/api/file_converter.md
@@ -734,4 +734,3 @@ in garbled text.
 attributes. If you want to ensure you don't have duplicate documents in your DocumentStore but texts are
 not unique, you can modify the metadata and pass e.g. `"meta"` to this field (e.g. [`"content"`, `"meta"`]).
 In this case the id will be generated by using the content and the defined metadata.
-
diff --git a/docs/_src/api/api/generator.md b/docs/_src/api/api/generator.md
@@ -445,4 +445,3 @@ Example:
 **Returns**:
 
 Dictionary containing query and Answers.
-
diff --git a/docs/_src/api/api/other_nodes.md b/docs/_src/api/api/other_nodes.md
@@ -136,4 +136,3 @@ well.
 of values to group the `Document`s to. `Document`s whose metadata field is equal to the first value of the
 provided list will be routed to `"output_1"`, `Document`s whose metadata field is equal to the second
 value of the provided list will be routed to `"output_2"`, etc.
-
diff --git a/docs/_src/api/api/pipelines.md b/docs/_src/api/api/pipelines.md
@@ -1974,4 +1974,3 @@ def run_batch(document_ids: List[str], top_k: int = 5)
 
 - `document_ids`: document ids
 - `top_k`: How many documents id to return against single document
-
diff --git a/docs/_src/api/api/preprocessor.md b/docs/_src/api/api/preprocessor.md
@@ -148,4 +148,3 @@ def split(document: Union[dict, Document],
 Perform document splitting on a single document. This method can split on different units, at different lengths,
 with different strides. It can also respect sentence boundaries. Its exact functionality is defined by
 the parameters passed into PreProcessor.__init__(). Takes a single document as input and returns a list of documents.
-
diff --git a/docs/_src/api/api/primitives.md b/docs/_src/api/api/primitives.md
@@ -264,7 +264,7 @@ or, user-feedback from the Haystack REST API.
 **Arguments**:
 
 - `query`: the question (or query) for finding answers.
-- `document`: 
+- `document`:
 - `answer`: the answer object.
 - `is_correct_answer`: whether the sample is positive or negative.
 - `is_correct_document`: in case of negative sample(is_correct_answer is False), there could be two cases;
@@ -599,4 +599,3 @@ Loads the evaluation result from disk. Expects one csv file per node. See save()
 This method uses different default values than pd.read_csv() for the following parameters:
 header=0, converters=CONVERTERS
 where CONVERTERS is a dictionary mapping all array typed columns to ast.literal_eval.
-
diff --git a/docs/_src/api/api/pseudo_label_generator.md b/docs/_src/api/api/pseudo_label_generator.md
@@ -33,14 +33,14 @@ For example:
 
 **Notes**:
 
-  
+
   While the NLP researchers trained the default question
   [generation](https://huggingface.co/doc2query/msmarco-t5-base-v1) and the cross
   [encoder](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-6-v2) models on
   the English language corpus, we can also use the language-specific question generation and
   cross-encoder models in the target language of our choice to apply GPL to documents in languages
   other than English.
-  
+
   As of this writing, the German language question
   [generation](https://huggingface.co/ml6team/mt5-small-german-query-generation) and the cross
   [encoder](https://huggingface.co/ml6team/cross-encoder-mmarco-german-distilbert-base) models are
@@ -194,4 +194,3 @@ dictionary contains the following keys:
 - pos_doc: Positive document for the given question.
 - neg_doc: Negative document for the given question.
 - score: The margin between the score for question-positive document pair and the score for question-negative document pair.
-
diff --git a/docs/_src/api/api/query_classifier.md b/docs/_src/api/api/query_classifier.md
@@ -35,33 +35,33 @@ and the further processing can be customized. You can define this by connecting
   |pipe.add_node(component=SklearnQueryClassifier(), name="QueryClassifier", inputs=["Query"])
   |pipe.add_node(component=elastic_retriever, name="ElasticRetriever", inputs=["QueryClassifier.output_2"])
   |pipe.add_node(component=dpr_retriever, name="DPRRetriever", inputs=["QueryClassifier.output_1"])
-  
+
   |# Keyword queries will use the ElasticRetriever
   |pipe.run("kubernetes aws")
-  
+
   |# Semantic queries (questions, statements, sentences ...) will leverage the DPR retriever
   |pipe.run("How to manage kubernetes on aws")
-  
+
   ```
-  
+
   Models:
-  
+
   Pass your own `Sklearn` binary classification model or use one of the following pretrained ones:
   1) Keywords vs. Questions/Statements (Default)
   query_classifier can be found [here](https://ext-models-haystack.s3.eu-central-1.amazonaws.com/gradboost_query_classifier/model.pickle)
   query_vectorizer can be found [here](https://ext-models-haystack.s3.eu-central-1.amazonaws.com/gradboost_query_classifier/vectorizer.pickle)
   output_1 => question/statement
   output_2 => keyword query
   [Readme](https://ext-models-haystack.s3.eu-central-1.amazonaws.com/gradboost_query_classifier/readme.txt)
-  
-  
+
+
   2) Questions vs. Statements
   query_classifier can be found [here](https://ext-models-haystack.s3.eu-central-1.amazonaws.com/gradboost_query_classifier_statements/model.pickle)
   query_vectorizer can be found [here](https://ext-models-haystack.s3.eu-central-1.amazonaws.com/gradboost_query_classifier_statements/vectorizer.pickle)
   output_1 => question
   output_2 => statement
   [Readme](https://ext-models-haystack.s3.eu-central-1.amazonaws.com/gradboost_query_classifier_statements/readme.txt)
-  
+
   See also the [tutorial](https://haystack.deepset.ai/tutorials/pipelines) on pipelines.
 
 <a id="sklearn.SklearnQueryClassifier.__init__"></a>
@@ -116,33 +116,33 @@ This node also supports zero-shot-classification.
   |pipe.add_node(component=TransformersQueryClassifier(), name="QueryClassifier", inputs=["Query"])
   |pipe.add_node(component=elastic_retriever, name="ElasticRetriever", inputs=["QueryClassifier.output_2"])
   |pipe.add_node(component=dpr_retriever, name="DPRRetriever", inputs=["QueryClassifier.output_1"])
-  
+
   |# Keyword queries will use the ElasticRetriever
   |pipe.run("kubernetes aws")
-  
+
   |# Semantic queries (questions, statements, sentences ...) will leverage the DPR retriever
   |pipe.run("How to manage kubernetes on aws")
-  
+
   ```
-  
+
   Models:
-  
+
   Pass your own `Transformer` classification/zero-shot-classification model from file/huggingface or use one of the following
   pretrained ones hosted on Huggingface:
   1) Keywords vs. Questions/Statements (Default)
   model_name_or_path="shahrukhx01/bert-mini-finetune-question-detection"
   output_1 => question/statement
   output_2 => keyword query
   [Readme](https://ext-models-haystack.s3.eu-central-1.amazonaws.com/gradboost_query_classifier/readme.txt)
-  
-  
+
+
   2) Questions vs. Statements
   `model_name_or_path`="shahrukhx01/question-vs-statement-classifier"
   output_1 => question
   output_2 => statement
   [Readme](https://ext-models-haystack.s3.eu-central-1.amazonaws.com/gradboost_query_classifier_statements/readme.txt)
-  
-  
+
+
   See also the [tutorial](https://haystack.deepset.ai/tutorials/pipelines) on pipelines.
 
 <a id="transformers.TransformersQueryClassifier.__init__"></a>
@@ -185,4 +185,3 @@ https://huggingface.co/transformers/main_classes/model.html#transformers.PreTrai
 A list containing torch device objects and/or strings is supported (For example
 [torch.device('cuda:0'), "mps", "cuda:1"]). When specifying `use_gpu=False` the devices
 parameter is not used and a single cpu device is used for inference.
-
diff --git a/docs/_src/api/api/question_generator.md b/docs/_src/api/api/question_generator.md
@@ -83,4 +83,3 @@ Generates questions for a list of strings or a list of lists of strings.
 
 - `texts`: List of str or list of list of str.
 - `batch_size`: Number of texts to process at a time.
-
diff --git a/docs/_src/api/api/ranker.md b/docs/_src/api/api/ranker.md
@@ -194,4 +194,3 @@ Returns lists of Documents sorted by (desc.) similarity with the corresponding q
 - `documents`: Single list of Documents or list of lists of Documents to be reranked.
 - `top_k`: The maximum number of documents to return per Document list.
 - `batch_size`: Number of Documents to process at a time.
-
diff --git a/docs/_src/api/api/reader.md b/docs/_src/api/api/reader.md
@@ -1110,4 +1110,3 @@ of content_type ``'table'``.
 **Returns**:
 
 Dict containing query and answers
-
diff --git a/docs/_src/api/api/retriever.md b/docs/_src/api/api/retriever.md
@@ -2124,4 +2124,3 @@ Generate formatted dictionary output with text answer and additional info
 **Arguments**:
 
 - `result`: The result of a SPARQL query as retrieved from the knowledge graph
-
diff --git a/docs/_src/api/api/summarizer.md b/docs/_src/api/api/summarizer.md
@@ -189,4 +189,3 @@ If set to "True", all docs of a document list will be joined to a single string
 that will then be summarized.
 Important: The summary will depend on the order of the supplied documents!
 - `batch_size`: Number of Documents to process at a time.
-
diff --git a/docs/_src/api/api/translator.md b/docs/_src/api/api/translator.md
@@ -167,4 +167,3 @@ Run the actual translation. You can supply a single query, a list of queries or
 - `queries`: Single query or list of queries.
 - `documents`: List of documents or list of lists of documets.
 - `batch_size`: Not applicable.
-
diff --git a/docs/_src/api/api/utils.md b/docs/_src/api/api/utils.md
@@ -388,4 +388,3 @@ prediction head. Each dictionary contains the metrics and reports generated duri
 
 A tuple (stopprocessing, savemodel, eval_value) indicating if processing should be stopped
 and if the current model should get saved and the evaluation value used.
-
diff --git a/docs/_src/api/openapi/openapi-1.10.0rc0.json b/docs/_src/api/openapi/openapi-1.10.0rc0.json
@@ -1025,4 +1025,4 @@
             }
         }
     }
-}
+}
diff --git a/docs/_src/api/openapi/openapi-1.11.0rc0.json b/docs/_src/api/openapi/openapi-1.11.0rc0.json
@@ -1021,4 +1021,4 @@
             }
         }
     }
-}
+}
diff --git a/docs/_src/api/openapi/openapi-1.12.0rc0.json b/docs/_src/api/openapi/openapi-1.12.0rc0.json
@@ -1033,4 +1033,4 @@
             "python"
         ]
     }
-}
+}
diff --git a/docs/_src/api/openapi/openapi-1.2.0.json b/docs/_src/api/openapi/openapi-1.2.0.json
@@ -831,4 +831,4 @@
             }
         }
     }
-}
+}
diff --git a/docs/_src/api/openapi/openapi-1.2.1rc0.json b/docs/_src/api/openapi/openapi-1.2.1rc0.json
@@ -824,4 +824,4 @@
             }
         }
     }
-}
+}
diff --git a/docs/_src/api/openapi/openapi-1.3.0.json b/docs/_src/api/openapi/openapi-1.3.0.json
@@ -831,4 +831,4 @@
             }
         }
     }
-}
+}
diff --git a/docs/_src/api/openapi/openapi-1.3.1rc0.json b/docs/_src/api/openapi/openapi-1.3.1rc0.json
@@ -889,4 +889,4 @@
             }
         }
     }
-}
+}
diff --git a/docs/_src/api/openapi/openapi-1.4.0.json b/docs/_src/api/openapi/openapi-1.4.0.json
@@ -889,4 +889,4 @@
             }
         }
     }
-}
+}
diff --git a/docs/_src/api/openapi/openapi-1.4.1rc0.json b/docs/_src/api/openapi/openapi-1.4.1rc0.json
@@ -890,4 +890,4 @@
             }
         }
     }
-}
+}
diff --git a/docs/_src/api/openapi/openapi-1.5.0.json b/docs/_src/api/openapi/openapi-1.5.0.json
@@ -889,4 +889,4 @@
             }
         }
     }
-}
+}
diff --git a/docs/_src/api/openapi/openapi-1.5.1rc0.json b/docs/_src/api/openapi/openapi-1.5.1rc0.json
@@ -890,4 +890,4 @@
             }
         }
     }
-}
+}
diff --git a/docs/_src/api/openapi/openapi-1.6.0.json b/docs/_src/api/openapi/openapi-1.6.0.json
@@ -890,4 +890,4 @@
             }
         }
     }
-}
+}
diff --git a/docs/_src/api/openapi/openapi-1.6.1rc0.json b/docs/_src/api/openapi/openapi-1.6.1rc0.json
@@ -883,4 +883,4 @@
             }
         }
     }
-}
+}
diff --git a/docs/_src/api/openapi/openapi-1.7.0.json b/docs/_src/api/openapi/openapi-1.7.0.json
@@ -883,4 +883,4 @@
             }
         }
     }
-}
+}
diff --git a/docs/_src/api/openapi/openapi-1.7.1.json b/docs/_src/api/openapi/openapi-1.7.1.json
@@ -883,4 +883,4 @@
             }
         }
     }
-}
+}
diff --git a/docs/_src/api/openapi/openapi-1.7.1rc0.json b/docs/_src/api/openapi/openapi-1.7.1rc0.json
@@ -883,4 +883,4 @@
             }
         }
     }
-}
+}
-Original file line number
+Diff line change
@@ Expand Up / @@ -26,4 +26,4 @@ div.sphinxsidebar .logo img { @@
     div.sphinxsidebar .download a img {
         vertical-align: middle;
-    }
+    }
Original file line number	Diff line number	Diff line change
Expand Up		@@ -182,4 +182,3 @@ E.g. 1) crawler_naming_function=lambda url, page_content: re.sub("[<>:'/\\\|?*\0
		Returns:

		Tuple({"paths": List of filepaths, ...}, Name of output edge)
Original file line number	Diff line number	Diff line change
Expand Up		@@ -184,4 +184,3 @@ Documents are updated in place.
		Returns:

		List of Documents or list of lists of Documents enriched with meta information.
Original file line number	Diff line number	Diff line change
Expand Up		@@ -163,4 +163,3 @@ https://huggingface.co/transformers/main_classes/model.html#transformers.PreTrai
		Returns:

		top_1_sas, top_k_sas, pred_label_matrix
Original file line number	Diff line number	Diff line change
Expand Up		@@ -194,4 +194,3 @@ This is a wrapper class to create a Pytorch dataset object from the data attribu

		- `model_inputs`: The data attribute of the output from a HuggingFace tokenizer which is needed to evaluate the
		forward pass of a token classification model.
Original file line number	Diff line number	Diff line change
Expand Up		@@ -42,4 +42,3 @@ Sends out files on a different output edge depending on their extension.
		Arguments:

		- `file_paths`: paths to route on different edges.
Original file line number	Diff line number	Diff line change
Expand Up		@@ -734,4 +734,3 @@ in garbled text.
		attributes. If you want to ensure you don't have duplicate documents in your DocumentStore but texts are
		not unique, you can modify the metadata and pass e.g. `"meta"` to this field (e.g. [`"content"`, `"meta"`]).
		In this case the id will be generated by using the content and the defined metadata.
Original file line number	Diff line number	Diff line change
Expand Up		@@ -445,4 +445,3 @@ Example:
		Returns:

		Dictionary containing query and Answers.
Original file line number	Diff line number	Diff line change
Expand Up		@@ -136,4 +136,3 @@ well.
		of values to group the `Document`s to. `Document`s whose metadata field is equal to the first value of the
		provided list will be routed to `"output_1"`, `Document`s whose metadata field is equal to the second
		value of the provided list will be routed to `"output_2"`, etc.
Original file line number	Diff line number	Diff line change
Expand Up		@@ -1974,4 +1974,3 @@ def run_batch(document_ids: List[str], top_k: int = 5)

		- `document_ids`: document ids
		- `top_k`: How many documents id to return against single document