28 Mar 08:52

22942ca

v1.15.0-rc2 Pre-release

Pre-release

v1.15.0-rc2

Assets 2

28 Mar 06:59

silvanocerza

v1.15.0-rc1

4dc5abf

v1.15.0-rc1 Pre-release

Pre-release

v1.15.0-rc1

Assets 2

28 Feb 13:59

vblagoje

v1.14.0

9b380cf

v1.14.0

⭐ Highlights

PromptNode enhancements

PromptNode just rolled out prompt logging (pipeline debug), run_batch, and model_kwargs support. More updates to PromptNode and PromptTemplates coming soon!

Shaper

We're introducing the Shaper, PromptNode's helper. Shaper unlocks the full potential of PromptNode and ensures its seamless integration with Haystack. But Shaper's scope and functionality are not limited to PromptNode; you can also use it independently, opening up a whole new world of possibilities.

IVF and Product Quantization support for OpenSearchDocumentStore

We've added support for IVF and IVF with Product Quantization to OpenSearchDocumentStore. You can train the IVF index by calling train_index method (same as in FAISSDocumentStore) or by setting ivf_train_size when initializing OpenSearchDocumentStore and take your search to the next level.

What's Changed

Breaking Changes

refactor: Updated rest_api schema for tables to be consistent with Document.to_dict by @sjrl in #3872
feat: Support multiple document_ids in Answer object (for generative QA) by @tstadel in #4062
feat: Update OpenAIAnswerGenerator defaults and with learnings from PromptNode by @sjrl in #4038
build: cache nltk models into the docker image by @mayankjobanputra in #4118
feat: Add IVF and Product Quantization support for OpenSearchDocumentStore by @bogdankostic in #3850

Pipeline

feat: add frontmatter to meta in MarkdownConverter by @TuanaCelik in #3953
fix: removing code block in MarkdownConverter by @TuanaCelik in #3960
feat: Add page range support to PDF converters. by @danielbichuetti in #3965
fix: Update telemetry to not serialize Pipeline if disabled. by @sjrl in #4000
feat: add Shaper by @ZanSara in #3880
fix: Event sending for RayPipeline crashing Haystack by @zoltan-fedor in #3971
fix: document retrieval metrics for non-document_id document_relevance_criteria by @tstadel in #3885
fix: make the crawler more robust on Windows by @anakin87 in #4049
fix: use correct count of outgoing edges in RayPipeline by @zoltan-fedor in #4066
feat: Allow all training options for training a SentenceTransformers EmbeddingRetriever by @sjrl in #4026
refactor: replace mutable default arguments by @julian-risch in #4070
feat: Support multiple RayPipelines by @zoltan-fedor in #4078
Remove double batching in retrieve_batch by @sjrl in #4014
style: Update black by @silvanocerza in #4101
fix: Fix TableTextRetriever for input consisting of tables only by @jackapbutler in #4048
fix: Deduplicate same Documents in isolated evaluation of Reader by @bogdankostic in #4114
Docs: Fix code block formatting by @agnieszka-m in #4162
refactor: Remove the pin from the espnet module and fix the audio node tests. by @danielbichuetti in #4128
fix: change tiktoken fallback mechanism to support Windows amd64 by @danielbichuetti in #4175
feat: Add OpenAIError to retry mechanism by @sjrl in #4178

DocumentStores

refactor: use weaviate client to build BM25 query by @hsm207 in #3939
fix: fixed InMemoryDocumentStore.get_embedding_count to return correct number by @sjrl in #3980
fix: Add inner query for mysql compatibility by @julian-risch in #4068
feat: add support for custom headers by @hsm207 in #4040
feat: Add BM25 support for tables in InMemoryDocumentStore by @bogdankostic in #4090
refactor: InMemoryDocumentStore - manage documents without embedding & fix mypy errors by @anakin87 in #4113
refactor: complete the document stores test refactoring by @masci in #4125
feat: include testing facilities into haystack package by @masci in #4182

Documentation

Align with the docs install guide + correct lg by @agnieszka-m in #3950
docs: Update Crawler docstring for correct usage in Google colab by @silvanocerza in #3979
Docs: Update docstrings by @agnieszka-m in #4119
docs: Update Annotation Tool README.md by @bogdankostic in #4123
feat: Add model_kwargs option to PromptNode by @sjrl in #4151
fix: Remove logging statement of setting ID manually in Document by @bogdankostic in #4129
chore: Fixing PromptNode .prompt() docstring to include the PromptTemplate object as an option by @TuanaCelik in #4135
chore: de-couple the telemetry events for each tutorial from the dataset on AWS that is used by @TuanaCelik in #4155
feat: Implement run_batch for PromptNode by @sjrl in #4072

Other Changes

fix: add option to not override results by Shaper #4231
fix: Shaper store all outputs from function #4223
fix: allowing file-upload api to write files to disk #4221
fix: Fix bug in prompt template check of OpenAIAnswerGenerator #4220
feat: add top_k to PromptNode #4159
feat: Add JsonConverter node #4130
feat: adding secure loading of models by default for haystack by @mayankjobanputra in #3901
fix: add tiktoken fallback mechanism. by @danielbichuetti in #3929
fix: change model in distillation test by @ZanSara in #3944
feat: Expose output_variable in PromptNode result, adjust unit tests by @vblagoje in #3892
fix: Fix type in FARMReader's save_to_remote by @bogdankostic in #3952
refactor: Remove PromptNode hash and equality functions by @vblagoje in #3923
ci: Remove mypy deps install step in python_cache action by @silvanocerza in #3956
fix: overwrite params with environment variables even if there are no params in the pipeline definition; make mypy ignore REST API tests by @anakin87 in #3930
Docs: Update ImageToText docstrings by @agnieszka-m in #3963
Docs: Add TransformersImageToText API doc by @agnieszka-m in #3966
ci: Add Docker images testing by @silvanocerza in #3943
feat: Allow users to set a timeout for remote APIs by @danielbichuetti in #3949
ci: Fix docker image testing on release by @silvanocerza in #3976
Fix: Fix quotation marks by @agnieszka-m in #3973
fix: PromptNode doesn't have run_batch support (yet) by @vblagoje in #3972
chore: increased timeout for loading pipelines through API by @mayankjobanputra in #3977
Missing import for TransformersImageToText by @ZanSara in #3984
test: CI on py3.8 by @ZanSara in #3926
Simplifies and fix docker images tests on release by @silvanocerza in #3982
feat: Add use_prefiltering parameter to DeepsetCloudDocumentStore by @bogdankostic in #3969
ci: Delete Docker images after testing to prevent workflow failure by @silvanocerza in #4004
fix: Add a verbose option to PromptNode to let users understand the prompts being used #2 by @zoltan-fedor in #3898
fix: prevent posthog from sending errors to stderr by @julian-risch in #4008
fix: extend schema for prompt node results by @tstadel in #3891
proposal: TableCell by @sjrl in #3875
refactor: In PromptNode reuse tokenizer instead of loading new one for stop words by @sjrl in #4016
ci: Automate release on PyPi by @silvanocerza in https://github.co...

Contributors

masci, vblagoje, and 16 other contributors

Assets 2

0 Join discussion

22 Feb 17:20

vblagoje

v1.14.0rc2

4504d73

v1.14.0rc2 Pre-release

Pre-release

What's Changed

fix: add option to not override results by Shaper #4231
fix: Shaper store all outputs from function #4223
fix: allowing file-upload api to write files to disk #4221
fix: Fix bug in prompt template check of OpenAIAnswerGenerator #4220
feat: add top_k to PromptNode #4159
feat: Add JsonConverter node #4130

Assets 2

20 Feb 19:02

vblagoje

v1.14.0rc1

a5ef5b4

v1.14.0rc1 Pre-release

Pre-release

⭐ Highlights

PromptNode enhancements

PromptNode just rolled out prompt logging (pipeline debug), run_batch, and model_kwargs support. More updates to PromptNode and PromptTemplates coming soon!

Shaper

IVF and Product Quantization support for OpenSearchDocumentStore