Skip to content

Python: Fix inaccurate split flag in TextChunker to prevent redundant re-splitting#12623

Closed
mohiuddin-khan-shiam wants to merge 3 commits into
microsoft:mainfrom
mohiuddin-khan-shiam:main
Closed

Python: Fix inaccurate split flag in TextChunker to prevent redundant re-splitting#12623
mohiuddin-khan-shiam wants to merge 3 commits into
microsoft:mainfrom
mohiuddin-khan-shiam:main

Conversation

@mohiuddin-khan-shiam

Copy link
Copy Markdown

Description

semantic_kernel.text.text_chunker._split_str always returned input_was_split=False even after splitting, causing higher-level routines to keep searching separators and unnecessarily re-split text.
The function now sets input_was_split=True as soon as it performs the initial split and continues to propagate deeper recursive flags, improving performance and preserving intended chunk boundaries.

odiomarcelino and others added 2 commits June 29, 2025 18:46
…tting

`semantic_kernel.text.text_chunker._split_str` always returned `input_was_split=False` even after splitting, causing higher-level routines to keep searching separators and unnecessarily re-split text.
The function now sets `input_was_split=True` as soon as it performs the initial split and continues to propagate deeper recursive flags, improving performance and preserving intended chunk boundaries.

Co-Authored-By: S. M. Mohiuddin Khan Shiam <147746955+mohiuddin-khan-shiam@users.noreply.github.com>
…tting

`semantic_kernel.text.text_chunker._split_str` always returned `input_was_split=False` even after splitting, causing higher-level routines to keep searching separators and unnecessarily re-split text.  
The function now sets `input_was_split=True` as soon as it performs the initial split and continues to propagate deeper recursive flags, improving performance and preserving intended chunk boundaries.
@mohiuddin-khan-shiam mohiuddin-khan-shiam requested a review from a team as a code owner June 29, 2025 12:49
@markwallace-microsoft markwallace-microsoft added the python Pull requests for the Python Semantic Kernel label Jun 29, 2025
@github-actions github-actions Bot changed the title Fix inaccurate split flag in TextChunker to prevent redundant re-splitting Python: Fix inaccurate split flag in TextChunker to prevent redundant re-splitting Jun 29, 2025
@moonbox3

Copy link
Copy Markdown
Collaborator

Hi @mohiuddin-khan-shiam thanks for the contribution. Can you please have a look at the failing unit tests?

@moonbox3

Copy link
Copy Markdown
Collaborator

Hi @mohiuddin-khan-shiam thanks for the contribution. Can you please have a look at the failing unit tests?

Hello @mohiuddin-khan-shiam, please have a look at the failing unit tests. Thanks.

@moonbox3

Copy link
Copy Markdown
Collaborator

Please re-open the PR when ready to move forward, thanks.

@moonbox3 moonbox3 closed this Jul 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python Pull requests for the Python Semantic Kernel

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants