Skip to content

Skip page should also support skip dict page #7409

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 16, 2025

Conversation

zhuqi-lucas
Copy link
Contributor

Which issue does this PR close?

Closes part of #7363

Rationale for this change

We hit the bug during the benchmark testing for the cache page branch.

The bug is we skip page not skipping the dict page which will make the corner case dead lock for the benchmark testing.

What changes are included in this PR?

Fix the skip page also support skip dict page.

Are there any user-facing changes?

No, the skip dict page use case only happen for the cache page branch.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks and nice sleuthing @zhuqi-lucas

What is weird is that we haven't hit this in production given how widely used this parquet-rs reader is. There are several systems I know of (like coralogix, pydantic and influxdb_iox that have filter pushdown enabled for years now) and skipping pages is done all the time via statistics.

Is there any way to trigger this issue from the public APIs (the unit test in this PR is using the low level APIs (like skip_next_page directly) 🤔

@zhuqi-lucas
Copy link
Contributor Author

Thank you @alamb for review and good question, i try to explain:

  1. Before the cache page PR, we will not use the low level skip_next_page to trigger the dict page bug, because the only place we use the low level api is:
    /// Skips over `num_records` records, where records are delimited by repetition levels of 0
    ///
    /// # Returns
    ///
    /// Returns the number of records skipped
    pub fn skip_records(&mut self, num_records: usize) -> Result<usize> {
        let mut remaining_records = num_records;
        while remaining_records != 0 {
            if self.num_buffered_values == self.num_decoded_values {
                let metadata = match self.page_reader.peek_next_page()? {
                    None => return Ok(num_records - remaining_records),
                    Some(metadata) => metadata,
                };

                // If dictionary, we must read it
                if metadata.is_dict {
                    self.read_dictionary_page()?;
                    continue;
                }

                // If page has less rows than the remaining records to
                // be skipped, skip entire page
                let rows = metadata.num_rows.or_else(|| {
                    // If no repetition levels, num_levels == num_rows
                    self.rep_level_decoder
                        .is_none()
                        .then_some(metadata.num_levels)?
                });

                if let Some(rows) = rows {
                    if rows <= remaining_records {
                        self.page_reader.skip_next_page()?;
                        remaining_records -= rows;
                        continue;
                    }
                }
            ..............
}

And the logic here, we already exclude the dict page case for above high level API, so we will not hit the bug:

               // If dictionary, we must read it
                if metadata.is_dict {
                    self.read_dictionary_page()?;
                    continue;
                }

So the following code will not trigger the dict page skip:

              if let Some(rows) = rows {
                    if rows <= remaining_records {
                        self.page_reader.skip_next_page()?;
                        remaining_records -= rows;
                        continue;
                    }
                }
  1. But we will use the low level API after the cache page PR which will hit the skip dict page bug.

@tustvold
Copy link
Contributor

tustvold commented Apr 13, 2025

I'm not sure this is correct, skip_next_page skips the next page regardless of its type? It seems especially odd as there isn't equivalent logic in the non-offset index case.

Is this possibly a bug in whatever benchmark/cache logic you're referring to?

@zhuqi-lucas
Copy link
Contributor Author

zhuqi-lucas commented Apr 13, 2025

Thank you @tustvold for review:

More details in this ticket:
#7363

We are trying to add benchmark for row filter push down:
#7401

And also try to benchmark with the page cache PR:
#6921

The benchmark will work well for main branch, but stuck for #6921 page cache branch, so i try to debug it and found the low level skip page API will not skip dict page. And the page cache PR will use this low level API.

And main branch will also use this API, but it will exclude the dict page before use it:
#7409 (comment)

  1. Before the cache page PR, we will not use the low level skip_next_page to trigger the dict page bug, because the only place we use the low level api is:
    /// Skips over `num_records` records, where records are delimited by repetition levels of 0
    ///
    /// # Returns
    ///
    /// Returns the number of records skipped
    pub fn skip_records(&mut self, num_records: usize) -> Result<usize> {
        let mut remaining_records = num_records;
        while remaining_records != 0 {
            if self.num_buffered_values == self.num_decoded_values {
                let metadata = match self.page_reader.peek_next_page()? {
                    None => return Ok(num_records - remaining_records),
                    Some(metadata) => metadata,
                };

                // If dictionary, we must read it
                if metadata.is_dict {
                    self.read_dictionary_page()?;
                    continue;
                }

                // If page has less rows than the remaining records to
                // be skipped, skip entire page
                let rows = metadata.num_rows.or_else(|| {
                    // If no repetition levels, num_levels == num_rows
                    self.rep_level_decoder
                        .is_none()
                        .then_some(metadata.num_levels)?
                });

                if let Some(rows) = rows {
                    if rows <= remaining_records {
                        self.page_reader.skip_next_page()?;
                        remaining_records -= rows;
                        continue;
                    }
                }
            ..............
}

And the logic here, we already exclude the dict page case for above high level API, so we will not hit the bug:

               // If dictionary, we must read it
                if metadata.is_dict {
                    self.read_dictionary_page()?;
                    continue;
                }

So the following code will not trigger the dict page skip:

              if let Some(rows) = rows {
                    if rows <= remaining_records {
                        self.page_reader.skip_next_page()?;
                        remaining_records -= rows;
                        continue;
                    }
                }
  1. But we will use the low level API after the cache page PR which will hit the skip dict page bug.

@tustvold
Copy link
Contributor

tustvold commented Apr 13, 2025

Right so this is a bug in the cache page PR which is using this API incorrectly, this is not a bug in skip_next_page, it is behaving correctly as documented.

@zhuqi-lucas
Copy link
Contributor Author

Got it @tustvold thanks, so you mean we don't need to make the skip page support dict page, it's the right behaviour. If so, we can fix it in the page cache PR.

@zhuqi-lucas
Copy link
Contributor Author

zhuqi-lucas commented Apr 13, 2025

But when we call the get next page, we will also include the dict page for low level, so i was wandering if we can make the skip page consistent, and it will not make main branch any error.

impl<R: ChunkReader> PageReader for SerializedPageReader<R> {
    fn get_next_page(&mut self) -> Result<Option<Page>> {
        loop {
            let page = match &mut self.state {
                SerializedPageReaderState::Values {
                    offset,
                    remaining_bytes: remaining,
                    next_page_header,
                    page_ordinal,
                    require_dictionary,
                } => {
                    ......
                    page
                }
                SerializedPageReaderState::Pages {
                    page_locations,
                    dictionary_page,
                    ..
                } => {
                    let front = match dictionary_page
                        .take()
                        .or_else(|| page_locations.pop_front())
                    {
                        Some(front) => front,
                        None => return Ok(None),
                    };
               }
       }
}

@tustvold
Copy link
Contributor

tustvold commented Apr 13, 2025

Yes, skip_next_page skips the next page regardless of its type, as documented. TBC we could change this, but it would be a breaking API change that we'd need to discuss and evaluate.

@tustvold
Copy link
Contributor

Oh wait I think I misread this PR, standby

Copy link
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, for some reason I read this as skipping two pages at once

@zhuqi-lucas
Copy link
Contributor Author

Got it, thank you @tustvold for double check.

@XiangpengHao
Copy link
Contributor

This is a nice catch, thank you @zhuqi-lucas

@alamb alamb merged commit b07d8ca into apache:main Apr 16, 2025
16 checks passed
@alamb
Copy link
Contributor

alamb commented Apr 16, 2025

Thank you @zhuqi-lucas and @tustvold and @XiangpengHao

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants