[CBRD-25997] Skip end_one_iteration when it's possible in parallel heap scan #6091

xmilex-git · 2025-04-05T13:41:55Z

http://jira.cubrid.org/browse/CBRD-25997

Purpose

기존에는 병렬 힙 스캔 수행 후, 메인 스레드가 각 스레드로부터 결과를 한 건씩 받아 xasl->list_id에 추가하는 방식으로 처리되었습니다. 그러나 이 방식은 첨부된 그래프에서 확인할 수 있듯이 성능 저하가 큽니다.

이에 따라, 병렬 힙 스캔을 수행한 스레드에서 생성한 임시 결과 파일을 별도의 병합 없이 그대로 클라이언트에 반환하거나 쿼리의 결과로 사용할 수 있도록 개선하였습니다.

병렬 힙 스캔 수행 시 다음 조건을 모두 만족하는 경우, end_one_iteration 등의 후처리를 생략하고 병렬 힙 스캔 스레드에서 생성한 임시 결과 파일을 최종 결과로 바로 반환하도록 동작이 수정됩니다.

TOP_MOST_XASL이거나, uncorrelated subquery일 경우
병렬 힙 스캔이 가능한 경우
if_pred가 없는 경우 (이는 objfetch proc이나 connect by proc에서 사용되므로, 병렬 힙 스캔이 가능한 경우에는 if_pred가 존재하지 않음)
topn_sort를 하지 않는 경우
hash_aggregate (group by 등) 을 하지 않는 경우
select-list에 set, multiset, sequence 등이 포함되지 않는 경우
(차후 지원 예정) end_one_iteration에서 fetch 수행되는 select-list에 arithmetic expression, function expression, sp expression이 없는 경우
buildvalue proc -> agg_list 가 없는 경우 (count(*) 등)
rownum과 같은 instnum_val이 없는 경우
(차후 지원 예정) LIMIT 1과 같은 instnum_pred가 없는 경우
(차후 지원 예정) interpolation 함수 (median, percentile_cont, percentile_dist, cume_dist, percent_rank등)

대표적인 적용 예시

조인이 없는 단순 테이블 full scan 쿼리 또는 uncorrelated subquery
테이블 full scan 이후 ORDER BY, GROUP BY, 집계 함수 등의 후처리를 수행하는 쿼리

Implementation

scan_next_parallel_heap_scan()에서 LIST_MERGE 방식으로 결과를 처리하는 parallel heap scan manager가 존재하는 경우, 자식 스레드가 모두 종료될 때까지 대기한 후 리스트 병합(list merging)을 수행하고, S_END를 반환하여 end_one_iteration이나 추가적인 pred 평가 과정을 생략합니다.
이때 xasl->list_id에 저장된 리스트는 병합된 결과 리스트로 대체됩니다.

Remarks

end_one_iteration을 생략하는 경우 topn_sort 기능을 제한합니다.
Parallel heap scan trace 내용이 변경됩니다. (parallelism, min, max, gather 방법 출력)

http://jira.cubrid.org/browse/CBRD-25447 To improve the performance of heap scan, we plan to add a parallel heap scan feature to CUBRID. The following improvements that have not yet been completed will be handled in separate improvement issues/PRs. • Trace output • Modify the query to allow specifying the number of threads using a query hint, e.g., select /*+ parallel(4) */ * from t1; • Store the results separately in individual list files for each parallel task thread, to be later merged or used as needed

… manager 코드 분리

… 출력, 평균값을 출력하도록 리팩토링

…우 row by row로 작동하게 변경

…ation에서 수행)

…hs_list_merge_feature

xmilex-git and others added 12 commits April 2, 2025 16:28

feat: add list_merger, mergable_list, and modify to default

b1bd42d

fix: add qlist count to avoid assert

e216e15

refactor: manager 가상화, page_by_page로 결과를 받는 manager와 최종 결과 파일 merge하는…

9374e68

… manager 코드 분리

feat: checker 에서 page by page와 list merge 방식중 선택하는 코드 추가

4dfdb33

feat&refactor: perf_monitor에서 list_merge 와 row by row 방식 중 무엇을 사용하였는지…

ee2b44d

… 출력, 평균값을 출력하도록 리팩토링

fix: trace_stat은 parallel heap 안하게 변경, arithmetic이 outptr_list에 존재할 경…

3d8bbe8

…우 row by row로 작동하게 변경

fix: Page by page 일 때 ACCESS_SPEC_FLAG_MERGED_LIST 제거

b9d6950

refactor: min..max trace

0638451

fix: json 에서 min값이 항상 0으로 나오는 현상 수정

ab134bd

fix: segmentation fault in trace print

24f664c

refactor: groupby, analytic을 위한 domain resolve 루틴 추가 (원래 end_one_iter…

bb42e0d

…ation에서 수행)

xmilex-git requested review from HyunukLee, youngjinj, Hamkua and sohee-dgist April 5, 2025 13:41

xmilex-git self-assigned this Apr 5, 2025

xmilex-git requested review from shparkcubrid and beyondykk9 as code owners April 5, 2025 13:41

xmilex-git changed the base branch from feature/parallel_heap_scan to feature/parallel_query April 7, 2025 03:04

xmilex-git requested a review from hornetmj as a code owner April 7, 2025 03:04

Merge remote-tracking branch 'upstream/feature/parallel_query' into p…

627a251

…hs_list_merge_feature

xmilex-git removed the request for review from hornetmj April 7, 2025 03:21

xmilex-git added 6 commits April 7, 2025 13:11

refactor: outptrlist에 들어갈 dbvalue 값 받아오는것 group by가 있는 경우로 한정

6186527

fix: unfix->pgbuf_set_dirty

53a97a0

fix: interpolation -> row_by_row

d89e4df

fix: instnum -> page by page

1f25ba0

fix: bug about topnsort trace

3700c6a

fix: build error

9cc99a6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CBRD-25997] Skip end_one_iteration when it's possible in parallel heap scan #6091

[CBRD-25997] Skip end_one_iteration when it's possible in parallel heap scan #6091

xmilex-git commented Apr 5, 2025 •

edited

Loading

[CBRD-25997] Skip end_one_iteration when it's possible in parallel heap scan #6091

Are you sure you want to change the base?

[CBRD-25997] Skip end_one_iteration when it's possible in parallel heap scan #6091

Conversation

xmilex-git commented Apr 5, 2025 • edited Loading

Purpose

Implementation

Remarks

xmilex-git commented Apr 5, 2025 •

edited

Loading