Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[opt](nereids) optimize limit on distinct aggregate #47570

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

englefly
Copy link
Contributor

@englefly englefly commented Feb 7, 2025

What problem does this PR solve?

if there is no aggregate-functions in aggregation output, the limit could be pushed down inside this aggregate.
example: select a from T group by a limit 1
original plan: limit(1)->aggregate(global, limit=1)->aggregate(local)->scan
after optimize: limit(1) -> aggregate(global, limit=1)->aggregate(local, limit=1)->scan

this work is implemented in PhysicalPlanTranslator

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Feb 7, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@englefly
Copy link
Contributor Author

englefly commented Feb 7, 2025

run buildall

@englefly englefly changed the title opt limit push agg [opt](nereids) optimize limit on distinct aggregate Feb 7, 2025
@doris-robot
Copy link

TPC-H: Total hot run time: 31725 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c1d847fbdc2ae60e196e82e4789469915306bbb6, data reload: false

------ Round 1 ----------------------------------
q1	17573	5106	5068	5068
q2	2047	298	166	166
q3	10409	1224	739	739
q4	10201	1020	537	537
q5	7508	2342	2367	2342
q6	182	163	136	136
q7	896	739	591	591
q8	9293	1220	1046	1046
q9	4741	4873	4770	4770
q10	6859	2315	1901	1901
q11	474	270	252	252
q12	347	354	215	215
q13	17804	3670	3113	3113
q14	229	222	214	214
q15	510	473	466	466
q16	628	612	579	579
q17	553	862	337	337
q18	7186	6475	6311	6311
q19	1363	960	532	532
q20	320	320	192	192
q21	2736	2107	1914	1914
q22	366	326	304	304
Total cold run time: 102225 ms
Total hot run time: 31725 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5118	5112	5118	5112
q2	234	328	234	234
q3	2142	2657	2338	2338
q4	1395	1798	1369	1369
q5	4222	4133	4132	4132
q6	215	164	124	124
q7	1900	1830	1683	1683
q8	2610	2628	2496	2496
q9	7203	7250	7096	7096
q10	3050	3201	2808	2808
q11	569	518	504	504
q12	703	760	619	619
q13	3433	3927	3335	3335
q14	278	299	267	267
q15	508	460	468	460
q16	635	689	659	659
q17	1136	1624	1329	1329
q18	7578	7415	7342	7342
q19	792	773	798	773
q20	1974	1988	1878	1878
q21	5460	5107	4789	4789
q22	613	570	549	549
Total cold run time: 51768 ms
Total hot run time: 49896 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 190829 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c1d847fbdc2ae60e196e82e4789469915306bbb6, data reload: false

query1	1358	960	932	932
query2	6310	1946	1932	1932
query3	11138	4678	4614	4614
query4	53927	24912	23557	23557
query5	5192	583	503	503
query6	352	233	177	177
query7	4922	498	302	302
query8	309	252	244	244
query9	5824	2518	2491	2491
query10	445	321	261	261
query11	15169	15261	14892	14892
query12	166	111	106	106
query13	1064	508	399	399
query14	10631	6371	6497	6371
query15	200	212	175	175
query16	7085	648	495	495
query17	1093	756	584	584
query18	1541	420	318	318
query19	202	210	176	176
query20	128	133	130	130
query21	209	128	107	107
query22	4389	4497	4445	4445
query23	34030	33408	33410	33408
query24	5817	2452	2462	2452
query25	459	484	390	390
query26	717	295	164	164
query27	1766	467	339	339
query28	2744	2403	2379	2379
query29	570	556	418	418
query30	210	185	155	155
query31	867	865	792	792
query32	75	64	67	64
query33	482	359	297	297
query34	817	867	494	494
query35	836	858	782	782
query36	955	1006	928	928
query37	120	95	70	70
query38	4196	4357	4336	4336
query39	1545	1438	1456	1438
query40	214	113	102	102
query41	50	49	49	49
query42	121	105	109	105
query43	519	518	483	483
query44	1295	800	802	800
query45	181	177	165	165
query46	899	1091	666	666
query47	1859	1917	1788	1788
query48	387	426	309	309
query49	696	498	423	423
query50	705	773	415	415
query51	4262	4344	4305	4305
query52	111	105	97	97
query53	229	258	197	197
query54	472	485	405	405
query55	87	80	77	77
query56	276	272	246	246
query57	1203	1206	1154	1154
query58	257	246	250	246
query59	2860	3027	2653	2653
query60	286	278	261	261
query61	121	117	118	117
query62	772	757	687	687
query63	235	198	200	198
query64	1838	1037	667	667
query65	3226	3120	3133	3120
query66	727	382	291	291
query67	15994	15674	15491	15491
query68	5590	777	523	523
query69	525	301	264	264
query70	1195	1133	1056	1056
query71	430	289	306	289
query72	6333	3661	3689	3661
query73	1266	750	353	353
query74	8924	9223	8847	8847
query75	3273	3167	2677	2677
query76	3840	1195	739	739
query77	531	379	278	278
query78	10030	10141	9339	9339
query79	2426	810	596	596
query80	696	541	489	489
query81	512	272	237	237
query82	405	134	95	95
query83	176	174	158	158
query84	287	91	79	79
query85	735	358	311	311
query86	370	330	301	301
query87	4457	4469	4405	4405
query88	3585	2173	2154	2154
query89	391	311	282	282
query90	1733	194	193	193
query91	129	141	108	108
query92	70	59	58	58
query93	2678	1016	593	593
query94	693	370	298	298
query95	349	276	265	265
query96	475	558	268	268
query97	2769	2875	2785	2785
query98	235	206	210	206
query99	1342	1382	1288	1288
Total cold run time: 294192 ms
Total hot run time: 190829 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.88 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit c1d847fbdc2ae60e196e82e4789469915306bbb6, data reload: false

query1	0.03	0.03	0.04
query2	0.07	0.04	0.03
query3	0.24	0.06	0.06
query4	1.62	0.10	0.10
query5	0.44	0.43	0.41
query6	1.17	0.65	0.66
query7	0.02	0.01	0.01
query8	0.04	0.03	0.03
query9	0.58	0.52	0.53
query10	0.58	0.60	0.57
query11	0.15	0.11	0.11
query12	0.14	0.11	0.11
query13	0.62	0.60	0.60
query14	2.69	2.68	2.70
query15	0.92	0.85	0.85
query16	0.38	0.38	0.38
query17	1.03	1.04	1.05
query18	0.22	0.20	0.20
query19	1.90	1.76	2.03
query20	0.02	0.01	0.01
query21	15.35	0.90	0.53
query22	0.77	1.16	0.67
query23	14.95	1.31	0.58
query24	7.22	1.71	1.17
query25	0.48	0.30	0.09
query26	0.58	0.17	0.14
query27	0.06	0.05	0.05
query28	10.09	0.84	0.43
query29	12.53	4.00	3.29
query30	0.25	0.08	0.06
query31	2.82	0.57	0.38
query32	3.23	0.55	0.47
query33	2.98	3.05	3.02
query34	15.65	5.14	4.52
query35	4.60	4.57	4.55
query36	0.66	0.51	0.48
query37	0.08	0.06	0.06
query38	0.05	0.04	0.03
query39	0.03	0.03	0.02
query40	0.16	0.13	0.13
query41	0.08	0.03	0.02
query42	0.04	0.02	0.02
query43	0.03	0.03	0.02
Total cold run time: 105.55 s
Total hot run time: 30.88 s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants