Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature](cloud)Impl file cache microbench #47563

Merged
merged 23 commits into from
Mar 11, 2025
Merged

Conversation

deardeng
Copy link
Contributor

@deardeng deardeng commented Feb 6, 2025

What problem does this PR solve?

Implement a microbenchmark to quickly and efficiently identify issues with the Doris file cache, including performance problems. Direct end-to-end system tests that stress Doris to evaluate the efficiency of the file cache component are relatively ineffective, as they do not quickly yield the desired eviction rates and pressures, and may also be affected by factors such as compaction. Therefore, we need to implement direct testing of the file cache functionality and performance through the IO layer (S3FileWriter + CachedRemoteReader) during upload and download operations.

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Feb 6, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@deardeng deardeng marked this pull request as ready for review February 26, 2025 11:45
@deardeng
Copy link
Contributor Author

run buildall

@deardeng
Copy link
Contributor Author

run buildall

@deardeng
Copy link
Contributor Author

run buildall

@deardeng
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31918 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 47eba2d49f2b47f42928ab13ec5821e097943088, data reload: false

------ Round 1 ----------------------------------
q1	17622	5256	5086	5086
q2	2052	320	189	189
q3	10440	1318	715	715
q4	10224	1037	525	525
q5	8117	2507	2373	2373
q6	188	166	133	133
q7	922	733	598	598
q8	9290	1310	1146	1146
q9	4997	4728	4908	4728
q10	6851	2315	1911	1911
q11	507	275	255	255
q12	344	357	228	228
q13	17784	3695	3132	3132
q14	230	236	207	207
q15	506	463	457	457
q16	625	621	599	599
q17	586	898	344	344
q18	6737	6297	6246	6246
q19	1740	992	550	550
q20	316	322	190	190
q21	2858	2217	1979	1979
q22	372	345	327	327
Total cold run time: 103308 ms
Total hot run time: 31918 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5185	5168	5107	5107
q2	248	337	234	234
q3	2151	2713	2348	2348
q4	1479	1849	1377	1377
q5	4269	4155	4201	4155
q6	208	162	123	123
q7	1896	1877	1813	1813
q8	2660	2724	2639	2639
q9	7280	7090	7116	7090
q10	3015	3193	2779	2779
q11	571	497	483	483
q12	669	761	624	624
q13	3531	3857	3236	3236
q14	266	300	287	287
q15	511	466	471	466
q16	639	666	667	666
q17	1197	1646	1329	1329
q18	7665	7264	7319	7264
q19	815	798	858	798
q20	2015	2028	1877	1877
q21	5462	5102	4907	4907
q22	653	598	590	590
Total cold run time: 52385 ms
Total hot run time: 50192 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 183917 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 47eba2d49f2b47f42928ab13ec5821e097943088, data reload: false

query1	977	403	374	374
query2	6538	1956	1909	1909
query3	6790	208	204	204
query4	26863	23405	23399	23399
query5	4369	668	480	480
query6	296	200	194	194
query7	4608	495	297	297
query8	282	229	217	217
query9	8602	2595	2596	2595
query10	448	295	253	253
query11	15341	15186	14953	14953
query12	152	106	102	102
query13	1658	504	387	387
query14	9092	6621	6078	6078
query15	210	191	181	181
query16	7058	607	476	476
query17	899	716	535	535
query18	1931	391	294	294
query19	186	182	148	148
query20	116	111	117	111
query21	205	118	101	101
query22	4296	4305	4588	4305
query23	34294	33550	33007	33007
query24	7716	2346	2375	2346
query25	538	444	419	419
query26	1244	294	152	152
query27	2119	479	315	315
query28	3975	2435	2383	2383
query29	761	543	414	414
query30	232	187	158	158
query31	928	847	762	762
query32	71	63	60	60
query33	555	356	301	301
query34	792	856	503	503
query35	789	814	758	758
query36	956	969	889	889
query37	113	95	78	78
query38	4165	4162	4069	4069
query39	1432	1386	1390	1386
query40	205	112	104	104
query41	58	53	49	49
query42	120	98	100	98
query43	506	505	486	486
query44	1269	773	764	764
query45	171	168	162	162
query46	854	1018	635	635
query47	1732	1811	1721	1721
query48	378	422	296	296
query49	772	491	418	418
query50	700	731	404	404
query51	4225	4219	4151	4151
query52	109	100	91	91
query53	217	264	183	183
query54	482	478	405	405
query55	82	77	77	77
query56	262	272	231	231
query57	1104	1114	1060	1060
query58	250	237	238	237
query59	2875	2817	2527	2527
query60	273	273	292	273
query61	122	138	140	138
query62	805	713	667	667
query63	224	184	181	181
query64	4565	1090	756	756
query65	3228	3112	3132	3112
query66	1146	411	312	312
query67	15857	15472	15354	15354
query68	8088	868	497	497
query69	468	302	262	262
query70	1211	1133	1096	1096
query71	415	297	273	273
query72	5652	3539	3791	3539
query73	749	744	351	351
query74	8989	8980	8915	8915
query75	3408	3186	2676	2676
query76	3316	1174	729	729
query77	777	379	278	278
query78	9959	10127	9257	9257
query79	2837	817	594	594
query80	708	512	470	470
query81	485	284	236	236
query82	676	129	95	95
query83	211	171	153	153
query84	285	140	73	73
query85	779	348	306	306
query86	332	308	273	273
query87	4538	4570	4394	4394
query88	3482	2223	2205	2205
query89	399	318	287	287
query90	1938	196	195	195
query91	134	139	109	109
query92	80	60	55	55
query93	1588	1029	567	567
query94	694	414	293	293
query95	353	271	252	252
query96	476	574	275	275
query97	3348	3417	3276	3276
query98	228	204	204	204
query99	1444	1386	1247	1247
Total cold run time: 272235 ms
Total hot run time: 183917 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.69 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 47eba2d49f2b47f42928ab13ec5821e097943088, data reload: false

query1	0.04	0.04	0.03
query2	0.07	0.03	0.04
query3	0.24	0.07	0.07
query4	1.60	0.10	0.10
query5	0.55	0.54	0.55
query6	1.19	0.71	0.73
query7	0.02	0.02	0.02
query8	0.04	0.04	0.03
query9	0.58	0.55	0.52
query10	0.58	0.57	0.57
query11	0.15	0.10	0.10
query12	0.14	0.11	0.12
query13	0.61	0.60	0.62
query14	2.81	2.69	2.72
query15	0.93	0.85	0.86
query16	0.39	0.38	0.40
query17	1.02	1.04	1.01
query18	0.21	0.20	0.20
query19	1.86	1.83	1.97
query20	0.01	0.01	0.02
query21	15.36	0.89	0.55
query22	0.75	1.17	0.79
query23	14.82	1.41	0.64
query24	6.74	1.94	0.45
query25	0.52	0.23	0.14
query26	0.58	0.16	0.15
query27	0.05	0.05	0.05
query28	9.96	0.90	0.42
query29	12.56	4.12	3.33
query30	0.25	0.10	0.06
query31	2.82	0.58	0.40
query32	3.27	0.55	0.46
query33	2.99	3.04	3.02
query34	15.86	5.14	4.48
query35	4.54	4.56	4.55
query36	0.67	0.49	0.49
query37	0.09	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.03	0.02
query40	0.17	0.13	0.13
query41	0.08	0.02	0.03
query42	0.03	0.02	0.03
query43	0.04	0.03	0.02
Total cold run time: 105.27 s
Total hot run time: 30.69 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/0) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 44.98% (11994/26666)
Line Coverage 34.47% (100741/292283)
Region Coverage 33.65% (51609/153362)
Branch Coverage 29.40% (26100/88770)

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

github-actions bot commented Mar 4, 2025

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 4, 2025
Copy link
Contributor

github-actions bot commented Mar 4, 2025

PR approved by anyone and no changes requested.

@gavinchou gavinchou merged commit 72c2889 into apache:master Mar 11, 2025
25 of 27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/3.0.x dev/3.0.x-conflict reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants