Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat](binlog) Speed binlog gc by locked binlogs #47547

Merged
merged 1 commit into from
Feb 8, 2025

Conversation

w41ter
Copy link
Contributor

@w41ter w41ter commented Feb 6, 2025

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #46887, https://github.com/selectdb/ccr-syncer/pull/399/files

Problem Summary:

This is the second PR for locking binlogs.

To reduce the cost of maintaining binlogs, an API named lockBinlog has
been added. Users use this API to indicate which binlogs are not
permitted for GC.

The binlog gcer will recycle all binlogs until the locked one. However, in order to remain compatible with the old behaviors, if no binlog is locked here, it falls through to the previous behavior (keep the entire binlogs until they are expired)

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@w41ter
Copy link
Contributor Author

w41ter commented Feb 6, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32413 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 956d58ca807905b7da48cbf2c5267138dcc911e1, data reload: false

------ Round 1 ----------------------------------
q1	17578	5522	5468	5468
q2	2059	317	185	185
q3	10370	1259	785	785
q4	10216	1032	555	555
q5	7548	2406	2431	2406
q6	193	166	137	137
q7	936	773	599	599
q8	9311	1414	1185	1185
q9	4891	4728	4876	4728
q10	6866	2342	1885	1885
q11	464	274	260	260
q12	356	375	222	222
q13	17772	3723	3095	3095
q14	226	234	207	207
q15	517	472	464	464
q16	615	626	574	574
q17	598	889	351	351
q18	6647	6318	6256	6256
q19	1402	966	579	579
q20	336	334	192	192
q21	3090	2299	1977	1977
q22	365	335	303	303
Total cold run time: 102356 ms
Total hot run time: 32413 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5255	5155	5162	5155
q2	247	335	232	232
q3	2154	2721	2293	2293
q4	1497	1840	1381	1381
q5	4246	4156	4222	4156
q6	214	166	127	127
q7	1876	1852	1759	1759
q8	2669	2665	2660	2660
q9	7293	7159	7085	7085
q10	3047	3274	2827	2827
q11	575	514	487	487
q12	716	791	622	622
q13	3427	3930	3271	3271
q14	306	294	273	273
q15	519	458	471	458
q16	640	682	632	632
q17	1178	1621	1344	1344
q18	7579	7303	7378	7303
q19	867	939	1008	939
q20	1994	2039	1874	1874
q21	5494	5098	4841	4841
q22	659	588	537	537
Total cold run time: 52452 ms
Total hot run time: 50256 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 189694 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 956d58ca807905b7da48cbf2c5267138dcc911e1, data reload: false

query1	1306	963	930	930
query2	6158	1839	1851	1839
query3	11140	4546	4607	4546
query4	55674	25390	22997	22997
query5	5079	522	482	482
query6	363	204	197	197
query7	4972	510	294	294
query8	319	256	239	239
query9	6038	2475	2510	2475
query10	430	319	248	248
query11	15088	15115	14824	14824
query12	159	107	110	107
query13	1094	518	377	377
query14	10712	6858	6422	6422
query15	190	202	180	180
query16	7088	661	494	494
query17	1089	736	581	581
query18	1543	422	309	309
query19	197	204	170	170
query20	133	128	117	117
query21	251	120	100	100
query22	4664	4826	4348	4348
query23	33927	33126	33452	33126
query24	5606	2447	2475	2447
query25	471	477	417	417
query26	672	282	154	154
query27	1700	496	327	327
query28	2901	2403	2392	2392
query29	595	577	424	424
query30	219	194	164	164
query31	885	884	801	801
query32	76	69	63	63
query33	440	366	314	314
query34	778	842	503	503
query35	803	835	757	757
query36	966	989	907	907
query37	121	103	78	78
query38	4298	4370	4255	4255
query39	1521	1464	1442	1442
query40	217	122	113	113
query41	51	48	50	48
query42	119	104	111	104
query43	492	519	486	486
query44	1286	827	801	801
query45	179	179	165	165
query46	865	1047	646	646
query47	1843	1881	1837	1837
query48	394	410	320	320
query49	697	505	421	421
query50	733	733	424	424
query51	4264	4277	4203	4203
query52	110	110	100	100
query53	249	260	191	191
query54	509	504	419	419
query55	89	94	78	78
query56	273	276	260	260
query57	1165	1195	1108	1108
query58	256	249	265	249
query59	2778	2817	2782	2782
query60	287	274	290	274
query61	121	123	114	114
query62	716	752	663	663
query63	236	196	192	192
query64	1494	1055	718	718
query65	3281	3226	3141	3141
query66	721	389	307	307
query67	15843	15636	15368	15368
query68	5431	783	499	499
query69	497	304	264	264
query70	1215	1117	1089	1089
query71	431	289	278	278
query72	6325	3730	3758	3730
query73	1298	772	350	350
query74	8893	9170	8920	8920
query75	3246	3137	2686	2686
query76	3907	1156	714	714
query77	543	362	287	287
query78	10030	10086	9383	9383
query79	2039	812	586	586
query80	770	532	461	461
query81	526	283	239	239
query82	415	150	211	150
query83	172	169	150	150
query84	293	89	81	81
query85	753	354	300	300
query86	410	306	288	288
query87	4482	4657	4374	4374
query88	3226	2191	2171	2171
query89	395	304	281	281
query90	1833	184	187	184
query91	132	135	109	109
query92	77	61	58	58
query93	2728	1002	571	571
query94	676	412	304	304
query95	361	264	261	261
query96	484	555	271	271
query97	2730	2846	2827	2827
query98	252	206	203	203
query99	1288	1376	1246	1246
Total cold run time: 294792 ms
Total hot run time: 189694 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.59 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 956d58ca807905b7da48cbf2c5267138dcc911e1, data reload: false

query1	0.04	0.03	0.03
query2	0.08	0.03	0.03
query3	0.24	0.06	0.06
query4	1.61	0.11	0.10
query5	0.43	0.42	0.41
query6	1.15	0.66	0.67
query7	0.02	0.02	0.02
query8	0.04	0.03	0.03
query9	0.59	0.53	0.52
query10	0.58	0.59	0.56
query11	0.16	0.11	0.11
query12	0.15	0.11	0.11
query13	0.62	0.59	0.60
query14	2.69	2.70	2.73
query15	0.93	0.85	0.85
query16	0.39	0.37	0.36
query17	1.01	1.08	1.06
query18	0.22	0.19	0.20
query19	1.90	1.84	1.97
query20	0.01	0.02	0.01
query21	15.35	0.90	0.56
query22	0.75	1.21	0.68
query23	14.89	1.38	0.64
query24	6.78	1.45	0.72
query25	0.51	0.33	0.09
query26	0.63	0.16	0.14
query27	0.06	0.05	0.04
query28	9.60	0.88	0.42
query29	12.57	4.00	3.30
query30	0.27	0.10	0.06
query31	2.82	0.60	0.38
query32	3.21	0.56	0.46
query33	3.01	3.05	3.02
query34	15.50	5.16	4.48
query35	4.57	4.57	4.55
query36	0.66	0.50	0.48
query37	0.09	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.02	0.02
query40	0.18	0.14	0.14
query41	0.08	0.03	0.02
query42	0.03	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 104.54 s
Total hot run time: 30.59 s

@w41ter w41ter requested a review from dataroaring February 7, 2025 02:03
Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Feb 8, 2025
Copy link
Contributor

github-actions bot commented Feb 8, 2025

PR approved by at least one committer and no changes requested.

Copy link
Contributor

github-actions bot commented Feb 8, 2025

PR approved by anyone and no changes requested.

Copy link
Contributor

github-actions bot commented Feb 8, 2025

PR approved by anyone and no changes requested.

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dataroaring dataroaring merged commit 6974c0a into apache:master Feb 8, 2025
30 of 31 checks passed
@w41ter w41ter deleted the feat_lock_binlog branch February 8, 2025 06:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants