采集过程中暂停采集器,checkpoint中保存的文件offset不正确 #1657
Unanswered
samtangweicheng
asked this question in
Help
Replies: 1 comment
-
正常的流程,在第三步停止时,logtail会自动dump checkpoint到本地,并在重启后自动读取checkpoint继续采集,不需要再用touch命令更新目标文件的修改时间。如果有完整的重启前后的日志可能更容易分析 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
我们基于flusher_opentelemetry做了二次开发用于上报数据。
![image](https://private-user-images.githubusercontent.com/8476721/354553372-810abd97-1971-4eb3-8244-f600715f90e7.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzg5MTQ0NzEsIm5iZiI6MTczODkxNDE3MSwicGF0aCI6Ii84NDc2NzIxLzM1NDU1MzM3Mi04MTBhYmQ5Ny0xOTcxLTRlYjMtODI0NC1mNjAwNzE1ZjkwZTcucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIwNyUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMDdUMDc0MjUxWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9MmJiNzViNmM4ZmE1NTBkZjE1YmE5MjYxYjg0NDQ0ZDhkMzUyNGZlOTk5YWQwODUxMTBhMTAwMDhiOTViNjM3MyZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.hiMwW4EtVCNsTZ695wqIVx-0OsByhm8IDoveAA4xXnA)
新的flusher_xxx_otlp的flush方法中会读取各个logRecord的"file_offset"标签,然后打印出来。
在测试的时候,我们发现重启采集器后有漏采的情况:
1、采集配置:
{ "enable" : true, "global" : { "EnableTimestampNanosecond" : true }, "inputs" : [ { "Type" : "input_file", "FilePaths" : [ "/root/zhl/log_file1.log4" ], "MaxDirSearchDepth" : 5, "ExcludeFilePaths" : [ ], "TailSizeKB" : 10485760, "AppendingLogPositionMeta" : true, "AllowingIncludedByMultiConfigs" : true } ], "flushers" : [ { "Type" : "flusher_xxx_otlp", "Logs" : { "Endpoint" : "xxxxxx.xxxxxx.cn:12345", "Timeout" : 10000, "WaitForReady" : true, "Compression" : "gzip" } } ] }
2、启动采集器后,用touch命令更新目标文件的修改时间,让采集器开始采集。/root/xxx/log_file1.log2是一个20M大小左右的文件。
3、1-2秒后给采集器发送sigTrem信号停止采集器;
4、重新运行采集器,然后在用touch命令更新目标文件的修改时间,让采集器继续采集。
在第三步之后,打开ilogtail.LOG日志文件如下,能看到是从0开始采集器的:
![image](https://private-user-images.githubusercontent.com/8476721/354550203-195d3c3a-fdc2-493e-b9a9-8edae5b9e843.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzg5MTQ0NzEsIm5iZiI6MTczODkxNDE3MSwicGF0aCI6Ii84NDc2NzIxLzM1NDU1MDIwMy0xOTVkM2MzYS1mZGMyLTQ5M2UtYjlhOS04ZWRhZTViOWU4NDMucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIwNyUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMDdUMDc0MjUxWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9NThlOWQ4NGZkZDg2MTBkNGM3NTVkMzhlZmJkOTBhZjA5MTQ2NWM4NzY4M2Y0MDI0ZDAxYWQ3YjNmNDJkNTMzNSZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.aQPAJN1Jw7baHav2-nY7GrlzZg_OCv5DV4voYsAM3wQ)
logtail_plugin.LOG日志中也打印出了最后一次发送的数据包的最大offset是5229336
![image](https://private-user-images.githubusercontent.com/8476721/354551273-f896931b-354e-4e34-83a9-7696cd1e583a.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzg5MTQ0NzEsIm5iZiI6MTczODkxNDE3MSwicGF0aCI6Ii84NDc2NzIxLzM1NDU1MTI3My1mODk2OTMxYi0zNTRlLTRlMzQtODNhOS03Njk2Y2QxZTU4M2EucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIwNyUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMDdUMDc0MjUxWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9OGJlMzdkMGY0ZjJlZjc1YTRmYTFmNTVmMjQ5MGVlNWI0ZTQ3NjVhZjNhN2FmYWY1YWRiN2YzMjU0ODI4MzA4YiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.hMAe4N9x6etohZFzpmOG_NkhT4djt7ZLo6Plqk56d8E)
但是此时查看checkpoint文件,发现其中的offset是8388480
![企业微信截图_17225013855869](https://private-user-images.githubusercontent.com/8476721/354551538-72ebd417-bdc7-4a0b-8614-fa89e90481a5.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzg5MTQ0NzEsIm5iZiI6MTczODkxNDE3MSwicGF0aCI6Ii84NDc2NzIxLzM1NDU1MTUzOC03MmViZDQxNy1iZGM3LTRhMGItODYxNC1mYTg5ZTkwNDgxYTUucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIwNyUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMDdUMDc0MjUxWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9ZmQ1OTVkMzlkNWVmMDRlYjcwMzE1YjViMzMzMGIyODUyNDY3ZDFjZTg1NWZmZTk2ZmUwMjg3ZWZhNzAxOTA3ZiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.kzH8N1683o7NYCJfDA_JW8fCmD44Rwv6PqRiVqGD70Y)
重启后,ilogtail.LOG中显示是从8388480处继续采集
![企业微信截图_17225014931361](https://private-user-images.githubusercontent.com/8476721/354552248-adf5e9f8-d034-4a01-ace5-2a0bb50605ca.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzg5MTQ0NzEsIm5iZiI6MTczODkxNDE3MSwicGF0aCI6Ii84NDc2NzIxLzM1NDU1MjI0OC1hZGY1ZTlmOC1kMDM0LTRhMDEtYWNlNS0yYTBiYjUwNjA1Y2EucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIwNyUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMDdUMDc0MjUxWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9ZWMzNzQ1Yjk4OWMzZjE4OGE4YWU1OTkyZGZlZmIyY2UxYzhkOTA4Y2Y5ZmZlMGM0NWY2Njc4OWVkMWIxMTQzYiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.16fwTD-lQ-2J1vTqLBDxOLonMTpSJgcXO6igGNuR5YY)
中间的5229336到8388480 这段日志漏采了。
Beta Was this translation helpful? Give feedback.
All reactions