-
Notifications
You must be signed in to change notification settings - Fork 399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: collect files in PV when another container restart or stop #2010
base: main
Are you sure you want to change the base?
Conversation
dcd68aa
to
3e5407b
Compare
3e5407b
to
6af14e0
Compare
6af14e0
to
bfd097c
Compare
"file inode", reader->GetDevInode().inode)("file size", reader->GetFileSize())); | ||
ForceReadLogAndPush(reader); | ||
reader->CloseFilePtr(); | ||
// update container info one more time, ensure file is hold by same cotnainer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cotnainer -> container
@@ -206,6 +206,7 @@ void CheckPointManager::LoadFileCheckPoint(const Json::Value& root) { | |||
string realFilePath; | |||
int32_t fileOpenFlag = 0; // default, we close file ptr | |||
int32_t containerStopped = 0; | |||
string containerID; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个场景E2E是否可以构造,是否有对应的用例?
@@ -206,6 +206,7 @@ void CheckPointManager::LoadFileCheckPoint(const Json::Value& root) { | |||
string realFilePath; | |||
int32_t fileOpenFlag = 0; // default, we close file ptr | |||
int32_t containerStopped = 0; | |||
string containerID; | |||
int32_t lastForceRead = 0; | |||
int32_t idxInReaderArray = LogFileReader::CHECKPOINT_IDX_OF_NEW_READER_IN_ARRAY; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UT、E2E重点找迅飞review下。
if (discoveryConfig.first == nullptr) { | ||
return false; | ||
} | ||
ContainerInfo* containerInfo = discoveryConfig.first->GetContainerPathByLogPath(mHostLogPathDir); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可能有问题,如果一个是主机上采集pv,一个是容器内,那么主机的是不会去更新container信息的,此时还能否将stop的container重置为空?
所以config信息是不是还是有用,不能只靠containerid
问题
LoongCollector之前的设计中,每个目录只会对应一个容器。但在采集volume(包括PV、hostPath等)时会出现问题:
多个容器采集同一个volume时。其中一个容器停止或者同一个容器重启时,会将采集同一目录的所有的reader置为container stopped。但实际上这些reader正在采集其他容器的文件,后续会不断触发读取stop容器告警,产生截断。
修复方法
测试用例
新容器start,旧容器stop,文件写入modify 这三个事件之间存在时序关系。