Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support host monitor #1890

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open

Conversation

Abingcbc
Copy link
Collaborator

  1. 打通进程元信息采集链路

TODO:

  1. 支持更多字段
  2. 可观测指标

@Abingcbc Abingcbc force-pushed the host_monitor branch 6 times, most recently from 6150e52 to bfdd9c2 Compare November 19, 2024 06:27
core/common/timer/HostMonitorTimerEvent.cpp Outdated Show resolved Hide resolved
core/models/PipelineEventGroup.h Outdated Show resolved Hide resolved
core/runner/sink/http/HttpSink.cpp Outdated Show resolved Hide resolved
core/host_monitor/collector/MockCollector.cpp Outdated Show resolved Hide resolved
core/host_monitor/collector/CollectorManager.cpp Outdated Show resolved Hide resolved
core/plugin/processor/inner/ProcessorHostMetaNative.cpp Outdated Show resolved Hide resolved
core/constants/EntityConstants.cpp Outdated Show resolved Hide resolved
core/host_monitor/SystemInformationTools.cpp Show resolved Hide resolved
core/plugin/processor/inner/ProcessorHostMetaNative.cpp Outdated Show resolved Hide resolved
core/host_monitor/collector/ProcessCollector.cpp Outdated Show resolved Hide resolved
core/common/FileSystemUtil.h Outdated Show resolved Hide resolved
core/common/StringTools.cpp Outdated Show resolved Hide resolved
core/common/timer/HostMonitorTimerEvent.cpp Outdated Show resolved Hide resolved
core/plugin/input/InputHostMeta.h Outdated Show resolved Hide resolved
core/plugin/input/InputHostMeta.h Outdated Show resolved Hide resolved
core/plugin/input/InputHostMeta.cpp Outdated Show resolved Hide resolved
core/plugin/input/InputHostMeta.cpp Show resolved Hide resolved
core/host_monitor/HostMonitorInputRunner.cpp Outdated Show resolved Hide resolved
core/host_monitor/HostMonitorInputRunner.cpp Outdated Show resolved Hide resolved
core/host_monitor/HostMonitorInputRunner.cpp Outdated Show resolved Hide resolved
core/common/timer/Timer.h Show resolved Hide resolved
core/plugin/processor/inner/ProcessorHostMetaNative.cpp Outdated Show resolved Hide resolved
core/runner/sink/http/HttpSink.cpp Outdated Show resolved Hide resolved
core/plugin/processor/inner/ProcessorHostMetaNative.cpp Outdated Show resolved Hide resolved
core/host_monitor/collector/ProcessCollector.h Outdated Show resolved Hide resolved
core/host_monitor/collector/ProcessCollector.cpp Outdated Show resolved Hide resolved
core/host_monitor/collector/ProcessCollector.cpp Outdated Show resolved Hide resolved
core/plugin/processor/inner/ProcessorHostMetaNative.cpp Outdated Show resolved Hide resolved
core/host_monitor/collector/ProcessCollector.cpp Outdated Show resolved Hide resolved

int readCount = 0;
WalkAllProcess(PROCESS_DIR, [&](const std::string& dirName) {
if (++readCount > mProcessSilentCount) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

个数控制是不是也在控制缓存里的?不应该只是控制去直接操作系统交互的吗

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

缓存是为了计算增量的CPU使用率,前一次和后一次的top n进程可能不一样,所以需要保存所有的

core/common/MachineInfoUtil.cpp Outdated Show resolved Hide resolved
core/common/MachineInfoUtil.cpp Outdated Show resolved Hide resolved
core/common/timer/Timer.h Show resolved Hide resolved
core/plugin/input/InputHostMeta.cpp Outdated Show resolved Hide resolved
core/host_monitor/HostMonitorInputRunner.cpp Outdated Show resolved Hide resolved
ThreadPool mThreadPool;

mutable std::shared_mutex mRegisteredCollectorMapMutex;
std::unordered_map<std::string, bool> mRegisteredCollectorMap;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::unordered_map<std::string, bool> mRegisteredCollectorMap;
std::unordered_map<std::string, std::shared_ptr<BaseCollector>> mCollectorInstanceMap;

这两个变量有点不合理。
包括
1、HostMonitorInputRunner::GetCollector的使用上。应该是自闭环合法性
2、锁的关系等
3、RegisterCollector 操作的变量是 mCollectorInstanceMap,但是另一个没操作的变量却叫reg

可能的优化
1、保持现有逻辑。一组功能不用过多发散变量,以上改成一个map。value用结构体表示。整体实现结构优化下。
2、UpdateCollector时再注册ProcessEntityCollector实例,这样能省一个变量。

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

core/host_monitor/collector/ProcessEntityCollector.cpp Outdated Show resolved Hide resolved

const std::string ProcessEntityCollector::sName = "process_entity";

ProcessEntityCollector::ProcessEntityCollector() : mProcessSilentCount(INT32_FLAG(process_collect_silent_count)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

任务执行异步
超时影响
不同任务类型见例如Prometheus、host entity、processo不影响

Copy link
Collaborator Author

@Abingcbc Abingcbc Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. 线程池放入不会堵塞,runner拿到任务后,可以立刻放入线程池
  2. 在一个任务执行结束后,再放入下一个任务。一个任务超时,会计算从上一次执行时间开始,增加n个周期,到达下一个未来可执行的时间点。
    nextExecTime = execTime + n * interval

core/common/MachineInfoUtil.cpp Outdated Show resolved Hide resolved
core/common/MachineInfoUtil.h Outdated Show resolved Hide resolved
// process entity
const std::string DEFAULT_CONTENT_VALUE_ENTITY_TYPE_ECS_PROCESS = "acs.ecs.process";
const std::string DEFAULT_CONTENT_VALUE_ENTITY_TYPE_HOST_PROCESS = "infra.host.process";
const std::string DEFAULT_CONTENT_KEY_PROCESS_PID = "pid";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

字段是如何跟安全保持一致的?


namespace logtail {

HostMonitorInputRunner::HostMonitorInputRunner() : mThreadPool(ThreadPool(3)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 的合理性?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

@@ -61,6 +61,8 @@ enum class EventGroupMetaKey {
PROMETHEUS_STREAM_ID,
PROMETHEUS_STREAM_TOTAL,

HOST_MONITOR_COLLECT_INTERVAL,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

使用的地方都属于input控制范围,可以通过HostMonitorInputRunner::UpdateCollector来传递

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{"process_entity"},

RegisterCollector<ProcessEntityCollector>();
}

void HostMonitorInputRunner::UpdateCollector(const std::vector<std::string>& newCollectors,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

保持变量含义的统一性,这里应该是collectornames。

std::string& entityType,
std::string& hostEntityID,
std::string& hostEntityType) {
ECSMeta metaObj = HostIdentifier::Instance()->GetECSMeta();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

合库前需要:GetECSMeta的使用还需要完善下设计及实现

core/app_config/AppConfig.cpp Show resolved Hide resolved

namespace logtail {

HostMonitorInputRunner::HostMonitorInputRunner() : mThreadPool(ThreadPool(3)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

@@ -61,6 +61,8 @@ enum class EventGroupMetaKey {
PROMETHEUS_STREAM_ID,
PROMETHEUS_STREAM_TOTAL,

HOST_MONITOR_COLLECT_INTERVAL,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{"process_entity"},

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants