Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[usm] add regular and raw tracepoints /sched_process_exit #33943

Open
wants to merge 28 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
3832d0c
[usm] add regular and raw tracepoints /sched_process_exit
yuri-lipnesh Feb 11, 2025
d9e1a32
[usm] #if PREBUILT or CORE or (RUNTIME and kernel>4, 17, 0): raw_trac…
yuri-lipnesh Feb 12, 2025
e2c3963
[usm] ebpfProgram.init() exclude raw_tracepoint if kernel<4.17.0
yuri-lipnesh Feb 12, 2025
a58dd1b
[usm] ebpfProgram.init(), exclude 'tracepoint__sched__sched_process_e…
yuri-lipnesh Feb 13, 2025
9a74005
[usm] remove #if conditon for SEC(raw_tracepoint/sched_process_exit)
yuri-lipnesh Feb 13, 2025
cf2ceac
[usm] sharedlibraries.Watcher, periodically clean dead pids from maps…
yuri-lipnesh Feb 14, 2025
4b84b67
[usm] sharedlibraries.Watcher, fix linter error, check err=emap.Delete()
yuri-lipnesh Feb 14, 2025
d7728e5
[usm] unit tests, cleaner of the map 'ssl_read_ex_args'
yuri-lipnesh Feb 14, 2025
9fba5d2
[usm] fix linter errors in TestSSLReadArgsMaps()
yuri-lipnesh Feb 14, 2025
1e141f3
[usm] TestSSLReadArgsMaps, test used map 'ssl_read_args' or 'ssl_read…
yuri-lipnesh Feb 18, 2025
cb81ffa
Merge remote-tracking branch 'origin/main' into yuri.l/USMON-1411_ssl…
yuri-lipnesh Feb 18, 2025
e0aef65
[usm] adjust type SslReadExArgs
yuri-lipnesh Feb 18, 2025
c576481
[usm] simplify in SSL maps cleaner test in tlsSuite.
yuri-lipnesh Feb 18, 2025
f1a6d7e
[usm] remove auto-generated TestCgoAlignment_SslReadExArgs
yuri-lipnesh Feb 18, 2025
d63cc2a
Merge remote-tracking branch 'origin/main' into yuri.l/USMON-1411_ssl…
yuri-lipnesh Feb 18, 2025
186ea0d
Merge remote-tracking branch 'origin/main' into yuri.l/USMON-1411_ssl…
yuri-lipnesh Feb 19, 2025
e4a5338
[usm] add auto-generated TestCgoAlignment_SslReadExArgs
yuri-lipnesh Feb 19, 2025
1fffb43
[usm] clean maps on thread exit: ssl_write_args, ssl_write_ex_args, s…
yuri-lipnesh Feb 19, 2025
946fde8
[usm] moved exit tracepoint setup and ssl maps cleanup to 'sslProgram'
yuri-lipnesh Feb 21, 2025
e8e0c7e
Merge remote-tracking branch 'origin/main' into yuri.l/USMON-1411_ssl…
yuri-lipnesh Feb 21, 2025
d541a5b
[usm] add probe 'sched_process_exit' to openSSLProbes.
yuri-lipnesh Feb 21, 2025
ea736fa
[usm] move TestSSLMapsCleaner() to ebpf_ssl_test.go
yuri-lipnesh Feb 24, 2025
37b688c
[usm] fix linter error in ebpf_ssl.go
yuri-lipnesh Feb 24, 2025
a16cbd6
Merge remote-tracking branch 'origin/main' into yuri.l/USMON-1411_ssl…
yuri-lipnesh Feb 24, 2025
51cf027
[usm] enhanced UT TestSSLMapsCleaner()
yuri-lipnesh Feb 25, 2025
f666e46
[usm] call m.Put(unsafe.Pointer(&key), unsafe.Pointer(&value)) in Tes…
yuri-lipnesh Feb 25, 2025
df55c29
[usm] drop = nil from declaration var cleanerCB func().
yuri-lipnesh Feb 25, 2025
32f36b1
Merge remote-tracking branch 'origin/main' into yuri.l/USMON-1411_ssl…
yuri-lipnesh Feb 26, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 0 additions & 12 deletions pkg/collector/corechecks/ebpf/probe/ebpfcheck/probe.go
Original file line number Diff line number Diff line change
Expand Up @@ -965,15 +965,3 @@ func hashMapNumberOfEntriesWithHelper(mp *ebpf.Map, mapid ebpf.MapID, mphCache *

return int64(res), nil
}

// HashMapNumberOfEntries returns the number of entries in the map
func HashMapNumberOfEntries(mp *ebpf.Map) (int64, error) {
if isPerCPU(mp.Type()) {
return -1, fmt.Errorf("unsupported map type: %s", mp.String())
}
buffers := entryCountBuffers{
keysBufferSizeLimit: 0, // No limit
valuesBufferSizeLimit: 0, // No limit
}
return hashMapNumberOfEntriesWithIteration(mp, &buffers, 1)
}
19 changes: 7 additions & 12 deletions pkg/network/ebpf/c/protocols/flush.h
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,7 @@ int tracepoint__net__netif_receive_skb(void *ctx) {
return 0;
}

SEC("tracepoint/sched/sched_process_exit")
int tracepoint__sched__sched_process_exit(void *ctx) {
CHECK_BPF_PROGRAM_BYPASSED()
static __always_inline void delete_pid_in_maps() {
u64 pid_tgid = bpf_get_current_pid_tgid();

bpf_map_delete_elem(&ssl_read_args, &pid_tgid);
Expand All @@ -39,22 +37,19 @@ int tracepoint__sched__sched_process_exit(void *ctx) {
bpf_map_delete_elem(&ssl_write_ex_args, &pid_tgid);
bpf_map_delete_elem(&ssl_ctx_by_pid_tgid, &pid_tgid);
bpf_map_delete_elem(&bio_new_socket_args, &pid_tgid);
}

SEC("tracepoint/sched/sched_process_exit")
int tracepoint__sched__sched_process_exit(void *ctx) {
CHECK_BPF_PROGRAM_BYPASSED()
delete_pid_in_maps();
return 0;
}

SEC("raw_tracepoint/sched_process_exit")
int raw_tracepoint__sched_process_exit(void *ctx) {
CHECK_BPF_PROGRAM_BYPASSED()
u64 pid_tgid = bpf_get_current_pid_tgid();

bpf_map_delete_elem(&ssl_read_args, &pid_tgid);
bpf_map_delete_elem(&ssl_read_ex_args, &pid_tgid);
bpf_map_delete_elem(&ssl_write_args, &pid_tgid);
bpf_map_delete_elem(&ssl_write_ex_args, &pid_tgid);
bpf_map_delete_elem(&ssl_ctx_by_pid_tgid, &pid_tgid);
bpf_map_delete_elem(&bio_new_socket_args, &pid_tgid);

delete_pid_in_maps();
return 0;
}

Expand Down
37 changes: 0 additions & 37 deletions pkg/network/usm/ebpf_main.go
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,6 @@ import (
"github.com/DataDog/datadog-agent/pkg/network/tracer/offsetguess"
"github.com/DataDog/datadog-agent/pkg/network/usm/buildmode"
"github.com/DataDog/datadog-agent/pkg/network/usm/utils"
"github.com/DataDog/datadog-agent/pkg/util/kernel"
"github.com/DataDog/datadog-agent/pkg/util/log"
manager "github.com/DataDog/ebpf-manager"
)
Expand Down Expand Up @@ -150,29 +149,6 @@ func newEBPFProgram(c *config.Config, connectionProtocolMap *ebpf.Map) (*ebpfPro
}
}

if rawTracepointsSupported() {
// use a raw tracepoint on a supported kernel to intercept terminated threads and clear the corresponding maps.
mgr.Probes = append(mgr.Probes, []*manager.Probe{
{
ProbeIdentificationPair: manager.ProbeIdentificationPair{
EBPFFuncName: "raw_tracepoint__sched_process_exit",
UID: probeUID,
},
TracepointName: "sched_process_exit",
},
}...)
} else {
// use a regular tracepoint to intercept terminated threads.
mgr.Probes = append(mgr.Probes, []*manager.Probe{
{
ProbeIdentificationPair: manager.ProbeIdentificationPair{
EBPFFuncName: "tracepoint__sched__sched_process_exit",
UID: probeUID,
},
},
}...)
}

program := &ebpfProgram{
Manager: ddebpf.NewManager(mgr, "usm", &ebpftelemetry.ErrorsTelemetryModifier{}),
cfg: c,
Expand Down Expand Up @@ -486,14 +462,6 @@ func (e *ebpfProgram) init(buf bytecode.AssetReader, options manager.Options) er
}
}

if rawTracepointsSupported() {
// exclude regular tracepoint if kernel supports raw tracepoint
options.ExcludedFunctions = append(options.ExcludedFunctions, "tracepoint__sched__sched_process_exit")
} else {
//exclude a raw tracepoint if kernel does not support it.
options.ExcludedFunctions = append(options.ExcludedFunctions, "raw_tracepoint__sched_process_exit")
}

err := e.InitWithOptions(buf, &options)
if err != nil {
cleanup()
Expand All @@ -506,11 +474,6 @@ func (e *ebpfProgram) init(buf bytecode.AssetReader, options manager.Options) er
return err
}

func rawTracepointsSupported() bool {
kversion, err := kernel.HostVersion()
return err == nil && kversion >= kernel.VersionCode(4, 17, 0)
}

func getAssetName(module string, debug bool) string {
if debug {
return fmt.Sprintf("%s-debug.o", module)
Expand Down
111 changes: 106 additions & 5 deletions pkg/network/usm/ebpf_ssl.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ package usm

import (
"bytes"
"errors"
"fmt"
"io"
"os"
Expand All @@ -20,8 +21,8 @@ import (
"time"
"unsafe"

manager "github.com/DataDog/ebpf-manager"
"github.com/cilium/ebpf"
"github.com/cilium/ebpf/features"
"github.com/davecgh/go-spew/spew"

ddebpf "github.com/DataDog/datadog-agent/pkg/ebpf"
Expand All @@ -38,6 +39,7 @@ import (
"github.com/DataDog/datadog-agent/pkg/util/log"
"github.com/DataDog/datadog-agent/pkg/util/safeelf"
ddsync "github.com/DataDog/datadog-agent/pkg/util/sync"
manager "github.com/DataDog/ebpf-manager"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move back to line 23

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't it be part of Datadog section?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved back

)

const (
Expand Down Expand Up @@ -69,6 +71,9 @@ const (
gnutlsRecordSendRetprobe = "uretprobe__gnutls_record_send"
gnutlsByeProbe = "uprobe__gnutls_bye"
gnutlsDeinitProbe = "uprobe__gnutls_deinit"

rawTracepointSchedProcessExit = "raw_tracepoint__sched_process_exit"
oldTracepointSchedProcessExit = "tracepoint__sched__sched_process_exit"
)

var openSSLProbes = []manager.ProbesSelector{
Expand Down Expand Up @@ -416,6 +421,16 @@ var opensslSpec = &protocols.ProtocolSpec{
EBPFFuncName: gnutlsDeinitProbe,
},
},
{
ProbeIdentificationPair: manager.ProbeIdentificationPair{
EBPFFuncName: rawTracepointSchedProcessExit,
},
},
{
ProbeIdentificationPair: manager.ProbeIdentificationPair{
EBPFFuncName: oldTracepointSchedProcessExit,
},
},
},
}

Expand All @@ -424,6 +439,7 @@ type sslProgram struct {
watcher *sharedlibraries.Watcher
istioMonitor *istioMonitor
nodeJSMonitor *nodeJSMonitor
ebpfManager *manager.Manager
}

func newSSLProgramProtocolFactory(m *manager.Manager) protocols.ProtocolFactory {
Expand Down Expand Up @@ -477,6 +493,7 @@ func newSSLProgramProtocolFactory(m *manager.Manager) protocols.ProtocolFactory
watcher: watcher,
istioMonitor: istio,
nodeJSMonitor: nodejs,
ebpfManager: m,
}, nil
}
}
Expand All @@ -487,7 +504,7 @@ func (o *sslProgram) Name() string {
}

// ConfigureOptions changes map attributes to the given options.
func (o *sslProgram) ConfigureOptions(_ *manager.Manager, options *manager.Options) {
func (o *sslProgram) ConfigureOptions(m *manager.Manager, options *manager.Options) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need to get m as you already have that in o.ebpfManager

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

options.MapSpecEditors[sslSockByCtxMap] = manager.MapSpecEditor{
MaxEntries: o.cfg.MaxTrackedConnections,
EditorFlag: manager.EditMaxEntries,
Expand All @@ -496,12 +513,12 @@ func (o *sslProgram) ConfigureOptions(_ *manager.Manager, options *manager.Optio
MaxEntries: o.cfg.MaxTrackedConnections,
EditorFlag: manager.EditMaxEntries,
}
o.addProcessExitProbe(m, options)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need to get m as you already have that in o.ebpfManager

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

}

// PreStart is called before the start of the provided eBPF manager.
func (o *sslProgram) PreStart(m *manager.Manager) error {
o.watcher.Start()
o.watcher.SetEbpfManager(m)
func (o *sslProgram) PreStart(*manager.Manager) error {
o.watcher.Start(o.cleanupDeadPids)
o.istioMonitor.Start()
o.nodeJSMonitor.Start()
return nil
Expand Down Expand Up @@ -806,3 +823,87 @@ func getUID(lib utils.PathIdentifier) string {
func (*sslProgram) IsBuildModeSupported(buildmode.Type) bool {
return true
}

// addProcessExitProbe adds a raw or regular tracepoint program depending on which is supported.
func (o *sslProgram) addProcessExitProbe(mgr *manager.Manager, options *manager.Options) {
if features.HaveProgramType(ebpf.RawTracepoint) == nil {
// use a raw tracepoint on a supported kernel to intercept terminated threads and clear the corresponding maps
p := &manager.Probe{
ProbeIdentificationPair: manager.ProbeIdentificationPair{
EBPFFuncName: rawTracepointSchedProcessExit,
UID: probeUID,
},
TracepointName: "sched_process_exit",
}
mgr.Probes = append(mgr.Probes, p)
options.ActivatedProbes = append(options.ActivatedProbes, &manager.ProbeSelector{ProbeIdentificationPair: p.ProbeIdentificationPair})
// exclude regular tracepoint
options.ExcludedFunctions = append(options.ExcludedFunctions, oldTracepointSchedProcessExit)
} else {
// use a regular tracepoint to intercept terminated threads
p := &manager.Probe{
ProbeIdentificationPair: manager.ProbeIdentificationPair{
EBPFFuncName: oldTracepointSchedProcessExit,
UID: probeUID,
},
}
mgr.Probes = append(mgr.Probes, p)
options.ActivatedProbes = append(options.ActivatedProbes, &manager.ProbeSelector{ProbeIdentificationPair: p.ProbeIdentificationPair})
// exclude a raw tracepoint
options.ExcludedFunctions = append(options.ExcludedFunctions, rawTracepointSchedProcessExit)
}
}

var sslPidKeyMaps = []string{
"ssl_read_args",
"ssl_read_ex_args",
"ssl_write_args",
"ssl_write_ex_args",
"ssl_ctx_by_pid_tgid",
"bio_new_socket_args",
}

// cleanupDeadPids clears maps of terminated processes.
func (o *sslProgram) cleanupDeadPids(alivePIDs map[uint32]struct{}) {
if o.ebpfManager != nil {
err := o.deleteDeadPidsInMaps(sslPidKeyMaps, alivePIDs)
if err != nil {
log.Debugf("SSL maps cleanup error: %v", err)
}
}
}

// deleteDeadPidsInMaps deletes dead processes in maps with the key 'pid_tgid'
func (o *sslProgram) deleteDeadPidsInMaps(mapNames []string, alivePIDs map[uint32]struct{}) error {
var errs []error
for _, n := range mapNames {
err := deleteDeadPidsInMap(o.ebpfManager, n, alivePIDs)
if err != nil {
errs = append(errs, err)
}
}
return errors.Join(errs...)
}

// deleteDeadPidsInMap finds a map by name and deletes dead processes.
func deleteDeadPidsInMap(manager *manager.Manager, mapName string, alivePIDs map[uint32]struct{}) error {
emap, _, err := manager.GetMap(mapName)
if err != nil {
return fmt.Errorf("dead process cleaner failed to get map: %q error: %s", mapName, err)
}

var keysToDelete []uint64
var key uint64
value := make([]byte, emap.ValueSize())
iter := emap.Iterate()

for iter.Next(unsafe.Pointer(&key), unsafe.Pointer(&value)) {
pid := uint32(key >> 32)
if _, exists := alivePIDs[pid]; !exists {
keysToDelete = append(keysToDelete, key)
}
}
_, err = emap.BatchDelete(keysToDelete, nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BatchDeleteAPI is supported from kernel 5.6
How did you ensure this code runs only when it can?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Contributor

@guyarb guyarb Feb 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a back and forth here
Original comment asking - why don't you use batch-delete-api when possible
This comment highlighting you didn't handle kernels older than 5.6
and now you got back to deletion on key at a time
I missing the clarity of the reasoning why you didn't use batch-delete-api when it is possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The system-probe calls this function only if kernel is <4.17, in that case batch deletion is not supported. Unit test may call it on any kernel, but unit tests are not so time critical. Am I missing something?


return err
}
Loading