Skip to content
George Starcher edited this page Mar 17, 2018 · 1 revision

Reference Link:

Summary:

If you are already working with Python outside of Splunk then HEC is the easiest way to get your data into Splunk.

These days most APIs of other systems are going to give you a list of JSON dicts (a list of records) that you want to clean up then index into Splunk as time series events.

You want to take the relevant timestamp, ensure it is in epoch and use that for the "time" field of the HEC payload.

Put all event data into the "event" field of the HEC payload. This gives you the _raw event in Splunk and the JSON is covered by search time extractions. If your event is very long you should consider assigning KV_MODE = json to the sourcetype for the data you are sending in.

I strongly encourage using the "popNullFields = True" to the HEC class object you setup in your code from the HEC class. This helps properly treat null values. See the blog post Splunk Null Thinking.

Index Time Extractions:

About using sourcetype=_json from the topic, Index Time Field Extractions. I recommend avoiding index time extractions as much as possible. At large scales this can consume a lot of additional storage. Also at large volumes having all your event data in the same sourcetype can make searches consume lots of resources. Break data sources into their own sourcetypes. The sourcetype _json is a predefined index time extraction enabled sourcetype meant just for JSON formatted data. You can enable index time extraction on your own sourcetype names if you really must have them.

Using this also requires you put the event fields into a payload field called "fields". You also must use JSON not RAW HEC modes for events with index time extractions.

Gotchas:

Watch how many events you submit in the same epoch time. Splunk does not like more than 100,000 events in the same time bucket when you are searching. This is also where using different source types help your search be more precise.

This class does NOT implement event index acknowledgements. This is frankly a pain to implement and most people don't bother. They put enough resources in ensuring their HEC receiving layer is large enough for their data volume. Scaling HEC

Clone this wiki locally