All Splunk Knowledge Objects (KOs) generated during search time are maintained in memory. Every field name extracted from the data, every eventtype and tag applied to each event take up a little bit of memory – our wonderful “schema on the fly” concept! Now any one KO, in and of itself, doesn’t take up too much memory. However, that minuscule amount of memory adds up as the number of events in the search results returned increases. Run a search that returns 10 million events, and the KOs generated and maintained in memory increase by that same 10 million count!
Eventtypes are a construct of filtering that is applied to a specific dataset to aid in searching, reporting and dashboard generation related to that dataset. An eventtype definition is supposed to identify a particular subset of a dataset, often for the purpose of applying ‘tags’ to that subset of data. Both the generation of the eventtype and the subsequently applied tags are KOs, and therefore consume some amount of memory. Incorrectly configuring and applying an eventtype definition to data not related to the designed purpose causes unnecessary memory usage. The key to preventing this is writing clear, concise and restrictive eventtype definitions.
The Splunk Technical Add On (TA) called Splunk Add-on for Unix and Linux (https://splunkbase.splunk.com/app/833/) is one particular TA that causes unnecessary and bloated eventtype and tag generation in datasets not related to unix, linux or any other *nix data. The definition used for (4) eventtypes in this TA is so broad as to falsely create eventtypes and subsequent tags in searches of some non-*nix datasets.
Provide guidance in refining (4) specific eventtypes within the Splunk Add On for Unix and Linux. Additionally, these concepts can be applied to other TAs.
Structure of an Eventtype in Splunk
Each eventtype has (2) required elements to it, with an optional set of third/fourth element(s). The (2) required elements are the ‘declaration’ and the ‘definition’. The optional elements (always commented and out of active configuration) are the listing of the potential tags and datamodels associated with the declared eventtype.
The following is an example of an eventtype, configured in eventtypes.conf:
search = sourcetype=foo signal=bar corrupted
# tags = operations services configuration
# datamodel = change
In the above sample, the first line is the declaration element (encompassed by the square brackets’[‘ and ‘]’); that is it ‘declares’ the eventtype title to be “sample_signal”. It is best to use a descriptive title that clearly and concisely identifies the subset of the overall dataset that the eventtype applies to.
The second line is the definition of the specific search. This is the filter to be applied to the selected dataset to identify a subset from this data. The terms of the definition need to be met if the underlying event is to have this eventtype assigned (and possibly accompanying tag) and KOs generated in memory. In the example, only events within a sourcetype of foo that also have a signal field with a value of bar and the word corrupted within it with be identified by the eventtype filter of sample_signal.
Lines 3 and 4 are optional comments that identify the potential tag KOs to be generated for this event and the appropriate data models this event may support.
Eventtypes in the Splunk Add On for Unix and Linux
The declaring or title of the eventtypes in the *nix TA is not usually an issue. They represent certain characteristics that are desired to fit a filter when looking at event data, such as [login_authentication], [passwd-auth-failure] or [sshd_authentication].
Most but not all of the eventtype definitions in the TA are specific enough to concisely identify very specific events to apply the eventtype KO to. One way to tighten the breadth of the definition is to include the sourcetype or index in the definition. A good number of the eventtypes in this TA apply the sourcetype. Additionally, the newest version of the TA (8.10 released June 24, 2020) even uses the punct field, identifying specific punctuation in an event as a definition filter. This makes it a VERY precise filter for that type of event!
There are (4) eventtypes in the TA that still use vague definitions, the end result being that incorrect events/datasets get an eventtype assigned to them, increasing unneeded memory usage.
Example One: Eventtype does not use the optional lines, but simply has a declaration and its definition:[nix_configs]
search = source=”/etc/*” OR source=”*.conf” OR source=”*.cfg”
The issue with this eventtype is that the definition is too broad. ANY event that has a source with an ending of .conf or .cfg get assigned this eventtype. Sources that include Splunk configuration files (using .conf) would get assigned the eventtype of “nix_configs”. Clearly this would be an incorrect assignment of an eventtype to a dataset.
The other (3) eventtype definitions in the Splunk Add On for Unix and Linux present similar issues; their definitions are too broad and are too easily inadvertently applied to events in other datasets.[nix_errors]
search = (NOT sourcetype=stash) error OR critical OR failure OR fail OR failed OR fatal
#tags = error[nix_kernel_attached]
search = (NOT sourcetype=stash) kernel
#tags = os unix kernel[nix-all-logs]
search = source=”*.log” OR source=”*.log.*” OR source=”*/log/*” OR source=”/var/adm/*” OR source=”access*” OR source=”*error*” OR sourcetype=”syslo*” NOT source=usersWithLoginPrivs NOT sourcetype=lastlog
Solution: There are multiple ways in which to resolve this unnecessary expansion of eventtype KOs in memory.
Method 1: add index and /or sourcetype to every eventtype definition in the TA.
For example, in the first eventtype, change[nix_configs]
search = source=”/etc/*” OR source=”*.conf” OR source=”*.cfg”
search = index = linux source=”/etc/*” OR source=”*.conf” OR source=”*.cfg”
Note: As we are only talking about (4) eventtypes in this particular TA, this is the simplest and most straight forward way to ensure these eventtypes get applied to ONLY *nix data.
Want to Know More? Contact Aditum’s Splunk Experts.
“We have a demanding development environment and Aditum has delivered top notch support.”– Large Health Insurance Provider
Aditum’s Splunk Architects, Splunk Administrators, Splunk Developers and Information Security consultants deliver outstanding results to companies like yours every day. From initial installation to managed services, our experts can help you deliver success.
Method 2: Can be used if many eventtypes had to be modified. This entails creating an additional eventtype which defines the index or sourcetype, and then applying that eventtype into the definition of every other eventtype.
For example, add the index definition eventtype:[nix_index]
search = index=linux
Then, in every subsequent eventtype definition, add this eventtype into it:[nix_configs]
search = eventtype=nix_index source=”/etc/*” OR source=”*.conf” OR source=”*.cfg”
Note: This effectively adds an index name to every eventtype definition, narrowing the scope of which events the eventtype can be applied to. The number of eventtypes generated and memory used will be reduced.
Method 3: This method makes use of a macro within Splunk. This entails the generation of a macro to define the index name or sourcetype, similar to the eventtype index naming method. The macro would be called within every eventtype definition search.
definition = index=linux
Then, in eventtypes.conf, for every eventtype definition, add this macro:[nix_configs]
search = `nix-indexes` source=”/etc/*” OR source=”*.conf” OR source=”*.cfg”
Note: This effectively adds an index name to every eventtype definition via the macro definition, again, narrowing the scope of which events the eventtype can be applied to.
If you’ve ever searched a dataset not related to a *nix OS, looked at the listing of eventtypes and saw nix_configs, nix_errors or nix-all-logs in the listing, now you know why. Poorly restricted eventtype definitions are the cause. The lesson can also carry over to any custom homegrown Add Ons you might develop – keep your eventtype definitions clear, concise and restrictive to the applicable dataset to avoid unnecessary KO generation and memory usage.
Aditum’s Splunk Professional Services consultants can assist your team with best practices to optimize your Splunk deployment and get more from Splunk.
Our certified Splunk Architects and Splunk Consultants manage successful Splunk deployments, environment upgrades and scaling, dashboard, search, and report creation, and Splunk Health Checks. Aditum also has a team of accomplished Splunk Developers that focus on building Splunk apps and technical add-ons.
Contact us directly to learn more.
- Things to Ponder: Splunk conf files -> Lessons Learned - December 4, 2020
- Splunk Tips & Tricks: Save some memory from those pesky extra eventtypes - October 2, 2020