What is Post Process Searching?
Post process searching is a technique used to optimize dashboards in Splunk. A general rule of thumb with Splunk is: every search running at a time is taking up a single CPU on the server for processing. So, when you have a dashboard that could have 15, 25, or even 30 panels on it, you can see how this would be extremely resource intensive. Especially if it is a widely used dashboard and multiple people are opening it at the same time.
Post process searching can help with this problem by applying a single common base search across multiple dashboard panels. So we can potentially cut the count of individual searches firing in a dashboard by 30-40% (or higher in some cases). This is a critical technique for your Splunk users to learn, especially as your Splunk environment grows and you have more and more people using resources for alerts, adhoc searches, dashboards and more.
There are two major hindrances to Splunk performance. First, is the impact of hardware allocation. Second is bad user behavior. Enabling your users will absolutely do wonders for getting your environment to run smoothly.
How do I know if I can use post process searching?
When you or your users are developing dashboards, you may notice a lot of the panels reference the same root search, but you may display the data in different charts (like the same data visualized in a timechart as well as a single value). Or you may reference the same root search but transform variances of fields with different stats counts across your panels. This is where post process searching can help you optimize your dashboards.
Post-process searches perform additional processing on results from a base search. A base search can be a global search or any other search within a dashboard. Use the base attribute in a post-process <search> to indicate the base search id.
You can use a single post-process search to generate results or you can chain multiple post-process searches together.
Use these best practices to make sure that post-process searches work as expected.
- Use a transforming base search. A base search should be a transforming search that returns results formatted as a statistics table.
- Non-transforming base search issues. Non-transforming base searches can cause the following search result and timeout issues. If you observe these issues in a dashboard, check the base search to make sure that it is a transforming search.
- No results returned. If the base search is a non-transforming search, you must explicitly state in the base search what fields will be used in the post-process search using the | fields command. For example, if your post-process search will search for the top selling buttercup game categories over time, you would use a search command similar to the following: | fields _time, categoryId, action
- Event retention. If the base search is a non-transforming search, the Splunk platform retains only the first 500,000 events that it returns. A post-process search does not process events in excess of this 500,000 event limit, silently ignoring them. This can generate incomplete data for the post-process search. This search result retention limit matches the max_count setting in limits.conf. The setting defaults to 500,000.
- Client timeout. If the post-processing operation takes too long, it can exceed the Splunk Web client timeout value of 30 seconds.
Want to Know More? Contact Aditum’s Splunk Experts.
“We have a demanding development environment and Aditum has delivered top notch support.”– Large Health Insurance Provider
Aditum’s Splunk Architects, Splunk Administrators, Splunk Developers and Information Security consultants deliver outstanding results to companies like yours every day. From initial installation to managed services, our experts can help you deliver success.
How do I implement post process searching?
It’s important to note, post process searching must be done via the source of the dashboard. You have to actually open the XML of the dashboard to add your base search. Post process example:
1. <!-- My parent search -->
2. <search id="xyz">
3. <query>index=_internal |stats count by destIp destPort eventType</query>
5. <!-- post processing reference -->
7. <search base="xyz">
8. <query> stats dc(destIp) as “Distinct Count of Hosts”</query>
12. <search base="xyz">
13. <query>timechart count by destPort</query>
Notice in the above example, at the top of our XML we name our base search “xyz” via the “search id” parameter in the XML file. The base search contains a transforming command, stats, which tables out our fields of interest that we reference in panels further down the dashboard. Then we pass various transforming commands on different panels in the dashboard depending on how we want to visualize the data. This process could GREATLY reduce load on indexers and search heads if this is practiced whereever possible throughout our Splunk environment.
Aditum’s Splunk Professional Services consultants can assist your team with best practices to optimize your Splunk deployment and get more from Splunk.
Our certified Splunk Architects and Splunk Consultants manage successful Splunk deployments, environment upgrades and scaling, dashboard, search, and report creation, and Splunk Health Checks. Aditum also has a team of accomplished Splunk Developers that focus on building Splunk apps and technical add-ons.
Contact us directly to learn more.