SAP Performance Tuning: The RHONDOS Way Part 2 (July 2024)
This article was authored by Benjamin Dare, Sales Engineer at RHONDOS.
Hundreds (if not thousands) of use cases can be implemented by utilizing PowerConnect to send data to large data platforms such as Splunk, Devo, Elastic, and DynaTrace. Because PowerConnect sends time-stamped, full-fidelity data to these platforms, you can use any number of powerful visualizations and alerts to help you not only slash your MTTD but also crush previous MTTR and deliver RCAs more quickly than ever.
SAP landscapes can be massive. There are multiple applications all interconnected. Depending on function and end-user activity, each application can have many servers to help balance the load. In the early days of SAP, you had to have a large team of basis folks to monitor, maintain, and patch all those systems. As Solution Manager Centralized Monitoring took off, it meant configuring and maintaining CCMS alerts and relying on proactive notifications to act. However, relying on alerts has its downside. You lose the intimate relationship with the SAP systems. There are no more manual system checks. Instead, you spend your time reducing alert noise and adjusting CCMS thresholds. What if you could have an intimate monitoring relationship with your systems and dynamic ML-generated thresholds to reduce noise and the time spent performing threshold maintenance?
The manual effort required to check the health of each app server in SAP could be daunting. For example, look at an SAP transaction like ST06 (OS Monitoring).
The information provided in ST06 is helpful to know what is happening now, but there is much to be desired. First, the information is static. To see the updated information, you must hit “refresh.” Second, historical information is almost useless. It will display a table with CPU values for previous hours. It is not granular enough to know if there was an intermittent spike or some sustained activity, which needs investigation. Finally, this data is only representative of a single server. As mentioned previously, SAP landscapes can be huge, and you would need to check this (somewhat helpful) information for each server in your environment. What if there was a way to get detailed information for all servers at once in an easy-to-read format with visualizations even the bean counters could understand?
PowerConnect extracts static ST06 data from SAP and slaps a time stamp on it so the data can be visualized over time, enabling trend analysis in near real-time. As you can see above, we can view multiple KPIs across many servers in a single dashboard. Visualizations automatically change color based on SAP KPI benchmarks so bottlenecks can be easily identified.
By now, I’m sure you are wondering, “How did he do it? It must have taken a long time to learn how to build these incredible dashboards?” Honestly, it’s a lot easier than you think. To dive deeper, let’s look closely at one of the panels I created above.
In this single dashboard panel, we evaluate RAM utilization across three servers. A “sparkline” shows the history over X amount of time. It’s this history that dictates the color indicator of each server. As a reminder, this is just one example of how you can present the data. Another way that might be useful is just a GREEN traffic light for ALL of your production systems. Should the RAM utilization on a single server reach a critical threshold, we could then change the color to yellow. Upon clicking into that event, we could display RAM for all servers so you can evaluate where the problem server exists. With that in mind, let’s look at the search behind the scenes that was used to create these panels:
I know what you’re thinking… It looks complex, and you want to turn and run back to Solution Manager or Cloud ALM. Don’t do it. You’ll be sorry. Trust me. It's not as difficult as you think when we take each line at a time and break it down.
As mentioned in part one, ideally, Splunk ingests data from your entire IT landscape, which is where the power comes from. Data comes from the Network, VMWare, third-party servers, and SAP. Since there is data from multiple sources inside of Splunk, we first need to identify the data source we will search for: `sap-abap(ST06)`. There is a lot of data inside of Splunk so we are just narrowing down our search to speed things up. Since I want to search all of my SAP systems, we use source=”*” and INSTANCE_NAME=”*” where SOURCE correlates to my SID and INSTANCE_NAME correlates to individual servers associated with each SID. Easy-peasy right?
If you started reading ahead, you noticed that in the next step, we are using eval statements to perform some calculations on some fields. Before I break down the calculations, we must first address where the interesting fields come from.
Another power of Splunk is that it can ingest “any data, structured or unstructured, from any source.” In Splunk, "interesting fields" are automatically extracted from the data and deemed useful or relevant based on the data set and the searches being performed. These fields are highlighted in the Splunk search interface to help users quickly identify key pieces of information in their data. Here’s how interesting fields are identified and where they come from:
Automatic Field Extraction:
· When data is ingested into Splunk, it uses built-in parsers and predefined rules to automatically extract fields from the raw data.
· Fields can be identified based on common log formats, key-value pairs, delimiters, or other recognizable patterns in the data.
Search-Time Field Extraction:
· Splunk applies field extraction rules at search time, meaning the extraction process happens dynamically when you perform a search query.
· These rules can include regular expressions, Splunk’s built-in extraction methods, or custom extraction rules defined by the user.
Field Discovery During Searches:
· When you run a search, Splunk analyzes the data returned by the search to identify fields that frequently appear in the results.
· Fields that are found in a significant number of events in the search results are considered "interesting."
User Interactions and Configuration:
· Users can define and configure custom field extractions using the Splunk Web interface or by editing configuration files.
· Field extraction settings can be fine-tuned to ensure that specific fields are always extracted and highlighted as interesting for certain data sources.
Knowledge Objects:
· Splunk allows the creation of knowledge objects such as field extractions, lookups, and event types that can enhance field discovery.
· These knowledge objects can help standardize and consistently extract fields across similar data sets.
Now that you understand interesting fields and their sources, we can inspect the calculations performed on the data to help generate our dashboard panels. After identifying our data source (`sap-abap(ST06` source=”*” INSTANCE_NAME=*) we use the pipe command (“|”) to send the results from that line to the next line. Our interesting field in this example is PHYS_MEM. The value of PHYS_MEM came over as text and has some extra characters in it so we need to clean that up. Using the “eval” command, we create a new variable called “splunk_total_memory”, we convert the text from PHYS_MEM to a number (tonumber) and remove the extra characters (trim).
On the following line, we’re doing the same thing with the FREE_MEM interesting field. Since we are trying to display the percentage of memory consumed, we need to do some further calculations.
As you learned in introductory algebra, we need to divide the amount of free memory by the total available memory to get the consumed memory as a percentage. Since displaying a percent as an integer is easier to read, we multiply the result by 100 to get our final result. We create a new variable to display the value called “splunk_mem_percent_used”.
The following line is there for formatting the data into a table in case a deeper investigation is required. We simply display a time chart with a span of 30 minutes between measurements. We’re showing the maximum amount of memory used in one column, for which we have given the title “Percent Memory Used,” we’re doing so “by INSTANCE_NAME” to ensure all our servers are listed.
One fantastic fact about this solution is that it’s fully scalable. If you add an application server, you don't need to rewrite SPL for your dashboards. In this example, the exciting field INSTANCE_NAME automatically detects when new app servers are added at the time of ingest. Below is the result of our SPL query and the formatting we applied above.
In my opinion, selecting the type of graphic to be displayed in your dashboard panel is a lot of fun and allows you to express your creative side. To select the type of visualization you’d like to use for your dashboard panel, click the “Visualization” tab.
In this example, I chose the “Single Value” visualization and added a sparkline below the number to see how the RAM utilization had been trending. Because I used the MAX value of the percentage of RAM used over a 30-minute period, I don’t have to worry about overlooking quick bursts in consumption. I’ve always used the strategy that you should tune your SAP systems for maximum expected workloads. Because I’m capturing the MAX value, I should be safe. When I get the panel how I like it I save it. Now it’s out there, and I can add it to any dashboard I choose. Think of it like object-oriented dashboard design.
I will admit. As someone who spent their entire career focused on SAP, I found the first time I encountered Splunk and SPL a bit daunting. Like anything, the more I play with it, the easier it gets. Also, I’ve discovered that SPL is a very logical language. The Splunk interface is very intuitive and easy to navigate. Another fantastic fact is that there is a TON of information on the web to help you.
A great place to start down your path of building dashboards in Splunk is to check out their free classes online. You can find a link HERE. There are enough free classes to earn two certifications (Splunk Core Certified User and Splunk Core Certified Power User). I earned both of these certifications shortly after joining RHONDOS. These certifications give you a solid foundation and are the cornerstone of many other certifications, including Splunk Enterprise Certified Admin (which I also earned 😊).