Loading

Friday, October 5, 2012

I LOVE LogParser

Recently I wrote about some code improvements we deployed that had a huge impact, causing our web services calls to be answered much quicker and saving us a fair amount of money each month. In that post I talked about using Cacti to graph server statistics (like CPU and IIS stats), and service response times. That post also included this spreadsheet showing how we lowered service response times considerably.

Figure 1 - Spreadsheet comparing average load time before and after code deployment.
This post is about how I use Microsoft's LogParser to extract very useful information from my IIS web server logs. About LogParser Microsoft says:
Log parser is a powerful, versatile tool that provides universal query access to text-based data such as log files, XML files and CSV files, as well as key data sources on the Windows® operating system such as the Event Log, the Registry, the file system, and Active Directory®. You tell Log Parser what information you need and how you want it processed. The results of your query can be custom-formatted in text based output, or they can be persisted to more specialty targets like SQL, SYSLOG, or a chart.
Most software is designed to accomplish a limited number of specific tasks. Log Parser is different... the number of ways it can be used is limited only by the needs and imagination of the user. The world is your database with Log Parser.
I use LogParser for a variety of tasks, but this is all about getting some good summary data from my IIS logs. (See Log Parser Rocks! More than 50 Examples! for some great examples of using LogParser - I've used this as a reference a lot!)

Over night when load to my IIS servers is considerably lower than during the day I have scheduled jobs that summarize information from the logs, then zips them & writes them to S3 for longer-term storage. For a little perspective my IIS servers handle somewhere around a cumulative total of 500,000,000 to 600,000,000 (yes, over 1/2 a BILLION requests) per day.  Since every request is logged (and I log every possible parameter in IIS) that's over 1/2 a billion log lines in total. Each and every day. Each of my IIS servers logs between 10-25 million requests per day - these log files come in at about 5-10 GB per server per day.

Figure 2 - Windows Explorer view showing rather large IIS log files on one server.
I guess I should address the question of why I log all requests and why I spend the time to analyze the logs. Besides the fact that I'm a bit of a data junkie, and that it gives me bragging rights that I collect so much data, the main reason is that it gives me a lot of very useful information. These summaries tell me number of requests, bandwidth served, average response time (both by hour and by minute - see spreadsheet above), number of and distribution of server responses (I.E. HTTP responses like 200, 302, 404, 500, 503, etc.), website referrers and much more. Then I use this summary data to compare servers - I can compare them to one another for a day, and over time. Of major significance is the fact that I can get a fairly quick and granular view of the impact to our systems of changes like code deployments (see my post, "Improving Web Server Performance in a Big Way With Small Changes" for more on this).

I'm not going to bore you with all my logparser commands. After all I'm not doing anything special & what most of them do has been well documented elsewhere. I am going to show a couple of things I do with logparser, both in my daily summarization of logs and on-demand to view the immediate effect of changes.

Note: I'm using line breaks in the following commands for readability.

Hits By Hour
logparser -i:W3C -o:CSV
"SELECT TO_LOCALTIME(QUANTIZE(TO_TIMESTAMP(date, time), 3600)) AS DateTime,
count(*) as Requests,
MUL(PROPCOUNT(*), 100) AS Percentage,
SUM(sc-bytes) AS TotalBytesSent,
MIN(time-taken) AS MinTime,
MAX(time-taken) AS MaxTime,
AVG(time-taken) AS AvgTime,
Div(TotalBytesSent,Requests) AS BytesPerReq
INTO D:\LogFiles\IIS\Reports\%IISTYPE%\%IISTYPE%_%IISDT%_%COMPUTERNAME%_%RTIME%_HitsByHourSummary.csv
FROM D:\LogFiles\IIS\%IISSITE%\ex%IISDT%*.log
GROUP BY DateTime ORDER BY DateTime"
Hits By Minute
logparser -i:W3C -o:CSV
"SELECT TO_LOCALTIME(QUANTIZE(TO_TIMESTAMP(date, time), 60)) AS DateTime,
count(*) as Requests,
MUL(PROPCOUNT(*), 100) AS Percentage,
SUM(sc-bytes) AS TotalBytesSent,
MIN(time-taken) AS MinTime,
MAX(time-taken) AS MaxTime,
AVG(time-taken) AS AvgTime,
Div(TotalBytesSent,Requests) AS BytesPerReq
INTO D:\LogFiles\IIS\Reports\%IISTYPE%\%IISTYPE%_%IISDT%_%COMPUTERNAME%_%RTIME%_HitsByMinuteSummary.csv
FROM D:\LogFiles\IIS\%IISSITE%\ex%IISDT%*.log
GROUP BY DateTime ORDER BY DateTime"
Breakdown
  • logparser -i:W3C -o:CSV - this tells logparser the input file is W3C and to output results as CSV
  • SELECT TO_LOCALTIME(QUANTIZE(TO_TIMESTAMP(date, time), 60)) AS DateTime - this little gem breaks down the results per minute (using 60) or per hour (3600); you could specify any number of seconds for this - at times when I'm feeling randy I'll break it out into 5 minute intervals with 300
  • count(*) as Requests - this gives me a total count of the log lines
  • MUL(PROPCOUNT(*), 100) AS Percentage - assigns a percentage of total calls to the specified time period
  • SUM(sc-bytes) AS TotalBytesSent - total bytes sent (server to client) from server
  • MIN(time-taken) AS MinTime - lowest time (in miliseconds) for a response
  • MAX(time-taken) AS MaxTime - highest time (in miliseconds) for a response
  • AVG(time-taken) AS AvgTime - average time (in miliseconds) for a response (this is the one I'm really after with this command - more on this in a minute...)
  • Div(TotalBytesSent,Requests) AS BytesPerReq - bytes per request or how much data was sent to the client on average for each request (another very useful one)
  • INTO D:\LogFiles\IIS\Reports\%IISTYPE%\%IISTYPE%_%IISDT%_%COMPUTERNAME%_%RTIME%_HitsByMinuteSummary.csv - into summary file (variables are used in scheduled job and should be self explanitory. One to note, however, is the %IISDT% [DT is for date] variable which I've documented in Get Yesterday's date in MS DOS Batch file)
  • FROM D:\LogFiles\IIS\%IISSITE%\ex%IISDT%*.log - from file (once again variables used in batch process)
  • GROUP BY DateTime ORDER BY DateTime - finally group by and order by clauses
Results
Here's a look at the results of this query (the second one, hits by hour) with a summary showing the particularly useful average time per request.

Figure 3 - results of logparser hits by hour query.
Wait, There's More
I use the hits by minute query for both summary data and to take a near real-time look at a server's performance. This is particularly useful to check the impact of load or code changes. In fact, just today we had to push out some changes to a couple SQL stored procedures and to determine if these changes had a negative (or positive) impact on the web servers I ran the following:
logparser -i:W3C -o:TSV "SELECT TO_LOCALTIME(QUANTIZE(TO_TIMESTAMP(date, time), 60)) AS DateTime, count(*) as Requests, SUM(sc-bytes) AS TotalBytesSent, Div(TotalBytesSent,Requests) AS BytesPerReq, MIN(time-taken) AS MinTime, MAX(time-taken) AS MaxTime, AVG(time-taken) AS AvgTime FROM D:\LogFiles\IIS\W3SVC1\u_ex%date:~12,2%%date:~4,2%%date:~7,2%*.log GROUP BY DateTime ORDER BY DateTime"
NOTES
  • First, notice I'm outputting it as a TSV and not specifying an output (with the INTO command) file. This is because I want the results displayed to the command prompt where I'm running it and want it tab separated to make it easier to view.
  • Second, and I'm quite proud of this, is "u_ex%date:~12,2%%date:~4,2%%date:~7,2%*.log". These variables parse today's date and format it in the default YYMMDD used by IIS logs. See the results of echo u_ex%date:~12,2%%date:~4,2%%date:~7,2%*.log command.
Figure 4 - results of echo u_ex%date:~12,2%%date:~4,2%%date:~7,2%*.log
So, when we pushed the code change today I ran the above command on each of my front-end servers to determine if there was a negative (or positive) impact. By breaking the results, particularly the average time per response, out by minute I could see if there was any impact immediately after the code was pushed. Which there wasn't - at least that's what I determined after a little while of thinking there had been a negative impact of the code push....

Notice the lines at 13:25 and 13:39 where the average time per response increased to 149 and 273 respectively. It turns out the minute or so while logparser was running (even though while watching task manager there didn't seem to be much of a noticeable CPU load hit) the average response time climbed quite a bit.
Figure 5 - results of logparser by minute command
So, by using logparser I'm able to summarize hundreds of millions of IIS log lines each day, and on-demand when needed to keep a good pulse on just what my servers are doing.


No comments:

Post a Comment