The Exchange Server 2010 protocol logs can reveal some interesting information to you about the number of SMTP errors that are occurring for connections to or from your server.
This information is useful for scenarios such as checking whether your outbound mail may experiencing a high rate of failure, or discovering whether someone is continually trying to relay through your email server (in which case you could then look at the top sender IP addresses to investigate further).
Remember that protocol logging is not enabled by default on Exchange 2010, and must be enabled on a per-connector basis if you want to use it.
For SMTP error statistics we can use Log Parser to check the protocol logs for instances of SMTP status codes starting with “5″ (which are permanent failures). The Log Parser query for this is as follows:
SELECT data as [Status Code], Count(*) as Hits FROM *.log WHERE data LIKE '5%' GROUP BY data ORDER BY Hits DESC
When run from the folder containing the protocol logs it will be as follows:
"C:\Program Files (x86)\Log Parser 2.2\logparser.exe" "SELECT data as [Status Code],Count(*) as Hits FROM *.log WHERE data LIKE '5%' GROUP BY data ORDER BY Hits DESC" -i:CSV -nSkipLines:4 -rtp:-1
This will give you output similar to this:
Status Code Hits ------------------------------------------------- ---- 500 5.3.3 Unrecognized command 3632 501 5.1.3 Invalid address 1411 501 5.1.7 Invalid address 200 554 5.4.6 Hop count exceeded - possible mail loop 100 503 5.5.2 Send hello first 20 503 5.5.2 Sender already specified 4 501 5.5.4 Unrecognized parameter 2 504 5.7.4 Unrecognized authentication type 1 Statistics: ----------- Elements processed: 13113294 Elements output: 8 Execution time: 150.15 seconds (00:02:30.15)
You can also report on the number of SMTP errors occurring each day to get a baseline, which will help when you need to identify whether a sudden spike in SMTP errors has occurred. This Log Parser query will be as follows:
SELECT TO_LOCALTIME(TO_TIMESTAMP(EXTRACT_PREFIX(TO_STRING([#Fields: date-time]),0,'T'), 'yyyy-MM-dd')) AS Date, COUNT(*) AS Hits FROM *.log WHERE (data LIKE '5%') GROUP BY Date ORDER BY Date ASC
When run from the protocol log directory it will be as follows:
"C:\Program Files (x86)\Log Parser 2.2\logparser.exe" "SELECT TO_LOCALTIME(TO_TIMESTAMP(EXTRACT_PREFIX(TO_STRING([#Fields: date-time]),0,'T'), 'yyyy-MM-dd')) AS Date, COUNT(*) AS Hits from *.log where (data LIKE '5%') GROUP BY Date ORDER BY Date ASC" -i:CSV -nSkipLines:4 -rtp:-1
And of course, provided you have the Office Web Components installed, you can generate a graph of this same data.
"C:\Program Files (x86)\Log Parser 2.2\logparser.exe" "SELECT TO_LOCALTIME(TO_TIMESTAMP(EXTRACT_PREFIX(TO_STRING([#Fields: date-time]),0,'T'), 'yyyy-MM-dd')) AS Date, COUNT(*) AS [SMTP Errors] INTO DailySMTPErrors.gif from *.log where (data LIKE '5%') GROUP BY Date ORDER BY Date ASC" -i:CSV -nSkipLines:4 -chartType:Column3D
Which will give you a graph similar to this:





Hi Paul- thanks for the great write-up. I have one question. My output has some odd things in it, I think it’s picking up numbers in the data column that are not indicative of SMTP errors, many of them are 6 digits. My output is below. Am I misunderstanding or is logparser picking up 5′s that are not SMTP errors?
Status Code Hits
—————————————————————————— —-
550003 803
547710 753
554271 620
556009 557
556436 404
552224 217
550725 128
559783 57
519658271A49FFA3E619EE416620EF524552103F 47
51F7900806F0783F09D45D5017A89322AFEB3FC3 39
558912 23
555756 23
550 5.1.1 Host 63.250.236.234 error: B1.1.1 20
556487 13
559347 12
559342 11
556250 9
557094 8
554 rejected due to spam content 8
559324 8
559246 8
550 5.7.1 Recipients have complained about included content (B-URL) 7
564148 6
559625 6
559433 6
557062 6
562798 6
559628 6
557161 6
557431 6
534E5E0F2AE22351F9A6D5D12931B5081D60FD47 6
563211 6
563217 5
550 No Such User Here 5
550 #5.1.0 Address rejected. 5
556914 5
550 5.1.1 User unknown 5
561213 5
561798 5
503 need MAIL FROM first 5
559114 5
559746 5
559328 5
558814 4
559758 4
556824 4
556868 4
500 5.3.3 Unrecognized command 4
550-5.1.1 The email account that you tried to reach does 4
550-5.1.1 unnecessary spaces. Learn more at 4
Statistics:
———–
Elements processed: 140056
Elements output: 50
Execution time: 0.17 seconds
I think it’s picking up on lines such as these:
2012-08-28T13:28:25.935Z,To_Internet,08CF52B5572C1755,14,192.168.84.3:49949,REMOTE_IP:25,*,563579,sending message
2012-08-28T13:28:30.185Z,To_Internet,08CF52B5572C1757,34,192.168.84.3:49950,REMOTE_IP:25,*,563581,sending message
2012-08-28T13:30:57.064Z,To_Internet,08CF52B5572C1780,34,192.168.84.3:49976,REMOTE_IP:25,*,563595,sending message
I guess so. My own output was pretty clean so I guess you’ve got some stuff in your logs that I don’t
In my examples, the bad lines are all 6 digit codes or longer that start with 5. Is there a way to tell logparser to only pick up codes that start with 5 and are 3 digits long?
thanks for the awesome post Paul… like always
, is it possible to fetch the failed count for specific time, e.g. for past one day or one hour,
thanks in advance.