Using Log Parser’s EXTRACT_PREFIX and EXTRACT_SUFFIX Functions to Split Strings

Exchange Server Pro forums member Nate-n8 has a question regarding Log Parser. Nate has a query returning certain strings that he wishes to truncate to only return the useful information for his report.

To demonstrate consider this query that is run against the RPC Client Access logs on an Exchange 2010 server.

logparser "SELECT DISTINCT client-name as User FROM *.log" -nSkipLines:4 -i:CSV

What we get is a bunch of results that look similar to this:

User
--------------------------------------------------------------------------------
/o=Exchange Server Pro/ou=Exchange Administrative Group (FYDIBOHF23SPDLT)/cn=Configuration/cn=Servers/cn=HO-EX2010-MB1/cn=Microsoft System Attendant
/o=Exchange Server Pro/ou=Exchange Administrative Group (FYDIBOHF23SPDLT)/cn=Recipients/cn=Administrator
/o=Exchange Server Pro/ou=Exchange Administrative Group (FYDIBOHF23SPDLT)/cn=Recipients/cn=Alan.Reid
/o=Exchange Server Pro/ou=Exchange Administrative Group (FYDIBOHF23SPDLT)/cn=Recipients/cn=Alannah.Shaw
/o=Exchange Server Pro/ou=Exchange Administrative Group (FYDIBOHF23SPDLT)/cn=Recipients/cn=Diane.Hall
/O=EXCHANGE SERVER PRO/OU=EXCHANGE ADMINISTRATIVE GROUP (FYDIBOHF23SPDLT)/CN=CONFIGURATION/CN=SERVERS/CN=HO-EX2010-MB1/cn=Microsoft System Attendant
/o=Exchange Server Pro/ou=Exchange Administrative Group (FYDIBOHF23SPDLT)/cn=Recipients/cn=Alex.Heyne

Nate is only interested in the user name, ie “Alan.Reid”. So how do we return just that portion of the string with the Log Parser query?

The solution to this type of situation is in two functions available in Log Parser:

  • EXTRACT_PREFIX
  • EXTRACT_SUFFIX

EXTRACT_PREFIX and EXTRACT_SUFFIX performs a similar function in a slightly different way, by letting you nominate a character as a separator and then the number (or instance) of that character where the string should be split.

  • EXTRACT_PREFIX splits the string at the specified instance of the separator (counting from the left-hand side) and returns the sub-string before (ie on the left-hand side of) the separator
  • EXTRACT_SUFFIX splits the string at the specified instance of the separator (counting from the right-hand side) and returns the sub-string after (ie on the right-hand side of) the separator

Which one you use depends on which side of the separator contains the sub-string you’re interested in. EXTRACT_PREFIX to the left, EXTRACT_SUFFIX to the right.

Example of EXTRACT_SUFFIX in Log Parser

For example, EXTRACT_SUFFIX in this case lets us specify “=” as the separator, which would make “Alan.Reid” the sub-string that exists to the right-hand side of the instance of the separator referenced as 0.

In the query itself it would be:

logparser "SELECT DISTINCT EXTRACT_SUFFIX(client-name,0,'=') as User FROM *.log" -nSkipLines:4 -i:CSV

This returns a nice neat sub-string containing just the username information.

User
--------------------------
Microsoft System Attendant
Administrator
Alan.Reid
Alannah.Shaw
Diane.Hall
Alex.Heyne

Example of EXTRACT_PREFIX in Log Parser

An example of EXTRACT_PREFIX is shown in my article on reporting top sender IP addresses using Log Parser and protocol logs.

EXTRACT_PREFIX is used here to separate the IP address from the port number of the remote endpoint that is logged by Exchange protocol logging.

The query in that example looks like this:

SELECT EXTRACT_PREFIX(remote-endpoint,0,':') as IP,
	REVERSEDNS(EXTRACT_PREFIX(remote-endpoint,0,':')) as Name,
	Count(*) as Hits
FROM *.log
WHERE data LIKE '%EHLO%'
GROUP BY IP
ORDER BY Hits DESC

Summary

As you can see EXTRACT_PREFIX and EXTRACT_SUFFIX are two very useful functions in Log Parser that can help you turn a variety of strings into easy to read reporting data, and can certainly be used to solve Nate’s problem here.

About Paul Cunningham

Paul is a Microsoft Exchange Server MVP and publisher of Exchange Server Pro. He also holds several Microsoft certifications including for Exchange Server 2007, 2010 and 2013. Connect with Paul on Twitter and Google+.

Comments

  1. Great article, I ran into an issue when supplying multiple source log paths. It seems that the unique parameter was being applied per source so if usernames were in multiple source folders they would be returned in the results multiple times. To get around this I guess you need one source, or run another sweep over the result removing duplicates again. I’m new to log parser and it’s fairly frustrating in some cases, using -recurse for instance doesn’t work for csv..silly.

Leave a Comment

*

We are an Authorized DigiCert™ SSL Partner.