How To
Archived Posts from this Category
Archived Posts from this Category
Posted by Kevin Powe on 10 Aug 2009 | Tagged as: Tuxedo

So the last time we talked about Tuxedo, we talked about the ULOG and TMTRACE. This time around, we’re going to look at a great tool for basic performance profiling.
The great thing with all of the tools we’re looking at, just as with the stuff we looked at in the last post, is that they’re built into the Tuxedo platform. Tuxedo provides a tool out of the box that gives you log files of service execution times at millisecond precision. It’s called txrpt (as per the title for the post) and you enable it for individual servers by specifying CLOPT particular CLOPT settings for your server.
Remember that there are two halves to your CLOPT (command-line) options for a Tuxedo server: general Tuxedo parameters on the left-hand side of the ‘–’ marker, and server-specific (the ones you code for yourself) on the right hand side of the ‘–’ marker. We’re specifying general Tuxedo parameters here. The ones we need to set are:
We need both of these together, because txrpt logging is sent to the stderr stream. So we want to control where that txrpt logging goes. By indicating a particular file, we can isolate the performance results we’re interested in.
So, if we were configuring txrpt logging for our old standby, the simpserv server that comes as part of simpapp, the configuration in our UBBconfig file might look like this:
| simpserv | SRVID=1 |
| SRVGRP=SpeedyServers | |
| MIN=1 | |
| MAX=1 | |
| CLOPT=”-A -r -e simpapp_txrpt.log — “ |
Because we’re only specifying standard Tuxedo CLOPT options, we didn’t have to put the “–” marker in our CLOPT options, but it’s there to reinforce exactly what we’re doing there. Whenever we request a service advertised by simpserv now, a record will be logged in simpapp_txrpt.log, giving us start and stop times for that service.
That’s why this is a great feature for profiling performance, but not for production environments: the overhead in disk I/O for a high volume system can be quite high. So in attempting to profile performance, you degrade it horribly.
So far, so good, but how do we get at this delicious information?
The easy way to do this is via the txrpt utility. Given a file like our simpapp_txrpt.log containing txrpt logging (and potentially a date and time range) txrpt will produce a report giving you a low-resolution indication of overall performance. Like this:
kevinpowe@yamamoto:~/tux10_0/tuxedo10.0/samples/atmi/simpapp$ txrpt -d 08/09 < toupper_txprt
| SERVICE SUMMARY REPORT | ||||
| SVCNAME | 16p-17p | 18p-19p | TOTALS | |
| Num/Avg | Num/Avg | Num/Avg | ||
| --------------- | ------- | ------- | ------- | |
| TOUPPER | 1/0.00 | 1/0.00 | 2/0.00 | |
| --------------- | ------- | ------- | ------- | |
| TOTALS | 1/0.00 | 1/0.00 | 2/0.00 | |
So you get the number of executions in that timeframe, broken down into hourly intervals, and with a handy total. A get a high-level view of what your application usage pattern is over time, at an individual service level. You also get service execution time, down to 100ths of a second. That takes us a long way for not a lot of effort, and may give you everything you need.
If you did want to go for extra credit though, you might be interested to know that the txrpt log file contains execution times at a millisecond level of precision. Here’s a snippet of a txrpt log file:
@CheckExpDate 6529 1014617897 225306420 1014617897 225306421
@UsrAuthenticate 6591 1014617897 225306417 1014617897 225306421
@GetErrorMesg 6366 1014617897 225306423 1014617897 225306424
@GetUsrPermission 6591 1014617897 225306422 1014617897 225306424
@GetMyLocation 6591 1014617897 225306424 1014617897 225306426
The columns in the file, from left to right respectively, are:
For each service request, you can get the service execution time in milliseconds by subtracting the start time from the end time. If you want a handy online converter for sanity-checking start and end dates online, you can find an easy to use conversion tool here.
If you were the kind of person who found parsing log files amusing, you could get each individual service call at a millisecond-level resolution, allowing you to get laser precision on what’s going on in your server over time.
Don’t forget though: You’ll be getting all the information logged to stderr for that server in your txrpt file. As well as txrpt logging, you’ll get any error text generated. You only want the lines prefixed by a @
So that’s txrpt. It gives you a great tool to go from no idea to a clear picture of what’s going on inside your application. It’s a great tool to use in pre-production performance profiling, and I have used it with clients before in production environments where they were desperate to get an understanding of how their application was being used.
Next we’ll look at the .TMIB, which is the runtime store of everything Tuxedo knows about everything. We’ll look at how we can get access to that information via SNMP monitoring, and also how we can use a command line tool called the unit driver (ud or ud32) to access this awesome store, and any other service that fulfills some basic requirements.
Posted by Kevin Powe on 19 May 2009 | Tagged as: Gobbledygook, Tuxedo

Things have been a little quiet around here, my dear imaginary audience, and for that I apologise profusely. I’ve been finding most of my spare time taken over by a pet programming project which I’m perched on the edge of revealing and quite excited about, despite its tiny scope. (more on that soon, I promise!)
As I find myself for the first time in quite a long time on a plane without the girl as company, it seemed high time to furnish you with a blog post. Post-production has taken a LOT longer than I’d hoped in making this available post-flight, but here we are finally.
My last post was about the general principle of reading the log files, so I thought I’d jump into specifics about where to find useful information with Tuxedo. I had intended this to be one complete post, but due to the wonderful fractal quality of explaining things, I’ll need to break this up into a series.
For those of you unfamiliar with Tuxedo, this post is unlikely to have much interest or benefit. Before we possibly part ways though, I’ll take a moment to explain exactly what Tuxedo is, as it’s somewhat rarified knowledge nowadays, but the market seems to be pushing to make it and similar products popular again. From my less cynical vantage point, it looks at the moment like a healthy step backwards from making the solution to EVERYTHING a web service, or SOA.
Tuxedo is what’s referred to as an Online Transaction Processing system. (OLTP, if you’re looking for a sexy acronym for your new tattoo) It’s also commonly referred to as a Transaction Processisng system, but that’s a shorter version of the same thing fundamentally.That means that it processes a huge volume of small operations. Because those operations are things that need to either work or fail consistently, we call those transactions. Transactions can be anything from processing a credit card charge, to updating the location of a package in transit for delivery, to transferring funds between accounts.
One of the big advantages of Tuxedo in particular is that it offers clients a set of available services, each with a unique name. Those services can be provided by a large, complex group of interconnected machines, or all of those services can be provided by one single machine. Clients don’t know and don’t care about the exact location of an available service. They connect to a central point, and request services by name. The exact process that provides that service is unknown to them, which lets you be much more flexible with how you structure the internals of a Tuxedo application.
Because OLTPs are so good at processing these high volumes of transactions (and typically have other technological benefits that make them attractive) they’re used by some big companies, often in preference to newer, ‘cooler’ technology like web services or JEE applications, for example. Just to drop two names, last time I heard both American Express and Fedex were Tuxedo users.
Thus ends the general part of the post. Thanks for listening!
Now… how to get more useful information out of Tuxedo?
The pimp-mack-super-daddy of information in Tuxedo is the ULOG file. This is where Tuxedo logs all of its own information, and also where any calls to the APPDIR environment variable. If you haven’t set APPDIR explicitly, then the ULOG file lives in the same directory as your TUXCONFIG file.
The ULOG file is actually a series of files, one for each day. The format of ULOG files is as below:
ULOG.mmddyywhere mm is the current month, dd is the current day of the month, and yy is the last two digits of the current year. Frustratingly, that leaves those files just short of sorting perfectly in chronological order. But I’m sure there’s a solid historical American reason for that convention, possibly involving orange cheese. As it stands, they’ll sort fine inside one year.
You can also control the location of ULOG files with the ULOGPFX environment variable. If you want to keep log files somewhere specific, you can specify a directory and filename like the following:
export ULOGPFX=/var/log/tuxedo/VERYSPECIALWith that convention, you’d get a file created on the 12th of April 2009 called /var/log/tuxedo/VERYSPECIAL.041209.
Every message that Tuxedo places in the ULOG file comes from a message catalog. You can find a starting point for the Tuxedo message catalog reference here. Nine times out of ten, if you have some cryptic message in the ULOG file, the message catalog documentation will shed some light on what’s going on, or at least give a hint of the general area of the problem. Unfortunately, one time out of ten, you’ll get a “Contact BEA Support” message, or a repeat of exactly what you’re seeing in the ULOG file.
One thing that gets overlooked sometimes when troubleshooting behaviour between a client and server is that the client gets their own ULOG file too. The same variables (APPDIR and ULOGPFX) control where it’s created. Oftentimes, especially with connectivity issues or situations where nothing appears to be happening in the Tuxedo application itself, it’s the logs for the client and server together that form the complete picture.
Now, the standard information in the ULOG file is often enough to solve simple problems. But, sometimes a problem comes along that’s complex enough that you need more. That’s where tmtrace comes in. It’s a facility in Tuxedo that allows you to provide more detailed information about what’s happening at runtime. It’s a very lightly documented feature, and I have to confess that my understanding of it borders on hoodoo. But I’ve found that even that hoodoo level of understanding is tremendously useful.
It works on a trace specification, which essentially allows you to specify three things:
I’ll explain each of those in a little more detail to shed some light:
The trace category can be as broad as a number of key areas that the documentation provides, or specific as an individual function. Personally, I tend to use the broader areas that are documented, because it means not having to adjust trace specifications constantly. I’m a big fan of the firehose.
The documented categories are:
atmi – Specific ATMI functions
iatmi – work done implicitly on behalf of an ATMI function, or as part of administration. This is a superset of the atmi category.
xa – work done to manage XA transactions
trace – work done related to managing trace functionality
The first three areas tend to be the most useful. In fact, you can just set the firehose to pulverize, use ‘*’, and get everything.
In terms of destinations for trace information, there’s an exciting new change with the more recent versions of Tuxedo: you can now specify utrace as a destination, which can send trace information to a custom destination. That was certainly new to me. The documentation is a little unclear on the specific limits of utrace, and seems to indicate that it can only be used with tmtrace information.
Now, dyeing behaviour. I can set trace configuration as surgically as on a server by server basis. So individual processes can have either their own or no current trace configuration. They could be the quiet, introspective server that lives next door.
Because Tuxedo works on the concept of services, the work done in order to process my service call to process a credit card charge might not be as simple as I think. The one server that takes my service call might make two service calls of its own. One of those service calls might turn into another three service calls. The flow could end up looking like this:

If we only want to use tracing for a specific flow of logic, to find out what happens with the full flow of processing, using dye allows us to follow processing not just for server A (where we’d set trace) but all the way through to servers D, E and F.
You can set trace configuration either using the TMTRACE environment variable, or using changetrace from inside tmadmin. The simplest way to enable tracing is flick everything on, using changetrace on, or you could just enable tracing for ATMI functions and send trace info to the ULOG file, as follows:
changetrace "atmi:ulog"
For more information on trace specification formats, the best reference is the documentation for tmtrace.
So that’s the ULOG file, and TMTRACE. Both of these together will form the first and and best line of investigation when something is going wrong with your Tuxedo application. Typically, your ULOG file is kind enough to provide some idea of what is going wrong with your application without any additional work. If not, you can coax more useful information out via TMTRACE.
Next up, we’ll have a look at txrpt, and the .TMIB service in Tuxedo – both great tools that provide us with the basics of performance and runtime monitoring in Tuxedo.
Posted by Kevin Powe on 06 Feb 2009 | Tagged as: How To, Plain English

I don’t claim to be a Top Gun consultant by any stretch of the imagination. Sadly, the Val Kilmer of IT consulting (whoever that might be) is unlikely to offer to let me ride his tail any time. (seriously, how could you deliver that dialogue with a straight face? Even in the 80s?)
But, I have gotten a lot of wins on client site by following one deceptively simple rule: read the log files. When strange things are happening intermittently, the first and best port of call is the log files for the system you’re debugging. Now, I don’t claim to be a genius in having discovered this rule. But it’s somewhere between surprising and depressing the number of times I’ve worked with really smart people and get a blank stare when I’ve asked ‘what do the log files say?’
And it’s wisdom I’ve come by the hard way myself. One of my first jobs was installing a HRMS system on client sites, and while troubleshooting a major hiccup with an install on client site, I was on the phone to our guru in the office. He asked if I’d checked the log files. I realised… I didn’t even know where the log files were. It turned out, if I’d checked them, I would have saved myself and the client a lot of grief, including a day’s work restoring from an old backup.
Log files are the forensic evidence of technical troubleshooting. And if we want to be the Grissoms of our own CSI, then there’s a few pointers it’s worth following.
It might sound like an obvious rule of thumb, but cases often have more than one crime scene, and the most important one might not be particularly obvious. You might be driving requests to an application through an interface that’s not giving you much more than an ‘Error occurred’, but chances are a log file somewhere is dutifully scribing reams of text about the problem that’s occurring.
For JEE applications, for example, you’ve got your default server log file, and application-specific log files at the very least. In the case of WebLogic Server, you’ve got your domain log file as well, and if you’re having problems with your back-end database, then the real answer might be in XA trace logs on your database server.
Once you’ve established where your log files are, understand where you can go to get additional context on problems that are occurring, as well. If you’re fortunate enough to have a unique message id associated with the error that’s occurring, understand where to find the specific documentation for your product stack so you can get more information on that specific error. And if that doesn’t give you a satisfactory answer, your next step might be a good solid Googling.
If there’s one thing my years of intensive crime show watching have taught me, it’s that forensics are all about understanding the problem that’s there, not the one you think you have. Almost invariably, if you search on a text string to try and troubleshoot, you’ll miss something.
A bit part of troubleshooting is understanding specific problems or errors in the context of anything else that might be happening at runtime, as well. So searching for keywords or text strings will catapult you past pages of logging that may contain that vital clue that will crack the case.
Individual error messages by themselves may not mean much, but combining error messages and application behaviour can be a telling sign. It could well be that a specific server-side component is causing database connection leaks. Or, maybe it’s accessing a particular page in your web application that is causing all of those JNI crashes.
Understanding the patterns of behaviour in an application (if you’re lucky enough to have several days of log file information to debug from) can be crucial to finding the core problem.
Once you’ve gotten a fairly clear idea of what the problem is, knowing how to configure more verbose levels of debug information can be very important. Keeping a reference handy on the debug switches supported for the product you’re troubleshooting is never a bad idea. In the case of WebLogic Server, for example, from WebLogic Server 9.0 onward, a number of configurable debug switches can be selected from inside the Administration Console. Handy! (it’s important to note though that these switches are far from the complete list you can use)
I’d like to talk about some rules of thumb that I think are good to follow to make a log file to be proud of as an application developer, and also how to troubleshoot some specific technologies, but that’s enough for now. I’ll come back to the other stuff.
Got any log file experiences? Bizarre error messages involving processes being sent insane by zombie children or the like? Times where you’ve saved the day and come out a hero, all thanks to a few choice log file messages? Share them in the comments, or just drop me a line and let me know what you think!