Classifications

Archived Posts from this Category

The Good Stuff Becomes Transparent

Posted by Kevin Powe on 01 Jul 2009 | Tagged as: Nerd Thoughts, Plain English

While I’m closing out the next Tuxedo post, this is almost more of a note to myself more than anyone else. It’s about useability and acclimatisation, and may well be blindingly obvious to you. I had an experience this week that really drilled home how we acclimatise to interfaces, internalise the positive stuff, and only really think consciously about what’s left visible and apparent – the negative.

My Hordemobile

I drive a 2004 Holden Astra hatch. It is not a glamorous or luxury car. I will never drive it down winding forest roads or through oil-black rainslick streets while my eerily coiffed European model wife sleeps, elegantly arranged next to me. (cue sweeping strains of classical music) But it does me just fine, thank you very much. I like the car so much that, after someone killed my last black Holden Astra, I brought pretty much the same car again. Then someone tried to kill this one in Melbourne, but that’s another story entirely

Recently, my car was kind enough to suggest that a new fuel pump might be in order, by way of killing the old one. While I was procrastinating over getting the car fixed, my partner was kind enough to let me do a substantial amount of driving in her car. I’ve driven her car around a number of times before, as you do in regular life. But this was a long enough stretch to acclimatise to her car as my regular driving experience. And it is a neat, practical car – it’s compact, corners fantastically, has more pick-up than mine in first gear, and has awesome cup-holders. And for coffee freaks, cup-holders are important.

But the experience when I got my car fixed was like driving a completely new car again. I saw my car through the eyes I had when I first bought it, and remember why I love the act of driving it, and why its been a great companion relocating interstate twice. Why I like sitting a little closer to the road, how comfortable the seats really are, how the interior design is put together in a way to melt away when you’re driving at night, and the sense of space and comfort it gives, even though it’s a smallish four-door hatch.

The lesson for myself here, which you’re welcome to take away too if it resonates, is this: when we use products, we often assume the positive parts of the experience as a given, and they become transparent – it’s difficult to appreciate them on a day to day basis. We can still see the negative though, and focus on that. It’s only when we move away from a product that the positive becomes clear, often by contrast. So I’m going to do my damnedest to remember that, if the only feedback for something is negative, it doesn’t necessarily mean it’s bad, if it’s still doing its job and being used widely.

It could just mean that the good stuff has become transparent.

Getting More Out of Oracle Tuxedo Part One: The ULOG And TMTRACE

Posted by Kevin Powe on 19 May 2009 | Tagged as: Gobbledygook, Tuxedo

Things have been a little quiet around here, my dear imaginary audience, and for that I apologise profusely. I’ve been finding most of my spare time taken over by a pet programming project which I’m perched on the edge of revealing and quite excited about, despite its tiny scope. (more on that soon, I promise!)

As I find myself for the first time in quite a long time on a plane without the girl as company, it seemed high time to furnish you with a blog post. Post-production has taken a LOT longer than I’d hoped in making this available post-flight, but here we are finally.

My last post was about the general principle of reading the log files, so I thought I’d jump into specifics about where to find useful information with Tuxedo. I had intended this to be one complete post, but due to the wonderful fractal quality of explaining things, I’ll need to break this up into a series.

For those of you unfamiliar with Tuxedo, this post is unlikely to have much interest or benefit. Before we possibly part ways though, I’ll take a moment to explain exactly what Tuxedo is, as it’s somewhat rarified knowledge nowadays, but the market seems to be pushing to make it and similar products popular again. From my less cynical vantage point, it looks at the moment like a healthy step backwards from making the solution to EVERYTHING a web service, or SOA.

So what is Tuxedo?

Tuxedo is what’s referred to as an Online Transaction Processing system. (OLTP, if you’re looking for a sexy acronym for your new tattoo) It’s also commonly referred to as a Transaction Processisng system, but that’s a shorter version of the same thing fundamentally.That means that it processes a huge volume of small operations. Because those operations are things that need to either work or fail consistently, we call those transactions. Transactions can be anything from processing a credit card charge, to updating the location of a package in transit for delivery, to transferring funds between accounts.

One of the big advantages of Tuxedo in particular is that it offers clients a set of available services, each with a unique name. Those services can be provided by a large, complex group of interconnected machines, or all of those services can be provided by one single machine. Clients don’t know and don’t care about the exact location of an available service. They connect to a central point, and request services by name. The exact process that provides that service is unknown to them, which lets you be much more flexible with how you structure the internals of a Tuxedo application.

Because OLTPs are so good at processing these high volumes of transactions (and typically have other technological benefits that make them attractive) they’re used by some big companies, often in preference to newer, ‘cooler’ technology like web services or JEE applications, for example. Just to drop two names, last time I heard both American Express and Fedex were Tuxedo users.

Thus ends the general part of the post. Thanks for listening!

Now… how to get more useful information out of Tuxedo?

ULOG

The pimp-mack-super-daddy of information in Tuxedo is the ULOG file. This is where Tuxedo logs all of its own information, and also where any calls to the function write data. if you’re having problems with Tuxedo, this should always be your first port of call. By default, the ULOG file lives in the directory pointed to by the APPDIR environment variable. If you haven’t set APPDIR explicitly, then the ULOG file lives in the same directory as your TUXCONFIG file.

The ULOG file is actually a series of files, one for each day. The format of ULOG files is as below:

    ULOG.mmddyy

where mm is the current month, dd is the current day of the month, and yy is the last two digits of the current year. Frustratingly, that leaves those files just short of sorting perfectly in chronological order. But I’m sure there’s a solid historical American reason for that convention, possibly involving orange cheese. As it stands, they’ll sort fine inside one year.

You can also control the location of ULOG files with the ULOGPFX environment variable. If you want to keep log files somewhere specific, you can specify a directory and filename like the following:

    export ULOGPFX=/var/log/tuxedo/VERYSPECIAL

With that convention, you’d get a file created on the 12th of April 2009 called /var/log/tuxedo/VERYSPECIAL.041209.

Every message that Tuxedo places in the ULOG file comes from a message catalog. You can find a starting point for the Tuxedo message catalog reference here. Nine times out of ten, if you have some cryptic message in the ULOG file, the message catalog documentation will shed some light on what’s going on, or at least give a hint of the general area of the problem. Unfortunately, one time out of ten, you’ll get a “Contact BEA Support” message, or a repeat of exactly what you’re seeing in the ULOG file.

One thing that gets overlooked sometimes when troubleshooting behaviour between a client and server is that the client gets their own ULOG file too. The same variables (APPDIR and ULOGPFX) control where it’s created. Oftentimes, especially with connectivity issues or situations where nothing appears to be happening in the Tuxedo application itself, it’s the logs for the client and server together that form the complete picture.

TMTRACE: The Firehose

Now, the standard information in the ULOG file is often enough to solve simple problems. But, sometimes a problem comes along that’s complex enough that you need more. That’s where tmtrace comes in. It’s a facility in Tuxedo that allows you to provide more detailed information about what’s happening at runtime. It’s a very lightly documented feature, and I have to confess that my understanding of it borders on hoodoo. But I’ve found that even that hoodoo level of understanding is tremendously useful.

It works on a trace specification, which essentially allows you to specify three things:

  • A trace category, identifying what area of Tuxedo you want more information on.
  • Where you want information to go, which has to be the ULOG file.
  • Whether you want your particular service request to ‘dye’, meaning that all work done as part of your service request will use the same trace configuration you’ve specified, or just use the current configuration for that server.

I’ll explain each of those in a little more detail to shed some light:

The trace category can be as broad as a number of key areas that the documentation provides, or specific as an individual function. Personally, I tend to use the broader areas that are documented, because it means not having to adjust trace specifications constantly. I’m a big fan of the firehose.

The documented categories are:

  • atmi – Specific ATMI functions
  • iatmi – work done implicitly on behalf of an ATMI function, or as part of administration. This is a superset of the atmi category.
  • xa – work done to manage XA transactions
  • trace – work done related to managing trace functionality

The first three areas tend to be the most useful. In fact, you can just set the firehose to pulverize, use ‘*’, and get everything.

In terms of destinations for trace information, there’s an exciting new change with the more recent versions of Tuxedo: you can now specify utrace as a destination, which can send trace information to a custom destination. That was certainly new to me. The documentation is a little unclear on the specific limits of utrace, and seems to indicate that it can only be used with tmtrace information.

Now, dyeing behaviour. I can set trace configuration as surgically as on a server by server basis. So individual processes can have either their own or no current trace configuration. They could be the quiet, introspective server that lives next door.

Because Tuxedo works on the concept of services, the work done in order to process my service call to process a credit card charge might not be as simple as I think. The one server that takes my service call might make two service calls of its own. One of those service calls might turn into another three service calls. The flow could end up looking like this:

One service calls five others in turn

If we only want to use tracing for a specific flow of logic, to find out what happens with the full flow of processing, using dye allows us to follow processing not just for server A (where we’d set trace) but all the way through to servers D, E and F.

You can set trace configuration either using the TMTRACE environment variable, or using changetrace from inside tmadmin. The simplest way to enable tracing is flick everything on, using changetrace on, or you could just enable tracing for ATMI functions and send trace info to the ULOG file, as follows:

    changetrace "atmi:ulog"

For more information on trace specification formats, the best reference is the documentation for tmtrace.

So that’s the ULOG file, and TMTRACE. Both of these together will form the first and and best line of investigation when something is going wrong with your Tuxedo application. Typically, your ULOG file is kind enough to provide some idea of what is going wrong with your application without any additional work. If not, you can coax more useful information out via TMTRACE.

Next up, we’ll have a look at txrpt, and the .TMIB service in Tuxedo – both great tools that provide us with the basics of performance and runtime monitoring in Tuxedo.

Read The Log Files…

Posted by Kevin Powe on 06 Feb 2009 | Tagged as: How To, Plain English

I don’t claim to be a Top Gun consultant by any stretch of the imagination. Sadly, the Val Kilmer of IT consulting (whoever that might be) is unlikely to offer to let me ride his tail any time. (seriously, how could you deliver that dialogue with a straight face? Even in the 80s?)

But, I have gotten a lot of wins on client site by following one deceptively simple rule: read the log files. When strange things are happening intermittently, the first and best port of call is the log files for the system you’re debugging. Now, I don’t claim to be a genius in having discovered this rule. But it’s somewhere between surprising and depressing the number of times I’ve worked with really smart people and get a blank stare when I’ve asked ‘what do the log files say?’

And it’s wisdom I’ve come by the hard way myself. One of my first jobs was installing a HRMS system on client sites, and while troubleshooting a major hiccup with an install on client site, I was on the phone to our guru in the office. He asked if I’d checked the log files. I realised… I didn’t even know where the log files were. It turned out, if I’d checked them, I would have saved myself and the client a lot of grief, including a day’s work restoring from an old backup.

Log files are the forensic evidence of technical troubleshooting. And if we want to be the Grissoms of our own CSI, then there’s a few pointers it’s worth following.

Know where your log files are

It might sound like an obvious rule of thumb, but cases often have more than one crime scene, and the most important one might not be particularly obvious. You might be driving requests to an application through an interface that’s not giving you much more than an ‘Error occurred’, but chances are a log file somewhere is dutifully scribing reams of text about the problem that’s occurring.

For JEE applications, for example, you’ve got your default server log file, and application-specific log files at the very least. In the case of WebLogic Server, you’ve got your domain log file as well, and if you’re having problems with your back-end database, then the real answer might be in XA trace logs on your database server.

Once you’ve established where your log files are, understand where you can go to get additional context on problems that are occurring, as well. If you’re fortunate enough to have a unique message id associated with the error that’s occurring, understand where to find the specific documentation for your product stack so you can get more information on that specific error. And if that doesn’t give you a satisfactory answer, your next step might be a good solid Googling.

Don’t search on text strings

If there’s one thing my years of intensive crime show watching have taught me, it’s that forensics are all about understanding the problem that’s there, not the one you think you have. Almost invariably, if you search on a text string to try and troubleshoot, you’ll miss something.

A bit part of troubleshooting is understanding specific problems or errors in the context of anything else that might be happening at runtime, as well. So searching for keywords or text strings will catapult you past pages of logging that may contain that vital clue that will crack the case.

Look for patterns

Individual error messages by themselves may not mean much, but combining error messages and application behaviour can be a telling sign. It could well be that a specific server-side component is causing database connection leaks. Or, maybe it’s accessing a particular page in your web application that is causing all of those JNI crashes.

Understanding the patterns of behaviour in an application (if you’re lucky enough to have several days of log file information to debug from) can be crucial to finding the core problem.

Know how to turn on the fire hose

Once you’ve gotten a fairly clear idea of what the problem is, knowing how to configure more verbose levels of debug information can be very important. Keeping a reference handy on the debug switches supported for the product you’re troubleshooting is never a bad idea. In the case of WebLogic Server, for example, from WebLogic Server 9.0 onward, a number of configurable debug switches can be selected from inside the Administration Console. Handy! (it’s important to note though that these switches are far from the complete list you can use)

I’d like to talk about some rules of thumb that I think are good to follow to make a log file to be proud of as an application developer, and also how to troubleshoot some specific technologies, but that’s enough for now. I’ll come back to the other stuff.

Got any log file experiences? Bizarre error messages involving processes being sent insane by zombie children or the like? Times where you’ve saved the day and come out a hero, all thanks to a few choice log file messages? Share them in the comments, or just drop me a line and let me know what you think!

Next Page »