Java discussion forum to accompany the Javamex web site.

On thread profiling

Neil - I think you've done an impressive job with your thread-profiling package. Let me relay my experience on the subject, which is in plenty of languages, but not yet in Java.

Ever play one of those arcade games where something pops up and you need to try to shoot it? That's how I do performance tuning. I take samples of the stack manually with the pause button. As soon as I see a line of code appear on more than one stack it is a potential target. If it is clearly a bottleneck, I shoot it down (fix it and start over). If I'm not sure, I continue sampling until I do see such a target.

Where this differs from your technique is that you have the concept of an "interesting target", such as the first line in the user's own code. What I've found is that the bigger the code is, the deeper the stack is, and the more likely I am to find things mid-stack that are easily shot down.

The reason it works is because if a line of code appears on a sample, that slice of time is being spent because that line is there, and would not be spent if it weren't there. The amount of time the line is responsible for is roughly estimated by the fraction of samples that contain it, and rough is good enough.

-Mike

P.S. Here's a rant which, even if it's a rant, is true:
http://stackoverflow.com/questions/1777556/alternatives-to-gprof/17...

0 members like this

▶ Reply to This

Replies to This Discussion

Permalink Reply by Neil Coffey on November 23, 2009 at 20:34

Hi Mike,

So your method of effectively taking "manual pot shots" and seeing if you can eye-ball a pattern is definitely applicable to Java as well. At any time you can press Ctrl + Break in the console window, or send a singal 9 to the VM process under Unix systems, and the VM will dump the current stack traces of all threads to the console. So if you do this 10 times periodically, and in 9 of those times you notice your program "stuck" in a place that you didn't expect to take 9/10 of the time, then you've potentially found the pot at the end of the rainbow, and no further programming/complicated methods are required.

The power from being able to conduct profiling programmatically comes in the cases where you can't spot such patterns/bottlenecks simply by eyeballing the stack traces. For example, you may be able to programmatically spot patterns in, say, a particular call taking a long time precisely when a particular request parameter being passed to your server. Making this kind of inference just from taking "pot shots" is likely to be more difficult.

In my example, I introduce the notion of picking a single "interesting line" of each stack trace. But that's really just a decision for simplicity that is sufficient in many cases. There would be nothing to stop you taking statistics on chains of calls rather than individual calls if the additional complexity was worth it for your particular profiling needs.

However I should say, I don't think I disagree in principle with any of the things you are saying either here or in your "rant". All I'm proposing is an "extra weapon in the armoury" for those cases where profiling from within the program itself is useful. There are also certainly times when it's overkill. And I present an example of profiling in a particular way, but the whole point of the programmatic method is that you could extend it in all sorts of ways.

Neil

▶ Reply

Permalink Reply by Michael R. Dunlavey on November 23, 2009 at 21:24

Thanks for your reply. What I would have it do is consider as interesting all lines in user code that showed up on all samples (in the interval of interest). There shouldn't be a fantastic number of them because there should be a lot of repetition and the number of samples does not need to be huge. Then the percent of samples containing any line directly measures what removing that line would save. I don't think there's any problem that can be missed this way, however, the user still has to decide which of the "popular" lines of code is something he/she can "punch out" - in other words, the amount of time to be saved by knocking out a line of code is known, but the user has to figure out if they _can_ knock it out, and that requires context to know if it's really necessary. Sometimes knowing who called it and why by looking up a representative stack sample is enough information. Sometimes more information about the state at the time of the sample is needed.

My most recent favorite example of this is the C# app my group is working on which was spending roughly 40-50% of it's startup time in the process of extracting strings out of resource files and converting them into string objects, for the purpose of internationalization. There was no way to guess this, but sampling found it immediately. There was nothing obviously wasteful about this code (which involved about 10 layers of method calls). However, if I simply, on a sample, asked what is the exact string being extracted, it turns out to be something so blase' and universal that nobody would ever internationalize it, like a version number. Most of them were like that. So just being able to ask "why" on representative samples enabled the optimization to be done. So it's like the sampling told us where to focus in, but then the extra state information finished it. By making those strings constants in the code, all that extraction no longer gets done, thus making those code lines much less popular and saving most of the time they represented.

I am hopeful that with efforts like yours the world will learn to think about optimization in ways that are more fruitful.

▶ Reply

▶ Reply to Discussion

RSS

On thread profiling

Replies to This Discussion

About