Friday, August 27, 2010

JPMP (Java's Poor Man's Profiler)

I recently came across PMP and found it surprisingly useful to troubleshoot and analyze live production systems, despite it being so ridiculously simple and naive. Unfortunately for me, I've had to deal with quite a bit of Java lately. So here we go, JPMP:
$ cat jpmp
#!/bin/bash
pid=$1
nsamples=$2
sleeptime=$3

for x in $(seq 1 $nsamples)
do
jstack $pid
sleep $sleeptime
done | \
awk 'BEGIN { s = "" }
/^"/ { if (s) print s; s = "" }
/^ at / { sub(/\([^)]*\)?$/, "", $2); sub(/^java/, "j", $2);
if (s) s = s "," $2; else s = $2 }
END { if(s) print s }' | \
sort | uniq -c | sort -rnk1
Output:
$ jpmp 28881 5 0
120 sun.misc.Unsafe.park,j.util.concurrent.locks.LockSupport.park,j.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await,j.util.concurrent.LinkedBlockingQueue.take,j.util.concurrent.ThreadPoolExecutor.getTask,j.util.concurrent.ThreadPoolExecutor$Worker.run,j.lang.Thread.run
15 sun.nio.ch.EPollArrayWrapper.epollWait,sun.nio.ch.EPollArrayWrapper.poll,sun.nio.ch.EPollSelectorImpl.doSelect,sun.nio.ch.SelectorImpl.lockAndDoSelect,sun.nio.ch.SelectorImpl.select,org.simpleframework.transport.reactor.ActionDistributor.distribute,org.simpleframework.transport.reactor.ActionDistributor.execute,org.simpleframework.transport.reactor.ActionDistributor.run,j.lang.Thread.run
5 sun.nio.ch.EPollArrayWrapper.epollWait,sun.nio.ch.EPollArrayWrapper.poll,sun.nio.ch.EPollSelectorImpl.doSelect,sun.nio.ch.SelectorImpl.lockAndDoSelect,sun.nio.ch.SelectorImpl.select,org.apache.zookeeper.ClientCnxn$SendThread.run
5 sun.misc.Unsafe.park,j.util.concurrent.locks.LockSupport.park,j.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await,j.util.concurrent.LinkedBlockingQueue.take,org.apache.zookeeper.ClientCnxn$EventThread.run
5 sun.misc.Unsafe.park,j.util.concurrent.locks.LockSupport.park,j.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await,j.util.concurrent.DelayQueue.take,org.simpleframework.util.lease.LeaseCleaner.clean,org.simpleframework.util.lease.LeaseCleaner.run,j.lang.Thread.run
5 j.lang.Thread.sleep,org.simpleframework.util.buffer.FileManager.run,j.lang.Thread.run
5 j.lang.Object.wait,j.lang.ref.ReferenceQueue.remove,j.lang.ref.ReferenceQueue.remove,j.lang.ref.Finalizer$FinalizerThread.run
5 j.lang.Object.wait,j.lang.Object.wait,j.lang.ref.Reference$ReferenceHandler.run

Java NIO and overengineering

In an earlier post about some of the most horrible APIs in the JDK (the date related APIs), I mentioned how the Java's IO APIs were equally (if not more) retarded. Today I had the joy to dig in the code of Java NIO try to understand why NIO doesn't have "built-in" proxy support like OIO through -D socksProxyHost and -DsocksProxyPort. I wanted to see if I could find a way to work around this limitation, possibly by doing evil things with reflection.

The problems boils down to the fact that with NIO, you use a SocketChannel, and under the hood, some kind of a Socket must be used. Here the fun starts, because as the javadoc says, "The actual work of the socket is performed by an instance of the SocketImpl class", which is itself an abstract class (it's an abstract Impl, yeah, right...).

So let's see how the code goes from a SocketChannel to a Socket or a SocketImpl:
  1. People typically create a SocketChannel like so: SocketChannel.open()
  2. SocketChannel.open() does return SelectorProvider.provider().openSocketChannel();
  3. SelectorProvider.provider() checks to see if the system property java.nio.channels.spi.SelectorProvider is set and a few other things. Unless you specifically do something manually, none of that stuff will end up finding a provider, so the code will end up returning sun.nio.ch.DefaultSelectorProvider.create();.
  4. Now here things get a little bit blurry. I think we end up in here (on Linux) doing return new sun.nio.ch.EPollSelectorProvider();
  5. So in step 2., when openSocketChannel() is called on the provider, we end up in here, doing return new SocketChannelImpl(this);
  6. SocketChannelImpl then does all the work for NIO sockets, mostly using sun.nio.ch.Net which is chiefly made of native methods.
  7. If you want to view a SocketChannelImpl as a Socket by calling socket(), the SocketChannelImpl wraps itself in a SocketAdaptor, but the adaptor simply transforms all the calls to calls on the underlying SocketChannelImpl which uses the native functions in sun.nio.ch.Net, so there's nothing really you can do to make it use a SOCKS proxy.
So the bottom line of the story is that there's no way to use a SOCKS proxy with NIO unless you implement the SOCKS proxy protocol yourself (and if you do, make sure you use enough abstract factories implementations and other layers of indirections, so that your code will fit well in the Java IO philosophy).

Also, just for the fun of it, I was curious to see how many objects were involved under the hood of a single SocketChannel. The NIO code is so bloated that I was expecting quite a bit of overhead. I used the following not-very-scientific method (which is kinda similar to PMP) to try to get an answer:
$ cat t.java
import java.net.InetSocketAddress;
import java.nio.channels.SocketChannel;

final class t {
  public static void main(String[]a) throws Exception {
    System.out.println("ready");
    Thread.sleep(5000);
    SocketChannel socket = SocketChannel.open();
    InetSocketAddress addr = new InetSocketAddress("www.google.com", 80);
    socket.connect(addr);
    System.out.println("connected");
    Thread.sleep(5000);
  }
}
then I run this code and use jmap -histo:live right after seeing the "ready" message and right after seeing the "connected" message. In between both of those messages, 2933 objects occupying 371744 bytes of RAM (on my MBP with the JDK 1.6.0_20-b02-279-10M3065 and HotSpot JVM 16.3-b01-279, both shipped by Apple with OS X 10.6.4). I don't know if I'm supposed to laugh or cry.
Class name                         Diff:       # instances Memory used (bytes)
<methodKlass>                                         +615 +74264
<constMethodKlass>                                    +615 +72416
<constantPoolKlass>                                    +65 +47288
<instanceKlassKlass>                                   +65 +46264
<symbolKlass>                                         +705 +33312
<constantPoolCacheKlass>                               +67 +32024
java.lang.Class                                        +69 +12696
[C                                                     +85 +10664
[[I                                                   +121  +8760
[I                                                     +73  +5928
[S                                                     +96  +5232
[B                                                     +58  +4600
java.lang.String                                       +94  +3760
java.util.LinkedList$Entry                             +65  +2600
<objArrayKlassKlass>                                    +4  +2336
<methodDataKlass>                                       +4  +1800
java.net.URL                                           +15  +1560
java.util.HashMap$Entry                                +29  +1392
[Ljava.util.HashMap$Entry;                              +8  +1344
java.util.LinkedHashMap$Entry                          +15   +960
java.util.LinkedList                                   +23   +920
sun.misc.URLClassPath$JarLoader                         +6   +432
java.util.HashMap                                       +6   +384
java.io.ExpiringCache$Entry                            +11   +352
java.util.jar.JarFile$JarFileEntry                      +3   +336
java.lang.reflect.Constructor                           +2   +240
[Ljava.lang.Object;                                     +2   +208
java.net.Inet4Address                                   +5   +200
sun.nio.ch.SocketChannelImpl                            +1   +176
java.util.LinkedHashMap                                 +2   +160
java.lang.Object                                        +8   +128
java.lang.ThreadLocal                                   +5   +120
java.lang.ClassLoader$NativeLibrary                     +2    +96
[Ljava.net.InetAddress;                                 +2    +88
java.util.ArrayList                                     +2    +80
sun.misc.JarIndex                                       +2    +80
[Ljava.lang.String;                                     +2    +64
java.net.InetAddress$CacheEntry                         +2    +64
java.net.InetAddress$Cache                              +2    +64
java.net.InetAddress$Cache$Type                         +2    +64
sun.misc.URLClassPath                                   +1    +56
java.lang.ref.SoftReference                             +1    +56
[Ljava.lang.ThreadLocal;                                +1    +48
java.util.Stack                                         +1    +40
sun.reflect.NativeConstructorAccessorImpl               +1    +40
java.net.InetSocketAddress                              +1    +40
[Ljava.net.InetAddress$Cache$Type;                      +1    +40
[Ljava.lang.reflect.Constructor;                        +1    +32
java.net.Inet6AddressImpl                               +1    +32
sun.reflect.DelegatingConstructorAccessorImpl           +1    +24
java.util.HashMap$KeySet                                +1    +24
java.nio.channels.spi.AbstractInterruptibleChannel$1    +1    +24
[Ljava.lang.Class;                                      +1    +24
java.net.InetAddress$1                                  +1    +16
sun.net.www.protocol.jar.Handler                        +1    +16
sun.nio.ch.KQueueSelectorProvider                       +1    +16
sun.nio.ch.SocketDispatcher                             +1    +16
java.io.FileDescriptor                                  -1    -24
java.util.Vector                                        -1    -40
java.io.FileInputStream                                 -2    -64
java.util.jar.JarFile                                   -1    -80
java.util.zip.ZStreamRef                                -4    -96
java.util.zip.Inflater                                  -4   -192
java.util.zip.ZipFile$ZipFileInputStream               -12   -672
java.lang.ref.Finalizer                                -17  -1088
Er.. OK I guess that wasn't entirely fair, since 2286 objects (taking 328928 bytes of RAM) have been created by the VM for classes that have been loaded and compiled. So if we discount those, we're still left with 647 objects taking 42816 bytes of RAM... Ahem... Just for a little SocketChannel.

PS: I'm half-ashamed but to produce the output above, I used what could possibly be the most horrible "one-liner" I ever wrote in Python. I put the output of jmap in /tmp/a for the first run and /tmp/b for the second run:
python -c 'load = lambda path: dict((klass, (int(instances), int(bytes))) for (num, instances, bytes, klass) in (line.split() for line in open(path) if line[4] == ":")); a = load("/tmp/a"); b = load("/tmp/b"); print "\n".join("%s\033[55G%+4d\t%+6d" % (klass, diffinst, diffbytes) for (klass, diffinst, diffbytes) in sorted(((klass, b[klass][0] - a.get(klass, (0, 0))[0], b[klass][1] - a.get(klass, (0, 0))[1]) for klass in b), key=lambda i: i[2], reverse=True) if diffbytes != 0).replace(">", "&lt;").replace(">", "&gt;")'