Wednesday, February 6, 2013
Google uses captcha to improve StreetView image recognition
Here's another one:
These were on some Blogger blogs. Looks like Google is using captchas to help improve StreetView's address extraction quality.
Sunday, January 27, 2013
Using debootstrap with grsec
debootstrap
with grsec (more specifically with a kernel compiled with CONFIG_GRKERNSEC_CHROOT_MOUNT=y
), you may see it bail out because of this error:W: Failure trying to run: chroot path/to/root mount -t proc proc /procOne way to work around this is to bind-mount procfs into the new chroot. Just apply the following patch before runnning
debootstrap
:--- /usr/share/debootstrap/functions.orig 2013-01-27 02:05:55.000000000 -0800 +++ /usr/share/debootstrap/functions 2013-01-27 02:06:39.000000000 -0800 @@ -975,12 +975,12 @@ umount_on_exit /proc/bus/usb umount_on_exit /proc umount "$TARGET/proc" 2>/dev/null || true - in_target mount -t proc proc /proc + sudo mount -o bind /proc "$TARGET/proc" if [ -d "$TARGET/sys" ] && \ grep -q '[[:space:]]sysfs' /proc/filesystems 2>/dev/null; then umount_on_exit /sys umount "$TARGET/sys" 2>/dev/null || true - in_target mount -t sysfs sysfs /sys + sudo mount -o bind /sys "$TARGET/sys" fi on_exit clear_mtab ;;As a side note, a minbase chroot of Precise (12.04 LTS) takes only 142MB of disk space.
Friday, November 9, 2012
Sudden large increases in MySQL slave lag caused by clock drift
Seconds_Behind_Master
in SHOW SLAVE STATUS
) would sometimes suddenly jump to 7 hours and then come back, and jump again, and come back.Turns out, the machine's clock was off by 7 hours and no one had noticed! After fixing NTP synchronization, the issue remained, I suspect that MySQL keeps a base timestamp in memory that was still off by 7 hours.
The fix was to
STOP SLAVE; START SLAVE;
Thursday, October 18, 2012
Python's screwed up exception hierarchy
try: # some code except Exception, e: # Bad log.error("Uncaught exception!", e)Yet you need to do something like that, typically in the event loop of an application server, or when one library is calling into another library and needs to make sure that no exception escapes from the call, or that all exceptions are re-packaged in another type of exception.
The reason the above is bad is that Python badly screwed up their standard exception hierarchy.

__builtin__.object BaseException Exception StandardError ArithmeticError AssertionError AttributeError BufferError EOFError EnvironmentError ImportError LookupError MemoryError NameError UnboundLocalError ReferenceError RuntimeError NotImplementedError SyntaxError IndentationError TabError SystemError TypeError ValueErrorMeaning, if you try to catch all
Exception
s, you're also hiding real problems like syntax errors (!!), typoed imports, etc. But then what are you gonna do? Even if you wrote something silly such as:try: # some code except (ArithmeticError, ..., ValueError), e: log.error("Uncaught exception!", e)You still wouldn't catch the many cases where people define new types of exceptions that inherit directly from
Exception
. So it looks like your only option is to catch Exception
and then filter out things you really don't want to catch, e.g.:
try: # some code except Exception, e: if isinstance(e, (AssertionError, ImportError, NameError, SyntaxError, SystemError)): raise log.error("Uncaught exception!", e)But then nobody does this. And pylint still complains.
Unfortunately it looks like Python 3.0 didn't fix the problem :( – they only moved
SystemExit
, KeyboardInterrupt
, and GeneratorExit
to be subclasses of BaseException
but that's all.They should have introduced another separate level of hierarchy for those errors that you generally don't want to catch because they are programming errors or internal errors (i.e. bugs) in the underlying Python runtime.
Saturday, October 6, 2012
Perforce killed my productivity. Again.
Anyways, after a 3 year break during which I happily forgot my struggle with Perforce, I am now back to using it. Sigh. Now what's 'funny' is that Arista has the same problem as Google: they locked themselves in through tools. When you have a large code base of tools built on top of an SCM, it's really, really hard to migrate to something else.
Arista, like Google, literally has tens of thousands of lines of code of tools built around Perforce. It's kind of ironic that Perforce, the company, doesn't appear to have done anything actively evil to lock the customers in. The customers got locked in by themselves. Also note that in both of these instances the companies started quite a few years ago, back when Git didn't exist, or barely existed in Arista's case, so Perforce was a reasonable choice at the time (provided you had the $$$, that is) given that the only other options then were quite brain damaging.
Now I could go on and repeat all the things that have been written many times all over the web about why Perforce sucks. Yes it's slow, yes you can't work offline, yes you can't do anything that doesn't make it wanna talk to the server, yes it makes all your freaking files read-only and it forces you to tell the server that you're going to edit a file, etc.
But Perforce has its own advantages too. It has quasi-decent branching / merging capabilities (merging is often more painful than with Git IMO). It gives you a flexible way to compose your working copy, what's in it, where it comes from. It's more forgiving for organizations that like to dump a lot of random crap in their SCM. This seems fairly common, people just find it convenient to commit binaries and such. It is convenient indeed if you lack better tools, but that doesn't mean it's right.
So what's my grip with Perforce? It totally ruins my workflow. This makes my life as a software engineer utterly miserable. I always work on multiple things at the same time. Most of the time they're related. I may be working on a big change, and I want to break it down in many multiple small incremental steps. And I often like to revisit these steps. Or I just wanna go back and forth between a few somewhat related things as I work on an idea and sort of wander into connected ideas. And I want to get my code reviewed. Before it gets upstream.
This means that I use git rebase very, very extensively. And git stash. I find that this the hardest thing to explain to people who don't know Git. But once it clicks in your mind, and you understand how powerful git rebase is, you realize it's the best Swiss army knife to manipulate your changes and their history. When it comes to writing code, it's literally my best friend after vim.
Git, as a tool to manipulate changes made to files, is several orders of magnitude better and more convenient. It's so simple to select what goes into what commit, undo, redo, squash, split, swap, drop, amend changes. I always feel like I can manipulate my code and commits effortlessly, that it's malleable, flexible. I'm removing some lint around some code I'm refactoring? No problem, git commit -p to select hunk-by-hunk what goes into the refactoring commit and what goes into the "small clean up" commit. Perforce on the other hand doesn't offer anything but "mark this file for add/edit/delete" and "put these files in a change" and "commit the change". This isn't the 1990s anymore, but it sure feels like it.
With Perforce you have to serialize your workflow, you have to accept to commit things that will require subsequent "fix previous commit" commits, and thus you tend to commit fewer bigger changes because breaking up a change in smaller chunks is a pain in the ass. And when you realize you got it wrong, you can't go back, you just have to fix it up with another change. And your project history is all fugly. I've used the patch command more over the past 2 months than in the previous 3 years combined. I'm back to the stone age.
Oh and you can't switch back and forth between branches. At all. Like, you just can't. Period. This means you have to maintain multiple workspaces and try to parallelize your work across them. I already have 8 workspaces across 2 servers at Arista, each of which contains mostly-the-same copy of several GB of code. The overhead to go back and forth between them is significant, so I end up switching a lot less than when I just do git checkout somebranch. And of course creating a new branch/workspace is extremely time consuming, as in we're talking minutes, so you really don't wanna do it unless you know you're going to amortize the cost over the next several days.
I think the fact that P4 coerces you into a workflow that sucks shows in Perforce's marketing material and product strategy too. Now they're rolling out this Git integration, dubbed Perforce Git Fusion, that essentially makes the P4 server speak Git so that you can work with Git but still use P4 on the server. They sell it as "improving the Git experience". That must be the best joke of the year. But I think the reality is that engineers don't want to deal with the bullshit way of doing things Perforce imposes, and they want to work with Git. Anyways this integration sounds great, I would love to use it to stop the pain, only you have to be on a recent enough version of Perforce to be able to use it, and if you're not you "just" need to pay an arm and a fucking leg to upgrade.
My lame workaround: overlay a Git repo on top of my P4 workspace, p4 edit the files I want to work on, maintain the changes in Git until I'm ready to push them upstream. Still a royal PITA, but at least I can manipulate the files in my workspace.
And then, of course, there is the problem that I'm impatient. I can't stand waiting more than 500ms at a prompt. It's quite rare to be able to p4 edit a file in less than a second or two. At 1:30am on Saturday, after a dozen p4 edits in a row, I was able to get the latency down to 300-500ms (yes it really took a dozen edits/reverts in a row to reliably get lower latency). It often takes several minutes to trace the history of a file or a branch, or to blame a file ... when that's useful at all with Perforce.
We're in 2012, soon 2013, running on 32 core 128GB RAM machines hooked to 10G/40G networks with an RTT of less than 60µs. Why would I ever need to wait more than a handful of milliseconds for any of these mundane things to happen?
So, you know what Perforce, (╯°□°)╯︵ ┻━┻
Edit: despite the fact that Arista uses Perforce, which is a bummer, I love that place, love the people I work with and what we're building. So you should join!
Saturday, April 14, 2012
How Apache Hadoop is molesting IOException all day
IOException
.
I'm not even going to debate how checked exceptions are like communism (good idea in theory, totally fails in practice). Even if people don't get that, I wish they at least stopped the madness with this poor little
IOException
.
Let's review again what
IOException
is for:
"Signals that an I/O exception of some sort has occurred. This class is the general class of exceptions produced by failed or interrupted I/O operations."In Hadoop everything is an
IOException
. Everything. Some assertion fails, IOException
. A number exceeds the maximum allowed by the config, IOException
. Some protocol versions don't match, IOException
. Hadoop needs to fart, IOException
.
How are you supposed to handle these exceptions? Everything is declared as
throws IOException
and everything is catching, wrapping, re-throwing, logging, eating, and ignoring IOException
s. Impossible. No matter what goes wrong, you're left clueless. And it's not like there is a nice exception hierarchy to help you handle them. No, virtually everything is just a bare IOException
.
Because of this, it's not uncommon to see code that inspects the message of the exception (a bare
String
) to try to figure out what's wrong and what to do with it. A friend of mine was recently explaining to me how Apache Kafka was "stringly typed" (a new cutting-edge paradigm whereby you show the middle finger to the type system and stuff everything in String
s). Well Hadoop has invented better than checked exceptions, they have stringed exceptions. Unfortunately, half of the time you can't even leverage this awesome new idiom because the message of the exception itself is useless. For example when a MapReduce chokes on a corrupted file, it will just throw an IOException
without telling you the path of the problematic file. This way it's more fun, once you nail it down (with a binary search of course), you feel like you accomplished something. Or you'll get messages like "IOException: Split metadata size exceeded 10000000.
". Figuring out what was the actual value is left as an exercise to the reader.
So, seriously Apache folks...
IOException
!IOException
alone!
Hadoop (0.20.2) currently has a whopping 1300+ lines of code creating bare
IOException
s. HBase (0.92.1) has over 400. Apache committers should consider every single one of these lines as a code smell that needs to be fixed, that's begging to be fixed. Please introduce a new base exception type, and create a sound exception hierarchy.
Updates:
- Apr 15: There is now an issue for HBase to fix their abuse of
IOException
(HBASE-5796). - Will update if someone from Hadoop/HDFS/MapReduce files a similar issue on their side.
Monday, February 6, 2012
Devirtualizing method calls in Java
const
(C/C++) and final
(Java/Scala) are truly here to help the compiler help you. Many things aren't supposed to change. References in a given scope are often not made point to another object, various methods aren't supposed to be overridden, most classes aren't designed to be subclassed, etc. In C/C++ const
also helps avoid doing unintentional pointer arithmetic. So when something isn't supposed to happen, if you state it explicitly, you allow the compiler to catch and report any violation of this otherwise implicit assumption.
The other aspect of const correctness is that you also help the compiler itself. Often the extra bit of information enables it to produce more efficient code. In Java especially, final
plays an important role in thread safety, and when used on String
s as well as built-in types. Here's an example of the latter:
1 final class concat { 2 public static void main(final String[] _) { 3 String a = "a"; 4 String b = "b"; 5 System.out.println(a + b); 6 final String X = "X"; 7 final String Y = "Y"; 8 System.out.println(X + Y); 9 } 10 }Which gets compiled to:
public static void main(java.lang.String[]); Code: 0: ldc #2; //String a 2: astore_1 3: ldc #3; //String b 5: astore_2 6: getstatic #4; //Field java/lang/System.out:Ljava/io/PrintStream; 9: new #5; //class java/lang/StringBuilder 12: dup 13: invokespecial #6; //Method java/lang/StringBuilder."In the original code, lines 3-4-5 are identical to lines 6-7-8 modulo the presence of two":()V 16: aload_1 17: invokevirtual #7; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 20: aload_2 21: invokevirtual #7; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 24: invokevirtual #8; //Method java/lang/StringBuilder.toString:()Ljava/lang/String; 27: invokevirtual #9; //Method java/io/PrintStream.println:(Ljava/lang/String;)V 30: getstatic #4; //Field java/lang/System.out:Ljava/io/PrintStream; 33: ldc #10; //String XY 35: invokevirtual #9; //Method java/io/PrintStream.println:(Ljava/lang/String;)V 38: return }
final
keywords. Yet, lines 3-4-5 get compiled to 14 byte code instructions (lines 0 through 27), whereas 6-7-8 turn into only 3 (lines 30 through 35). I find it kind of amazing that the compiler doesn't even bother optimizing such a simple piece of code, even when used with the -O
flag which, most people say, is almost a no-op as of Java 1.3 – at least I checked in OpenJDK6, and it's truly a no-op there, the flag is only accepted for backwards compatibility. OpenJDK6 has a -XO
flag instead, but the Sun Java install that comes on Mac OS X doesn't recognize it...
There was another thing that I thought was a side effect of final
. I thought any method marked final
, or any method in a class marked final
would allow the compiler to devirtualize method calls. Well, it turns out that I was wrong. Not only it doesn't do this, but also the JVM considers this compile-time optimization downright illegal! Only the JIT compiler is allowed to do it.
All method calls in Java are compiled to an invokevirtual
byte code instruction, except:
- Constructors and private method use
invokespecial
. - Static methods use
invokestatic
. - Virtual method calls on objects with a static type that is an interface use
invokeinterface
.
Anyway, I always imagined that having a final
method meant that the compiler would compile all calls to it using invokespecial
instead of invokevirtual
, to "devirtualize" the method calls since it already knows for sure at compile-time where to transfer execution. Doing this at compile time seems like a trivial optimization, while leaving this up to the JIT is far more complex. But no, the compiler doesn't do this. It's not even legal to do it!
interface iface { int foo(); } class base implements iface { public int foo() { return (int) System.nanoTime(); } } final class sealed extends base { // Implies that foo is final } final class sealedfinal extends base { public final int foo() { // Redefine it to be sure / help the compiler. return super.foo(); } } public final class devirt { public static void main(String[] a) { int n = 0; final iface i = new base(); n ^= i.foo(); // invokeinterface final base b = new base(); n ^= b.foo(); // invokevirtual final sealed s = new sealed(); n ^= s.foo(); // invokevirtual final sealedfinal s = new sealedfinal(); n ^= s.foo(); // invokevirtual } }A simple Caliper benchmark also shows that in practice all 4 calls above have exactly the same performance characteristic (see full microbenchmark). This seems to indicate that the JIT compiler is able to devirtualize the method calls in all these cases.
To try to manually devirtualize one of the last two calls, I applied a binary patch (courtesy of xxd
) on the .class
generated by javac
. After doing this, javap
correctly shows an invokespecial
instruction. To my dismay the JVM then rejects the byte code: Exception in thread "main" java.lang.VerifyError: (class: devirt, method: timeInvokeFinalFinal signature: (I)I) Illegal use of nonvirtual function call
I find the wording of the JLS slightly ambiguous as to whether or not this is truly illegal, but in any case the Sun JVM rejects it, so it can't be used anyway.
The moral of the story is that javac
is really only translating Java code into pre-parsed Java code. Nothing interesting happens at all in the "compiler", which should really be called the pre-parser. They don't even bother doing any kind of trivial optimization. Everything is left up to the JIT compiler. Also Java byte code is bloated, but then it's normal, it's Java :)
Saturday, October 8, 2011
Hardware Growler for Mac OS X Lion
Tuesday, September 13, 2011
ext4 2x faster than XFS?
- CPU: 2 x Intel L5630 (Westmere microarchitecture, so 2x4x2 = 16 hardware threads and lots of caches)
- RAM: 2 x 6 x 8GB = 96GB DDR3 ECC+Reg Dual-Rank DIMMs
- Disks: 12 x Western Digital (WD) RE4 (model: WD2003FYYS – 2TB SATA 7200rpm)
- RAID controllers: Adaptec 51645 and LSI MegaRaid 9280-16i4e
O_DIRECT
on 64 files, for a total of 100GB of data.
Some observations:
- Formatting XFS with the optimal values for
sunit
andswidth
doesn't lead to much better performance. The gain is about 2%, except for sequential writes where it actually makes things worse. Yes, there was no partition table, the whole array was formatted directly as one single big filesystem. - Creating more allocation groups in XFS than physical threads doesn't lead to better performance.
- XFS has much better random write throughput at low concurrency levels, but quickly degrades to the same performance level as ext4 with more than 8 threads.
- ext4 has consistently better random read/write throughput and latency, even at high concurrency levels.
- Similarly, for random reads ext4 also has much better throughput and latency.
- By default XFS creates too few allocation groups, which artificially limits its performance at high concurrency levels. It's important to create as many AGs as hardware threads. ext4, on the other hand, doesn't really need any tuning as it performs well out of the box.
Saturday, August 27, 2011
Hitachi 7K3000 vs WD RE4 vs Seagate Constellation ES
It's not clear what disks sold with the Enterprise™©® label really do to justify the big price difference. Often it seems like the hardware is exactly the same, but the firmware behaves differently, notably to report errors faster. In desktop environments, you want the disk to try hard to read bad sectors, but in RAID arrays it's better to give up quickly and let the RAID controller know, otherwise the disks might timeout from the controller's point of view, and the whole disk might be incorrectly considered dead and trigger a spurious rebuild.
So I recently benchmarked the Hitachi 7K3000 against two other "enterprise" disks, the Western Digital RE4 and the Seagate Constellation ES.
The line up
- Hitachi 7K3000 model: HDS723020BLA642 – the baseline
- Western Digital (WD) RE4 model: WD2003FYYS
- Seagate Constellation ES model: ST2000NM0011
Both enterprise disks cost about $190, so about 90% more (almost double the price) than the Hitachi. Are they worth the extra money?
The test
I ended up using SysBench to compare the drives. I had all 3 drives connected to the motherboard of the same machine, a dual L5630 with 96GB of RAM, running Linux 2.6.32. Drives and OS were using their default config, except the "deadline" IO scheduler was in effect (whereas vanilla Linux uses CFQ by default since 2.6.18). SysBench usedO_DIRECT
for all its accesses. Each disk was formatted with ext4 – no partition table, the whole disk was used directly. Default formatting and mount options were used. SysBench was told to use 64 files, for a total of 100GB of data. Every single test was repeated 4 times and then averages were plotted. Running all the tests takes over 20h.
SysBench produces some kind of a free-form output which isn't very easy to use. So I wrote a Python script to parse the results and a bit of JavaScript to visualize them. The code is available on GitHub: tsuna/sysbench-tools.
Results
A picture is worth a thousand words, so take a look at the graphs. Overall the WD RE4 is a clear winner for me, as it outperforms its 2 buddies on all tests involving random accesses. The Seagate doesn't seem worth the money. Although it's the best at sequential reads, the Hitachi is pretty much on par with it while being almost twice cheaper.So I'll buy the Hitachi 7K3000 for everything, and pay the extra premium for the WD RE4 for MySQL servers, because MySQL isn't a cheap bastard and needs every drop of performance it can get out of the IO subsystem. No, I don't want to buy ridiculously expensive and power-hungry 15k RPM SAS drives, thank you.
The raw outputs of SysBench are available here: http://tsunanet.net/~tsuna/benchmarks/7K3000-RE4-ConstellationES