Saturday, October 6, 2012

Perforce killed my productivity. Again.

I've used Perforce for 2 years at Google.  Google got a lot of things right, but Perforce has always been a pain in the ass to deal with, despite the huge amount of tooling Google built on top.  I miss a lot of things from my days at Google, but Perforce is definitely not on the list.  Isn't it ironic that for a company that builds large distributed systems on commodity machines, their P4 server had to be by far the beefiest, most expensive server?  Oh and guess what ended up happening to P4 at Google?

Anyways, after a 3 year break during which I happily forgot my struggle with Perforce, I am now back to using it.  Sigh.  Now what's 'funny' is that Arista has the same problem as Google: they locked themselves in through tools.  When you have a large code base of tools built on top of an SCM, it's really, really hard to migrate to something else.

Arista, like Google, literally has tens of thousands of lines of code of tools built around Perforce.  It's kind of ironic that Perforce, the company, doesn't appear to have done anything actively evil to lock the customers in.  The customers got locked in by themselves.  Also note that in both of these instances the companies started quite a few years ago, back when Git didn't exist, or barely existed in Arista's case, so Perforce was a reasonable choice at the time (provided you had the $$$, that is) given that the only other options then were quite brain damaging.

Now I could go on and repeat all the things that have been written many times all over the web about why Perforce sucks.  Yes it's slow, yes you can't work offline, yes you can't do anything that doesn't make it wanna talk to the server, yes it makes all your freaking files read-only and it forces you to tell the server that you're going to edit a file, etc.

But Perforce has its own advantages too.  It has quasi-decent branching / merging capabilities (merging is often more painful than with Git IMO).  It gives you a flexible way to compose your working copy, what's in it, where it comes from.  It's more forgiving for organizations that like to dump a lot of random crap in their SCM.  This seems fairly common, people just find it convenient to commit binaries and such.  It is convenient indeed if you lack better tools, but that doesn't mean it's right.
Used to be a productive software engineer, took a P4 arrow in the knee
So what's my grip with Perforce?  It totally ruins my workflow.  This makes my life as a software engineer utterly miserable.  I always work on multiple things at the same time.  Most of the time they're related.  I may be working on a big change, and I want to break it down in many multiple small incremental steps.  And I often like to revisit these steps.  Or I just wanna go back and forth between a few somewhat related things as I work on an idea and sort of wander into connected ideas.  And I want to get my code reviewed.  Before it gets upstream.

This means that I use git rebase very, very extensively.  And git stash.  I find that this the hardest thing to explain to people who don't know Git.  But once it clicks in your mind, and you understand how powerful git rebase is, you realize it's the best Swiss army knife to manipulate your changes and their history.  When it comes to writing code, it's literally my best friend after vim.

Git, as a tool to manipulate changes made to files, is several orders of magnitude better and more convenient.  It's so simple to select what goes into what commit, undo, redo, squash, split, swap, drop, amend changes.  I always feel like I can manipulate my code and commits effortlessly, that it's malleable, flexible.  I'm removing some lint around some code I'm refactoring?  No problem, git commit -p to select hunk-by-hunk what goes into the refactoring commit and what goes into the "small clean up" commit.  Perforce on the other hand doesn't offer anything but "mark this file for add/edit/delete" and "put these files in a change" and "commit the change".  This isn't the 1990s anymore, but it sure feels like it.

With Perforce you have to serialize your workflow, you have to accept to commit things that will require subsequent "fix previous commit" commits, and thus you tend to commit fewer bigger changes because breaking up a change in smaller chunks is a pain in the ass.  And when you realize you got it wrong, you can't go back, you just have to fix it up with another change.  And your project history is all fugly.  I've used the patch command more over the past 2 months than in the previous 3 years combined.  I'm back to the stone age.

Oh and you can't switch back and forth between branches.  At all.  Like, you just can't.  Period.  This means you have to maintain multiple workspaces and try to parallelize your work across them.  I already have 8 workspaces across 2 servers at Arista, each of which contains mostly-the-same copy of several GB of code.  The overhead to go back and forth between them is significant, so I end up switching a lot less than when I just do git checkout somebranch.  And of course creating a new branch/workspace is extremely time consuming, as in we're talking minutes, so you really don't wanna do it unless you know you're going to amortize the cost over the next several days.

I think the fact that P4 coerces you into a workflow that sucks shows in Perforce's marketing material and product strategy too.  Now they're rolling out this Git integration, dubbed Perforce Git Fusion, that essentially makes the P4 server speak Git so that you can work with Git but still use P4 on the server.  They sell it as "improving the Git experience".  That must be the best joke of the year.  But I think the reality is that engineers don't want to deal with the bullshit way of doing things Perforce imposes, and they want to work with Git.  Anyways this integration sounds great, I would love to use it to stop the pain, only you have to be on a recent enough version of Perforce to be able to use it, and if you're not you "just" need to pay an arm and a fucking leg to upgrade.

My lame workaround: overlay a Git repo on top of my P4 workspace, p4 edit the files I want to work on, maintain the changes in Git until I'm ready to push them upstream.  Still a royal PITA, but at least I can manipulate the files in my workspace.

And then, of course, there is the problem that I'm impatient.  I can't stand waiting more than 500ms at a prompt.  It's quite rare to be able to p4 edit a file in less than a second or two.  At 1:30am on Saturday, after a dozen p4 edits in a row, I was able to get the latency down to 300-500ms (yes it really took a dozen edits/reverts in a row to reliably get lower latency).  It often takes several minutes to trace the history of a file or a branch, or to blame a file ... when that's useful at all with Perforce.

We're in 2012, soon 2013, running on 32 core 128GB RAM machines hooked to 10G/40G networks with an RTT of less than 60µs.  Why would I ever need to wait more than a handful of milliseconds for any of these mundane things to happen?

So, you know what Perforce, (╯°□°)╯︵ ┻━┻

Edit: despite the fact that Arista uses Perforce, which is a bummer, I love that place, love the people I work with and what we're building.  So you should join!


Warren said...

You missed the single WORST thing about perforce. No "ignore file". The java plugin for eclipse has "ignore" support, but not the p4 gui or command line.


Anonymous said...

I used to be subjected to the p4 evil... but rather that lying a git repo on top of P4 as you appear to be doing I just use git-p4

the only downside to git-p4 is that you have to have one git repo per perforce branch. BUT, aside from that restriction it's way better than overlaying a git repo on top of a perforce repo.

Definitely give it a shot.

Anonymous said...

I'm certainly not saying Perforce is in any way good, but it in fact does have an ignore file as of the 2012.x versions. Use that, change your workspace options to "allwrite" from the "noallwrite" default (which makes all files writable by default) and use the new "p4 status" and "p4 reconcile" commands to check for changes and stage them for commit and it's ALMOST productive - at least on a single branch.

Anonymous said...

I've worked at an organization that used perforce and refused to switch to git. It was horrendous.

Arista is great and I have friends beyond high intelligence who work there. But, since you have the same problem, a perforce version history (and I consider it a plague), I wouldn't work there. Good version control is that important to me.

tsuna said...

@Anonymous the primary reason why I decided to join Arista is that I get to work with incredibly bright people. And I wasn't disappointed. I feel like I felt when I worked at Google.

This means that people here are very much willing to evolve if the alternative is technically superior. It's just that we need to take the time to adjust our tools, have a transition period, etc. Because we're in a fast growth mode and have to focus primarily on shipping our products, we have to find a balance between improving tools/workflows and making the product move forward.

Thankfully Ken Duda (one of the three founders and our CTO) puts a strong emphasis on building and improving tools, and he's very open minded. So with this sort of support across the entire organization, it's very much possible that things change in the future. It's just a matter of time.

Unknown said...
This comment has been removed by the author.
Anonymous said...

I absolutely agree with you that Perforce is a pain in ass.
Perforce is a sh*t, nightmare.
It makes my productivity very low. I take a couple of minutes to fix a few lines of code and Perforce takes me several hours to submit.

B4 Perforce, I use SVN, SVN is perfect for my work.

Why Perforce can survive?

Notetaker said...

Perforce sucks, but there absolutely is ignore file support. Environment variable P4IGNORE contains name of file with list of patterns to ignore. How could you not know that??

Anonymous said...

I will probably be a minority here, but working with perforce and git for a few years, I would say P4 rocks. You cannot take just a part of a project in Git, and sometimes at big companies you just need that. If you want to run a build with unit tests that runs for 20 minutes, you will need another clone in git, what is worse that having another client view in p4. If you need to have too many enlistments in p4 because of many branches - that means that your company's processes are terrible, what has nothing to do with p4. Access control for parts of the repository - such thing just does not exist in git and it cannot :-(
Also, saying that a thing like rebase does not exist in perforce means you did not work with it enough - please rtfm for integrate command and see :-)
And by the way, people who only go with changes via GitHub or the likes, do not really differ from p4 users. So all those religious wars again....

tsuna said...

Git is certainly far from being perfect, but it's seriously hard to get good productivity with Perforce for the use cases you mentioned. Sure, getting the initial clone with Git may be more expensive, but once you have it, you can switch between branches more easily, than with Perforce constantly having to re-download everything from server (which also makes it impossible to work offline).

Also Perforce's locking strategy is a disaster, and any multi-hundred employee company experiences significant productivity loss because of this alone.

Anonymous said...

My view is the opposite, used to have Perforce but my company switched to Mercurial three years ago.

It's been a hell!

I never spent more than 3-4 minutes a day with Perforce. Now I spend at least an hour with mercurial queues, rollback, cloning, pulling/updating, authentication problems, forest extension. It takes forever.

Maybe if the project is small (<100K LOC) git/hg works well, but if's large (>10 M LOC) it's such a pain.

Gary Fry said...

Good post- P4 can suck in some ways of working.

New workflow paradigms have evolved, making P4 somewhat obsolete in contrast.

With this view in mind, some people would be quite resolute about not working at P4 sites. I see their logic - it's easy to assume where a company is at by looking at their practices and toolsets...

However, even this assumption is riddled with dragons. Many organisations claim to be bleeding edge based on the fact that they are using bleeding edge technologies. Yet it doesn't mean in any way that they're adopting this tech the properly!

Perhaps, it's nothing more than a ruse to attract the more picky engineers out there? Nothing wrong with that; all is fair in love and war I guess... however nobody likes to be tricked into something, right?

So my point is, be diligent with the things you care about during the interview process...

That said, whatever the reality, it's always up to the professional engineer to leave things in a better state than when they arrived... so make that change!

Anonymous said...

Excellent post!

As a point of record, I believe p4 is evil but it's just a tool, albeit an evil one.

I work on git -> p4 migration tools in the same vein as git-p4 but with better bidirectional support.

At the end of the day it's just two different takes on a files storage system.

What's really interesting is that git's blob based key value database is a really awesome invention, epsecially compared to p4's old file system approach which is mired in old RCS stuff.

To the people promoting perforce, please understand you have a big disconnect with respect to git versus p4. P4 is usually one huge mass of files, and ALL KINDs of CHAOS can happen in those files, and it central, so changes are blasted to everyone like a git force push.

For well organized git projects, repositories should be small, at about one repo per build or artifact, and then you use a binary storage tool like Artifactory to share components.

So you create access control at a much finer level than p4, at the repo level. In contrast, access controls are actually much harder in p4. (So anonymous got that backwards.)

If you put 10M LOC in one component git will work but slowly, and I'd be quite surprised if p4 would scale without very careful client specs.

Looking at GitHub on the other hand shows that a git based solution (git core + something) can be a super scalable solution.

The p4 ecosystem simply hasn't kept up. While you have p4 + homegrown, the real world has git + ecosystem.

This is the type of scenario that crushed RIM while Android is the clear winner. The same is true for git, the ecosystem is so much bigger than p4 that in the end perforce really has very little chance of survival.