Wednesday, July 18, 2007

Learning git-svn in 5min

You are a SVN user and you don't have time to learn new things, here is a 5min course to get started with Git and git-svn.
  1. Import your SVN repository in Git:
    git svn clone -s https://svn.foo.com/svn/proj
  2. Make your own Git branch:
    git checkout -b work trunk
  3. git add the files you changed.
  4. git commit
  5. Want to sync with the remote master SVN repos?
    git svn dcommit

There you go!  And guess what, svn-wrapper supports Git!

Some more details now:
  1. The various -s argument is simply here to tell git svn that you use the standard SVN-style layout (trunk/branches/tags).
  2. You must not work in a remote branch.  That's why the 2nd step is to setup a local branch where your work will happen. I checked-out trunk but you could also checkout branch "1.0" or whatever.
  3. This is the only notable difference with SVN for basic usage.  With subversion, you svn add your files once and then SVN will track them automatically.  Git does not track files but content.  I know this might sound weird and hard to digest when you start using Git, but that's how Git is and there are many reasons why things are such.  So for now, just don't forget that you must git add the files you want to commit.  Alternatively, you can use git commit -a to commit all the files.  Be warned though that if you do this, git will schedule all the files for the next commit.  It will remove all files that disappeared in the mean time too.
  4. Don't forget that the commit will be done in your local repository only.
  5. git-svn will push each commit you made in the remote SVN repository.  Each commit will be pushed separately with the log message you gave to Git.
  6. Someone committed in the SVN repository?  Fetch the new revisions with git svn fetch

Now you can read the previous post in order to find some useful references that are worth reading.  You will quickly see that the time you invest learning Git will pay off (for those of you who are very concerned with ROI). git-svn is ironically the best SVN client IMO.

Some questions you might ask:

Where does git store its stuff?
In the single .git folder at the root of the working copy.  Everything is there, you don't have .git folders all over the place like .svn folders.

Where are my branches and stuff?
They are in .git/refs/remotes/, you can easily switch between branches with git checkout.

What about my svn:ignore?
You can import them in git with:
(echo; git-svn show-ignore) >> .git/info/exclude

What about my svn:externals?
Sorry, they are not yet supported by git-svn.  But you're not lost!  Create another git repository with the svn:external repository and put that repository where it's meant to be and checkout the revision that was pinned in the SVN.  Look near the end of .git/svn//unhandled.log, you'll see:
rREVISION1
+dir_prop: trunk svn:externals external_name%20-r%20REVISION2%20URL

This tells you that at REVISION1 in the branch , the svn:externals pinned the revision REVISION2 of URL as external_name.
You need to find the sha1 hash of this revision in git, enter the "external" repository and do:
grep -r rREVISION2 .git
.git/logs/refs/remotes/trunk: [...] rREVISION2

There you go, you can simply issue:
git checkout

Still not convinced by Git?
  • Git is way faster than SVN
  • Git is distributed (you can work offline which is a great advantage for laptop users)
  • Git makes branching and tagging extremely cheap and convenient.
  • Most important: Git makes merging a trivial yet powerful operation.  Subversion is flawed in this respect, merging is a pain, you have to manually track the last revision that was merged, you loose the history, etc.  Git does not have all these disadvantages.  Merging is done right: fast, easy, reliable.  As a bonus, you even get less conflicts.
  • Git has tons of sexy features that SVN will probably never have and SVN users can only dream about.  For instance, Git can instantly tell you where does THIS LINE come from, even if this line moved across different files over years.
  • Git is safe and reliable.  We also use versioning systems because we want to keep a safe backup of all the history of a project.  But servers crash, filesystems get corrupted, we all have this kind of problem. Sometimes malicious people try to fiddle with the history on well-known public servers.  Git checks every single thing it controls with sha1 sums, there is no way you can screw things up without noticing.  This reason is actually good enough that everyone actually carrying about their code should switch to Git right now. With a single 40-bytes sha1 hash, you can make sure that, not only a single revision is OK, but that the entire history, all the files and stuff straight from the beginning until this revision are OK.
  • Git makes it a lot easier for everyone to contribute.  No endless delicate political discussions about commit access, people pull changes from each other, usually from the people they trust.  Everyone maintains their own branches and publish only what they want to publish.
  • Git is highly optimized.  The first thing people usually worry about is "OMG, this is gonna take a hell lot of space if I gotta import the entire history on my local hard drive".  First off, you don't have to import everything if you're wrapping SVN repositories with Git (see the links in my previous post).  Second thing, most of my Git repos are actually smaller than the same working copy in SVN (even though the Git one actually has the entire history it its .git!)  I have an example at hand, a 1340 revision SVN working copy freshly checked out.  It's 7.6MB.  The same working copy in Git but with the entire history of the 1340 revisions (for all branches and tags) is only 6.6MB.

24 comments:

Anonymous said...

For the snv:external stuff the second step/grep can be replaced by: git checkout `git svn find-rev r1000`

Anonymous said...

#To sync with svn the other way round:

#fetch changes
git svn fetch

#put fetched changes in your working branch
git svn rebase

Ian Monroe said...

"Git makes it a lot easier for everyone to contribute. No endless delicate political discussions about commit access, people pull changes from each other, usually from the people they trust. Everyone maintains their own branches and publish only what they want to publish."

A pointless idea on a git-svn tutorial. And Linux is the only project I know of that does this... and it has more politics then most.

tsuna said...

That's true, but that's a real point in favor of Git. I had many of these endless discussions and I'm sure at least some of them could have been avoided with a distributed model.

Hendy said...

Wow... bravo article!!

I'm doing all my Subversion work in git right now :)

Anonymous said...

Point against git: there is no TortoiseGit for it on Windows, and thus, only ultra-nerds use it. This is something only supportable for nerds, and not for users in a commercial environment. Everything can be scripted and be brought into a usable state, but this is something only the nerds want to do. Other users just want to use an SCM system, and not masturbate over the greatness of their version control system of the day.

tsuna said...

That's not true, Git comes with many graphical user interfaces available on Windows. The official distribution is msysGit, it comes with git-gui and gitk. There is also qgit. These are only 3 of them but I know there are others. I admit that most of them are still in "beta" stage, although they are usable, and Git probably doesn't promote them enough, but saying that it's only for ultra-nerds that don't work with Windows or that it needs to be scripted to be brought in usable state is total rubbish. It clearly shows that you did not really bother to try Git.

Anonymous said...

7.6M to 6.6M? Something must be wrong. Did you run git-gc ?

tsuna said...

The project now has 1809 revisions and my working copy weights roughly 4M, my .git is also about 4M after a 'git-gc --prune --aggressive' and I have an additional 1.3M of meta-data for git-svn under .git/svn. Anyways, it's quite common to have .git repositories with the entire history of a project that weight less than an actual working copy. The point is, people are usually afraid of the space consumed by the repository whereas they shouldn't.

Anonymous said...

It is not true that git commit -a will schedule files that it does not know about. You still have to add them first with "git add". Essentially git commit -a is similar to "svn commit" and is actually what I normally use.

Samuel A. Falvo II said...

@setok: I suggest you read the git-commit manpage again. The -a flag tells commit to automatically seek out modified files, new files, etc. and automatically git-add them first. It's behavior is most definitely NOT equivalent to "svn commit".

Just clearing up a common misconception.

pilif said...

Samuel, I think it is you that misread the manpage. Or maybe the behaviour of -a has changed recently.

OPTIONS
-a|--all
Tell the command to automatically stage files that have been modified and deleted, but new files you have not told git about are not affected.

(emphasis is mine)

this clearly states that files not added by "git add" will /not/ be commited. Thus, "git commit -a" is the closest thing to subversion's commit.

Philip

Tim Harper said...

I always like to add --prefix=svn/ when I'm cloning git repositories - it's nice to have the branches prefixes and gets rid of the "ambiguation" errors if you're like me and create local branches for each of your subversion branches.

Anonymous said...

@Anonymous: wow, thanks for your feedback eighty-percenter. Go back to your desk and struggle for a day to understand some verbose api so you can spend another 3 days getting some glue code to work that a twenty-percenter wouldn't even bother factoring in to the time needed for a larger task.

Anonymous said...

I always like to add --prefix=svn/ when I'm cloning git repositories

Consider adding some color to your branches as well via: git config --global color.branch auto

Fabien Engels said...

Hi,
Nice introduction but i've got a problem,
I execute :
git-svn clone -s --username=fabien.engels http://myserver/svn/myproject destination_directory

During the cloning process, i can see message like this one :

Found possible branch point: http://myserver/svn/myproject/trunk => http://myserver/svn/myprojecy/tags/0.5, 88
Found branch parent: (tags/0.5) e45cd56cca1ee72417af0c3b83244fcfc6ce5978

But when i go inside my destination directory and i execute git tag -l, i got nothing like git don't see my tags.

Is it ok ? or i messed something ? :)

Anonymous said...

Hi,

I've just started using GIT this week, currently the project i'm working on is held in subversion. I tested git svn clone with a small test project (about 10 files) which worked a treat.

This morning i decided to test the clone with the full project i'm working on (11,000 files) and I get the error message Checksum mismatch: vn2.sln 0f7a82f1d38b819 expected: fde799e5ba0d1d07e6b539016bea3260
got: e71db1010a0da06ea76d4163c452df72

Can someone help with why this error is happening? Is there an issue with the GIT clone and large repositories?

Thanks in advance for your help,
G.

Scott Stout said...

I have the same issue: Checksum Mismatch.

Reid said...

This is an 3-year-old posting, but it comes up in Google searches, so I thought I would point out that there is now a Tortoise-git:

http://code.google.com/p/tortoisegit

I was looking to see if there was a wrapper for git to make it look like svn, perhaps even to the point of working correctly with scripts, etc?

tsuna said...

Reid, a few years ago I wrote a script that behaves like the "SVN" command but can also use Git under the hood: http://repo.or.cz/w/svn-wrapper.git

It's not maintained and hasn't been used in like 3 years but if you really need something that provides a SVN-cli for Git, maybe you could use it as a starting point.

tfnico said...

If anyone are still looking for resources on working with git-svn, I've collected a bunch of how-tos and screencasts here:

http://www.tfnico.com/presentations/git-and-subversion

Hope you find them useful :)

ambar said...

More than four years on, this article is incredibly useful. Thanks!

mike said...

I've used git-svn against two different >1 GiB SVN repositories, and in both cases the git repo was the same size or slightly smaller than a single SVN checkout...

dare to win said...

how to avoid git to bring all the histories and file that do not even exist in repositories and just coming because they have been maintained in histories.


http://techidiocy.com/understand-git-clone-command-svn-checkout-vs-git-clone/