Version-Control Systems for Linux

(sometimes called Source-code management = SCM or revision-control systems = RCS)

Last updated: Thu Oct 29 23:14:42 PDT 2015

This listing of Linux VCS tools concentrates on VCS as a category distinct from project management, process management, workflow management, build control, release management, trouble ticketing, CASE, and other useful but adjoining functions. I'll try to mention where such extended features are present.

Pointers ([notable]) highlight entries notable in the opinion of this page's maintainer (Rick Moen, rick@linuxmafia.com), to whom complaints^Wcomments can be sent.

Bias & Other Disclaimers:





Contents





AccuRev (AccuRev, Inc.) (link)

AccuRev is a client-server networked, transaction-based system. Automatically versions directories. Provides changeset/transaction-oriented (as opposed to file-based) pre and post triggers that can run on both the client and the server. AccuRev, Inc. (formerly Ede Development Enterprises) is coy about pricing, but in 1999 it was US $750 for a single licence, including a year of support and updates.

Binary-only. Proprietary.



Aegis (link)

Aegis is Peter Miller's transaction-based software configuration management system that enforces a development process requiring that change sets "work" before integration into the project baseline. It calls make and RCS (or similar) for software-building and repository functions. Aegis is local-oriented (non-network-aware).

Very mature. Atomic commits. Supports renames. Poor Win32 support. Heavy security focus and well-developed process integration.

Code is C. Open source (GNU GPL).



AllFusion Harvest Change Manager (Computer Associates, Inc.) (link)

AllFusion Harvest (formerly CCC/Harvest aka Change and Configuration Control/Harvest, which was published by Platinum Technology International, Inc.) is a multi-platform networked VCS that integrates with an optional build system and provides integrated problem tracking.

Binary-only. Proprietary.



Arch (link)

Arch is an advanced VCS specification with multiple, independent but compatible implementations, and several descendants/offshoots. Arch is widely considered more ambitious than Subversion, mainly on account of its support for decentralised repositories: In Arch, any branch or developer's private work area can be treated as a repository of its own, with a global name space for developers, repositories, and branches, which you then periodically merge with a central repository.

Please see also the individual entries for the several implementations and offshoots of this specification: ArX, Barch, Bazaar 1.x, Bazaar, Eva, GNU Arch/tla, Larch.

All the various Arch implementations used to be linked from Mat Kovach's Arch Wiki (formerly Michael Grubb's) — but recently (2005-04) the wiki seems to have dropped all but GNU Arch/tla and Bazaar 1.x.



[notable]ArX (link)

ArX is Walter Landry's (no longer data-compatible) 2003 C++ reimplementation of the decentralised, networked Arch VCS specification, originally inspired in part by dissatisfaction with poor portability and other problems in Tom Lord's original shell-script-based Larch implementation. As an Arch offshoot, ArX is based on tracking deltas (changesets). Archives, patches, and revisions are cryptographically signed using gnupg signatures and SHA-256 hashes. ArX can use ftp, ssh, sftp, http, and http w/WebDAV transport. Internationalised but not localised. Centralised development can be done via a patch queue. Program ports to MacOS X, and in an "embryonic" fashion (2005-01-23) to Win32 (via Cygwin). On Linux, note (large) dependency on gnome-vfs. Beta as of 2005-04. Possibly moribund (2008).

Code is C++. Open Source (GNU GPL).



Archipel (link)

A prototype distributed system coded on Python by "S├ębastien". Repository is stored in a queryable RDF database. Extensible/modifiable using plug-in delta modules for different versioning models. Version information is stored as SHA-1 hashes.

It is unclear at this point (2005-04) whether this software is publicly available in any form, let alone open source.



Barch (link)

Barch (for "binary arch" — now defunct) was Robert Collins's (early) attempt at a compatible C++ implementation of the Arch VCS specification — inspired in part by the need to transcend technical and portability limitations in Tom Lord's then-current shell-script-based implementation, Larch. The project appears to have stalled as of Collins's 0.0.6-DEVEL code released on 2002-12-29. However, more recently, Collins created Bazaar 1.x (now defunct).

Code is C++. Open source (GNU GPL).



Bazaar 1.x (baz) (link)

Bazaar 1.x (executable name "baz") was Robert Collins's fork for Canonical, Ltd. of Tom Lord's GNU Arch ("tla") C-language implementation of the Arch VCS specification.

It aimed to combine the essential features of GNU Arch ("tla") with user interface improvements and Win32 support: Subversion-like diff, switch, import, export, and log commands. A single merge command allowed merging between arbitrary branches, daily builds, internationalisation, Python bindings, Win32 binaries and an MS-Windows GUI, an annotate/blame/praise command, comprehensive documentation, and UI simplification. History was stored separately from the working directory.

As of 2005-08, Canonical decided to de-emphasise work on Bazaar 1.x, put that codebase in maintenance mode, and concentrate on Bazaar ("bzr"), formerly Bazaar 2, as the true long-term successor for GNU Arch/tla. Then, some time during 2006, Bazaar 1.x was discontinued entirely.

Code is C. Open source (GNU GPL).



[notable]Bazaar (bzr) (link)

Bazaar (executable name "bzr"), initially called Bazaar-NG, later called Bazaar 2, is Martin Pool's networked, fully distributed, changeset-oriented variant implementation / offshoot, coded in Python for Canonical, Ltd., based on the Arch VCS specification.

It aims to combine the best feature of all the new VCSes (darcs, Subversion, GNU Arch/tla, Quilt, and BitKeeper) "into a single coherent and simple system", and has a simple CVS-like syntax for common operations like add, mv, diff, status, commit, log, merge, etc. Stores/checks SHA-1 hashes of all patches and hashes of the tree state in each revision. Hashes can optionally be signed. Stores changesets (deltas). Control files are stored inside the working directory tree. Changesets can be sent/received over e-mail, optionally gnupg-signed. Renames, deletions, binaries are versioned. Uses hashes for integrity checks, not for identity. Repository data are stored encoded in UTF-8. Partial trees and per-file histories are not supported.

Though there is no dedicated network server program module, there is built-in support for "pull" synchronisation over a variety of network mechanisms; by default, there is both "push" and "pull" support over sftp transport only, and sftp is supported only if you install the bzrtools extensions (plus Python modules paramiko and pyCrypto).

Bazaar is also the successor to the former Bazaar 1.x ("baz") project, and all baz repositories should by now (2006) have been converted.

Change history is append-only. Tool lets you "cherry-pick" changes from one branch to another. As a further advance on Bazaar 1.x, each checkout is fully usable as a repository. At this date (2005-04), bzr is not yet feature-complete. Modular design: library + command-line client, making feasible use via library call from any other client. All development takes place on branches, which are the highest-level grouping. Forking of branches is by intention frequent and has been made easy. Three-way merge algorithm. Allows merges within an arbitrary graph. History-sensitive merges allow safe repeated merges, merges across renames, and mutual merges between parallel lines. Checkouts default to the last archive you pulled from (as with darcs and BitKeeper).

Bazaar (v. 0.7) was one of three VCSes seriously considered for hosting the very large OpenSolaris Project, and was eliminated on grounds of slowness and high memory usage compared to git and (the winner) Mercurial.

Trent Buck published a comparison of darcs and Bazaar (bzr) in 2006.

Code is Python script: Tool should run anywhere that Python 2.4 and above runs (i.e., Unixes including MacOS X, Win32, etc.). Open source (GNU GPL).



[notable]BitKeeper, formerly BK/Pro, formerly BitKeeper (BitMover, Inc.) (link)

Networked, changeset-oriented system (executable name "bk"). Atomic commits. Supports renames. Supports file/directory copies that retain version history. Has some advanced merge methods, based on line identity. Uses time, rather than sequence, in some places. Repository, which uses weave storage, gets fully replicated onto each developer's system (multiple repositories / staging areas). History is stored with the working directory. Checked-out files must be marked before editing. Patches from others can retain their separate identity even after integration (are not collapsed/rolled up). Has default GUI. Emits lots of noise messages, and has lots of weird commands. Requires both per-file and per-changeset commands. Very space-efficient storage. Has fine-grained pre- and post-event triggers. Can remotely find status of a tree, e.g. parent, number of commiters, versioned files, extras, modified, etc. Binary-only in recent versions. Was used by the majority of core Linux kernel developers until 2005-04, although some declined for licensing and other reasons. See: analysis of architectural advantages over competitors.

Created by talented software engineer Larry McVoy strongly based on his earlier experience at Oracle creating NSElite and then TeamWare.

[Maintainer's personal comment: The obvious comparison to BitKeeper is IBM / Rational's very high-priced ClearCase tool, which I've used in the software industry. Having tried BitKeeper for a bit, I found it technically superior in every way — and it's dramatically cheaper. (On the other hand, 2005 has seen the rise of the remarkable, fast, stable open-source VCS "Mercurial", which should certainly be considered strongly, having much the same strengths except paid technical handholding.)

This document's entry for BitKeeper initially had several mis-statements of fact, unflattering to the company, that I'd picked up by repeating uncritically some Linux developers' on-line assertions. I regret those errors, and caution people to be skeptical of such claims.]



Bitsafe (Bitmanager-Media GmbH) (link)

Bitsafe is a networked VCS coded in Java (JRE/JDK in version 1.4.2 or later required) and back-ended into either Oracle RDBMS or SAP-DB for its repository.

Code is Java bytecode. Proprietary.



/BriefCase 3 Toolkit (Applied Computer Sciences, Inc.) (link)

/BriefCase 3 Toolkit is ACSi's enhanced networked variant on RCS wrapped inside a comprehensive development, release, and lifecycle toolkit. Network access is via rsh transport for coordination of client and server-side scripts, and NFS & pipes for data transport. NFS support has a well-developed locking structure. Administrative and private user tag mechanisms are provided. Client-side work areas can have multiple local replicas, to work on different releases and/or aspects of a project. Import tools are provided for SCCS, CVS, RCS and PVCS data.

Code is Korn shell and awk scripts. Open source (GNU GPL).



CBE (link)

CBE (Code Building Environment) is Thomas Neumann's VCS coded in pure Java with integrated software build functions. Functionality is roughly similar to that of CVS with some new features like renaming files (while still keeping the history) and using a database as backend (optional). Code is said to be alpha-stage (2005-04).

Code is Java. Open source (GNU GPL).



ChangeMan DS (Serena Software, Inc.) (link)

ChangeMan DS is a networked VCS integrated with build control, release management, a programmers' editor, software-distribution tools, and process control. It also integrates with SAP for packaged application management, and with various third-party IDEs. Serena Software is coy about pricing.

The product was formerly known as ChangeMan and before that as Diamond CM — because it was written by Diamond Optimum Systems, which was acquired by Serena Software. (Even earlier, during its origins as an HP-3000 / HP/UX tool, it was called VCS-UX.)

Binary-only. Proprietary.



ClearCase (IBM / Rational Software, Inc.) (link)

ClearCase is a networked VCS whose repository one accesses using a quasi-filesystem of its own design, with integration to external software-build tools and adding some of its own. Supports advanced 3-way merge, versioning of any object (including directories), parallel builds distributed over a network, and triggers for local site customising. Linux usage requires loading a proprietary kernel module, said to freeze up frequently (at least that was true around 2001, not confirmed recently), and compatible only with particular kernels: See the ClearCase Linux Installation Guide Whitepaper for compatibility details.

They're extremely coy about pricing; probably you have to haggle on site-wide terms with their sales team. One claim is that ClearCase licenses tend to cost around US $5000/seat plus 20% per year for support.

ClearCase's history goes back to 1984, when a team at Apollo Computer created DSEE (Domain Software Engineering Environment), a VCS/build system. At the time of HP's purchase of Apollo, they left and formed Atria Software, which then merged with Pure Software, to form PureAtria, which was then bought by Rational Software. In 2002-12, Rational Software was bought by IBM Corp.

Binary-only. Proprietary.



CM+ (Neuma Technology, Inc.) (link)

CM+ integrates networked VCS with process control, build control, configuration management, product management, document management, problem tracking, activity tracking, requirements tracking, and release control. Neuma Technology is coy about pricing.

Binary-only. Proprietary.



Configuration Management Version Control (IBM) (defunct)

Configuration Management Version Control (CMVC) was a network client-server VCS with integrated defect tracking, change management, and configuration management functions. Its actual file-versioning data storage in Source Code Control System (SCCS) or PVCS. It was discontinued some time after IBM acquired Rational Software, makers of ClearCase.

Binary-only. Proprietary.



CMZ (CodeME S.A.R.L.) (link)

CMZ is a local-oriented (non-network-aware) system, supporting a wide range of VCS, coding, editing, and library-management functions. Linux binary is PPC-only (no x86).

Binary-only. Gratis non-commercial usage. Proprietary.



[notable]Codeville (link)

Codeville (executable name "cdv") is Bram and Ross Cohen's fully decentralised (networked) VCS with some advanced merge methods, based on line identity: two-way merge with history. (It allows you to update from or commit to any repository at any time, with no unnecessary re-merges. Supports offline commits: Supports a workflow model where the repository is centralized, but the working tree is used as a branch.)

Repository is stored in a binary BerkeleyDB database. History is stored in the form of changes from old hashed versions. Support for non-ASCII files and some metadata (e.g., execute bit) are still (2005-08) pending. SRP as authentication protocol, network access via (if I understand correctly) custom, built-in network transport to the "cdvserver" piece on TCP port 6601. File and directory renaming are supported. Code is now (2005-08) solid, with a small to-do list remaining. Still has nearly nil built-in documentation, and only a little more on the project Web site. Very well-designed but very unusual merge algorithm (aforementioned two-way merge with history).

Code is Python script (Python 2.3 and up, BerkeleyDB 4.1 and up). Open source (newer BSD licence).



Control-CS (Network Concepts, Inc.) (link)

Control-CS is a networked version-management system. (Earlier version was called Control.) Linux has server-end tool only; client-end software exists only for Win32.

Binary-only. Proprietary.



cscvs (1, 2) (link)

cscvs (now defunct) was a networked, decentralised VCS coded in Python, using CVS as a back-end storage repository, imposing on top of CVS atomic changeset semantics (and with easy migration to arch).

Code is Python script. Open source (BSD-style licence).



CSSC (Free Software Foundation) (link)

CSSC (Compatibly Stupid Source Control) is a simple local-oriented (non-network-aware) reimplementation of the old SCCS (Source Code Control System) system from early Unix (pre-RCS), mostly used for access to old repositories (being not recommended for new data repositories), and (for a while but reportedly not any more), with minor modification, to BitKeeper repositories. Uses weave storage, which makes it an excellent foundation for building advanced VCSs, despite CSSC/SCCS's antiquity. Fast, small, lightweight. Versions files (no changesets). Doesn't handle binary files. Does locking, uses a centralised (local) repository. No merging. Confusing command-line interface.

Code is C. Open source (GNU GPL).



CVS (link)

CVS (Concurrent Versions System) is the ubiquitous old-school default, but has serious flaws: No moves/renames, inefficient with binaries, no versioning of directories, no merge-history tracking (must be done manually through tagging), no atomic commits or retrievals (i.e., no atomic tree-wide operations), interacts badly with backup software, tends to leave stale locks, doesn't deal well with symlinks/special files, often gets into merging snarls, branching is often problematic and requires scrupulous attention to tagging, changes are tracked per-file instead of per-change, development is stagnant. Tagging and branching are expensive operations. No integrity checking: Prone to repository corruption. Derived from script wrappers by Prof. Dick Grune of the Free University of Amsterdam around the local-only RCS system (written in 1984-5 and rewritten in C two years later by Brian Berliner), it added concurrency control, annotations, and other enhancements. History is stored separately from the working directory (in the central repository), and can contain many modules. Working copies are separate and contain one module each. Able to transact data across network connections since 1994, making CVS the first practical network-capable VCS. Existing project history can be revised, but only through special mechanisms. Syncing of repositories is possible using add-on software CVSup.

Code is C. Open source (GNU GPL).

Maintainer's note: If you still use this thing, for heaven's sake upgrade to Subversion, or at least CVSNT.



CVSNT (link)

CVSNT started out to be an NT-only variant of CVS, but is now fully portable. It adds merge tracking, SSPI authentication, and a built-in copy of the PuTTY SSH code. Has per-branch ACLs, remote user administration using "cvs passwd" commands, a separate LockServer instead of filesystem-based locks, Unicode support, more-efficient storage of binary diffs, atomic checkouts, better handling of merges without tagging requirements, additional server triggers, etc. However, it retains many of CVS's disadvantages, including poor handling of renames, etc.

Code is C. Open source (GNU GPL).



[notable]darcs (link)

DARCS (David's Advanced Revision Control System), more often called "darcs", is David Roundy's CVS-replacement VCS, written in Haskell, handling all metadata, supporting fully decentralised repositories and advanced branching / patch-handling: Distributed merge is implemented via patch commutation. History is stored with the working directory. (Control files are stored inside the working directory tree.) Atomic commits. Supports renames. Existing project history can be revised (which is potentially a drawback: Interlopers can change what the history says). Patches from others can retain their separate identity even after integration (are not collapsed/rolled up) — and darcs also has the so-far unique advantage of being able to track inter-patch dependencies, and thus is the canonical example (and, really, the pioneer) of the concept of "cherry-picking" of patches and groups thereof. Based on tracking changesets (deltas). Fully functional (but slower) Win32 port; also works on MacOS X. Attempts to do all work entirely in RAM. Simple to use; easy to learn. Mature tool with active user community. No crypto checksum on the tree; no crypto signatures except on the transport. Does not guarantee than any past revision can be reproduced. Can be extremely slow when resolving merge conflicts.

The Haskell language is a rather obscure functional language, known to but few, and additionally is reported to characteristically suffer unpredictable but sometimes severe performance problems. Because of the language's obscurity, there have been relatively few third-party contributions to the codebase.

darcs also suffers from impenetrable, poorly designed error messages and diagnostics. Each repository can only hold a single branch of a single project: To create a new branch, you must create an entirely new repository.

As of 2005-08, darcs's development branch is now alternatively able to work with git repositories.

David A. Wheeler's 2004-03 SCM essay includes a brief description, and some thoughtful analysis, on darcs.

Trent Buck published a comparison of darcs and Bazaar (bzr) and some comments on darcs kludges in 2006.

Code is Haskell. Open source (GNU GPL).



DCVS (Distributed Concurrent Versions System) (elego Software Solutions GmbH) (link)

DCVS extends the CVS model, adding support for distributed repositories and local lines of development, using a variant of John D. Polstra's file distribution and synchronization program CVSup.

Code is C. Open source (GNU GPL).



Dimensions CM (Serena Software, Inc.) (link)

Dimensions CM is a VCS with revision control, change, build, and release management capabilities.

Binary-only. Proprietary.



Discipline 4GL (Saint Mavris Technology) (link)

Discipline 4GL is a networked VCS, software-development, build control, and release-control tool. Saint Mavris Technology is coy about pricing.

Binary-only. Proprietary.



Eva (link)

Eva (now defunct) was Federico Di Gregorio's compatible Python implementation of the decentralised, networked Arch VCS specification — inspired in part by the need to transcend technical and portability limitations in Tom Lord's then-current shell-script-based implementation, Larch. The project appears to have stalled as of Di Gregorio's 2003-04-13 snapshot code.

Code is Python script. Open source (GNU GPL).



FastCST (link)

FastCST (Fast Change Set Tool) is Zed Shaw's experimental, distributed, networked VCS, [re-]coded in Ruby, from Shaw's C original. (The Ruby version still lacks the original's revision control and encryption features, as of 2005-04.) Supports sending and receiving changesets via POP3 + SMTP; also works over http, http+ftp, or a built-in "serve" command (http access on port 3040). Merge command is (a/o 2005-04) only partially working: Basic merge is implemented, but without conflict resolution.

Code is Ruby. Open source (GNU GPL).



Fossil (link)

Fossil is a distributed version control system, bug tracking system, and wiki/blogging software server, created by D. Richard Hipp, author of SQLite. Atomic transactions, content stored in a SQLite database. Doesn't appear to support moving/renaming files. Practical size limit for checked-in files is about 10MB. No real support for patch management. Some functionality is available from the command line, some only via the embedded Web server.

Code is C. Open source (2-clause BSD).



[notable] git and related "porcelains": (link)



GNU Arch 1.x ("tla") (link)

GNU Arch 1.x (executable name "tla", for Tom Lord's Arch) is Tom Lord's second (compatible) implementation, this time in C, of his decentralised, networked Arch VCS specification. It is now (2006) maintained by Andy Tai. Very complex, arguably overfeatured system. Does star-merge. Patches from others can be collapsed/rolled up upon integration so as to lose their separate identities. Atomic commits. Supports renames. Commits are gnupg-signed. Parseable/scriptable shell interface; uses tar, gzip, and patch. Changelogs are autogenerated. Uses weird, often problematic filenames featuring (e.g.) leading "++", ",,", "=", and "{ ... }" sequences. Uses inode signatures to detect file modifications, leading to some false alarms, e.g., when copying a tree. Uses MD5 hashing (insecure) for archive verification; will eventually move to SHA-1: Only patches (not complete revisions) are signed. Uses sequence, rather than time, to determine precedence of changes. Can use ftp, sftp, WebDAV, and plain http transport. Relies on ssh for remote access, authentication, and confidentiality. Changesets can be also conveyed via e-mail. A dedicated arch-server component isn't really needed, but has been prototyped, or alternatively one could use Colin Walters's Python-based archd software and matching protocol (using TCP port 2420).

Interested parties will find Nick Moffitt's concise and focussed "Arch for CVS Users" tutorial an excellent place to start.

David A. Wheeler's SCM essay points out some of GNU Arch's current (2004-03) problems: (1) The Win32 port is currently missing quite a few features (symlinks, most file permissions, correct handling of newlines), and in general GNU Arch may never be fully functional on non-POSIX systems. (2) The repository's file-naming conventions use very badly chosen special characters that tend to break vi, more, the C shell, various scripting languages, and other primary tools. (3) Automated cache management is missing (making the program slow by default), and is badly needed. (4) Merging breaks if branches aren't either all commit-based or all tag-based, but the tool doesn't enforce that limitation. (5) "mv" and "move" do very different things, and in general the full command set is needlessly complex.

Wheeler considered almost all of these to be short-term problems.

Lord had released, working entirely by himself, three development betas of a GNU Arch 2.0 redesign (executable name "revc") incorporating ideas from Bazaar ("bzr"), git, and Monotone, including a belated switch from MD5 to SHA-1 checksums. However, my guess is that the 2.0 effort (at minimum) is now (2005-08) defunct.

The GNU Project adopted tla as "GNU Arch" in 2003-07.

Code is C. Open source (GNU GPL).

Maintainer's personal note: In my opinion, GNU Arch's implementation flaws are sufficiently grievous that people considering its use should hasten to substitute Canonical's compatible replacement, Bazaar ("bzr"), in its place.

It appears that I'm not alone in this perception: As of 2005-08-15, Tom Lord announced that he was resigning effective immediately as GNU Arch maintainer, endorsed Bazaar 1.x ("baz") as an immediate direct replacement, and expressed hope that Bazaar ("bzr") will eventually take its place, in turn. The main GNU Arch developers other than Lord had already left that project, by that point, and had become Bazaar 1.x coders.

2005-10 update: Developer Andy Tai has taken over the GNU Arch project lead, and it has resumed.



JRMS