Although I really like a lot of the features that darcs offers, I can’t really see it working out as a tool for me in the long run. This is the first of a pair of entries on the obstacles to my adoption of darcs.
I want to use a version control tool that will work just as well for me at home as it does at work. That means it needs to scale. Not just in terms of its technical capabilities but in terms of the development model that it pre-supposes.
The distributed development model is substantially different from all the preceding forms of centralized repository use. I think it is quite a bit more sophisticated but not always needed (YAGNI).
Indeed I have the feeling that the “natural evolution” of the usage of source control is that we begin in the first age: no control at all. Every change to a file is blindly saved over the previous version.
The next step is copying. At this stage we make a million copies of files. Do you have folders called Backup or Archives? If you do, you may be in the copying stage. This stage is notorious for requiring large amounts of disk space (“delete it? But I might need it! I’ll just make a copy here in my Saved Backups folder.”).
Moving on we come to versioned “in-place” systems like RCS or darcs without networking. All content is placed under version control and the database is either distributed throughout the content (like RCS) or stored centrally but local to the content (like darcs). Many groups of users never reach this stage. Authors and artists and scientists who have worked with computers for over a decade still do not grasp the most basic of versioning concepts (although they are more than happy to create a million copies of each document). In many cases this is because the models provided did not suit their methods of working. The only barrier to using the “in-place” versioning systems are the commands required to explicitly manage the versions since the workflows of the development process remain essentially the same.
In contrast, in the next stage, we move to centralized development models where we work together by synchronizing our work via a central repository. Each of us has our own working copy. This is a very confusing model for a lot of people. The model is further complicated by the fact that it usually will use branching and merging as a first class content management strategy. Previous “in-place” versioning treated branching and merging more like working on a copy of the content than actual parallel development. Now also we begin to seriously take notice of changesets and tags and begin to manage them as part of our strategy.
The final step (or rather, the step beyond which I do not understand source control) is that we each get our own repository and all we do is send the difference between ours and other developers’ around to each other. This is the most complicated of all since developers need to accurately coordinate the exchange of patches because there is no central repository with which they can all synchronize. An organization similar to the Linux kernel development team would be one way of managing that exchange.
Each step along this path of evolution provides greater benefits: finer-grained control over content recovery in the event of disaster, faster recovery, simplified backup routines, etc. But each also incurs a greater cost up-front. I think that the benefits of going full-on distributed will remain less than the costs in my day-to-day work. Thus, if I want to have an SCM tool I use both at home and at work, I’ll need to really work hard to figure out whether the use of darcs is worthwhile!
