Guidelines for the use of CVS

$dotat: doc/web/writing/cvs-guidelines.html,v 1.10 2003/03/02 19:33:04 fanf2 Exp $

0. Introduction

This document explains some guidelines for using CVS that I wrote for use within Demon Internet which I believe are also useful in the wider world. It assumes that you already have some basic knowledge of CVS; if you need tutorial material then you should look at Pascal Molli's CVS page and in particular Per Cederqvist's manual.

When it is used effectively, CVS is a valuable tool for developers because it records the history of changes to the files stored in the repository. Therefore, many of these guidelines aim to aim to make the history as useful as possible, both to the original developer and to people who maintain the code later.

CVS is not forgiving of mistakes: once a change has been made to the repository it usually cannot be reversed. Therefore, some of these guidelines explain how to perform various tasks in a way that experienced CVS users consider to be correct.

2. Basic points

This section describes things that are generally applicable when using CVS; guidelines that are more specific to particular tasks or uses are described in the other sections.

2.1. When to check in

Check in early, check in often. When you have made a change that works, check it in. Check in separate changes in separate commits (as much as possible). Don't be shy to check in work-in-progress, so long as it is minimally functional, or at least compilable without errors.

2.2. Commit messages

Use meaningful commit messages. Explain what bug the commit fixes, or what features it adds. Don't be too concise: "fixed typo" is too short; "fixed typo in error message" or "fixed typo in function name" is OK. The aim is to make it easier to find the desired change easily from just the commit messages (e.g. presented by cvsweb).

The converse of this is including too much information. CVS automatically maintains information like the date and time of the commit, who made the commit, what code was changed, etc. You don't need to include it in the commit message yourself.

2.3. Using tags

If in doubt, lay down a tag. Tags are useful for pinning down a particular version of the code, e.g. one that is being run in service, or just before a big change or import. They are also used to identify branches. Tag names should be short and meaningful, like variable names. For example, webmail-19990811, pre-new-resolver, fanf-patches, corresponding to the uses mentioned above. Tags should be commented in the modules file.

2.4. The modules file

Comment the modules file. It defines the modules in the repository, which in the simplest case are just aliases for a directories in the repository. CVS can also combine several directories that together form a module. For each module in the file there should be a comment describing the contents of the module, when it was created and who by, and a description of the tags and branches used by the module. (Tags don't get commit messages of their own, hence the latter requirement.)

3. Code

Most of the guidelines in this section are common sense, but it's worth while re-iterating them in the context of CVS because it has implications that might not be immediately obvious.

3.1. Never reformat code

Never, ever reformat code. This is a really bad thing to do because it makes diffs hard to understand and apply. Upstream authors won't accept patches against reformatted code. Bugfixes and patches against the upstream code won't apply. New versions of the upstream code can't be imported. Real changes get hidden in the mass of reformatting.

No-one's favourite coding style is significantly better or worse than anyone else's so reformatting code provides no advantage to oppose the disadvantages.

3.2. Format code consistently

Use the same coding style as the code you are editing. This is a corollary to the previous subsection. It is easier for people reading the code if it uses consistent layout rules throughout, so when you are editing someone else's code the code you add should be in the same style.

3.3 Tab settings

Tabs are eight characters wide. This is also a corollary to the previous subsections. Although indentation sizes vary greatly, tabs are almost universally eight characters, so using a different setting is liable to cause confusion or even reformatting. A four character tab might suit your indentation style, but the rest of the world will think your code is a mess.

3.4. Comments

Commit messages are not a substitute for comments, or vice versa. Comments should describe data structures and why the code does what it does; commit messages should explain why code was changed.

3.5. CVS ident strings

Include CVS $Header$ strings in your code. This makes it easier for people to know which version of a file they have and where it came from, so that they can usefully refer to the file's CVS history to find out about bugs and fixes, etc. C source code files should each have an external declaration near the top like static const char *const cvsid = "$Header$"; so that the version information is included in the compiled binary, and C header files and scripts should have the $Header$ string in a comment near the top.

If your repository is configured appropriately, use the custom tag instead of $Header$ . See the section on custom tags for more details.

4. Documentation

This section is similar in intent to the previous one.

4.1. Try not to reformat documenation

The reasons for this guideline are similar to the rule about not reformatting code, but documentation is easier to read if paragraphs are wrapped properly, etc. Therefore, when you edit a paragraph, it's a good idea to re-wrap it, but don't gratuitously change the rest of the document.

4.2. CVS ident strings

Documentation should also include the CVS $Header$ string or appropriate custom tag near the top.

5. Importing code

Importing code is reasonably simple, but care must be taken because a careless import can make a mess of the repository which may be really hard to fix.

5.1. Importing local code

The procedure is as follows:

Choose a location in the repository, $loc. This may be either in your own area under a directory named by your username, or in a directory used to keep software related to a given service or function together. Try to keep the repository tidy.
Choose a vendor tag $v and a release tag $r. The vendor tag can be either your company name or your username; the release tag can be something like "start" or "initial".
If this is a new project without existing files, then create an initial empty directory structure on your workstation. If not, why didn't you import it earlier?
In the top directory of your project type cvs import $loc $v $r (filling in the variables with the appropriate values) and then enter an appropriate commit message, e.g. "initial import of my foo program which bars customers".
Change to the next directory up, move the original project to a place where the checkout won't interfere with it, and "cvs checkout" the CVSed version of the project. If all is well you should now have two identical copies of your project, modulo CVS directories, etc. The old copy can be deleted, and the new version becomes your working copy.
Add an entry for your project to the modules file, unless it's a new part of a bigger project.

5.2. Importing upstream code

The procedure here is basically the same as the one described in the previous section, but you must consider the following points:

Beware upstream code that came from a CVS repository itself. You will probably want to examine any .cvsignore files since they will usually list generated files such as configure scripts which are part of the release tarball but which are not wanted in the upstream CVS repository. You probably want to import everything in the release tarball so find . -name .cvsignore | xargs rm files is usually the thing to do.
The vendor tag should be the vendor's real name, e.g. "ISC" for the distributors of bind and inn, etc.
The release tag should be the name of the software and the version number; note that hyphens and dots should be replaced by underscores. E.g. "bind_8_2_1" or "inn_2_2".
The tags should be documented in the modules file.
The "cvs import" command is performed in the top directory of the unpacked upstream source tarball. Sometimes software comes in separate tarballs (e.g. source and documentation) and these should be unpacked into their own directories under a new top directory.
The commit message should also mention where the software came from, e.g. a URL like <ftp://ftp.isc.org/isc/bind/src/8.2.1>.

5.3. Updating upstream code

Again the procedure is similar, but there are a couple of steps that must be added before and after the main procedure:

Before importing the new upstream source, tag the locally modified version: in the top of your working tree for the project type e.g. "cvs tag bind_8_2_1_local" using the previous version number. Alternatively you can use a tag like before_bind_8_2_2. This makes it easy to retrieve this version of the code in the future. Ensure the tag is documented in the modules file.
Import the new upstream version as above. The tarball is unpacked into a new directory tree and imported from there. The vendor tag must be the same as before, the release tag should reflect the new version number, and the commit message needn't mention the distribution site unless it has changed.
After importing you will probably have to resolve conflicts; most of the ones created by the import can be resolved by CVS automatically, but there may be conflicts caused by local modifications that must be resolved manually. CVS will tell you the command to run to resolve the conflicts; as before care should be taken to avoid mixing up the pristine upstream source, your old working directory, and the newly checked out source, by moving directories that may be overwritten out of the way.
After CVS has resolved what conflicts it can, fix any remaining ones. They can be found in the code marked with lines containing "<<<<", "====", ">>>>". Having done this, check in the updated code. A simple commit message like "resolve import conflicts" is fine.
If you used the before_ style of tag in the first step above then you might also want to add a post-import tag at this point, e.g. after_bind_8_2_2.

6. Handling tricky situations

Because of limitations in CVS certain tasks are inherently difficult, particularly recovering from mistakes. Although changing the repository directly is nearly always a Really Bad Idea sometimes it cannot be avoided. These guidelines explain what to do in these situations.

6.1. Creating directories

Use `cvs import` to create new top-level directories. i.e. follow the relevant parts of section 5.1 to add a directory to the repository. Subdirectories of existing directories can be added by creating them in your working directory and then using cvs add - the directory will be created immediately so you don't need to do a cvs commit aas well.

6.2. "Whoops! I checked in the wrong thing!"

Once a change has been committed you cannot un-commit it. You have to reverse the change and check in a new revision with the old code.

Sometimes you might have a number of changes in your working copy which should be committed separately but accidentally get committed all at once with a commit message that's only appropriate to one of the changes. The safe thing to do is revert the inadvertent commits then re-commit them with the right message; editing the repository directly is possible but foolishly dangerous.

6.3. "Whoops! I cocked up a `cvs import`!"

Getting an import right is important because it affects the long-term usefulness of the repository. Check import commands particularly carefully before running them!

If you do make a mistake, the solution depends on exactly what went wrong. You might have run the command in the wrong working directory, or you might have used the wrong repository path, etc. The important point is whether the imported files coincide with files in the repository or not.

If none of the files in the erroneous import have the same name as an existing file in the repository (e.g. they all ended up in a completely new directory) then just removing the files from the repository can be done by using the appropriate rm command in the repository.
If the import is OK apart from an incorrect tag, the tag can probably be deleted and re-applied correctly without too much pain. (This may not be true for a misspelled vendor branch tag.)
If there is a filename clash with an unrelated file, then there's a fairly serious problem. Find a CVS guru and help him or her to fix the repository manually. You won't be popular.

6.4. Renaming files

There is one situation where the best practice requires changing the repository manually, and that is when moving a file. The aim is to keep the full history with the file in its new location, but still allow old checkouts to work as expected. The procedure is:

Log in to the CVS server and copy the appropriate ",v" file from the old location to the new loation.
In your working copy of the code do a cvs update; you will now have two copies of the file in the old and new locations.
Delete the file from its old location, cvs rm it and check in the change. It'll move into the Attic in the repository.
Delete all the tags from the new version of the file with cvs tag -d. This allows checkouts of old tagged versions of the module to work without introducing spurious files. Checkouts based on dates may still not work quite right, but they shouldn't be necessary if the module has been tagged properly.

6.5. Undeleting files

If you have removed a file from recent versions of the source tree but decide that it needs to be restored, then you can use the following procedure. It is just an elaboration on the theme of cvs add $file; cvs ci $file.

Find the penultimate revision of the file by using cvs status $file and subtracting one from the revision number.
Retrieve the last version of the file by using cvs up -p -r $rev $file > $file.
Edit the file, if necessary.
Re-add the file to the repository and check it in, with cvs add $file; cvs ci $file.

6.6. Custom Tags

In a repository that includes third-party software on vendor branches, it is helpful to configure CVS to use a custom tag instead of the standard $Id$ or $Header$ tags. Examples from real projects include $Xorg$ , $XFree86$ , $FreeBSD$ , $NetBSD$ , $OpenBSD$ , and my own $dotat$ . The advantage of this is that you can include your local version information in a file (as a custom tag) without disrupting the upstream version information (which may be a different custom tag or a standard tag). All tag expansion except of the custom tag is disabled.

In order to configure this, you need a patched version of CVS such as the one that comes with FreeBSD or Debian. (The NetBSD and OpenBSD versions have similar functionality, but this procedure will not work with them.)

Choose a name for your custom tag, e.g. the name of your organization. In this example I'll use "dotat".
Check out your repository's CVSROOT directory.

Create a file in the CVSROOT called options with the following contents:

	# Local CVS options:
	# Add a "dotat" keyword and restrict keyword expansion
	#
	# $dotat: doc/web/writing/cvs-guidelines.html,v 1.10 2003/03/02 19:33:04 fanf2 Exp $
	
	tag=dotat=CVSHeader
	tagexpand=idotat

Check in the options file with cvs add options; cvs commit options as usual.
Now you can use your custom tag, and other tags will not be expanded.

The tag option says that $dotat$ is a synonym for $CVSHeader$ (which is like the standard $Header$ tag but with the CVS root stripped off). The tagexpand option specifies which tags should be expanded. If the argument starts with i then only the keywords that are listed (separated with commas) are expanded. If the argument starts with e then all keywords except those listed are expanded.