CVS (Concurrent Versioning System) is the standard source code control system used in unix and other OS's. This is a quick primer about what source code control is and how CVS works. It's recommended to anyone doing any sort of group project, or even a large individual project.
Have you ever done a programming or writing assignment in a group? Was it really annoying to keep track of who was doing what? I thought so. :-)
Typically, when you try to do a project in a group, you manage your code the ad-hoc way, either by making all the files shared and trying not to edit over each other, or making a lot of copies of the files and then pulling your hair out over who edited what file when. Either way, it's really annoying. There must be a better solution...
That's what source code and revision control is for. The idea is to make it as painless as possible for many people to work on a set of documents (specifically, code) at once, without getting in each others' way, and to keep a record of all the changes that were made so you can easily keep up to date on the code, and revert any bad changes back to the way they were.
What does CVS allow you to do? Well, you set up a shared repository of the source code, which holds your files and every change that was ever made to them. Each person can then edit the files simultaneously, without having to worry about overwriting someone else's precious changes. Merging code from different people is a breeze. Logging changes is automatic. It's great!
The main idea of CVS is that your group has a central repository of source code. The "official" copy of your project is always stored here. It contains all your source code, as well as any support files etc. Plus, it remembers all the changes that are ever made to the files (even if they're deleted). Basically, this is where your "working copy" lives.
But everyone can't edit the repository, right? Someone would quickly delete all the files in it or do some other careless thing. You know how groups are. The solution: Nobody actually works in the repository. Each person checks out a local copy of the source code for their own use. This is basically a scratch copy - it belongs to you, not to your team-mates. You can delete files in it, write really terrible code, anything you want. It's not the "official" version, just a working copy. It's here that you make all your changes, build the code, etc.
So, you're editing your code in the working copy. What do you do when you have some code that you think looks marginally presentable (meaning, it compiles, or possibly even works!)? Well, you want to commit that code back into the repository. When you commit your code, all the changes you made to it from the "official" version are merged back into the repository's version, and stored. These are the changes that get remembered. All that really terrible code you wrote back when you were trying to get everything to compile was just some scratch junk of your own. CVS isn't evil. It doesn't record every single bug you create. Just the ones you let other people see. It does nice things for you too. For instance, it slaps a new version number on the file for you, and can even update a change log, modification dates, all sorts of things.
So, what if someone else modified the file first? Well, that's what CVS is really for. If someone else modifies the file first, it automatically merges the changes together! As long as they're not in exactly the same place in the file, it'll do a really good job of keeping track of everything. And even if you both edit the file in the SAME place, it will refuse to actually lose information. It'll just put both copies in the file, surrounded by copious syntax errors, and tell you about the problem so you can determine whose code is right.
Ok, let's get started with CVS for real. I'll assume you're using bash in this tutorial. If you're using tcsh, you should be able to translate the environment variables.
CVS is a command line utility in unix. There are also ports to some other operating systems available somewhere or other. It works most simply on the local machine, but it's also capable of working over a network.
Basically, all cvs commands are just arguments to cvs. Let's do an example:
export CVSROOT=/homes/iws/joe/cvs
cvs checkout project1
The first line there tells CVS where the source code repository is. In this case, it's looks like it's in someone named Joe's home directory, in "cvs"
The second line tells cvs that you'd like to check out a copy of the first project, presumably because it's due tomorrow at midnight and you just noticed that it's 10:00pm. What you'll see is something like:
[elladan@tandu school]$ cvs checkout project1
cvs checkout: Updating project1
U project1/Makefile
U project1/bst.c
U project1/bst.h
U project1/main.c
Ok. Now you have your very own copy of the first project. It'll be in a new directory, named, surprisingly enough, project1. Inside this directory, you'll see your files, and another directory, CVS. In fact, there will be a CVS directory in every directory of your project. CVS uses it to store information about the files that were checked out. For instance, what version they are, where they came from, etc.
So, you quickly hack off some important feature, and you're ready to commit it back to the archive. Not only that, but it's only 6am now, still time to get 1 hour of sleep. What now?
The first thing you do before a commit is, strangely enough, update your working space. Updating means that any changes made by the other people in your group since you checked out the project, will now be merged into your code. So, your code will be the very latest copy now, since it'll have everyone else's changes, plus the changes you just made yourself.
In case you're wondering why you have to do this, it's because you're responsible for the code you just wrote working with everyone else's when you check it in. (But... Why are YOU responsible when they broke your code? It's no fair! - Well, actually it is fair. They made sure their code worked [ahem] and checked it in before you did. Thus, it's all up to you now!) So, CVS will merge everything in, and then you get to make sure it all works. Let's see the update:
[elladan@tandu project1]$ cvs update
cvs update: Updating .
RCS file: /homes/iws/joe/cvs/project1/bst.h,v
retrieving revision 1.1
retrieving revision 1.2
Merging differences between 1.1 and 1.2 into bst.h
M bst.h
U main.c
U README
Whew! It updated without any merge conflicts! That means that nobody else messed with your code, so in theory, it should work. Of course, there were those merging lines... And some of those U's (well, in this case, one - the README) don't look familiar. New files? So, you build the code, and of course there are 741 compile errors. You deftly fix them all by 7am (argh! No sleep now!) and you're ready to commit. No, not to a long-term relationship! CVS won't help you with that. The source code! How might you do it, though?
cvs commit
That was really hard. You forgot that "commit" has 2 m's. You must really be getting sleepy! Anyway, an editor window popped up, asking for a log file. You quickly type in some half-delerious comment about the code being great, and pink bunnies hopping on your keyboard, and quit out of it.
[elladan@tandu project1]$ cvs commit
cvs commit: Examining .
Checking in main.c;
/homes/iws/joe/cvs/project1/main.c,v <-- main.c
new revision: 1.2; previous revision: 1.1
done
You're done! You're done! Well, you hope...
Well, those were the basics. Now for a bit of the more complicated stuff. For instance...
What if, when you did your update before checking in some code (or some other time) you got one of these messages?
[elladan@tandu project1]$ cvs update
cvs update: Updating .
RCS file: /homes/iws/joe/cvs/project1/main.c,v
retrieving revision 1.2
retrieving revision 1.3
Merging differences between 1.2 and 1.3 into main.c
rcsmerge: warning: conflicts during merge
cvs update: conflicts found in main.c
C main.c
... and now your code doesn't work (or even compile) at all!
Well, what happened was that someone edited the file in the same place as you did, and committed their changes before you did. When CVS tried to merge their changes in, it could tell that there was no way it was going to work automatically, so it gave you that conflict warning. If you examine the file, it'll quickly become clear why it doesn't compile!
[elladan@tandu project1]$ head -20 main.c <<<<<<< main.c /* Sample program for intermediate unix tutorial * note that it's pretty silly */ ======= /* Sample program for intermediate unix tutorial * just inserts and removes some numbers */ >>>>>>> 1.3 #include "bst.h" #include
Clearly, what CVS did was take your version, and take the one from the repository, and put them both in your file. Your version is the one on top, between the <<<<<<< and the =======, while the one from the repository (version 1.3 - you were editing version 1.2!) is between the ======= and the >>>>>>>. So, what do you do now? Well, it's up to you! You need to decide how the code (or in this case, the comment) should look, fix up your copy so it's like that, and get rid of the <<<<<<<, ======, >>>>>>> business so it's all clean now. Then, you'd update again, and commit!
Note that there's one rather nasty case: What if someone re-indented or otherwise reformatted an entire file, or just generally made humongous changes? What would tend to happen is that your entire file is a merge conflict, if you've made any changes to it, and you'll have a big pain fixing everything. The best thing to do in that case is not to reformat files! In general, if you want to completely change some files like that, just ask everyone to commit all their stuff, then have them stay out of the repository for a few minutes while you make and commit your wholesale changes, then have everyone update and continue. That's a bit distruptive, but a whole lot easier than merging an entire file by hand!
Ok. I lied. It's not very complicated. First, set your CVSROOT variable to point to the directory where you want the repository to be. For example:
export CVSROOT=/projects/cse326/00wi/group-k/cvs
Now, you have to initialize the concurrent versioning system's repository database and blah blah blah blah blah ... ok, basically, you type:
cvs init
That was easy. So now you have a repository. What do you do with it? Well, the first thing would probably be to go and set all the directories group-writable and such, actually. (Hint: chmod). But that's not really part of CVS. What you really need to do is create a module for your source code. Think of this as a project. CVS can hold any number of different projects in one archive, so you need to create one, for instance for your great project1. This part is a little bit more involved, but not much.
First, you go BACK to your home directory and check out part of the repository. Specifically, the module called CVSROOT. This is not to be confused with the CVSROOT environment variable, which tells cvs where to find its repository. This is a module in the repository which contains administrative files. You'll need to edit one of these to create a module.
cvs checkout CVSROOT
Now, you need to go edit the modules file in there. This file describes the projects in your repository. There will be some text in it describing how it works. Basically, just put the following line at the end of the file (without a # before it!)
project1 project1/
What was the point of that? Well, it just means that the project1 module is in the project1 directory. What a surprise! Now, you need to actually go and make that directory in the repository. This is about the only time you ever actually touch that repository directly! Just go back to the repository, and make the directory. (And set the permissions).
cd $CVSROOT
mkdir project1
chgrp group-k project1
chmod 770 project1
Ok, now you can check it out! Go back to your home dir and do a cvs checkout project1. There should be a blank directory there, with a CVS in it of course. Go at it!
Contrary to what you might think at first, if you want to add a file to the CVS archive, you can't just commit it to the archive. You have to add it first:
cvs add file
cvs commit file
If you just try to commit it, CVS doesn't know anything about it. And if you just try to add it, CVS will only reserve a name for it, but won't store any data. I'm sure what you're thinking now is: why?
Well, if you think about it, it kind of makes sense. When you, say, build a project, a bunch of extraneous files (object files, executables, editor backups etc) tend to accumulate all over the place. What would happen then if someone typed something like cvs commit * and you didn't need to add first? Well, the repository would get a lot of junk in it, which would be bad. So CVS requires that you say what you want explicitly. (Note that CVS actually does that "commit *" business by default - if you just type cvs commit it commits all the modified files)
But, what if you have a whole bunch of files in 12 subdirectories? It would be kind of annoying to have to go through all of them and add them one by one, or come up with some sort of search function etc. which finds the files you want. To solve this, CVS includes an import facility which lets you suck in a whole tree
To use it, you first put the files you want to import in some directory somewhere, which should not be the directory in CVS where you're placing them. Then, cd into that directory. Now, decide what place you want to place them in the CVS repository - for example, we'll use new-project. This is the directory name in the repository. Execute a command like the following:
cvs import -m"Initial version" new-project initial start
You need to put in a vendortag and a release tag. Since you almost certainly don't care, I've just put in "initial" and "start". The -m is just a way of setting the log file entry on each file added automatically.
What will this command do? Well, it takes the files in the current directory (where you are) and adds them into the CVS archive, as version 1.1 labeled 1.1, with the comment "Initial version" on each file. All the files will be under the directory new-project in the repository. Note that you may now need to go add a module mapping to the modules file as before, for new-project.
Let's go through that again, for example for something like the evil Nachos program:
export CVSROOT=/path/to/archive
cd
mkdir tmp
cd tmp
tar -zxf ../nachos-3.4.tar.gz
cd nachos-3.4
cvs import -m"Initial version" nachos initial start
cd ~/tmp
cvs checkout CVSROOT
cd CVSROOT
[edit the modules file, add:
nachos nachos/
cvs commit modules
Done!
Here are some other things you might want to do with CVS:
... becomes .../************************************************************** * This file is really important. * Last modified by: $Author: husted $ * On: $Date: 2000/05/19 02:08:06 $ * */ static char const rcsid[] = "$Id: cvs.html,v 1.10 2000/05/19 02:08:06 husted Exp $"; ... code goes here /*************************************************************** * Log entries: * * $Log: cvs.html,v $ * Revision 1.10 2000/05/19 02:08:06 husted * *** empty log message *** * * Revision 1.9 2000/05/19 02:02:36 husted * *** empty log message *** * * Revision 1.8 2000/05/14 03:53:27 husted * *** empty log message *** * * Revision 1.7 2000/05/14 03:52:34 husted * *** empty log message *** * * Revision 1.6 2000/05/14 03:42:43 husted * *** empty log message *** * * Revision 1.5 2000/05/14 03:23:25 husted * *** empty log message *** * * Revision 1.4 2000/05/14 00:40:49 elladan * * Finished CVS tutorial * * added example repository * * -J * * */
The bits that are surrounded by the dollar signs, such as $Date: 2000/05/19 02:08:06 $, will be replaced by cvs every time you commit with information relevant to the file. Putting the date, author, etc. in comments probably makes sense, but what about that constant rcsid variable? Well, that one is a little weird. Basically, it'll get substituted by a fairly verbose identifier telling you about the current file... And then it gets compiled into the program. The point is that it actually gets built into your program. You'd put one in each file, hence the local (static) scope. Then, say you had a binary sitting around. You could immediately find out what version each of the source files was by running the "ident" command on the program, eg. ident file. ident just looks through the file for strings that look like that id message, and prints them out. Cool, huh? Well, you probably won't use that on a departmental project, but still... :-)/************************************************************** * This file is really important. * Last modified by: $Author: husted $ * On: $Date: 2000/05/19 02:08:06 $ * */ static char const rcsid[] = "$Id: cvs.html,v 1.10 2000/05/19 02:08:06 husted Exp $"; ... code goes here /*************************************************************** * Log entries: * * $Log: cvs.html,v $ * Revision 1.10 2000/05/19 02:08:06 husted * *** empty log message *** * * Revision 1.9 2000/05/19 02:02:36 husted * *** empty log message *** * * Revision 1.8 2000/05/14 03:53:27 husted * *** empty log message *** * * Revision 1.7 2000/05/14 03:52:34 husted * *** empty log message *** * * Revision 1.6 2000/05/14 03:42:43 husted * *** empty log message *** * * Revision 1.5 2000/05/14 03:23:25 husted * *** empty log message *** * * Revision 1.4 2000/05/14 00:40:49 elladan * * Finished CVS tutorial * * added example repository * * -J * * Revision 1.3 2000/05/05 20:08:47 husted * *** empty log message *** * * Revision 1.2 2000/04/28 22:46:24 husted * *** empty log message *** * * */
There are a few problems that everyone using CVS will probably see at one time or another:
CVS is supposed to manage concurrent access by many people at once. So, obviously, it does a lot of locking on its files, to make sure everything is consistent. The problem is, everyone is running a copy of cvs themselves, and each of those copies are given the task of removing their own locks. So, say you start to commit a file, and then you kill cvs. It won't remove its lock, and so anyone else trying to commit to that file will be locked out. This doesn't just go away - someone needs to actually go and nuke a lock to fix the problem.
Locks in CVS are just files in the cvs repository that look like the files starting with a # in this directory listing:
[elladan@tandu project1]$ ls -l
total 20
drwxrwxr-x 2 elladan elladan 4096 May 13 17:11 #cvs.lock
-rw-rw-r-- 1 elladan elladan 0 May 13 17:11 #cvs.wfl.tandu.imlandris.org.6309
-r--r--r-- 1 elladan elladan 2720 May 13 16:24 Makefile,v
-r--r--r-- 1 elladan elladan 1339 May 13 16:36 bst.c,v
-r--r--r-- 1 elladan elladan 684 May 13 16:54 bst.h,v
-r--r--r-- 1 elladan elladan 1178 May 13 16:54 main.c,v
Basically, you just go and delete the lock files (not the actual repository files ending in a ,v!) and everyone's happy.
The next problem that comes up is really weird if you don't know what's going on. Basically, it usually happens if you try to check out an old version of a file. So, you do a cvs checkout -rversion file and it looks ok. Then, you edit and commit, and somehow, it doesn't show up for everyone else! And nobody else's changes show up for you, either.
The basic problem is that you checked out an old version of the file, and then you edited the old version of it. So, CVS thought you wanted to update the old version, not the latest version. This brings up a point about what CVS is actually capable of: I haven't mentioned it, but the archive isn't necessarily a single chain of updates. CVS allows you to have multiple branches of development on the same files at once, where for instance each person or small group might be implementing one feature, at the same time another group is doing something else. So, to keep things separate, CVS allows you to create branches in the code, each based on a certain point in the code and built on top of of it in different directions. Anyway, I won't go into it. Suffice it to say, CVS thought you were creating a branch, and you probably didn't want to do that.
To fix it, you want to tell CVS that the file is no longer on a branch. Technically, this means you need to clear the sticky tag from the file. Just back up your copy - it's probably going to get deleted! and do this:
cvs update -A file
To get an old version of a file without setting the sticky tag, ask CVS to just "view" the file, and then save that to disk:
cvs update -rversion -p file > file.old
Another thing that tends to show up is permissions problems. This is pretty easy to understand if you think about it a bit, but can be annoying if you don't know what's going on.
Basically, CVS is letting a lot of people write somewhere, right? So, they need permission to do that. Basically, due to the way unix permissions work, they actually just need to be able to read the files and write to the directory. So, all the directories need to be writable by everyone who should be able to use the repository. I won't go into a big discussion of permissions here, and will just give a few examples instead:
Say you're in a group, "project-k" with some people, and all should be able to use the repository. You want to go to the place you stored the repository, and (assuming your repository is in a directory named cvs) do this:
chgrp -R project-k cvs
chmod -R 770 cvs
That will put all the files in the repository in the project-k group, and set their permissions to "readable, writable, and executable by the owner and the group" (note that you must have executable! On a directory, executable means you can open a file in the directory whose name you know.)
A slightly more geeky version is:
chgrp -R project-k cvs
find . -type d -exec chmod 770 {} \;
find . -type f -exec chmod 440 {} \;
That just restricts the permissions a bit more, based on whether something is a file or a directory. You do indeed need that \; at the end, too.
Now, this is all nice, but there are a few problems with using CVS here. Basically, as you may have noticed, you need some shared file space where everyone can commit their stuff. You can't just use your home directory and set files world-writable, because you'll run out of quota space on foreign filesystems as soon as your project is 100k or larger (basically, everyone in the group is probably on a different drive. Because of the way support has disk space quotas set up, if you write to a drive other than the one where your home directory is, you're limited to 100k of disk space). So, you have one of two choices:
Accessing cvs remotely is basically just like using rsh. (It IS using rsh). I won't go into too many details, but basically you might access files on one of the department machines from your home computer by using this as your CVSROOT at home:
export CVSROOT=joe@fiji.cs.washington.edu:/projects/stuff/cvs
It's also possible to use ssh for this purpose. Just do this:
export CVS_RSH=ssh1
That will tell cvs to talk through ssh instead of rsh.
There might be some path problem when you do this, in which case you need to tell CVS the whole path to itself (it runs "cvs" on the remote machine):
export CVS_SERVER=/uns/bin/cvs
You shouldn't have to do this, but on the off chance you get error messages like cvs: command not found when you try to use CVS remotely, that's probably the problem.
Well, that's all for now. I hope this is was useful.