Subversion and EncFS
 
 
HowTo
Requirements
  • EncFS
  • Permission to create FUSE mountpoints
  • Subversion
  • Python
Encrypting a Repository
  • Create a dump file:
    $ svnadmin dump <repository> > my_repos.dump
  • Download the conversion script and convert the given dump file:
    $ ./encrypt.py -d my_repos.dump -o my_encrypted_repos.dump
  • Create a new repository and load the encrypted dump file:
    $ svnadmin create --fs-type=fsfs /my/encrypted/repos
    $ svnadmin load /my/encrypted/repos < my_encrypted_repos.dump
Creating a New Repository
  • Generate a suitable random key:
    $ dd if=/dev/random bs=32 count=1 | uuencode --base64 whatever | \
      head -n 2 | tail -n 1 > encoded.pwd
  • Create a new repository to house the data.
  • Check out a copy of the repository
  • Setup EncFS:
    encfs /path/to/checkout /mount/point
    When prompted, select expert mode, and manually configure EncFS. The only key point in this step is that the file-name encryption be null (ie file names on the backend match file names on the frontend). If you wish, it is safe enough to encrypt the names, but it makes subversion management a bit of a pain because you don't know what you're adding/removing/etc. When prompted for a password, give it the contents of encoded.pwd.
  • Encrypt the key file:
    $ cat encoded.pwd | gzip | gpg -e -r <Your Name Here> > encoded.pwd.gpg
    == OR ==
    $ bcrypt encoded.pwd
  • Copy the encrypted key file to a few USB keys and distribute to various relatives, etc. Never keep the encrypted key in the same place as your gpg key. This way, an attacker would require your gpg key, as well as the usb key, as well as your gpg password, as well as the encrypted data, to breach security. Typically, one can expect your gpg private key to be on your desktop machine, as well as a checkout of your secure repository, so just be sure to only use the USB key when necessary, and keep it separate from your gpg private key.
  • Add the .encfs file to the repository:
    $ svn add /path/to/checkout/.encfs*
    $ svn ci /path/to/checkout -m "Add the .encfs file"
Using your Encrypted Repository
  • Check out the source code as you would from any other Subversion repository.
  • If, for example, your subversion tree includes a trunk directory and tags directory, then you'll need to copy the .encfs* file from the root of the repository to the root of your checkout, eg:
    $ svn co svn+ssh://myserver.com/svn/taxes/2008 2008_taxes_encrypted
    $ svn cp svn+ssh://myserver.com/svn/taxes/.encfs5 2008_taxes_encrypted/
    Please note that the number after '.encfs' will depend upon the version of encfs being used.
  • Mount the encrypted data to access the clear-text versions:
    $ mkdir 2008_taxes_clear
    $ cat /media/JOES_KEY/encoded.pwd.gpg | gpg -d | \
      encfs -S $(pwd)/2008_taxes_encrypted $(pwd)/2008_taxes_clear
  • Perform all your Subversion commands from the encrypted checkout, as the .svn directories (including the text-base) will be committed in clear-text, and everything else will be encrypted. Attempting to access the .svn directories through the EncFS mount will result in gibberish.
    $ pushd 2008_taxes_encrypted
    $ svn status
    ...
    $ svn ci -m "hi mom!"
    ...
    $ popd
What?
The goal of the encryption system is simple: only allow encrypted data to be stored (either in RAM or on disk) on the server at any given moment. Because of the encryption requirements, the data must be encrypted before transmission to the server (either by Subversion or otherwise), and must be secure against offline attack. Given that Subversion does not support data encryption (outside of wire-transfer) -- enter EncFS
EncFS is a FUSE application that injects a layer of encryption on top of a pre-existing filesystem. The basic model is relatively simple: given a directory full of encrypted files (possibly with encrypted file names, as well) it provides a VFS interface, via the FUSE module, allowing the user standard posix semantics, making the encryption completely transparent to everyone but the admin who issues the mount command. To satisfy the requirement that it be secure against offline attack, I recommend using a USB key to store a random key, and then encrypt that key using something like bcrypt or, preferably, your gpg public key.
Why?
Several companies offer subversion hosting on their servers, granting users access to their subversion repositories from anywhere in the cloud. One of the down-sides of such an approach is that their data resides on the server in clear-text. No matter what precautions the user takes (short of encrypting their data), a missed security patch, or nosey admin can result in compromised data -- enter data encryption.
How?
System Diagram The diagram (left) shows the basic structure (with the blue components indicating encrypted data and the red components indicating clear-text data). As with all systems of this type, there will still exist the fear of swapspace containing intermediate states and/or clear-text data, or malicious users accessing it while decrypted, but such obstacles are left to the reader to overcome. Owing to the author's lack of expertise in the area, the system was designed to be as simple as possible, requiring the least change. Specific steps for creating and managing an EncFS encrypted subversion repository are listed above in the HowTo sections.
The Good, The Bad, and The Ugly
  • I would like to make it very clear that I am NOT a security expert, and anything you read on this website should be taken with a grain of salt and is without warranty or guarantee of any kind (implied or otherwise).
  • This system has now seen some test-time by the author, and has been shown to be reasonable from a procedural perspective.
  • The encryption system has not been tested against attack, and is likely to be weakened by such factors as multiple revisions. For example, a one character change to a file will likely make it easier for a clever person to attack the encryption system.
  • Subversion commands that modify the repository (such as add, copy, remove, etc..) will only be effective from the actual subversion checkout (which is the encfs source data).
  • The svn diff command is rendered useless, as the repository itself is comprised, entirely, of encrypted data.
  • The repository will be enormous. Because the data is all encrypted with a block cipher, Subversion will be incapable of efficiently storing diffs, and compression programs (such as BZip) will, also, have little effect. A few basic tests have shown a 300% increase in repository size, making it wise to restrict the amount of data stored therein.
  • Conflict-resolution via subversion will also be useless. If a conflict occurs, you'll have to manually merge the two versions on the mountpoint.
  • Small files are more efficient, because, in effect, a new copy of the file is added to the repository every time you commit a change.