HOWTO purge undesired files from your SVN repository

Several months ago, I wrote a post about how to regenerate a SVN repository.

In this post I will show how to purge undesired files from it (for example compiler intermediate files, …) on a Windows system. This may be needed to keep the repository size under control.

One thing to care about is that once committed, a file is stored permanently in a repository.  If it will be marked for deletion, it will be effective in future revisions only.

  • The first step is to dump the repository.
  • The second step is to filter out undesired content by using svndumpfilter command (one or more times).
  • The third step is to import the filtered dump to the new repository.

The 1st and the 3rd steps are the same as shown in my previous post.

In step 2, we will have to use svndumpfilter, a utility included in the standard SVN package.

Up to now (v1.10), it can do only one operation at a time.

Depending on how these files are distributed into your repository folders, you can plot a strategy to filter them out.

In my case, they may be stored throughout the folder tree, so I choose to filter them out by extension.

You can do this by a specific folder but it can be done only by one folder per time.

Filtering out the undesired files

The first thing is to create a text file (blacklist.txt ?) containing one pattern for line, like the following one:

*.7z
*.bpl
*.db
*.dll
*.exp
*.idb
*.ilk
*.lng
*.local
*.log
*.nativecodeanalysis.xml
*.ncb
*.obj
*.opt
*.pch
*.pdb
*.pdi
*.prm
*.rar
*.res
*.sbr
*.sdf
*.tds
*.tlh
*.tli
*.xml.bck
*.zip
*.~*

then we can write down a small script to execute the task:

set list="<fullpath>\blacklist.txt"
set in_dmp="<incoming dump full filename>"
set out_dmp="<outcoming dump full filename>"
svndumpfilter exclude --drop-empty-revs --pattern --targets %list% < %in_dmp% > %out_dmp%

In the outcoming dump file, all matching files will be excluded and all ‘empty’ revisions will be purged.

We can add ‘–renumber-revs‘ to delete revision holes.