I've reorganized my music a few times, and it took me a while to get organized enough to want a single directory tree that held everything I owned. Every once and a while I go through the junk drawer and move a few more songs into the "official" tree, checking their metadata and whatnot. I was getting annoyed at finding duplicate songs in several junk drawers, so I wrote up a little script, find duplicates.py which can compare SHA1s for files in two trees and remove any duplicates from the lesser tree. Now there is a lot less junk to merge into my official tree.
Sometimes you want to do something more effective than printing
duplicates, but more subtle than removing them. To give you this
flexibility, I've added the --one-line
option, which you can use
along these lines:
$ find_duplicates.py dir_a dir_b --one-line | while read -a DUPS; do mv "${DUPS[1]}" "${DUPS[0]}".dup; done