Finding duplicate File names in a directory tree

Wednesday, January 26, 2011

Just a little one liner. Essentially I have a nested file system structure with a lot of subfolders and I wanted to know if there are duplicate file names.

This does the trick for me:

find . -iname "*.txt" | xargs -I {} basename {} | uniq -d -i

So what this does:

Call find to find all files that match *.txt (case insensitive) in the current folder and all subfolders
Call basename to turn /path/to/file.txt into file.txt. We have to use xargs since basename does not seem to be able to read from stdin
Call uniq to check the list for duplicates. -d means that we only show duplicates (default is to show unique values) and -i performs a case insensitive comparison

Sidenote: As much as I love working with Windows, it just does not stand a chance against a long chain of simple, small, well-defined UNIX tools that are piped together. As much as PowerShell is a step in the right direction, it's not the same because the philosophy of having tons of very limited and specialized tools doesn't seem to exist in the Windows universe - most command line tools do too much. Also, UNIX has a 30 year head start against PowerShell.