
Back in 2004 I wrote up a blog entry showing how to get a count of files by a specific extension. For example you want to know how many js files are in a directory, you can run this:
find /some/dir | fgrep -c '.js'
The -c
in grep tells it to count the matches, I'm using fgrep
here because I'm not using a regex (to avoid escaping the dot).
The above would also match a file, or a directory had .js
anywhere in the path, so we could improve that script by using a regular expression $
character, for example:
find /some/dir | grep -c '\.js$'
Now we are limiting the .js to show up only at the end of the file.
Listing all file extensions and the count of files in a directory
Here's one way to print out a list of extensions and the count the number of files of each type in the directory:
find /some/dir -type f | grep -o ".[^.]\+$" | sort | uniq -c
This will print out the number of the files in the directory by extension :
5 .js 3 .html 1 .css
How it works
First we have find /some/dir -type f
which just limits find to output all the files in the directory recursively. The -type f
omits directories from showing up in the list.
Next we have grep -o ".[^.]\+$"
the -o
tells grep to only output lines that match the pattern, and only output the match. The pattern is just a regex that says look for a dot followed by one or more chars that are not a dot [^.]\+
, at the end of a line $
.
Next we pipe into the sort
command which just puts every thing in order.
Finally we pipe into uniq -c
which counts each unique line (the file extensions) and prints out the results. Cool!
The one drawback to this approach is that it ignores any files that do not have a file extension.
Counting the number of files in a directory on Linux or Mac
If we just want to know how many files are in the directory, we can use a the find
command and the wc
(word count) command together, like this:
find /some/dir | wc -l
The above command will print out the count of files in a directory on linux, a mac or any unix based operating system.
Comments
Next we have grep -o ".[^.]\+$" the -o tells grep to only output lines that match the pattern, and only output the match. The pattern is just a regex that says look for a dot followed by one or more chars that are not a dot [^.]\+, at the end of a line $.
find -type f | grep -o ".[^./]\+$" | sort | uniq -c | sort -n include / in [^./] to exclude results with no file extension but . in a directory name in the path
What if you want to have a count of the extensions by folder, so you can run it recursively from a top level folder for example, and have a count of ext by each sub folder.