File Extensions on the Internet

td.num {
padding-left: 20px;

I had a simple question to which I couldn’t find an answer.

Which file extensions are used on the internet?

So I wrote a little program ( and a half million calls to Google later, we have some interesting data.

First, the raw data:

Top 10:


6 700 000 000


5 980 000 000


1 690 000 000


1 510 000 000


1 380 000 000


565 000 000


385 000 000


298 000 000


242 000 000


199 000 000

Some interesting facts I saw :

  • There are 1305 unused 3 letter extensions out of the possible 17,576. That is 92.5% are already used for something. (There IS a lot of junk thought, so don’t be TOO alarmed).
  • There are a lot of common extensions that I have NO idea what they are for. .e? .nhn?
  • 4x more pages are html instead of just htm.
  • PHP is beating ASP by about 2x.
  • Many servers serve HTML from image extensions, and jpg > png == gif > svg > jpeg > bmp > tiff
  • Naming is mostly not biased by first letter. The empty part is 3 letter extensions starting with y.

  • Only the top 5,000 extensions have more than 1000 pages.

Some caveats

  • This was done in October 2009, things might change. I’ll rerun it if people leave comments.
  • I only looked for extensions up to 4 letters. No numbers or funky symbols.
  • I am assuming the counts on Google’s search results are ACTUALLY correct.

If anyone makes any interesting observations with this data, please let me know and I’ll post it here. Pretty graphs are welcome as well 🙂



From the idea of progrium I did a fun little hack last night.

(mimetype or file extension) -> icon ==

In the vein of gravatar’s simple URLs, just add the file extension, or mimetype onto the path, and you will get a good icon representing it. There are more options for choosing your icon set, size, and default, but head over to the root page to find out more.

It’s open source, so fork and fix. And if you have any ideas for good icon sets, please post a comment or email me and I’ll get them into the system.