Area codes in the US

related: snippet , python

Wikipedia has a list of the area codes used in the US.  How many are used vs unused?

List of area codes in the United States

So you copy and paste that crap into a text file, let’s say it’s called areacodes.txt, and let’s make sure it’s really plain-text and not full of HTML badness.

It’s not so nice looking though; if you copied it into TextEdit and then converted to plain-text, it has bullet points and region names and area codes separated by slashes, and just isn’t that machine-readable

Clean it up with some sed:

1
2
3
4
# cat areacodes.txt |sed ’s/[^0-9]*\([0-9/]*\).*/\1/’ |tr “/” “\n” |sort |uniq > sortedareacodes.txt

# cat sortedareacodes.txt |wc -l
     288

That cuts out anything on a line except 3-digit numbers with optional ’/’ separators.  Then converts ’/’ into newlines.  Sorts and removes duplicates.

That’s great.  Now Python it up.

1
2
3
4
5
6
7
# python
>>> f = open(“sortedareacodes.txt”)
>>> codes = f.read().split(“\n”)
>>> all = range(200,999)
>>> unused = [x for x in all if x and str(x) not in codes]
>>> len(unused)
512

That reads in the file and makes a list with one string per area code.  ‘all’ is a list of all the valid area codes (since they can’t start with 0 or 1, and are 3-digits).  The list comprehension makes a list of every 3-digit number that is in ‘all’ and not in ‘codes’.

Wikipedia knows of 288 area codes.  512 valid ones remain, minus any reserved combos this does not consider, like 555.