How many distinct last names exist in the United States of America?

How does one estimate/guess that number? The first source that comes to mind is the result of the latest available census. We know that 2000 year census presents us with a little over 150,000 unique surnames. The least popular ones mentioned in the survey have not less than 100 living representatives. All other names are not included. In fact it is almost certain we will not know all the exact names until year 2070 when the complete 2000 census data will become publicly available. By that time the data will not be of significant importance for this exercise as it will be 70 years late.

What else can we look at? I suspect the best way at this point is to use Facebook. One just needs to scan all of their American names and count distinct ones. Then do simple math. Say, we counted X number of unique names belonging to American accounts, and Y is the number of American accounts. The approximate number of unique names in the United States then is 309,000,000*(X/Y).

Unfortunately this method is not so easy to use since it requires knowledge on how how to perform all those Facebook account enumerations. It surely can be done, but I will wait with that since I have another, less precise, way to get the estimates.

We know that the population of the USA in the year 2000 was close to 300,000,000. The 150,000+ unique names available thru the census cover 240,000,000 people. The rest of the population, 60,000,000 of them, has last names with less than 100 bearers each. We do not know how the number of bearer per name is distributed. The only thing we can do is to provide the range for the number of the last names out there. We will do just that. It is possible but, of course, not likely that all the other names are also somewhat popular and each of them has 100 representatives too, just like the least popular names in the census. If that was the case the number of unique names in the United States would be 150,000 + 60,000,000/100 = 750,000. Another possibility is that all the names not getting counted in the census have only 1 bearer each. In this case the number of unique names in the United States is 150,000 + 60,000,000 = 60,150,000. Neither 750,000 nor 60,150,000 looks like a viable estimate. But they do provide a range: from 750,000 to 60,150,000 unique surnames. Any attempt to narrow the range without looking at the real data will be just that, an attempt that might or might not give you a better number. Let's make the assumption that the number of bearers per surname not published by the census is 50, right in the middle between 100 and 1 bearer and everything is distributed very unifromly. Then the number of unique last names in the USA is 150,000+60,000,000/50 = 1,350,000. Please, note that that later data and assumptions are true for the year 2000, for which we have reasonably detailed data. More modern estimates can be obtained once 2010 data becomes available.


©2014 ANC Labs Inc | Terms of Use | Privacy Policy | View in: Classic | Mobile