The New gTLDs are Here! Woohoo!

Go here and here to see the CartoDB visualizations. Below is my story behind them.

carto-cctlds

The new gTLDs are here! Thanks IANA! Are you as excited as I am??

What?


Okay, so it’s not a very sexy topic (and it seems like they go out of their way to remind you of that) but the Internet Assigned Numbers Authority (IANA) — which manages the global coordination of internet protocols — has added hundreds of new generic top-level domains (gTLDs) to the existing pool, with promises for many more.

What’s a top-level domain? In short, it’s the .com, .edu, .io, or .ly at the end of an internet address. The URL is often the first impression we get of any internet resource—sometimes I’ll hover over a link just to see what it says, and in that case it’s my only impression. While a URL is only a pointer, it does provide a window into what might be behind it, and I know I trust .com or .org over .biz.

Now the IANA has made it much easier to add new TLDs, and soon you might see more and more suffixes like .nyc, .audio, or even the surprisingly popular .realtor. (My personal favorite: .horse). Some of these are open, but others will come with their own regulations on usage or access (for instance, .nyc only allows proven New York City residents or institutions to register, while .xxx and .adult are reserved for obvious reasons).

URLs are far less concrete and fixed than physical addresses, but they do still represent local ties. URL suffixes that you might have seen like .ly, .tv, and .io actually belong to Lybia, Tuvalu, and the British Indian Ocean Territory, respectively, and are usually leased out to companies in the US or Europe for a fee (using one of these URLs is rather annoyingly called a “domain hack”). Such exchanges reflect the dynamics of communication and transaction on the web, where borders are more permeable than in the physical world; it’s also a reminder of the economic realities behind the curtain, with these often small and extremely poor countries selling virtual land to tech startups.

The new “generic” TLDs are distinct from these reserved “country code” TLDs, and won’t be affiliated specifically with physical borders. But many of the more popular new TLDs are location-based (.nyc, .berlin, .paris), or reserved for specific companies or practices (like .wme, specific to William Morris). So these domains reflect something about who’s going to use them. Just today, NPR’s Morning Edition considers whether some of these suffixes are akin to extortion; domains like .wtf, .fail, or .sucks are quickly snatched up by companies who are worried about what could turn up there, but free speech advocates see it as a positive force. So this new industry is already brewing controversy.

I was curious about two main questions, which led me to two data sets:

  1. At this early stage, which of the new TLDs are gaining popularity?
  2. Who are the entrepreneurs and technologists starting these domains, and why? Who keeps up the registries, both new and old? I can understand the commercial promise of a domain like .news or .travel, but it still seems like an unknown quantity. Who is behind this next generation of domains?

The Project

A few months ago I stumbled across a registry of all the domain name registrars. I liked the idea of making a register of registrars, so I started with this, which led me to two interesting datasets. The idea of an old-fashioned mashup came to mind:

  1. the ICANN’s monthly registry reports, which had detailed data on the volume of domain registrations (which I took as a proxy for the number of sites) for generic TLDs, an
  2. the IANA’s root zone database, which had the addresses for both generic and country-level TLDs.
Gathering insights from d3plus turned out to be messy.

Gathering insights from d3plus was difficult.

First I had to scrape the data from both sites. The ICANN’s data was hidden behind CSVs, which I had to programmatically extract, but it did give me historical data. I got a total of 912 registries, with about 1000 institutions sponsoring and administering them; in December 2014, the generic TLDs alone housed 158 million domains (75% of these, however, were .com).

I started by trying some graphs with d3plus, but the differences in scale made it very hard to graph; the 119 million domains on .com (over half of the web), or even the 15 million .net domains, simply can’t be graphed next to many of these that still number in the hundreds or thousands. The sheer number also made it a little messy. But it started to give me an idea of which new domains are gaining popularity.

The IANA site contained full addresses for the sponsor, administrative contact, and technical contact of each domain, so course I thought of a map; and I wanted to see what was behind these addresses, so I added street view screenshots where available. Unfortunately the address data was not very clean, and the process of geocoding the data was a challenge— as a result, don’t take these maps as a complete picture of all registries.

Map of some of the country code TLD registrars. Click on the map to see the full interactive viz on CartoDB

Map of some of the country code TLD registrars. Click on the map to see the full interactive viz on CartoDB

Google Maps’ geocoding API was giving me trouble, both with quota limits and bizarre results (I admit my data wasn’t the cleanest, but at one point it claimed to not find 1600 Amphitheatre Parkway!). Also, naturally the geo data is biased towards better-mapped parts of the world. Given more time I would make sure the data was comprehensive, and I would clean and structure the address data so it could make best-guess assumptions. However, these views allow you to get a glimpse and explore certain pieces.

Weighted map of some generic TLD registrars. Click on the map to see the full interactive viz on CartoDB.

Weighted map of some generic TLD registrars. Click on the map to see the full interactive viz on CartoDB.

I ended up with two visualizations, screnshotted and linked above:

  1. A “heatmap” of some of the sponsors, administrators, and technicians, weighted by the combined traffic from each domain it owned.
  2. A map of all country-code TLD sponsors, administrators, and technicians for which geo data was available.

Again, the data is not complete, but I was able to get some insights and find some interesting outliers.

Findings

Especially considering what the map leaves out, the country-level domains are widely spread out around the globe. But when only viewing the generic TLDs, or weighing it by traffic, the US and Europe dominate the map. One that stands out is .top, which comes from China and seems to have gained popularity recently.

Isolating the data from September-December 2014, some of the more popular and fastest-growing new domain names include .realtor (89,000 domains), .nyc (62,000), .london (54,000), .link (53,000), and .top (36,000, based in China). Some of the other popular ones include .website, .rocks, and of course, .cat, which is for the purposes of Catalonia, but I’m sure some have “hacked” it for internet memes. The ones gaining the most traction range from novelties to geographies and professional credentials.

Top 20 generic TLDs with the most domains from December 2014 (log scale).

Top 20 generic TLDs with the most domains from December 2014 (log scale).

My favorite moment when exploring the data was when I noticed a hotspot around D.C. and clicked around. I saw one for the suffix .desi— it stood out because it was a nice house rather than the usual generic office park. It turns out that the founders started the domain to serve as a hub for the desi (i.e. South Asian) community and diaspora worldwide.

Desi also happens to be the name of my CMS colleague Desi Gonzalez (weirdly, Desi Networks LLC is also based a 15 minute drive from her home), so I immediately alerted her. She has already snatched up at least 2 domains, making my data obsolete, and she has threatened to buy 20 more (my personal favorite: whois.desi).

While you may not find your name among the new TLDs, another one might serve well for a new project or business, or make you curious about its origin. It will be interesting to follow the future of these virtual speculations, and see if startups will ever move past the usual “hacks” once the existing domains run out of space.

1 thought on “The New gTLDs are Here! Woohoo!

  1. Great piece of explanatory journalism on the TLDs; this piece has a strong narrative, compelling insights and strong visual semiotics.

Comments are closed.