An idea for URL shortners: content type hintingPosted: April 21, 2009
Regardless of how you [feel about URL shortening services](http://kottke.org/09/04/url-shorteners-suck) they have become an essential component of the Internet today. Services like Twitter make them all but a necessity so that character starved bloggers can squeeze as much copy into a tweet as possible, after all, [every character counts](http://www.copyblogger.com/twitter-writing/).
I recently watched an [interview with Ev Williams](http://revision3.com/tekzilla/web20/) in which he talked briefly about making features for a product simply by blessing and codifying what your users are already doing in some meaningful way. His example was their observing people using the `@twittername` to signify a reply and their subsequent decision to auto-link that identifier to the person’s Twitter account. That interview, plus a later conversation gave me an idea for a pattern I would love to see adopted by URL shortening services in the hopes of laying the groundwork for similar serendipitous features on Twitter.
The problem is thus: URL shortening services make the content being linked to even more opaque to a client than it already is. When you think about what can be inferred by a complete URL on the Internet you realize just how much information is lost in the shortening process. Consider these URLs:
From this URL you can infer that what is being linked to is a video. You also know the video’s ID and could conceivably embed your own video player to display it on your web site.
From this URL you can infer that what is being linked to is a photo, you have a handle to the identity of the person who uploaded the photo, and you have an ID to the photo.
And the list goes on. Even relatively random URLs from stupid blogs can tell you a lot:
This tells you at the very least that what is being linked to is at least a web page, most likely in HTML. You could infer more I suppose, but it would be difficult for a machine to do so without actually following the link.
And that is the crux – you don’t want to force clients to follow the link in order to make these simple inferences. Ideally, a shortened URL could provide some kind of hint to as to the nature of the content being linked to on the other end so that Twitterific for example could embed a “play” button to play a video, or even embed a photo posted via TwitPic directly in your feed.
So what if a convention was defined and adopted that:
* didn’t increase the length of a shortened URL that much.
* that was completely optional.
* that was utilized in the same manner across multiple shortening services?
Then clients would be free to do whatever they wanted with the extra metadata, mainly ignore it, or bring some added value to their users. So here is my proposal:
* URL shorteners should append to any shortened URL a simple one character code identifying the content type of the content being linked to delimited by a hash or pound sign, e.g. “#”. For example, both of the following URLs remain resolvable despite being augmented with an additional two characters:
* URL shorteners ideally agree to a standard, or at least publish a lookup table for their coding conventions. Such a look up table might be as simple as:
|o||a page with an oembed tag|
|b||a blog post with comments|
The specifics don’t really matter provided that the codes tell clients something that would be useful to them when processing a link to a resource.
It also has made me ponder a little bit how tweets could be similarly augmented to include additional meta data about the tweets, as hash tags do, and what clients could then turn around and do with that extra little bit (no pun intended) of information.