Thursday, December 10, 2009

Microformats

Jeff Atwood, who writes the well-known Coding Horror blog, took on the topic of Microformats recently. His misguided comments about the presumed hackiness of overloading CSS classes with semantic meaning (actually their intended purpose) had people quoting the HTML spec:

The class attribute, on the other hand, assigns one or more class names to an element; the element may be said to belong to these classes. A class name may be shared by several element instances. The class attribute has several roles in HTML:
  • As a style sheet selector (when an author wishes to assign style information to a set of elements).
  • For general purpose processing by user agents.

Browsers work great for navigation and presentation, but we can only really compute with structured data. Microformats combine the virtues of both.

There are at least a couple of ways in which the ability to script interaction with web applications comes in handy. For starters, microformats are a huge advance compared to screen-scraping. The fact that so many people suffered through the hideous ugliness of screen-scraping proves that there must be some utility to be had there.

Also, web-based data sources have a browser-based front-end and also often expose a web service. Microformats link these together. A user can find records of interest by searching in the browser, embedded microformats allow the automated construction of a web service call to retrieve the data in structured form.

Microformats aren't anywhere near the whole answer. But, the real question is how to do data integration at web scale using the web as a channel for structured data.

See also