Yesterday, Matthew Hurst (Microsoft Live Labs and Data Mining) posted about five projects applying spidering, crawling, and text mining techniques to Track Political Buzz in the United States.
None are anywhere near as ambitious as VOSON (ANU, Canberra), their level of automated processing appears to vary widely, and they could all benefit from seriously (re)considering their information design. But they are all fairly serious attempts to distill relevant statistical and semantic information from the steadily expanding ocean of online political discourse.
These may seem like novelties now, but how long before it's critical that we apply (semi)automated tools to have any hope of seriously engaging the volume and variety of public discourse*?
* A 1999 paper used hand-coding to summarize 4,832 comments submitted for the Hoover National Forest (IN). According to a 2007 report on management plans for the Bitterroot, Lolo and Flathead National Forests (MT), hand-coding is still used. While the number of respondents was still relatively low (~2,800), other issues (e.g., the proposed elimination of the 2001 Roadless Area Rule) have garnered millions of comments.