8.18.2005

Open source unstructured text analysis

Computerworld is reporting that IBM has made their Unstructured Information Management Architecture fully open to the public. Essentially a text mining application, UIMA was designed for very large data sets where metaformating is non-existent.

Why would this be relevant for political scientists? Because this sort of tool might also be used to identify and track patterns in other large, unstructured (or semi-structured) datasets, like online news, blogs, webpages, or other heterogeneous electronic collections.

No comments: