augmentation: Language processing

6.18.2007

Language processing

Is social science on the verge of a paradigm shift? Research into large-scale natural language processing has been going for some time (especially in national security -- see here and here), but it also seems understanding of how such tools might be applied to social science may be emerging.

In "Data Catalysis: Facilitating Large-Scale Natural Language Data Processing" (presented at ISUC 2007) Patrick Pantel presents a USC project to extend such expertise to social scientists.

While there may still be a gap between such tools and the needs and understanding of most of us, Daniel Hopkins and Gary King recently demonstrated the feasibility of "Extracting Systematic Social Science Meaning from Text," using machine learning to categorize millions of political texts (websites, blogs) with accuracy rates rivaling human coders.

H/T to Mark Liberman at Language Log for the link to Pantel's paper.

6.18.2007

Language processing

No comments: