[gov3009-l] Grimmer on "Quantitative Discovery from Qualitative Information"

Matt Blackwell mblackwell at iq.harvard.edu
Tue Sep 8 11:51:27 EDT 2009


Hello Applied Statistics Community,

Please join us tomorrow, September 9th for our first workshop of the
year when we are happy to have Justin Grimmer presenting joint work
with Gary King entitled "Quantitative Discovery from Qualitative
Information: A General-Purpose Document Clustering Methodology." The
workshop will start with a light lunch at 12 noon and the presentation
will start at 12:15.

Justin and Gary have provided the following abstract for their paper:

Many people attempt to discover useful information by reading large
quantities of unstructured text, but because of known human
limitations even experts are ill-suited to succeed at this task. This
difficulty has inspired the creation of numerous automated cluster
analysis methods to aid discovery. We address two problems that plague
this literature. First, the optimal use of any one of these methods
requires that it be applied only to a specific substantive area, but
the best area for each method is rarely discussed and usually
unknowable ex ante. We tackle this problem with mathematical,
statistical, and visualization tools that define a search space built
from the solutions to all previously proposed cluster analysis methods
(and any qualitative approaches one has time to include) and enable a
user to explore it and quickly identify useful information. Second, in
part because of the nature of unsupervised learning problems, cluster
analysis methods are not routinely evaluated in ways that make them
vulnerable to being proven suboptimal or less than useful in specific
data types. We therefore propose new experimental designs for
evaluating these methods. With such evaluation designs, we demonstrate
that our computer-assisted approach facilitates more efficient and
insightful discovery of useful information than either expert human
coders using qualitative or quantitative approaches or existing
automated methods. We (will) make available an easy-to-use software
package that implements all our suggestions.

You can find a copy of the paper here:
http://gking.harvard.edu/files/discov.pdf

We hope you can make it.

Best regards,
matt.


~~~~~~~~~~~
Matthew Blackwell
PhD Candidate
Institute for Quantitative Social Science
Department of Government
Harvard University
email: mblackwell at iq.harvard.edu
url: http://people.fas.harvard.edu/~blackwel/


More information about the gov3009-l mailing list