Dear all,

Please join us for the Applied Statistics Workshop (Gov 3009) this Wednesday, March 21 from 12.00 - 1.30 pm in CGIS Knafel Room 354. David Reshef, an MD/PhD student at the Harvard-MIT Division of Health Sciences and Technology (HST), will give a presentation entitled "Detecting Novel Bivariate Associations in Large Data Sets". As always, a light lunch will be provided.

For those interested, here is the project website and the accompanying Science article.

Abstract:  
Identifying interesting relationships between pairs of variables in large data sets is increasingly important. One way of doing so is to search such data sets for pairs of variables that are closely associated. This can be done by calculating some measure of dependence for each pair, ranking the pairs by their scores, and examining the top-scoring pairs. We outline two heuristic properties--generality and equitability--that the statistic we use to measure dependence should have in order for such a strategy to be effective. 
We present a measure of dependence for two-variable relationships, the maximal information coefficient (MIC), that has these properties. MIC captures a wide range of associations both functional and not (generality), and assigns similar scores to relationships with similar noise levels, regardless of relationship type (equitability). Finally, we show that MIC belongs to a larger class of maximal information-based nonparametric exploration (MINE) statistics for identifying and classifying relationships.

An up-to-date schedule for the workshop is available at http://www.iq.harvard.edu/events/node/1208.

Best,
Konstantin


--
Konstantin Kashin
Ph.D. Student in Government
Harvard University

Mobile: 978-844-0538
E-mail: kkashin@fas.harvard.edu
Site: http://www.konstantinkashin.com/