Hi, folks. I've had several conversations (both electronic and in person)
with folks about Problem 2. Having discussed #2 with Kevin, I need to
issue a general correction. In my prior email, I described the process of
"breaking" the MM-estimator as creating contaminated coefficients outside
the interval [true.coef +- 2*SE(true.coef)], where SE(true.coef) came from
a regression of the uncontaminated data.
However, in fact, since we're interested in the performance of the
MM-estimator and the inferences we draw from it, you should compare the
true, uncontaminated beta.true coefficients you used to generate the data
to an interval constructed around the *contaminated* coefficients using
the *contaminated* SE's. That is, the question to ask is "are the
true.coeffs in the interval [beta.MM +- 2*SE(beta.MM)] ?"
In a real, applied data situation, you'll only have the beta.MM and
SE(beta.MM) values from which to draw inferences. Thus, we want to use
these quantities and see how difficult it is to draw incorrect inferences
from these quantities.
Apologies for getting this wrong in several conversations. I hope this
description is clear. Please don't hesistate to ask if it's not.
Sorry for the confusion,
Ryan
See Q&A below:
****
I still don't understand what fails means. I have original values and
contaminated values. I know I need to look at the coefficients, but what does
it mean when the coefficient estimates are not within +- 2 standard errors of
the true values? I see that my standard error terms (in the contaminated
group) are more than twice the original standard errors. Is this what the
question is regerring to? If not, what key am I missing?
****
Not exactly. Start with the original data generating process you use.
Regress to show that you recover the coefficients you used to generate the
data. Then contaminate the data. Regress again. Be sure that your data
are sufficiently contaminated. By "sufficiently", I mean that the
coefficients from the second regression (on the contaminated data) are
outside of the interval defined by [the value you used to generate the
data] +- 2SD's. The SD's will come from the first estimation you
performed. The SD's from the contaminated regression aren't the
quantities of interest in this question.
Let me know if that's not clear!
Ryan
Hi guys-
I hope that the homeworks and final papers are going well!
Just wanted to let you know that I will be having sporadic email access
over the next two weeks (as will, I imagine, Kevin and Ryan). For this
reason, I would encourage you to send questions to the list rather than to
us individually.
Ryan and I will be holding an extended office hour session on January 3rd
before the final is released. (If there is material you feel it would be
helpful to work into a more formal review session, please let me know. My
sense, though, was that one-on-one sessions would be a more effective use
of everyone's time. Please let me know if I am mistaken.) Ryan will be
sending out more details about times closer to the third.
For those who don't yet have copies of Venables and Ripley or Stock and
Watson, Ryan has some extra copies. Let us know if you're still missing a
copy.
Happy holidays!
Alison
********************
Alison Post
Department of Government
Harvard University
www.people.fas.harvard.edu/~apost
Peeps,
Quick question on #2: how much freedom do we have to "contaminate" the
data set? Can we change the first 2000 values of all the xi's as well as
y? Or just the xi's?
Thanks,
Vip
On plots of DFBETAs: How do we decide what deserves a plot, and what
does not? Seems like we could just do a DFBETA plot matrix and plot the
effect of each independent variable on the other. Is this shotgun
approach correct, or is there a more systematic way of approaching the
problem?
Wearily,
Andy
Another elementary question.
I have a 'y' variable but no x1 !!
Code:
hw10dat <- read.table("hw10data.dat", header=T, row.names=2)
attach(hw10dat)
help?
Hi, folks. As you've noticed, FAS is having some trouble with the ICE
cluster. Right now, the server that's been up the longest is ice1, and
it's been up for less than two consecutive days. It seems that this
problem is related to the operation of Java on the ICE cluster, which some
class needs at this point. FAS is working to correct the problem.
In the meantime, be sure to save your .tex, .R, and other files
frequently. C-x C-s. Feel free to continue to leave them open when you
log off, as usual, but be sure they are saved. If you have any questions
about restarting your VNC session, or getting your XDesktop looking like
you want it to, please don't hesitate to email. Email to the list is most
efficient, since many people may have the same questions.
I also encourage using the script to connect, as it automatically creates
a session if you need it to. Of course, using SecureCRT and "vncserver"
is perfectly fine as well, but remember to log off and re-log on after
you establish the appropriate Port Forwarding.
Best,
Ryan
------------------------------------------
Ryan T. Moore ~ Government & Social Policy
Ph.D. Candidate ~ Harvard University
Homepage: http://www.people.fas.harvard.edu/~rtmoore/
Gov1000: http://www.courses.fas.harvard.edu/~gov1000/