Actually, please send this to the entire list.
Homeworks are collaborative both within and across study groups. I think that
Nirmala has the right dataset created and the more that she can help out others
in making their own datasets, the better off the class will be.
"From each according to her abilities, to each according to her need."
-- Ayn Rand
;-)
Dave
Olivia Lau writes:
> Nirmala,
>
> Can you please send me summary(W)? It seems that I have no idea what the
> dataset is supposed to look like. Thanks,
>
> Olivia.
>
> On Tue, 3 Dec 2002 dkane(a)latte.harvard.edu wrote:
>
> > ravishan(a)fas.harvard.edu writes:
> > > So, I tried what you suggested and got some results but I am not sure how to
> > > interpret this. The coefficient for the interactive term (incumb*dwin) was
> > > 0.017. How does this one number tell us if the incumbency effect differs for
> > > the two parties?
> > >
> > > Thanks,
> > > Nirmala
> > >
> > >
> > >
> > >
> > > > reg2 <- lm(dpct ~ dpct.old + dwin + incumb + incumb*dwin, data = W)
> > > > summary(reg2)
> > >
> > > Call:
> > > lm(formula = dpct ~ dpct.old + dwin + incumb + incumb * dwin,
> > > data = W)
> > >
> > > Residuals:
> > > Min 1Q Median 3Q Max
> > > -0.3749963 -0.0481869 -0.0006044 0.0510622 0.3483169
> > >
> > > Coefficients:
> > > Estimate Std. Error t value Pr(>|t|)
> > > (Intercept) 0.143481 0.008223 17.449 < 2e-16 ***
> > > dpct.old 0.731561 0.016639 43.967 < 2e-16 ***
> > > dwin -0.037324 0.008766 -4.258 2.13e-05 ***
> > > incumb 0.039148 0.005379 7.277 4.40e-13 ***
> > > dwin:incumb 0.017972 0.008352 2.152 0.0315 *
> > > ---
> > > Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
> > >
> > > Residual standard error: 0.0797 on 2810 degrees of freedom
> > > Multiple R-Squared: 0.7338, Adjusted R-squared: 0.7334
> > > F-statistic: 1937 on 4 and 2810 DF, p-value: < 2.2e-16
> >
> > Excellent question! In many ways, interpreting this output is at the very heart
> > of the course. Here are some thoughts. I would be eager to hear how other
> > people look at this. Reasonable people (read: Gary and I) sometime disagree
> > about the best way to look at stuff like this.
> >
> > 1) Note that the number of observations looks more or less correct. (I can tell
> > that by looking at the degrees of freedom.)
> >
> > 2) Note how the residual standard error meets Gary's claim that around this is
> > always around 8\%. In other words, for a given set of right hand side
> > variables, your prediction of the vote will be within plus/minus 16% of the
> > actual results 95% of the time. Whether you consider this to be good or bad
> > depends on your purpose.
> >
> > 3) Note that the coeficient of dpct.old, a "control" variables --- a variable
> > for which you are *not* trying to estimate a causal effect -- is
> > plausible. I am not surprised that districts (regardless of whether the
> > election occured in 1920 or 1990) in which the democrats did well last time
> > are also districts in which the democrats did well this time.
> >
> > 4) As an exercise (please send your answers to the list!), you should think
> > about what the meaning of 0.73 as a coefficient of dpct.old is. Would you be
> > surprised if the coefficient was 1.2?
> >
> > 5) The coeficient of dwin seems less sensible. If we were playing Gary's
> > betting game and the goal was for you to guess the democratic percentage in
> > this election (left hand side variable) after Gary told you the values for
> > the right hand side variables, I would certainly have though that you should
> > guess *higher* if the Democrats won the last election, not lower. One way to
> > explore this is to see how the regression results look (please show the list
> > in a separate e-mail) without the interaction term of dwin:incumb. A lot of
> > times, including an interaction term can make the "main" effect swap signs.
> >
> > 6) The interpretation of the key causal effect (incumbency) is now tricky
> > because incumb appears in two terms, alone and with dwin. Again, Neter
> > chapter 11 does a great job of walking through this slowly. To do it
> > ourselves, we need to know how you coded incumb and dwin. (I assume you did
> > it the standard way, but this e-mail is long enough already.) You (or anyone
> > else in the class) should send this info, along with a repeat of the summary
> > output, to the list for further discussion.
> >
> > 7) I am also eager to read and provide improvements to other's answers for the
> > various subparts of the questions. Again, for the homeworks, it is
> > encouraged that you send to the list your output and yoru proposed response
> > to a given question.
> >
> > Dave
> >
> >
> > --
> > David Kane
> > Lecturer In Government
> > 617-563-0122
> > dkane(a)latte.harvard.edu
> > Please avoid sending me Word or PowerPoint attachments.
> > See http://www.fsf.org/philosophy/no-word-attachments.html
> > _______________________________________________
> > gov1000-list mailing list
> > gov1000-list(a)fas.harvard.edu
> > http://www.fas.harvard.edu/mailman/listinfo/gov1000-list
> >
>
>
--
David Kane
Lecturer In Government
617-563-0122
dkane(a)latte.harvard.edu
Please avoid sending me Word or PowerPoint attachments.
See http://www.fsf.org/philosophy/no-word-attachments.html
Hello,
In footnote 10 of the GK article with the model we're supposed to use for
1E...
What does "significant difference" mean? I mean, clearly, they must be
different by some factor because one is adding the gammas and the other
subtracts gamma1 from gamma0. My difference is about 1.5%...is this
significant?? What do I do to find out if it's significant?
Thanks,
Olivia
This is fine for the list.
Hmmmmm.
Nothing jumps out at me. Show us the key sections of the transcript (the call
to lm, the creation of the new data, the call to predict and the first few rows
of what predict returns)
That is, don't give us lots of output, just the key stuff.
Make sure to provide a print of q1bkm.
I think that there is some naming issue between the call and the new data.
Dave
dhopkins(a)fas.harvard.edu writes:
> Dear Dave,
>
> I am sending this to you individually, just on the off-chance that you wouldn't
> want the summarized data below out to the list. I am happy to resend this to
> the list assuming that is OK.
>
> The code that I used for the troublesome lm object is:
>
> q1bkm <- lm(clean8$dempct.10 ~ clean8$dempct.08 + clean8$demwin.08 + clean8
> $incum.10)
>
> "Clean 8" is my cleaned data set, summarized as follows:
>
> > summary(clean8)
> state dist incum.10 dem.10
> Min. : 1.00 Min. : 1.00 Min. :-1.00000 Min. : 5131
> 1st Qu.:14.00 1st Qu.: 3.00 1st Qu.:-1.00000 1st Qu.: 32642
> Median :24.00 Median : 7.00 Median : 0.00000 Median : 56394
> Mean :31.25 Mean :11.46 Mean :-0.04879 Mean : 61361
> 3rd Qu.:47.00 3rd Qu.:14.00 3rd Qu.: 1.00000 3rd Qu.: 81474
> Max. :82.00 Max. :98.00 Max. : 1.00000 Max. :1237409
>
> rep.10 incum.08 dem.08 rep.08
> Min. : 2589 Min. :-1.00000 Min. : 0 Min. : 0
> 1st Qu.: 35543 1st Qu.:-1.00000 1st Qu.: 29171 1st Qu.: 29063
> Median : 57605 Median : 0.00000 Median : 53721 Median : 55055
> Mean : 63009 Mean :-0.05473 Mean : 58678 Mean : 59813
> 3rd Qu.: 82328 3rd Qu.: 1.00000 3rd Qu.: 77871 3rd Qu.: 78590
> Max. :1447154 Max. : 1.00000 Max. :1455972 Max. :1342409
> NA's : 8 NA's : 6
> dempct.10 dempct.08 demwin.10 demwin.80
> Min. :0.3003 Min. :0.0000 Min. :0.0000 Min. :-1.00000
> 1st Qu.:0.4108 1st Qu.:0.4037 1st Qu.:0.0000 1st Qu.:-1.00000
> Median :0.4886 Median :0.4890 Median :1.0000 Median :-1.00000
> Mean :0.4931 Mean :0.4993 Mean :0.5049 Mean :-0.07666
> 3rd Qu.:0.5736 3rd Qu.:0.5825 3rd Qu.:1.0000 3rd Qu.: 1.00000
> Max. :0.6998 Max. :1.0000 Max. :1.0000 Max. : 1.00000
> NA's :9.0000 NA's :6.0000 NA's : 9.00000
> demwin.08
> Min. :0.0000
> 1st Qu.:0.0000
> Median :0.0000
> Mean :0.4617
> 3rd Qu.:1.0000
> Max. :1.0000
> NA's :9.0000
>
> Many thanks for your help in figuring this out.
>
> Best,
> Dan
>
>
> Quoting dkane(a)latte.harvard.edu:
>
> > dhopkins(a)fas.harvard.edu writes:
> > > Dear Dave and classmates,
> > >
> > > Many thanks for the tip. I checked the email archive, and have copied the
> > old
> > > dialogue below. Unforunately, even removing the "c" does not produce a
> > single
> > > predicted value. Here's my new code:
> > >
> > > > ndt5 <- data.frame(dempct.08 = .45, demwin.08 = 0, incum.10 = 1)
> > > > ndt5
> > > dempct.08 demwin.08 incum.10
> > > 1 0.45 0 1
> > > > predict(q1bkm, newdata = ndt5, int = "p", level = .9)
> > > fit lwr upr
> > > 1 0.4093867 0.3006464 0.5181271
> > > 2 0.4378359 0.3290813 0.5465904
> > > 3 0.4662118 0.3573642 0.5750593
> > > 4 0.4153382 0.3065995 0.5240770
> > > 5 0.4330430 0.3242947 0.5417913
> > > (again, it goes on for the thousands of available values)
> > >
> > > I will continue to experiment, but certainly welcome advice as to how to
> >
> > > proceed, or any thoughts on what I am doing wrong. Again, many thanks!
> >
> > Show us the code that made q1bkm. There is something fishy here . . .
> >
> > Dave
> >
> > --
> > David Kane
> > Lecturer In Government
> > 617-563-0122
> > dkane(a)latte.harvard.edu
> > Please avoid sending me Word or PowerPoint attachments.
> > See http://www.fsf.org/philosophy/no-word-attachments.html
> >
>
>
>
>
>
--
David Kane
Lecturer In Government
617-563-0122
dkane(a)latte.harvard.edu
Please avoid sending me Word or PowerPoint attachments.
See http://www.fsf.org/philosophy/no-word-attachments.html
Dear Colleagues,
I am having trouble doing the matrix manipulation, perhaps someone can spot my
error. I load the MASS library, but I still can't manage to use the "ginv"
command.
> library(MASS)
> ginv(clean8$demwin.08)
Error in svd(X) : NA/NaN/Inf in foreign function call (arg 1)
However, the transpose function DOES work, indicating that the problem is with
ginv/MASS and not with my underlying data.
> betatest <- t(clean8$demwin.08)
> dim(betatest)
[1] 1 2357
As always, helpful hints are helpful indeed. Many thanks.
Best,
Dan
When you say calcualte the quantities by hand, how do you want us to
present that? ie AB_{1,1}=3*1=3 etc... in latex? that seems really
tedious and time consuming. Could we submit a handwritten appendix
instead?
My impression is that what you're asking is to make sure we understand how
to add, subtract, and multiply matricies, their transposes, and their
inverses and not just rely on R or a TI-83; while it may not look as
polished, I'd rather write it out by hand and spend the saved time on
trying to get my R-code for Q1 right then typing up every little
multiplied value.
Thanks,
~T
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tiffany C. Nagano
Harvard College '05
nagano(a)fas.harvard.edu
313 Mather House Mail Center
(617) 493-7370
Dear Dave,
Question on #2. P_8 and v_8 are highly correlated. Specifically, P_8 is
always 1 when v_8 is high (ie, if democratic % of vote is high, democrat
wins!), and vice versa. So it is impossible to match on extreme values like
P_8 = 0 & v_8 = (high bin).
How can we match on these variables?
I can think of writing 2 separate loops for the two values of P_8 and
constraining v_8 to the corresponding range, but does this method satisfy the
philosophy of matching??
Thanks,
Phillip.
-------------------------------------------------
Phillip Y. Lipscy
Perkins Hall Room #129
35 Oxford Street
Cambridge, MA 02138
(617)493-4893
lipscy(a)fas.harvard.edu
Ph.D. Candidate
Harvard University, FAS, Department of Government
-------------------------------------------------
Hi all,
Whan our model includes the incumbency variable and an interaction variable
(incumbency*year): Should we calculate the total effect of incumbency in, say,
1920 by adding the (general) incumbency coefficient to that of the interaction
variable for that specific year?
Thanks,
Asif
Gary, Dave and Tao,
for Q4(b) are we supposed to make up some diagonal matrix for A and any matrix
for B and then calculate AB and AA? Or is there some formula with the diagonal
matrix like that we have with the identity matrix?
yongwook
-----------------------------
Yongwook Ryu
PhD Candidate
Department of Government
Harvard University
Tel:617-493-3397
Email: yryu(a)fas.harvard.edu
-----------------------------
1d:
When you say "something like 'democratic' and 'republican'", i take this to mean
changing 0 to -1? Is there a different interpretation?
1f:
When we set up the factor variable, should we just use year - ie 1910, 1920,
etc.; or should we use numbers like t = 1, 2, 3, 4...? 1d implies there won't
be a difference, but just to make sure. I recall trying this for the midterm
and getting really weird values when I used the year numbers, esp. for the
interaction effect. Is the ordinal ranking all that matters?
2b:
What exactly are you looking for here? The question seems to answer itself in
the parentheses. I'm not sure I understand the wording.
Thanks,
Phillip.
-------------------------------------------------
Phillip Y. Lipscy
Perkins Hall Room #129
35 Oxford Street
Cambridge, MA 02138
(617)493-4893
lipscy(a)fas.harvard.edu
Ph.D. Candidate
Harvard University, FAS, Department of Government
-------------------------------------------------
Dear Colleagues,
When I use the function "predict", I expect--following the answer key from
homework 4, pg. 2--to get a single set of three values, "fit", "lower,"
and "upper." But instead, I get a value for every data point:
> predict(q1bkm, newdata = data.frame(c(dempct.08 = .45, demwin.08 = 0,
incum.10 = 1)), int = "p", level = .9)
fit lwr upr
1 0.4093867 0.3006464 0.5181271
2 0.4378359 0.3290813 0.5465904
3 0.4662118 0.3573642 0.5750593
4 0.4153382 0.3065995 0.5240770
(it runs through a few more thousand...)
Any thoughts on how to get it to predict a single value once I have specified
values of each dep. variable?
Best,
Dan