Hi Everyone,
The Ornstein data problem on problem set 8 requires you to deal with
rhs variables that are categorical-- what are called "factors" in R.
The way to deal with such variables is to create a series of dummy
variables that allow the expected value of y to shift across the
different categories of the categorical variable.
Suppose you have a categorical variable X that has 3 categories (A, B,
and C). Assuming there is an intercept in the model, the idea is to
use 2 dummy variables-- one of which is equal to 1 whenever X = B (0
otherwise) and another dummy variable that is equal to 1 whenever X =
C (0 otherwise). The coefficient on the dummy for B tells you how much
higher the mean of y is in category B than the baseline category (here
X = A) holding the other independent variables in the model constant.
The coefficient on the C dummy can be interpreted similarly. In
general if your X has k categories you will use k-1 dummy variables.
One of the *really* nice things about R is that if a categorical
variable is coded as a factor R will automatically create the
appropriate dummy variables for you. Take a look at the following
highly contrived example:
> x <- factor(c("Pat", "Pat", "Pat", "Pat", "Sam", "Sam", "Sam",
"Sam", "Chris", "Chris", "Chris", "Chris"))
> y <- c(5, 5, 5, 5, 12, 12, 12, 12, 30, 30, 30, 30)
> data.frame(y,x)
y x
1 5 Pat
2 5 Pat
3 5 Pat
4 5 Pat
5 12 Sam
6 12 Sam
7 12 Sam
8 12 Sam
9 30 Chris
10 30 Chris
11 30 Chris
12 30 Chris
> lm.out <- lm(y~x)
> summary(lm.out)
Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-1.634e-14 7.456e-31 1.140e-30 1.534e-30 1.717e-14
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.000e+01 4.163e-15 7.206e+15 <2e-16 ***
xPat -2.500e+01 5.888e-15 -4.246e+15 <2e-16 ***
xSam -1.800e+01 5.888e-15 -3.057e+15 <2e-16 ***
---
Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
Residual standard error: 8.327e-15 on 9 degrees of freedom
Multiple R-Squared: 1, Adjusted R-squared: 1
F-statistic: 9.596e+30 on 2 and 9 DF, p-value: < 2.2e-16
Our results tell us that the average level of y is 25 units lower for
Pat than for Chris, and 18 units lower for Sam than for Chris. We can
verify this by looking at the original data.
Alison will also talk about dummy variables a bit today in section.
Hope this helps.
Best,
Kevin
------------------------------------------------------
Kevin Quinn
Assistant Professor
Department of Government and
Center for Basic Research in the Social Sciences
34 Kirkland Street
Harvard University
Cambridge, MA 02138
Hi Everyone,
Problem set 8 is now available on the course website.
Best,
Kevin
------------------------------------------------------
Kevin Quinn
Assistant Professor
Department of Government and
Center for Basic Research in the Social Sciences
34 Kirkland Street
Harvard University
Cambridge, MA 02138
Hi Everyone,
I have substantially revised the slides for today's lecture. They are
now available on the course website.
Best,
Kevin
------------------------------------------------------
Kevin Quinn
Assistant Professor
Department of Government and
Center for Basic Research in the Social Sciences
34 Kirkland Street
Harvard University
Cambridge, MA 02138
How can I change it? What are my options? I have a table that does not
fit, so I wanted to make the font smaller....
I consulted Not-so-short and tried the:
\small
etc commands and they didn't work.
Any ideas?
Thanks! Becky :)
>From the homework so far, it seems that we haven't HAD to do anything in
linear.hypothesis that we couldn't have done in anova. is this right? if
so, what is it that linear.hypothesis CAN do that anova can't, and if
not, where have I gone drastically wrong on the problem set?
thanks
Lucy
a question -
are we testing whether Black and Hisp are simultaneously equal to zero, or
whether each one might be equal to zero? does the second half of 7d refer
to the two tests made in question 7 all together (7b/c and one from 7d) or
the two tests from 7d (assuming the second interpretation of my first
question?). Clearly the answer to my second question relies on the answer
to my first.
Confusingly, and syntactically quite shockingly poor as to the way this
was written,
matt
I'm trying to format a document where I'd like the page number to be in
the upper right hand corner, prefaced by my name. Does anyone know an
easy way to do this?
andy