gov1000-list December 2002

gov1000-list@lists.fas.harvard.edu

19 participants
83 discussions

by Anna Lorien Nelson

Dear teachers, Can you please contrast how different measures describe our level of confidence in an estimated coefficient (say, psi), and when each should be used? The measures I am thinking of include: 1. standard error of psi in a single year 2. standard error of mean psi (averaged over many years) 3. 95% confidence interval for psi in a single year 4. 95% confidence interval for mean psi (averaged over many years) 5. variance of psi in a single year 6. variance of mean psi (aggregated over all years) 7. any other important measures for confidence that I'm forgetting... For example, in the second midterm I calculated and plotted a separate confidence interval for each year of psi, but I gather that the top 7 exams did not all do this. Why not? Now that I'm starting to get the hang of running a regression, I'm trying to get a better handle on describing and interpreting the reliability of my results. These questions about interpreting confidence in results touch on the substantive interpretation asked for in HW 7, part 1(b). Since there was no answer provided for 1(b), explanation of this interpretative stuff would be particularly helpful. Thanks, Anna -- Anna Lorien Nelson Department of Government, Harvard University alnelson(a)fas.harvard.edu

21 years, 4 months

NAs

by Andrew Reeves

Is there a way to set na.strings within a dataframe I've already loaded? I'm dealing with some data that has different NA values for each variable so I need to set it individually. Any help would be appreciated. Thanks, Andrew

21 years, 4 months

missing question from problem set 8

by Dave Kane

The following question got "lost" in problem set 8. \item One of the purposes of GOV 1000 is to teach you how to express your quantitative ideas, both in words and in pictures. For the later, \texttt{R}'s ``lattice'' graphics are a very powerful tool. You can find a very nice introduction to lattice grahics in the June 2002 edition of ``R News,'' available at: http://cran.r-project.org/doc/Rnews. Using lattice, create a figure that shows the histogram of the percentage Democratic vote across districts for each ``0'' year from 1900 through 1990. In other words, you will be showing 10 individual histograms, one for each decade-ending year. Make the figure legible and visually appealing. Of course, this is not a particular interesting figure, but we wanted to introduce you to lattice in the simplest way possible. If you are feeling adventurous, you are free to provide something more interesting. One ambitious choice would be to show ten scatter plots (one for each decade year) and fitted regression lines of Democratic percentage of the vote versus lagged percentage, as GK do in their Figure 1. This matters because we expect you to be able to use lattice graphics on the final. Tao will be updating the anwers for problem set 8 in due time. In the meantime, if any wanted to give a shot at this for the class list, I know that Tao would appreciate it. Dave -- David Kane Lecturer in Government 617-563-0122 dkane(a)latte.harvard.edu

21 years, 4 months

Simultaneity

by dhopkins＠fas.harvard.edu

Does anyone happen to know a good definition of the "simultaneity issue"? It comes up in a few of the readings. And how is it related to the endogeneity problem? Many thanks. Best, Dan

21 years, 4 months

Rubin citations

by Dave Kane

A couple of people have asked for overviews of the material in the QR 33 handouts. Here are three good articles, all on JSTOR, that provide an overview of the "right" way --- or at least the way that was emphasized in this class --- for thinking about causal effects. Although there are sections of each article that are somewhat advanced, all are worth reading. The tough sections (e.g., Bayesian stuff) involve material that many of you will see in GOV 2001. Statistics and Causal Inference (in Theory and Methods) Paul W. Holland Journal of the American Statistical Association, Vol. 81, No. 396. (Dec., 1986), pp. 945-960. Stable URL: http://links.jstor.org/sici?sici=0162-1459%28198612%2981%3A396%3C945%3ASACI… Practical Implications of Modes of Statistical Inference for Causal Effects and the Critical Role of the Assignment Mechanism Donald B. Rubin Biometrics, Vol. 47, No. 4. (Dec., 1991), pp. 1213-1234. Stable URL: http://links.jstor.org/sici?sici=0006-341X%28199112%2947%3A4%3C1213%3APIOMO… Bayesian Inference for Causal Effects: The Role of Randomization Donald B. Rubin Annals of Statistics, Vol. 6, No. 1. (Jan., 1978), pp. 34-58. Stable URL: http://links.jstor.org/sici?sici=0090-5364%28197801%296%3A1%3C34%3ABIFCET%3… Apologies for not coming up with these earlier in the semester, but I have only recently gained access to the wonder of JSTOR. Dave -- David Kane Lecturer in Government 617-563-0122 dkane(a)latte.harvard.edu

21 years, 4 months

library situation

by dkane＠latte.harvard.edu

Ryan Thomas Moore writes: > Dave: > > I removed the few Rdata files (Palm and GG related stuff), restarted R, > and I still don't get the ctest package when I start R in my > ~/fall02/gov1000 directory. I still do get the ctest package when I start > R in my ~ directory. Any other ideas on how I can use R in my gov1000 > directory, and still have ctest accessible? Good question for the list. cd to the ~/fall02/gov1000 directory Type pwd Type ls -al Start R (from the command prompt) Type ls() from the R prompt. Type search()from the R prompt. Copy and paste the output from all the above as an e-mail to the list. Dave > Thanks in advance, > Ryan > > ------------------------------------------ > Ryan T. Moore ~ Government & Social Policy > Ph.D. Candidate ~ Harvard University > > On Sun, 24 Nov 2002, Dave Kane wrote: > > > Good question for the list. > > > > Hmmm. > > > > By deduction, there must be something different about the two > > directories. You can read about the messy details about how R starts > > up here: > > > > > help(.Rprofile) > > > > > > My best guess is that you have a messed up .Rdata file (or .Rprofile) > > in your ~/fall02/gov1000 directory. Look for them (or anything else > > weird) with ls -a. Delete them if you find them. Then try > > restarting. That should work. > > > > Dave > > > > Ryan Thomas Moore writes: > > > Dave: > > > > > > I restarted R, and within my ~/fall02/gov1000 directory, I still don't > > > have the ctest package. But, if I start R in my home (/rmoore) directory, > > > the package:ctest does appear. I'd rather use R in the ~/fall02/gov1000 > > > directory if I can, but if not, I'm ok with using it in the home > > > directory. In short, I have a viable solution, but is there any way I can > > > get the package:ctest to appear in the /gov1000 directory I've created? > > > > > > Thanks! > > > Ryan > > > > > > ------------------------------------------ > > > Ryan T. Moore ~ Government & Social Policy > > > Ph.D. Candidate ~ Harvard University > > > > > > > > > > -- > > David Kane > > Lecturer in Government > > 617-563-0122 > > dkane(a)latte.harvard.edu > > > > -- David Kane Lecturer In Government 617-563-0122 dkane(a)latte.harvard.edu Please avoid sending me Word or PowerPoint attachments. See http://www.fsf.org/philosophy/no-word-attachments.html

21 years, 4 months

Scholarly Language

by Phillip Y. Lipscy

Dear Dave & Gary, I would like to see at least one example from you that shows us in the kind of language you expect us to use on the exam and in scholarly articles (and no hand-waving!) 1. how to interpret coefficients (i.e. such that reasonable people can agree that it's good or OK), 2. how to convey to the reader why we chose a particular model. Otherwise, I think it is unreasonable to expect us to use such language in the exam. So, for 1., suppose through some legitimate process we obtained the following model as summarized below. Could you give us an example of how a *good, reasonable scholar* could interpret the coefficients in language suitable for publication in a major academic journal? (If you think this is a really bad model, feel free to use another one instead). Thanks, Phillip. > summary(arf5) Call: lm(formula = dperc ~ dwin.lag2 + dwin.lag6 + dperc.lag4 + dperc.lag2 + incumb, data = dog) Residuals: Min 1Q Median 3Q Max -0.351964 -0.044175 0.003146 0.048644 0.361068 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.118519 0.008934 13.266 < 2e-16 *** dwin.lag2 -0.039145 0.009203 -4.254 2.18e-05 *** dwin.lag6 -0.025457 0.004857 -5.241 1.74e-07 *** dperc.lag4 0.320400 0.022001 14.563 < 2e-16 *** dperc.lag2 0.505499 0.023522 21.490 < 2e-16 *** incumb 0.049977 0.004320 11.569 < 2e-16 *** --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 Residual standard error: 0.07459 on 2341 degrees of freedom Multiple R-Squared: 0.7453, Adjusted R-squared: 0.7448 F-statistic: 1370 on 5 and 2341 DF, p-value: < 2.2e-16 ------------------------------------------------- Phillip Y. Lipscy Perkins Hall Room #129 35 Oxford Street Cambridge, MA 02138 (617)493-4893 lipscy(a)fas.harvard.edu Ph.D. Candidate Harvard University, FAS, Department of Government -------------------------------------------------

21 years, 4 months

Comments on the reading

by dkane＠latte.harvard.edu

Here are some brief comments on the readings. 1) There are 2 methodology readings (Leamer and McCloskey). The Leamer article is the best methodology article I have ever read --- and I read a lot of methodology. Both are valuable, both for framing the discussion on Monday and for use in your final exams. 2) The other three articles are somewhat verbose. In an ideal world, you would all have time to read all three very closely. On the off chance that we are not living in the best of all possible worlds, despite what Dr. Pangloss tells me, you should (obviously) focus closely on the article that you are expected to critique. Reading the abstract, introduction, conclusion and regression result sections of the other two articles will give you enough background to follow the discussion. 3) The articles occasionally mention things (3 stage least squares, probit, and so on) that we have not mentioned in class. That's OK. Feel free to skip those parts. 4) If you have any questions are comments on the papers over the week-end, please send them to the list. Example: Why do Alt et al misuse "multicollinearity"? I think it would be useful to get the discussion going before Monday. Back to tending my garden, Dave -- David Kane Lecturer In Government 617-563-0122 dkane(a)latte.harvard.edu Please avoid sending me Word or PowerPoint attachments. See http://www.fsf.org/philosophy/no-word-attachments.html

21 years, 5 months

by Phillip Y. Lipscy

Dear All, About 1b. I'm not sure what to make of the correlation between P_8 and v_8. Following your example on the list serve, we've run several regressions with different combos of the variables, and we've found that the coefficients vary considerably. But given that these are both controls that we don't really care about them, are we mainly concerned about what happens to incumb when we do this? But in that case, taking out either of the variables merely gives us biased results (G/K), right? So what are we looking for? -Phillip. > arf <- lm(dperc ~ dperc.lag2 + dwin.lag2 + incumb, data = dog) > summary(arf) Call: lm(formula = dperc ~ dperc.lag2 + dwin.lag2 + incumb, data = dog) Residuals: Min 1Q Median 3Q Max -3.809e-01 -4.764e-02 -3.825e-06 4.994e-02 2.996e-01 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.158014 0.008646 18.275 < 2e-16 *** dperc.lag2 0.719269 0.018685 38.495 < 2e-16 *** dwin.lag2 -0.038345 0.009417 -4.072 4.82e-05 *** incumb 0.049108 0.004507 10.897 < 2e-16 *** --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 Residual standard error: 0.07789 on 2343 degrees of freedom Multiple R-Squared: 0.722, Adjusted R-squared: 0.7216 F-statistic: 2028 on 3 and 2343 DF, p-value: < 2.2e-16 > cor(dog$dwin.lag2, dog$dperc.lag2) [1] 0.8051607 > summary(lm(dperc ~ dperc.lag2 + incumb, data = dog)) Call: lm(formula = dperc ~ dperc.lag2 + incumb, data = dog) Residuals: Min 1Q Median 3Q Max -3.969e-01 -4.835e-02 -2.247e-05 5.131e-02 3.145e-01 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.156361 0.008666 18.04 <2e-16 *** dperc.lag2 0.685660 0.016819 40.77 <2e-16 *** incumb 0.034132 0.002613 13.06 <2e-16 *** --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 Residual standard error: 0.07815 on 2344 degrees of freedom Multiple R-Squared: 0.72, Adjusted R-squared: 0.7198 F-statistic: 3014 on 2 and 2344 DF, p-value: < 2.2e-16 > summary(lm(dperc ~ dwin.lag2 + incumb, data = dog)) Call: lm(formula = dperc ~ dwin.lag2 + incumb, data = dog) Residuals: Min 1Q Median 3Q Max -0.3875738 -0.0686709 0.0004885 0.0651227 0.3589614 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.444149 0.005642 78.720 <2e-16 *** dwin.lag2 0.121790 0.010792 11.285 <2e-16 *** incumb 0.054079 0.005755 9.398 <2e-16 *** --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 Residual standard error: 0.0995 on 2344 degrees of freedom Multiple R-Squared: 0.5462, Adjusted R-squared: 0.5458 F-statistic: 1410 on 2 and 2344 DF, p-value: < 2.2e-16

21 years, 5 months

3c: Notion

by dhopkins＠fas.harvard.edu

Dear Colleagues, One quick notional question about 3c. We are asked to derive beta_1IV and beta_1IISLS. On page 2 of the handout from section, though, we learn that "the IISLS estimator coincides with the IV estimator," which seems to tell us that the two are in fact the same. Or do I misread? It seems clear that we are going to do two regressions, producing a few different betas. But which beta is the IISLS, and is the IISLS estimate of beta always equal to the IV estimate? Many thanks. Best, Dan

21 years, 5 months

← Newer
1
2
3
4
5
6
7
8
9
Older →

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

gov1000-list December 2002