gov1000-list November 2004

gov1000-list@lists.fas.harvard.edu

23 participants
83 discussions

by Marie R. Cole

Hi guys! I want to invite you to the wine and cheese tasting at CBRSS from 4-6. It will be set up in the main lobby. Also, if you prefer beer and cheese, that could be arranged :) Just ask. Hope to see you there! Marie

19 years, 6 months

Calculus -- the basics, and for vectors, 1030-1130am

by Ryan Thomas Moore

Hi, everyone. Having heard from several folks, I'm going to split the difference between people's schedules and say that we'll do Calculus 10.30am to 11.30am next Monday. I think it will be most helpful for folks if we do the session before lecture Monday. Sound ok? Ryan ------------------------------------------ Ryan T. Moore ~ Government & Social Policy Ph.D. Candidate ~ Harvard University Homepage: http://www.people.fas.harvard.edu/~rtmoore/ Gov1000: http://www.courses.fas.harvard.edu/~gov1000/

19 years, 6 months

Calculus Monday?

by Ryan Thomas Moore

Hi, all. I'm planning a vector calculus session of about one hour for those who haven't done any calculus, other than maybe the Prefresher. Would 10am Monday work for those who are interested? Cheers, Ryan ------------------------------------------ Ryan T. Moore ~ Government & Social Policy Ph.D. Candidate ~ Harvard University Homepage: http://www.people.fas.harvard.edu/~rtmoore/ Gov1000: http://www.courses.fas.harvard.edu/~gov1000/

19 years, 6 months

section today

by Alison Elizabeth Post

Hi guys! Just a reminder that there will be a section today focusing on matrix algebra and vector geometry. It is optional, though I recommend that anyone who hasn't taken a course in linear algebra come. (attendance was low on Tues.) See you soon! Alison

19 years, 6 months

Re: [gov1000-list] PS5

by Ryan Thomas Moore

> I am having a latex problem. I can't get decent looking paragraphs > in my latex document. Is there a proper syntax for indentation of > paragraphs? If you double space between paragraphs, then the first paragraph of a section will not be indented, but the rest will, I believe. Another command you might investigate in LaTeX is \indent. Let me know if neither of these suffices! Ryan

19 years, 6 months

Re: [gov1000-list] Re: SE (fwd)

by Kevin Quinn

> > S_E is just a descriptive > > measure that tells you how much variability there is around the > > regression line. > > Ok, your email definitely helped clarify what S_E is trying to describe: > variability around the regression line. > Cool. > But I still don't understand what values tell us the variability is > low or variability is high. Doesn't the S_E value depend on the > y-values for our particular experiment? ie, the variability values > do not mean the same thing for different data sets, unlike r^2, > where the varability is always between 0 and 1. S_E does depend on the sample variance of y in a particular dataset. Because of this, you do not want to compare S_E across models that are fit to different y variables. Note that, for the same reason, one should also *NOT* compare R^2 across regression models that are fit to different y variables. R^2 is simply the following: R^2 = RegSS / TSS (p. 91 Fox97) which is the same as R^2 = Var(\hat{y}) / Var(y) (p. 58 of Achen) despite the standardized [0,1] scale, R^2 depends on the observed variance of y. As a side note, the discussion on pp. 58-61 of Achen regarding R^2 is very good. > So how do we know what range of values for our experiment denote low > vs high levels of variability? I recall on the section handout > sheet, Allison said to compare it to the standard deviation of the y > variable, range, min, max, etc??? Yep, exactly. Note the identity RegSS = TSS - RSS on p. 91 of Fox97. Another way to write this is: Var(\hat{y}) = Var(y) - Var(\hat{\epsilon}) or with some simple algebra Var(y) = Var(\hat{y}) + Var(\hat{\epsilon}) in words, the sample variance of y is equal to the sample variance of the fitted values plus the sample variance of the residuals. Recall that S_E^2 is the sample variance of the residuals. Since variances have to be nonnegative it has to be the case that S_E^2 falls somewhere in the range [0, Var(y)] and S_E has to be in the range [0, SD(y)]. At the one extreme if S_E = 0 then the regression line fits the observed data perfectly (all residuals are 0) and at the other extreme if S_E = SD(y) then the slope coefficient is exactly 0 so that all the fitted values are equal to the mean of y. As a side note, recall the RFS plot from Cleveland. This plot is a graphical depiction of the variance decomposition Var(y) = Var(\hat{y}) + Var(\hat{\epsilon}) Hope this helps. Best, Kevin

19 years, 6 months

PS5

by Ryan Thomas Moore

Hi, everyone. Problem Set 5 has been posted to the course website. As Kevin mentioned in lecture yesterday, PS5 will be due in lecture 15 November 2004. Happy Voting, Ryan ------------------------------------------ Ryan T. Moore ~ Government & Social Policy Ph.D. Candidate ~ Harvard University Homepage: http://www.people.fas.harvard.edu/~rtmoore/ Gov1000: http://www.courses.fas.harvard.edu/~gov1000/

19 years, 6 months

Re: SE (fwd)

by Kevin Quinn

> in any case, I am still a little confused about how to evaluate the > standard error of regression. What, exactly, are "good", "ok", and > "bad" values? Right. I wouldn't think about it as a question of "good" vs. "bad" values of S_E. S_E is an estimator for the standard deviation of the disturbances in the regression model. S_E is just a descriptive measure that tells you how much variability there is around the regression line. It might help to think of it like this-- suppose you have: y_i = \beta_0 + \beta_1 * x_i + \epsilon_i, i=1,...,n and you get estimates \hat{\beta}_0 and \hat{\beta}_1 of the intercept and slope parameter. These two things determine the regression line through the data which is (oftentimes) a reasonable and concise way to summarize the information in a scatterplot of y on x. Being a summary, the regression line alone doesn't capture some of the important aspects of the relationship between y and x. In particular, it doesn't allow us to say anything about the variability around the regression line. Looking at S_E provides exactly this information-- how much variability there is around the regression line. Larger values of S_E imply that the distribution of residuals has greater variance. Suppose you've never seen a scatterplot of x and y. If I tell you the values of \hat{\beta}_0, \hat{\beta}_1 and S_E from a regression of y on x you will be able to more accurately reproduce the scatterplot of y on x than if I just tell you \hat{\beta}_0, \hat{\beta}_1. It seems that a common theme running through many of the questions to the list and some of the questions in class yesterday is that there should be objectively determined cutoff values of S_E, R^2, etc. that allow one to make sharp decisions that guide one's data anlysis-- "if R^2 is above some number one should believe the results", etc. I can't emphasize enough that this sort of thing is *extremely bad practice*. This is what Cleveland discussed as "rote data analysis". Data analysis really does involve a great deal of artistry. The two things we can be assured of are that inferences depend upon assumptions and that the necessary assumptions almost always don't exactly hold. The real question is whether the assumptions are close enough to being true so that our inferences are not wildly misleading. We haven't talked about how one goes about examining assumptions too much yet (although Cleveland deals with this a fair amount). Once we get multiple regression under our belts then the bulk of the remainder of the course will deal with diagnostics and model checking. All of this will come together quite a bit more over the next few weeks. Hope this helps. Best, Kevin

19 years, 6 months

section this week

by Alison Elizabeth Post

Hi guys- We will be having optional section this week for folks coming in without a strong matrix algebra background. What I plan to cover: -rank -orthogonality -linear independence -the calculation of an inverse -working with matrices in R If you feel confident that you understand these concepts well enough to do the problem set, you're free to spend your time on other things. Note that those who skip will miss the "matrix algebra quiz bowl" (they had Halloween candy on half off tonight at Walgreen's...) Practicing is by far the best way for matrix algebra to become intuitive! For those worried about the 6pm start for Gary's party, I plan for section to run approx. 1 hour this week. So you should be out with plenty of time to reach Gary's place. Office hours as usual, after section until 6pm on Tuesdays and Thursdays. Alison

19 years, 6 months

Pr 3: data

by Rebecca Marie Nelson

Hi, I'm sure I'm missing basic here, but how do I get for Anscombe data into R (for problem 3)? If i just do data(anscombe), I get data that looks *nothing* like Table 5.1. It doesn't say what library to load and I don't see the data read for download on the website. Any ideas? Thanks, Becky

19 years, 6 months

← Newer
1
2
3
4
5
6
7
8
9
Older →

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

gov1000-list November 2004