Dave:
In 2e, we estimate the regression model with 10 variables and then test it on our holdout sample that has only 10 rows or observations. R gave me the smack down and actually used exclamation points and capital letters to tell me that there are no residual degrees of freedom. What should we do?
T
Hello all,
My group members and have authored our own codes, getting different dimensions
for the loaded data and the cleaned data frames in 1(a). We've compared code,
and it seems identical. But we are getting very different dimensions:
Dimensions from code #1:
3958 18 (uncleaned)
2347 28 (cleaned)
Dimensions from code #2:
3959 (uncleaned)
2815 (cleaned)
What dimensions are other people getting for 1(a)? We're trying to figure out
who's on the right track.
Thanks,
Anna
This seems to work.
The highest value I'm getting is in the 0.34 range though, so I'm not sure if
this is the best method after all. for cor4, it seems to be better if you use
something like cor^10 rather than abs(cor) or cor^2. This is an empirical issue
though.
Anybody get something higher??
warning - this function can go on a long time if it hits on a really high value.
Use C-c C-c.
-Phillip.
---
superfindfunc <- function(){
r <- 0
compare <- 0
k <- 0
while(k < 100000){
r <- 0
while(r <= compare){
a <- sample(2:100, 10, prob = parta$cor4)
sumry <- summary(lm(x$Y ~ x[,a[1]] + x[,a[2]] + x[,a[3]] + x[,a[4]] + x[,a[5]] +
x[,a[6]] + x[,a[7]] + x[,a[8]] + x[,a[9]] + x[,a[10]]))
r <- sumry$adj.r.squared
k <- 1 + k
}
cat(paste(a)," R^2: ", r, "\n")
compare <- r
}
}
----
> superfindfunc()
82 25 79 43 73 36 50 55 72 84 R^2: 0.3127964
25 82 50 79 72 43 73 36 55 9 R^2: 0.3171445
25 82 79 9 73 36 84 50 72 55 R^2: 0.3209903
25 82 9 72 79 50 73 55 84 85 R^2: 0.3289005
25 82 72 50 79 36 55 9 57 73 R^2: 0.3297745
82 25 84 72 55 36 79 50 73 57 R^2: 0.3313713
25 82 79 50 73 72 33 55 57 36 R^2: 0.3351744
25 72 79 82 50 73 55 36 45 57 R^2: 0.337514
25 79 82 57 36 72 55 50 73 51 R^2: 0.3438499
57 25 50 79 72 55 82 73 63 36 R^2: 0.3475270
-------------------------------------------------
Phillip Y. Lipscy
Perkins Hall Room #129
35 Oxford Street
Cambridge, MA 02138
(617)493-4893
lipscy(a)fas.harvard.edu
Ph.D. Candidate
Harvard University, FAS, Department of Government
-------------------------------------------------
Actually, while I was sending that e-mail, I found one that is 0.22. So I'm
raising it a little bit for the all nighter version. :)
-Phillip.
-------------------------------------------------
Phillip Y. Lipscy
Perkins Hall Room #129
35 Oxford Street
Cambridge, MA 02138
(617)493-4893
lipscy(a)fas.harvard.edu
Ph.D. Candidate
Harvard University, FAS, Department of Government
-------------------------------------------------
Dear all,
sort() works nicely for vectors but not for dataframes and matrices (for
matrices it doesn't sort by columns properly). Since even ms excel can sort
dataframes by columns, I'm assuming R can do that too, but so far I haven't
found anything that seems to let me do that. Any ideas?
Thanks,
Phillip.
-------------------------------------------------
Phillip Y. Lipscy
Perkins Hall Room #129
35 Oxford Street
Cambridge, MA 02138
(617)493-4893
lipscy(a)fas.harvard.edu
Ph.D. Candidate
Harvard University, FAS, Department of Government
-------------------------------------------------
Dear All
Cleaning away all the extreme years (dpct) and the questionable dwin/incumb
combos in the all the lagged years is rather tedious. My gut feeling is that
it won't make much of a difference. Should we spend a lot of time making sure?
-Phillip.
-------------------------------------------------
Phillip Y. Lipscy
Perkins Hall Room #129
35 Oxford Street
Cambridge, MA 02138
(617)493-4893
lipscy(a)fas.harvard.edu
Ph.D. Candidate
Harvard University, FAS, Department of Government
-------------------------------------------------