Documente Academic
Documente Profesional
Documente Cultură
Data Manipulation
Ionut Bebu
R is case sensitive.
> getwd()
[1] "C:/Program Files/R/R-2.5.0"
> ? setwd
Define vectors:
> x = c(2,5,1,7)
> x = rep(10,5)
> x = seq(from=2,to=10,length.out=4)
Operations
> length(x)
> sort(x)
> log(x)
> y = 1:4
> x%*%y
> crossprod(x,y)
> outer(x,y)
> sum(x); prod(x); cumsum(x)
Try
rownames(w_dataframe), colnames(w_dataframe), edit(w_dataframe)
Define a matrix:
> w_matrix = matrix(c(1,2,3,6,5,4),nrow=2,byrow=T)
> rownames(w_matrix) = 1:2
> colnames(w_matrix) = c("A","B","C")
> dim(w_matrix)
[1] 2 3
> A = matrix(0,nrow=2,ncol=4) # try it!
Define an array:
> w_array = array(1:12,c(2,2,3))
> dimnames(w_array) = list(letters[1:2],c("A","B"),
c("I","II","III"))
> w_array # try it!
transpose, operations
> t(w_matrix); 3*w_matrix;
> u_matrix = matrix(1:4,ncol=2)
> u_matrix%*%w_matrix # multiplication
> det(u_matrix)
eigenvalues, inverse, trace
> A = matrix(1:9,ncol=3) + diag(rep(1,3))
> eigen(A)
> solve(A)
> sum(diag(A)) # trace
a logical vector:
> x[sqrt(x)==floor(sqrt(x))]
[1] 1 4 9
> w_dataframe[w_dataframe[,2]=="cured",]
> w_dataframe[w_dataframe[,c("Age")]>60,]
> print(xtable(w_dataframe),type="latex")
Age status
1 65.00 cured
2 58.00 no improvement
3 73.00 cured
4 59.00 some improvement
5 68.00 marked improvement
6 70.00 cured
> write(print(xtable(w_dataframe),type="html"),"C://TEACHING/R
file:///C|/TEACHING/R/sterge.html
Age status
1 65.00 cured
2 58.00 no improvement
3 73.00 cured
4 59.00 some improvement
marked
5 68.00
improvement
6 70.00 cured
Easy to write down the log-likelihood, which will be helpful for finding the
MLE. If x1 , . . . , x20 ∼ N(1, 1), then the likelihood is obtained as
> x = rnorm(20,1,1)
> log_likelihood = sum(dnorm(x,1,1,log=T))
[1] -30.52741
> set.seed(23)
2
28 28
Standardized residuals
50
1
Residuals
0
−1
−50
−2
14 14
19
19
0.5
2
14 28
28
Standardized residuals
Standardized residuals
1
1.0
0
−1
0.5
−2
14 0.5
Cook’s distance 19
0.0
> library(car)
> linear.hypothesis(rubber_lm,c(0,1,-1))
Linear hypothesis test
Hypothesis: hard - tens = 0
Model 1: loss ~ hard + tens
Model 2: restricted model
Res.Df RSS Df Sum of Sq F Pr(>F)
1 27 35950
2 28 151916 -1 -115966 87.096 6.118e-10 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
> library(’MASS’)
> data(’UScereal’)
E (yi ) = β0 + β1 x1i
E (yi ) = β0 + β1 x1i + β2i
E (yi ) = β0 + β1 x2i