Documente Academic
Documente Profesional
Documente Cultură
For
integration, use the n-point trapezoidal rule
Z b
1 1
f (x)dx ≈ h f (x1 ) + f (x2 ) + . . . + f (xn−1 ) + f (xn ) ,
a 2 2
Print numbers in the second and third column in the %24.17e format.
Sample output:
1
Problem 2. Consider a square dartboard with a circle inscribed inside; see figures below. In
these figures, the square dartboard is marked with dashed lines.
We wish to simulate the experiment of throwing N random darts at this dartboard. Using this,
we wish to estimate the value of π as explained below.
1. A random dart hit (x, y) inside the square can be simulated by generating 2 uniform
random numbers over [−1, +1]. Write R code to pick N such random dart hits with
uniform density. One realization of 100 random darts is depicted in the right-hand figure
above.
2. The ratio π
bN = 4N◦ /N is an estimate of π. Here, N◦ is the number of dart hits inside the
circle. You will need to write R code to find this N◦ .
3. Run this simulation for N = 10, 20, 50, 100, 200, 500, 1000, 2000, 5000, 10000. Tabulate
your results neatly.
4. Plot πbN as a function of N . Also create (through R) a pdf figure of the same.
3.6
3.5
3.4
3.3
^N
π
3.2
3.1
3.0
2
Problem 3. You have a sample of data values xi , i = 1, 2, . . . , n. Write R code to produce the
following 2 × 2 graphical summary of this sample. The four panels of this graphical summary
consist of
1. a probability density histogram of the sample, overlaid with the empirical density (density()
in R), and a rug representation of the sample;
2. the empirical cdf (ecdf() in R) of the sample;
3. the autocorrelation function of the sample; and
4. a quantile-quantile plot against standard normal quantiles.
See an example of such a 2 × 2 summary below. Test your code using a randomly-generated
N (0, 1) sample of size 100. Also create a pdf image of the graphical summary using an appro-
priate R graphics device (screenshot are not allowed).
Histogram of x ecdf(x)
1.0
0.5
0.8
0.4
0.6
Density
0.3
Fn(x)
0.4
0.2
0.2
0.1
0.0
0.0
−2 −1 0 1 2 −2 −1 0 1 2
x x
2
Sample Quantiles
1
0.6
ACF
0
0.2
−1
−0.2
−2
0 5 10 15 20 −2 −1 0 1 2
3
Problem 4. The following figure was made using only straight lines and standard R graphics
functions. Construct R code that will produce this plot exactly. Create a pdf file of this plot.
Screen shots not allowed.
1.0
0.5
0.5
0.0
0.0
−0.5
−0.5
−1.0
−1.0
4
Problem 5. Solutions to the equation f (x) = 0, i.e., points x where the function f takes
a zero value, are called zeros or roots of the function f (x). Newton’s method is a one-step
iterative method to locate a zero of f . Starting from an initial point x0 , it produces successive
approximations x1 , x2 , . . . to a zero of f , where
5
Problem 6. Which of the in-built functions mean() and median() is faster? To find out,
repeat the following two steps M times:
Median
5e−04
2e−04
Compute Time (s)
1e−04
5e−05
Mean
2e−05
1e−05
6
Problem 7. Which of the in-built functions mean and median is faster? To find out, repeat
the following steps M times:
Make a plot in which the boxplots of δ for different N are plotted side-by-side.
Median
2e−03
Time (sec)
5e−04
2e−04
Mean
5e−05
2e−05
7
Problem 8. Below are approximate pairwise distances (in km) between five cities in Maha-
rashtra:
A salesperson wishes to visit each of these cities on a route that begins and ends in Nasik.
Moreover, (s)he does not wish to visit the same city more than once.
Write R code to enumerate all such circular routes available to the salesperson, together with
the total distance traveled along each route. Which are the shortest and longest routes?
You may use this algorithm for generating permutations one-by-one in lexicographic order.
Produce output as below.
Nasik -> Kopargaon -> Shrirampur -> Ahmednagar -> Sangamner -> Nasik 307
Nasik -> Sangamner -> Ahmednagar -> Shrirampur -> Kopargaon -> Nasik 307
Nasik -> Kopargaon -> Ahmednagar -> Shrirampur -> Sangamner -> Nasik 336
Nasik -> Sangamner -> Shrirampur -> Ahmednagar -> Kopargaon -> Nasik 336
Nasik -> Sangamner -> Kopargaon -> Shrirampur -> Ahmednagar -> Nasik 342
Nasik -> Ahmednagar -> Shrirampur -> Kopargaon -> Sangamner -> Nasik 342
Nasik -> Kopargaon -> Sangamner -> Ahmednagar -> Shrirampur -> Nasik 353
Nasik -> Shrirampur -> Ahmednagar -> Sangamner -> Kopargaon -> Nasik 353
Nasik -> Shrirampur -> Ahmednagar -> Kopargaon -> Sangamner -> Nasik 360
Nasik -> Sangamner -> Kopargaon -> Ahmednagar -> Shrirampur -> Nasik 360
Nasik -> Shrirampur -> Kopargaon -> Ahmednagar -> Sangamner -> Nasik 363
Nasik -> Sangamner -> Ahmednagar -> Kopargaon -> Shrirampur -> Nasik 363
Nasik -> Kopargaon -> Sangamner -> Shrirampur -> Ahmednagar -> Nasik 364
Nasik -> Ahmednagar -> Shrirampur -> Sangamner -> Kopargaon -> Nasik 364
Nasik -> Kopargaon -> Shrirampur -> Sangamner -> Ahmednagar -> Nasik 367
Nasik -> Ahmednagar -> Sangamner -> Shrirampur -> Kopargaon -> Nasik 367
Nasik -> Sangamner -> Shrirampur -> Kopargaon -> Ahmednagar -> Nasik 374
Nasik -> Ahmednagar -> Kopargaon -> Shrirampur -> Sangamner -> Nasik 374
Nasik -> Kopargaon -> Ahmednagar -> Sangamner -> Shrirampur -> Nasik 385
Nasik -> Shrirampur -> Sangamner -> Ahmednagar -> Kopargaon -> Nasik 385
Nasik -> Shrirampur -> Kopargaon -> Sangamner -> Ahmednagar -> Nasik 391
Nasik -> Ahmednagar -> Sangamner -> Kopargaon -> Shrirampur -> Nasik 391
Nasik -> Shrirampur -> Sangamner -> Kopargaon -> Ahmednagar -> Nasik 420
Nasik -> Ahmednagar -> Kopargaon -> Sangamner -> Shrirampur -> Nasik 420
Nasik -> Kopargaon -> Shrirampur -> Ahmednagar -> Sangamner -> Nasik 307
Nasik -> Sangamner -> Ahmednagar -> Shrirampur -> Kopargaon -> Nasik 307
Nasik -> Shrirampur -> Sangamner -> Kopargaon -> Ahmednagar -> Nasik 420
Nasik -> Ahmednagar -> Kopargaon -> Sangamner -> Shrirampur -> Nasik 420
A visual twist: Visualize these paths in the form of 2D plots; see figure on the next page.
8
Nasik Nasik Nasik Nasik
9
Problem 9. Write R code to integrate a function f of one variable between a and b. For
integration, use the n-point Simpson-1/3 formula (odd n)
Z b
1 4 2 4 2 2 4 1
f (x)dx ≈ h f (x1 ) + f (x2 ) + f (x3 ) + f (x4 ) + f (x5 ) . . . + f (xn−2 ) + f (xn−1 ) + f (xn ) ,
a 3 3 3 3 3 3 3 3
Print numbers in the second and third column in the %24.17e format.
Sample output:
10
Problem 10. Implement the following R functions:
1. Add circle (x − x0 )2 + (y − y0 )2 = r2 to an existing plot:
circle <- function( radius = 1, center = c(0,0) ) { your implementation }
Arguments: center ≡ (x0 , y0 ), radius ≡ r.
2. Add ellipse ((x − x0 )/a)2 + ((y − y0 )/b)2 = 1 to an existing plot:
ellipse <- function( a = 1, b = 1, center = c(0,0) ) { your implementation }
Arguments: center ≡ (x0 , y0 ), a ≡ a, b ≡ b.
3. Add triangle defined by (x1 , y1 ), (x2 , y2 ), (x3 , y3 ) to an existing plot:
triangle <- function( x1, y1, x2, y2, x3, y3 ) { your implementation }
Arguments: x1 ≡ x1 , y1 ≡ y1 , x2 ≡ x2 , y2 ≡ y2 , x3 ≡ x3 , y3 ≡ y3 .
Using these functions, recreate the figure below. Produce a pdf file for your plot. Screenshot
not allowed. You may create an empty plot window as follows.
plot.new( ); plot.window( xlim = c( -2, 2 ), ylim = c( -2, 2 ), asp = 1 )
axis( 1 ); axis( 2 ); abline( v = 0, h = 0, lty = 2 )
2
1
0
−1
−2
−2 −1 0 1 2
11
Problem 11. Solutions to the equation f (x) = 0, i.e., points x where the function f takes
a zero value, are called zeros or roots of the function f (x). Halley’s method is a one-step
iterative method to locate a zero of f . Starting from an initial point x0 , it produces successive
approximations x1 , x2 , . . . to a zero of f , where
f (xi−1 )f 0 (xi−1 )
xi = xi−1 − for i = 1, 2, . . .
[f 0 (xi−1 )]2 − 12 f (xi−1 )f 00 (xi−1 )
and f 0 , f 00 are the first two derivatives of f . The iteration is terminated when
12
Problem 12. The roots of a degree-m polynomial a0 + a1 x + a2 x2 + . . . + am xm happen to be
the eigenvalues of the m × m matrix
am−1
− am−2 · · · − aam1 − aam0
− m am
1 0 ··· 0 0
0 1 ··· 0 0 .
.
. .
. .
. .
.
. . ··· . .
0 0 ··· 1 0
Write an R function of the form polyroot.h(a) that takes the coefficient vector a as the
argument and returns a vector containing all the roots. The elements of a are the coefficients
a0 , a1 , . . . , am . Use this function to find all roots of 8x4 − 8x2 + 1. Report the roots you find to
15 digits. Compare your roots with roots returned by the in-built function polyroot().
13
R1
Problem 13. Consider integrals of the form I = f (x)dx. For example, I = 1/2 for f (x) =
√ 0
1 − x and I = π/4 for f (x) = 1 − x2 , etc.
Just as we estimated the value of π, we can estimate any integral of this form using randomness.
Here is a recipe:
2. Generate a uniform random numbers x1 , . . . , xn from the interval [0, 1]. One estimate of
the above integral is Ibn = (1/n) ni=1 f (xi ).
P
Apply this procedure to the two functions defined above. Make side-by-side boxplots Ibn as a
function of n on the same plot. Mark the true value of I on each plot.
1
I1 = ⌠ (1 − x)dx
⌡0
Estimated Integral
0.6
1/2
0.4
0.2
1
I2 = ⌠ 1 − x2 dx
⌡0
0.6 0.7 0.8 0.9
Estimated Integral
π 4
14
Problem 14. Write R code that computes the first n + 1 Fibonacci numbers f0 , . . . , fn .
The limit of the ratio sequence φn = fn+1 /fn as n → ∞ is called the golden ratio φ.
Report your best approximation for the √golden ratio to full 16-digit accuracy. Compare your
values φn with the exact value φ = (1 + 5)/2.
φ
1.6
1.4
1.2
1.0
5 10 15 20
n
φ
1e−04
(φn − φ)
1e−08
5 10 15 20
15
Problem 15. Write R code to add a horizontal boxplot on top of a histogram, as in the figure
below. The width of the boxplot should look the same irrespective of the range of histogram
counts. Test your code using a randomly-generated N (0, 1) sample of size 100. Create a pdf
file of this plot (screenshot not allowed).
Histogram of x
20
15
Frequency
10
5
−2 −1 0 1 2
x
16
Problem 16. Suppose a data vector y1 , . . . , yn is given. The running mean of this data at
index i, over a window of width w, is defined as ȳi := average of ymax(1,i−w) , . . . , ymin(i+w,n) .
The index i takes values 1, . . . , n, and the window width w can take values between 0 and n.
Generate the vector y as follows:
n <- 500
signal <- sin( 4 * pi * ( 1:n ) / n )
noise <- rnorm( n, sd = 1 )
y <- signal + noise
w <- 30
Compute the running-mean sequence ȳi for 1 ≤ i ≤ n for the given window size w.
Run your running-mean code on this y with window size w = 30. Create a plot with y repre-
sented by points, signal represented by a red line, and ȳ represented with a blue line. See the
example plot below.
3
2
1
0
y
−1
−2
−3
Index
17
Problem 17. We are given data of the form (x1 , y1 ), . . . , (xn , yn ). Possibly, there is some
relationship between x and y. A regressogram r(x) is a way of guessing this relationship.
A regressogram is computed as follows: Divide the range of x values in m bins. Therefore,
the bins are [b0 , b1 ), [b1 , b2 ), . . . , [bm−1 , bm ], where b0 = min(x), b1 = h, b2 = 2h, . . . , bm−1 =
max(x) − h, bm = max(x), and h = (max(x) − min(x))/m. The regressogram r(x) at x is the
mean of all y values belonging to the bin that x belongs to. When plotted, it is a piecewise flat
function (see example below).
Write R code to compute regressogram, given vectors x, y and bin count m. Generate vectors
x, y as follows:
n <- 500
x <- ( 0:( n - 1 ) ) / n
signal <- sin( 4 * pi * x )
noise <- rnorm( n, sd = 1 )
y <- signal + noise
m <- 20
Compute the regressogram of this data (x and y) using your code, for a bin count m = 20. Create
a plot with y represented by points, signal represented by a red line, and r(x) represented with
flat blue lines. as in the example plot below.
2
0
y
−2
−4
18
Problem 18. The Legendre polynomials are a family of polynomials defined over the interval
[−1, +1]. P0 (x), which is the Legendre polynomial of order 0, is 1 over this entire interval (i.e.,
P0 (x) = 1). P1 (x), which is the Legendre polynomial of order 1, equals x over this entire interval
(i.e., P1 (x) = x). The figure below plots P0 through P5 .
19
Problem 19. The Chebyshev polynomials are a family of polynomials defined over the interval
[−1, +1]. T0 (x), which is the Chebyshev polynomial of order 0, is 1 over this entire interval
(i.e., T0 (x) = 1). T1 (x), which is the Chebyshev polynomial of order 1, equals x over this entire
interval (i.e., T1 (x) = x). The figure below plots T0 through T5 .
Write R code to compute Tn (x) given a vector of x values and order n. Using your code, create
a plot similar to the one above.
20
Problem 20. A 1-dimensional symmetric random walk is a series of random steps of length
1, either to the right or to the left of the current position of the random walker, with equal
probability. In other words, starting at x0 at time t = 0, the position x1 of the random walker
at time t = 1 is either x1 = x0 + 1 or x1 = x0 − 1 with probability 0.5 each. See the example
below.
Write R code to simulate such a random walk of n steps. At each time, you will have to
choose a displacement of +1 or −1 randomly with probability 0.5 each: This can be done in
a number of ways in R. Starting from x0 = 0, generate a realization of this random walk for
n = 500 steps. Make a plot of this random walk, and save it as pdf (screenshot not allowed).
Your particular plot will in general look very different from the one shown here.
15
10
5
0
x
−5
−10
−15
−20
21
Problem 21. We are given data of the form (x1 , y1 ), . . . , (xn , yn ). The relationship between y
and x is known to be linear; i.e., of the form y = mx + c, where m and c are the unknowns.
A
Pnway of estimating 2the parameters m and c is the least-squares way, where one minimizes
i=1 (yi − (mxi + c)) with respect to m and c. The resulting estimates m b and bc of m and c
have the form
Pn
(x − x̄)(yi − ȳ)
m
b = Pn i
i=1
2
i=1 (xi − x̄)
c = ȳ − mx̄,
b b
where x̄ ≡ average value of x1 , . . . , xn , and ȳ ≡ average value of y1 , . . . , yn . See the figure below,
where the red line is the true relationship, and the blue line is the least-squares fit.
Write R code to compute these estimates given vectors x and y of length n. Generate vectors
x, y as follows:
n <- 100
x <- ( 0:( n - 1 ) ) / n
m <- -1
c <- 2
signal <- m * x + c
noise <- rnorm( n, sd = 1 )
y <- signal + noise
Estimate the parameters m and c from this data using your code. Create a plot with y repre-
sented by points, signal represented by a red line, and the fit yb = mx
b +b
c represented with a
blue line.
4
3
2
y
1
0
−1
22
Problem 22. Given data of the form (x1 , y1 ), . . . , (xn , yn ), and a relationship between y and
x of the form y = a0 + a1 x + . . . + ak xk ,P a way of estimating the parameters a0 , . . . , ak is the
least-squares way, where one minimizes ni=1 (yi − (a0 + a1 xi + . . . + ak xki ))2 with respect to
a0 , . . . , ak . This leads to the solution
1 x1 x21 · · · xk1
a0
b y1
a ≡ ... = (X T X)−1 X T ... , where X = ... ... .. .
. · · · .. .
b
ak
b yn 1 xn x2n · · · xkn
Notice that X is a n × (k + 1) matrix. This solution involves forming the matrix X, taking its
transpose X T , inverting a matrix, several matrix multiplications, and a matrix-vector product.
In the figure below, the red line is the true relationship (which is cubic polynomial), and the
blue line is the least-squares fit yb = ba0 + b ak xk using the estimates b
a1 x + . . . + b a obtained as
above.
Write R code to obtain estimates b a as described above, given vectors x and y of length n, and
the degree k of the fitting polynomial. Generate vectors x, y as follows:
n <- 100
x <- ( 0:( n - 1 ) ) / n
a <- c( 2, -2, 2, -2 )
k <- length( a )
signal <- a[1] + sapply( x, function( t ) { sum( a[-1] * t^( 1:(k-1) ) ) } )
noise <- rnorm( n, sd = 1 )
y <- signal + noise
Estimate the parameters a from this data using your code. Create a plot with y represented by
points, signal represented by a red line, and the fit yb represented with a blue line.
4
3
2
y
1
0
−1
23
Problem 23. Implement the following R functions:
1. Add circle (x − x0 )2 + (y − y0 )2 = r2 to an existing plot:
circle <- function( radius = 1, center = c(0,0) ) { your implementation }
Arguments: center ≡ (x0 , y0 ), radius ≡ r.
2. Add ellipse ((x − x0 )/a)2 + ((y − y0 )/b)2 = 1 to an existing plot:
ellipse <- function( a = 1, b = 1, center = c(0,0) ) { your implementation }
Arguments: center ≡ (x0 , y0 ), a ≡ a, b ≡ b.
3. Add triangle defined by (x1 , y1 ), (x2 , y2 ), (x3 , y3 ) to an existing plot:
triangle <- function( x1, y1, x2, y2, x3, y3 ) { your implementation }
Arguments: x1 ≡ x1 , y1 ≡ y1 , x2 ≡ x2 , y2 ≡ y2 , x3 ≡ x3 , y3 ≡ y3 .
Using these functions, recreate the figure below. Produce a pdf file for your plot. Screenshot
not allowed. You may create an empty plot window as follows.
plot.new( ); plot.window( xlim = c( -2, 2 ), ylim = c( -2, 2 ), asp = 1 )
axis( 1 ); axis( 2 ); abline( v = 0, h = 0, lty = 2 )
2
1
0
−1
−2
−2 −1 0 1 2
24
Problem 24. Consider a collection of n points in the x–y plane, such as that in the left-hand
figure below. In the R code below, this collection is represented as a n × 2 matrix representing
the x (column 1) and y (column 2) coordinates.
k <- 25
x <- cbind( rnorm( k, mean = 0, sd = 0.75 ), rnorm( k, mean = 0, sd = 0.75 ) )
x <- rbind( x, cbind( rnorm( k, mean = 2, sd = 0.25 ), rnorm( k, mean = 2, sd = 0.25 ) ) )
n <- 2 * k
Write R code to compute the pairwise distances for this collection of n points, and store them
as a n × n matrix. Further, create a color image of this matrix using your favorite color scheme;
see the right-hand figure below for an example. For your collection these n points, produce
plots similar to the ones below.
2
1
x[,2]
0
−1
−2
−2 −1 0 1 2 3
x[,1]
25
Problem 25. Write R code to produce a “scrambled eggs” plot similar to the one below. This
plot consists of letters ‘e’ and ‘g’ placed with random orientation at random locations in the
unit square, and the proportion of ‘g’ is twice that of ‘e’.
26
Problem 26. Consider a text passage such as this one below:
Long years ago, we made a tryst with destiny, and now the time comes when we shall redeem our
pledge, not wholly or in full measure, but very substantially. At the stroke of the midnight hour,
when the world sleeps, India will awake to life and freedom. A moment comes, which comes but
rarely in history, when we step out from the old to the new, when an age ends, and when the soul
of a nation, long suppressed, finds utterance. It is fitting that at this solemn moment we take the
pledge of dedication to the service of India and her people and to the still larger cause of humanity.
Suppose this or a similar piece of text is saved in a file without any punctuation marks. Read
this file in R using scan() with what=character(0). Write R code that will find the set of
unique words in this array, ignoring case. Further, find the count of each unique word. Report
the unique words (in alphabetical order) and their counts.
For example, the unique words (in alphabetical order) and their counts in the passage above
are
27
Problem 27. How can we illustrate the central limit theorem? Here is a recipe.
Fix sample size to some integer n > 0. Generate a sample x1 , . . . , xn from any distribution, say
an exponential distribution. Compute the sample average
n
1X
x̄n = xi
n
i=1
for this sample. Repeat this a larger number of times (say, m = 1000). You now have a
collection of m sample averages. Make a histogram of this collection of sample averages.
Repeat the above for a number of increasing n values, say n = 2, 5, 10, 50, 100, 500.
With increasing n, the histograms should start looking more and more like the normal distri-
bution. This should be even more apparent if you can overlay an appropriate normal density
curve on top of each histogram.
Try this simulation for the following disributions: normal, exponential, cauchy.
1.2
0.8
0.6
1.0
0.6
0.8
Density
Density
Density
0.4
0.6
0.4
0.4
0.2
0.2
0.2
0.0
0.0
0.0
8
2.0
3
Density
Density
Density
6
1.5
4
1.0
1
0.5
2
0.0
0.6 0.8 1.0 1.2 1.4 1.6 0.8 1.0 1.2 0.85 0.95 1.05
28
Problem 28. Consider the iris data set available in R.
1. Make a pair plot of the four numeric variables of this data set.
2. For the four numeric variables of this data set, make boxplots stacked side by side on a
common vertical scale.
3. Take any one pair of numeric variables in this data set. Make a scatterplot such that the
color of a point represents the species. See the example below.
4.0
3.5
Sepal.Width
3.0
2.5
2.0
Sepal.Length
29
Problem 29. Consider the iterative process
xk+1 = f (xk ),
where f (x) = λx(1 − x), and the iteration starts at some x0 in the interval [0,1]. Write R code
to iterate this process n times.
For λ = 1, 2, 3, 4, iterate this process n = 99 times. For each λ, plot xi against i (0 ≤ i ≤ n),
and make a histogram of x0 , . . . , xn . Stack these plots in the form of a 2 × 4 array.
1.0
0.06
0.7
0.8
0.70
0.05
0.6
0.04
0.6
0.65
xt
xt
xt
xt
0.03
0.4
0.5
0.60
0.02
0.2
0.4
0.55
0.01
0.0
0 40 80 0 40 80 0 40 80 0 40 80
t t t t
30
100
30
30
25
80
25
25
20
20
20
60
Frequency
Frequency
Frequency
Frequency
15
15
15
40
10
10
10
20
5
5
5
0
0.01 0.04 0.0 0.4 0.8 0.55 0.65 0.75 0.0 0.4 0.8
xt xt xt xt
30
Problem 30. We are given data of the form (x1 , y1 ), . . . , (xn , yn ), where no two xi s are identical.
For these data, we wish to compute the Lagrange interpolating polynomial, which has the form
n
X
L(x) = yi li (x),
i=1
where
(x − x1 ) (x − xi−1 ) (x − xi+1 ) (x − xn )
li (x) = ... ... .
(xi − x1 ) (xi − xi−1 ) (xi − xi+1 ) (xi − xn )
Write R code to compute the Lagrange polynomial L(x) given (x1 , y1 ), . . . , (xn , yn ), where no
two xi s are identical.
Plot the data together with L(x). Produce a pdf for the plot. Screen shots not allowed.
31
Problem 31. Implement the following functions. ’{...}’ stands for the function body that
you need to implement.
1. Write an R function of the form fact <- function( n ) {...} that calculates n! without
explicit loops and without using the in-built factorial() function.
2. Write an R function of the form is.square <- function( A ) {...} that returns TRUE
or FALSE depending on whether A is a square matrix. You also need to check whether A
is a matrix in the first place.
3. Write an R function of the form trace <- function( A ) {...} that calculates the trace
of A without using loops. Your implementation must check whether A is a square matrix.
4. Write an R function of the form is.symmetric <- function( A ) {...} that checks
whether the given matrix A is a symmetric matrix. Your implementation must check
whether A is a square matrix.
5. Write an R function of the form eigenvalues.tb <- function( n ) {...} that returns
the eigenvalues of a n×n square symmetric matrix A defined as
Aij = 1 if i = j + 1 or j = i + 1, and 0 otherwise.
7. Write an R function of the form fibonacci <- function( n ) {...} that returns the
first n Fibonacci numbers. Fibonacci numbers are defined through the recursion fi =
fi−1 + fi−2 (i = 3, 4, . . .) with f1 = 0, f2 = 1.
8. Rewrite the following R code to achieve the same result without explicit loops:
x <- matrix( c( 1, 2, 3, 4 ), nrow = 2 ); y <- matrix( c( 4, 3, 2, 1 ), nrow = 2 );
z <- x;
for ( i in 1:2 )
for ( j in 1:2 )
{
z[i,j] <- 0
for ( k in 1:2 ) z[i,j] <- z[i,j] + x[i,k] * y[k,j]
}
32
Problem 32. Fit a linear regression model to the standard data set called Orange, with age
as the covariate and circumference as the dependent variable. In a 2 × 2 layout, plot
1. the data together with the fitted line, with axes appropriately labeled,
40
20
circumference
150
Residuals
0
100
−20
50
−40
age Fitted
Autocorrelation: Residuals
1.0
10
8
0.5
Frequency
ACF
6
0.0
4
2
−0.5
0
Residuals Lag
33
Problem 33. Make a color-coded contour plot (R function filled.contour) of the function
f (x, y) = x4 + xy + (1 + y)2 for −2 ≤ x, y ≤ +1 using terrain colors, 31 levels, and aspect ratio
= 1.
1.0 20
0.5
15
0.0
−0.5 10
−1.0
−1.5
−2.0 0
34
Problem 34.
1. Find out how to optimize functions (of more than one variable) in R using the conjugate
gradient (CG) or any other method.
2. Find the minimizer (x∗ , y∗ ) of the Rosenbrock function
analytically. Minimizer (x∗ , y∗ ) is the point in the (x, y) plane where f (x, y) attains its
minimum value. Write the expression of the gradient of this function.
3. Make a contour plot of the Rosenbrock function and mark this (true) minimum on this
plot.
4. Find the minimizer of f (x, y) numerically using the Polak-Ribiere conjugate gradient
method in R. Do not let R use finite-difference approximations for the gradient (i.e., supply
a function to compute the gradient). Mark this numerical minimum on the contour plot
above. Report the numerical minimum to full 16-digit precision on the same plot.
5. Repeat the above exercise for the following functions. (Unlike with the function above, it
may not possible to find the analytical minimizer of the functions below.)
35
Problem 35. Write R code that will print the binary representation of a positive integer n.
Following the standard convention, the least significant bit should be the rightmost. Using your
code, print binary representations of integers n = 0, . . . , 8, and 47.
5 4 3 2 1 0
0 0 0 0 0 0 0
1 0 0 0 0 0 1
2 0 0 0 0 1 0
3 0 0 0 0 1 1
4 0 0 0 1 0 0
5 0 0 0 1 0 1
6 0 0 0 1 1 0
7 0 0 0 1 1 1
8 0 0 1 0 0 0
47 1 0 1 1 1 1
36
Problem 36. Design an R function to make a polar plot. The user supplies equal-length
vectors with radius (r) and angle (θ) values. Your function should overlay the polar plot with
appropriate constant-θ lines and constant-r circles. The values for these “grid” curves should
be intelligently derived from the (r, θ) data supplied. Your function should also make sure that
circles look like circles. See example of a polar plot below: The function plotted is sin2 (θ).
sin(t)^2
0.9997581
0.7498186
0.4998791
0.2499395
0.2499395
0.4998791
0.7498186
0.9997581
1.499637
1.249698
1.249698
0.000
37
Problem 37. Design an R function of the form
runif.sphere <- function( n, d = 3, r = 1 )
that generates a random sample of n points distributed uniformly on the surface of a d-
dimensional hypersphere with radius r centred at the origin in d dimensions. The algorithm
for this is operationally straightforward, and is explained in detail in Donald Knuth, The Art
of Computing, Volume 2: Seminumerical Algorithms, 2/3 edition, Sec. 3.4.1.E.6, p.130. Your
function should return a matrix, with each row representing the d-dimensional coordinates one
random point from the hypersphere. This matrix is thus a n × d matrix. To demonstrate that
your function works as intended,
38
Problem 38. R provides recursion in the same way that C does. The purpose of this exercise
is to compare the performance of recursive and non-recursive implementations of the same
computation. As a concrete example, we will analyze the performance of (at least) three distinct
methods of computing the first n Fibonacci numbers:
1. using recursion;
Implement these in R. For each method, explore how the compute time changes with n, and
compare performances across methods.
39
Problem 39. Consider the problem of generating all the permutations of the first n integers.
The total number of such permutations is n!, which can be too large a number to store all
the permutations. The alternative approach is to generate these permutations one by one in
certain order such as lexicographic order. The algorithm http://en.wikipedia.org/wiki/
Permutation#Generation_in_lexicographic_order takes in one permutations of numbers
{1, ..., n}, and generates the next permutation from the lexicographically ordered set of all n!
permutations. The algorithm also identifies when the very last permutation in the set has been
reached. Implement this algorithm as an R funtion of the form next.perm <- function( p
) {...}, where p is an R vector containing a permutation of integers {1, ..., n}, where n ==
length( p ). You will need to devise an error check to make sure that the vector p contains
all integer values from {1, ..., n} represented exactly once each. To demonstrate the correctness
of the above function, implement another R function all.perm <- function( n ) {...} that
returns all permutations of {1, ..., n} using next.perm(). The returned value of all.perm()
should be a matrix of size n! × n. Permutations should not be printed on the screen. For
sufficiently small values of n, demonstrate that all the n! permutations are indeed generated in
lexicographic order.
> all.perm( 4 )
40
Problem 40. Implement Conway’s Game of Life in R. Explore the behaviour of this game
graphically – Have fun! Life in Conway’s world follows the rules http://en.wikipedia.org/
wiki/Conway%27s_Game_of_Life#Rules
41
Problem 41.
1. What is the result of the operation x <- 1:2*1:10? Explain with reasoning.
2. Report all eigenvalues of the matrix
0 1 0 0 0 1
1 0 1 0 0 0
0 1 0 1 0 0
0 0 1 0 1 0
0 0 0 1 0 1
1 0 0 0 1 0
3. Draw 1000 random samples of size 5 from the Exp(2) distribution. Compute the sample
average µ
b for each of these samples. Plot histogram of µ
b using the ’FD’ option.
5. Which of the following two (equivalent) codes is more efficient, and why?
(a) S <- 0; for ( i in 1:length( x ) ) { S <- S + x[i] }
(b) S <- sum( x )
6. How will you add the x = 0 and y = 0 lines to the plot produced by
plot( rnorm( 1000 ), rnorm( 1000 ), asp = 1, pch = 19, col = ’gray’, axes = F, xlab = ’’, ylab = ’’ )?
7. Given a numeric vector x, how will you find the 0.025 and 0.975 quantiles of x?
42
Problem 42. Consider a sample X1 , . . . , XN from the Gamma(shape,scale) distribution. The
method-of-moments estimators (MoME) for the shape and scale parameters are
m21
shape =
m2 − m21
m2 − m21
scale = ,
m1
PN PN
where m1 = N −1 i=1 Xi and m2 = N −1 2
i=1 Xi .
Do the following for N = 5, 10, 15, 30, 100, 1000 (arranging your scatterplots in a 2 × 3 array):
Generate 1000 random samples of size N (see below) from the Gamma(shape=2,scale=2) dis-
tribution. Estimate shape and scale from each of these samples. Make a scatterplot of these
(shape,scale) estimates using pch=’.’. Mark the (shape=2,scale=2) point with a red dot
on each scatterplot. Label the axes appropriately. Indicate the sample size N on each plot.
7
12
6
6
10
5
8
4
scale
scale
scale
4
6
3
4
2
2
2
1
0
0 50 150 250 0 5 10 15 20 25 30 2 4 6 8 10 14
4.0
3.5
4
2.2
3.0
scale
scale
scale
3
2.0
2.5
2.0
2
1.8
1.5
1
1.0
1 2 3 4 5 1.0 1.5 2.0 2.5 3.0 3.5 1.8 2.0 2.2 2.4
43
Problem 43. Solutions to the equation f (x) = 0, i.e., points x where the function f takes
a zero value, are called zeros or roots of the function f (x). The secant method is a two-step
iterative method to locate a zero of f . Given two initial points x0 , x1 , it produces successive
approximations x2 , x3 , . . . to a zero of f , where
44
Problem 44. Solutions to the equation f (x) = 0, i.e., points x where the function f takes a
zero value, are called zeros or roots of the function f (x). Bisection method is an iterative method
to locate a zero of f . Given a starting interval (a0 , b0 ) that contains a zero of f , this method
produces successively smaller intervals (a1 , b1 ), (a2 , b2 ), . . . containing the zero of f . Successive
intervals are obtained as follows:
45
Problem 45. Solutions to the equation f (x) = 0, i.e., points x where the function f takes a
zero value, are called zeros or roots of the function f (x). False position method is an iterative
method to locate a zero of f . Given a starting interval (a0 , b0 ) that contains a zero of f ,
this method produces successively smaller intervals (a1 , b1 ), (a2 , b2 ), . . . containing the zero of f .
Successive intervals are obtained as follows:
f (bi−1 )(bi−1 − ai−1 )
ci−1 = bi−1 −
f (bi−1 ) − f (ai−1 )
(ai−1 , ci−1 ) if f (ai−1 )f (ci−1 ) < 0
(ai , bi ) = for i = 1, 2, . . . .
(ci−1 , bi−1 ) otherwise
46