Documente Academic
Documente Profesional
Documente Cultură
Presentation Layout
.1
0 2.5
2.5
loggnp
3.5
4.5
4.5
0 55 60 65 70 75 80
Life expectancy at birth
.2
.2
.15
.1
.05
Fraction
.3
3.5
loggnp
. graph twoway scatter mpg weight //Note that you don't need to type graph or twoway
10
20
Mileage (mpg)
30
40
5,000
Price
10,000
15,000
Domestic
Foreign
30
20
10
Mileage (mpg)
40
2,000
3,000
4,000
5,000
2,000
Weight (lbs.)
Graphs by Car type
3,000
4,000
5,000
10
10
20
20
30
30
40
40
2,000
3,000
Weight (lbs.)
Mileage (mpg)
4,000
Fitted values
5,000
2,000
3,000
Weight (lbs.)
Mileage (mpg)
4,000
Fitted values
5,000
Foreign
10
20
30
40
Domestic
2000
3000
4000
5000
2000
3000
4000
5000
Weight (lbs.)
95% CI
Mileage (mpg)
Fitted values
by(foreign) is an option of
twoway.
10
20
30
40
Domestic
2000
3000
4000
5000
2000
3000
4000
5000
Weight (lbs.)
95% CI
Mileage (mpg)
Fitted values
So:
. twoway (qfitci mpg weight, stdf) (scatter mpg weight), by(foreign)
. twoway qfitci mpg weight, stdf || scatter mpg weight ||, by(foreign)
14
ABC.com Inc.
Closing Share Price vs. Nasdaq Composite Index
12
10
14
12
10
8
6
4
2
0
01oct2009
01jan2010
date
01apr2010
01jul2010
Oct 1, 2009
Dec 1, 2009
Feb 1, 2010
Apr 1, 2010
Jun 1, 2010
Nov 1, 2009
Jan 1, 2010
Mar 1, 2010
May 1, 2010
Graph Subtitle
Enter subtitle
Graph Region
X- Axis
Color = Bluish-gray
Range = 0 to 16 by 2, axis line = medium thick, add title, label
angle = horizontal, grid lines = of
title = of, minor ticks = of, suggest # of ticks = 8,
alternate spacing of adjacent labels = on, change label format,
label size=small,
axis line = medium thick
Plot 1 line
Plot 2 line
Caption
Add caption
Y-Axis
ABC.com Inc.
Closing Share Price vs. Nasdaq Composite Index
ABC.com Inc.
Closing Share Price vs. Nasdaq Composite Index
14
14
Share Price (USD)
12
10
8
6
4
2
0
Oct 1, 2009
Dec 1, 2009
Feb 1, 2010
Apr 1, 2010
Jun 1, 2010
Nov 1, 2009
Jan 1, 2010
Mar 1, 2010
May 1, 2010
Oct 1, 2009
Dec 1, 2009
Feb 1, 2010
Apr 1, 2010
Jun 1, 2010
Nov 1, 2009
Jan 1, 2010
Mar 1, 2010
May 1, 2010
Unfortunately, if you are not faculty, you are probably using lab
computers to use Stata, and when they are re-imaged, you will
lose the files in your grec folder. So you can store the recordings
on your flash drive by clicking the Browse button when you save
your recording. Now, when you are in the graph editor and click
the play button, your recording will not appear in the list because
it is not stored where Stata knows to look for it. Never fear, just
click Browse, and navigate to where your .grec file is. If you want
your recording to be available right from code, as in
play(advanced_workshop_1), you will need to move it (at least
life expectancy
50
55
55
45
40
50
40
1900
1910
1920
Year
1930
1940
1900
1910
1920
Year
1930
1940
life expectancy
60
60
45
65
65
More on Schemes
. graph query, schemes
Available schemes are
s2color
s2mono
s2manual
s2gmanual
s2gcolor
s1color
s1mono
s1rcolor
s1manual
sj
economist
see
see
see
see
see
see
help
help
help
help
help
help
scheme_s1color
scheme_s1mono
scheme_s1rcolor
scheme_s1manual
scheme_sj
scheme_economist
Schemes are very powerful, because they let your implement a certain
look without specifying a long series of options in every graph, or
running every graph through the graph editor. However, creating
schemes is fairly time consuming.
For more on creating your own schemes, see:
http://www3.eeg.uminho.pt/economia/nipe/2010_Stata_UGM/papers/Risi
ng.pdf
And
If you draw another graph, it replaces the previous one in memory, and is
. scatter
price
length
now
called
Graph.
If .you
want
have
multiple graphs up at the same time, you can use the
scatter
price to
mpg,
name(scatter1)
name option.
. cd C:\Users\nickj22\Downloads\
. graph save scatter1 mygraph1.gph
graph save moves your graph from memory to disk, saving it as a .gph
file.
. graph dir
Graph
scatter1
mygraph1.gph
graph dir lists all graphs in memory and on disk (in the current directory)
. graph drop scatter1
graph drop drops a graph from memory. Graphs contain the data files
they represent, so if the dataset is large, they can actually take up quite a
NE
N Cntrl
20 40 60
Percent
12
12
20 40 60
Percent
10
15
16
12
8
6
22
22
33
22
17
South
West
38
38
25
13
13
31
15
50
33
20
20
9.5
10
10
10.5
average education level
Source: US Census, 1980 and 1990
11
9.5
Percent
normal educ
Percent
9.5
10.5
11
Graphs by Census region
10
10.5
11
72.1
46.1
46.2
27.9
21.7
Degrees Fahrenheit
20
40
60
73.3
N.E.
N. Central
July
South
January
West
.1
0 2.5
2.5
loggnp
3.5
4.5
4.5
.2
.15
.1
.05
Fraction
0 55 60 65 70 75 80
Life expectancy at birth
.2
.3
3.5
loggnp
60
70
80
20
40
60
80
3
2
Avg.
annual %
growth
1
0
-1
80
70
60
Life
expectancy
at birth
50
12
Log GNP
per
capita
10
8
6
100
80
60
safewater
40
20
-1
0
1
2
3
Source: The World Bank Group
100
10
12
55
80
Chile
Panama
Uruguay
Venezuela
Trinidad
Mexico
Dominican Republic
Ecuador
Para Colombia
Honduras
El Salvador
Peru
Nicaragua
Argentina
Brazil
Guatemala
Bolivia
Haiti
.5
5
10
GNP per capita (thousands of dollars)
15
20 25 30
3500
Calories consumed
4000
4500
5000
01jan2002
01apr2002
01jul2002
Date
Tess
Arnold
01oct2002
Sam
01jan2003
Macros
Macros come in two general types:
1. Globals
2. Locals
. di "`names2'"
Jake Steven Jose Tyrell Martin
. di "$names"
Ballav Nick ChongMing Joe David
.
end of do-file
. di "`names2'"
. di "$names"
Ballav Nick ChongMing Joe David
Creating the
global
Creating the
local
- References to locals
have to
be enclosed in single
References to globals
quotes
have to
begin with a $
End of the do file
The local no longer exists
Conversely, the global
still exists
The local we
created
General macros
automatically
created by Stata
The global we
created
Foreach
Syntax of foreach command
foreach lname {in|of varilist} variables {
commands referring to `lname'
}
The open brace must appear on the same line
as the foreach;
Nothing may follow the open brace except, of
course, comments; the first command to be
executed must appear on a new line;
The close brace must appear on a line by itself
foreach command
with "in" option
foreach command
with "of varlist" option
foreach x in v1 v2 v3
v4 {
recode `x' (99 = .)
}
recode v1 (99
= .)
recode v2 (99
= .)
recode v3 (99
= .)
recode v4 (99
= .)
SS
df
MS
Model
Residual
14.1382227
22.587052
5 2.82764454
85 .265730024
Total
36.7252747
90 .408058608
depress
Coef.
age
iq
gender
weight
anxiety
_cons
-.0193698
-.0093197
-.240945
-.0188543
.5563893
2.332141
Std. Err.
.0137039
.0130535
.1438568
.0229981
.0831225
1.466546
t
-1.41
-0.71
-1.67
-0.82
6.69
1.59
P>|t|
0.161
0.477
0.098
0.415
0.000
0.115
=
=
=
=
=
=
91
10.64
0.0000
0.3850
0.3488
.51549
.0078771
.0166341
.0450809
.0268721
.7216592
5.248027
SS
df
MS
Model
Residual
22.2722042
13.3233014
7
81
3.18174346
.164485202
Total
35.5955056
88
.404494382
depress
Coef.
age
iq
gender
weight
anxiety
sleep
satlife
_cons
-.0252532
-.0212878
-.0288896
-.017562
.3652071
-.6100973
-.4784158
4.336996
Std. Err.
.0109484
.0103962
.1233419
.0181686
.074345
.1435988
.1009435
1.192267
t
-2.31
-2.05
-0.23
-0.97
4.91
-4.25
-4.74
3.64
Number of obs
F( 7,
81)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.024
0.044
0.815
0.337
0.000
0.000
0.000
0.000
=
=
=
=
=
=
89
19.34
0.0000
0.6257
0.5934
.40557
-.0034692
-.0006025
.216522
.0185878
.5131304
-.3243807
-.2775698
6.709233
Results
. program printit
1. display "Listing the values of four variables"
2. list make price mpg foreign
3. end
.
. printit
Listing the
1.
2.
3.
4.
5.
values
of
four
variables
make
price
mpg
foreign
AMC Concord
AMC Pacer
AMC Spirit
Buick Century
Buick Electra
4,099
4,749
3,799
4,816
7,827
22
17
22
20
15
Domestic
Domestic
Domestic
Domestic
Domestic
5,788
4,453
5,189
10,372
4,082
18
26
20
16
19
Domestic
Domestic
Domestic
Domestic
Domestic
11,385
14,500
15,906
3,299
5,705
14
14
21
29
16
Domestic
Domestic
Domestic
Domestic
Domestic
4,504
5,104
3,667
3,955
3,984
22
22
24
19
30
Domestic
Domestic
Domestic
Domestic
Domestic
4,010
5,886
6,342
4,389
4,187
18
16
17
28
21
Domestic
Domestic
Domestic
Domestic
Domestic
11,497
13,594
13,466
3,829
5,379
12
12
14
22
14
Domestic
Domestic
Domestic
Domestic
Domestic
6.
7.
8.
9.
10.
Buick
Buick
Buick
Buick
Buick
LeSabre
Opel
Regal
Riviera
Skylark
11.
12.
13.
14.
15.
Cad. Deville
Cad. Eldorado
Cad. Seville
Chev. Chevette
Chev. Impala
16.
17.
18.
19.
20.
Chev.
Chev.
Chev.
Chev.
Dodge
21.
22.
23.
24.
25.
Dodge Diplomat
Dodge Magnum
Dodge St. Regis
Ford Fiesta
Ford Mustang
26.
27.
28.
29.
30.
Linc.
Linc.
Linc.
Merc.
Merc.
31.
32.
33.
34.
35.
Merc. Marquis
Merc. Monarch
Merc. XR-7
Merc. Zephyr
Olds 98
6,165
4,516
6,303
3,291
8,814
15
18
14
20
21
Domestic
Domestic
Domestic
Domestic
Domestic
36.
37.
38.
39.
40.
Olds
Olds
Olds
Olds
Olds
5,172
4,733
4,890
4,181
4,195
19
19
18
19
24
Domestic
Domestic
Domestic
Domestic
Domestic
41.
42.
43.
44.
45.
Olds Toronado
Plym. Arrow
Plym. Champ
Plym. Horizon
Plym. Sapporo
10,371
4,647
4,425
4,482
6,486
16
28
34
25
26
Domestic
Domestic
Domestic
Domestic
Domestic
46.
47.
48.
49.
Plym.
Pont.
Pont.
Pont.
4,060
5,798
4,934
5,222
18
18
18
19
Domestic
Domestic
Domestic
Domestic
Malibu
Monte Carlo
Monza
Nova
Colt
Continental
Mark V
Versailles
Bobcat
Cougar
Cutl Supr
Cutlass
Delta 88
Omega
Starfire
Volare
Catalina
Firebird
Grand Prix
Invoke the
program by
simply typing
the program
name and then
running in
Stata.
Results
Complex Survey
Sampling weights
inverse probability of being sampled
represent weight elements in the population
Clustering
groups sampled together
primary sampling units (PSU) -- first level
clusters
Stratification
groups of clusters strata
strata sampled separately
Example
States, Counties, Schools, Students
sample states in diferent regions
sample counties within each state
sample schools within each county
sample students from schools
svyset
svyset psu? [pweight=?] , strata =
(?) fpc(?)
|| psu?, fpc(?)
psu = primary sampling unit
pweight = probability weight
fpc = finite population correction (total #
of
stratus or clusters PSU is sampled
from)
|| = next stage
SVYSET Examples
use http://www.stata-press.com/data/r12/multistage
svyset county [pw=sampwgt], strata(state) fpc(ncounties) ||
school, fpc(nschools)
save highschool
use highschool
svyset
Take-home Message
Ask what sampling design for your
data before running analysis.
If complex survey data, consider
svyset or multilevel modeling.
display current
xtset
xtset, clear
clear
xtset
Menu
Statistics > Longitudinal/panel data > Setup and utilities > Declare dataset to be panel
data
Time-Unit Options
[unitoptions]
specify
units of time
[deltaoption]
observations
delta (#)
delta (exp)
delta (# units)
days)
e.g.
deta (2)
delta (7*24)
delta (10 min)/(7
Xtdescribepattern of xt
data
xtdescribe [if] [in] [, options]
[,opti ons]
patterns(#)
e.g.
p(10) --
display max. 10
width(#)
w(80) --
display 80 columns
Menu
Statistics > Longitudinal/panel data > Setup and utilities >
Describe pattern of xt data
Examples
use http://www.statapress.com/data/r12/nlswork
xtset
Browse
xtdes, p(20)
xtsum hours
xttab race
xtreg ln_w grade age ttl_exp tenure south,
mle
After a regression, use the predict newvar syntax to create a new variable,
that contains the fitted values for each observation.
If the model is fitted only for a limited sample, use the following syntax to
get the predicted value for that sample
After a regression, use the predict newvar, r syntax to create a new variable,
that contains the residuals for each observation.
Outreg command can be used to reformat and write regression tables to document files
Example
Outreg has lots of options that lets us customize the look of the output table.
Margins
In the above regression, the coefficient on weight is misleading as an increase in weight affects both
weight and weight squared. So, the total effect depends on the starting value of weight.
The following command will set the variables to their means and find the derivative of expected price with
respect to weight at that point.
Marginsplot
Often, the results from margins can be hard to read as in the following example.
The command marginsplot can be used to visualize the results and understand them better.
Example
20
30
40
50
age in years
1.black
60
1.female
70
Stata stores results from a command in various forms scalar, string, matrices etc. Such results are called
returned results
Returned results can be used to make other computations in STATA
We can type return list after we run a command to see what the returned results
Example
Results are stored mainly as r() class or e() class depending on the commands used
Access r() class results return list, access e() class results ereturn list
Matrices in returned results can be used as regular matrices.
Example :
More advanced computations with matrices can be done in MATA which is a matrix language built into
STATA.
estat ic
Available only after commands that report log likelihood
Given two models, the one with the smaller AIC and BIC values fits the data better
estat vce
- displays the covariance matrix estimates
Postfile
Results can be stored into a STATA dataset using the postfile command
This can be useful when we have to run a lot of regressions, for example - monte carlo simulations.
Lets consider an example from the STATA manual
Suppose we want the means and variances from 10,000 randomly constructed 100-observation samples
of data and store the results in results.dta
We could do that as follows (refer to the do file)