Sunteți pe pagina 1din 24

H OW R ELIABLE IS D UALITY T HEORY IN

E MPIRICAL W ORK ?

Downloaded from https://academic.oup.com/ajae/article-abstract/101/3/825/5251971 by UNIVERSITAT DE BARCELONA. Biblioteca user on 20 July 2019


F RANCISCO R OSAS AND S ERGIO H. L ENCE

The Neoclassical theory of production establishes a dual relationship between the profit value func-
tion of a competitive firm and its underlying production technology. This relationship, commonly re-
ferred to as duality theory, has been widely used in empirical work to estimate production
parameters such as elasticities and returns to scale. We generate a pseudo-dataset by Monte Carlo
simulations, which, starting from known production parameters, yield a dataset with the main charac-
teristics of U.S. agriculture in terms of unobserved firm heterogeneity, decisions under uncertainty,
unexpected production and price shocks, endogenous prices, output and input aggregation, measure-
ment error in variables, and omitted variables. Production parameters are not precisely recovered
when performing econometric estimation based on the duality approach, and the elasticity estimates
are inaccurate. Deviations of own- and cross-price elasticities from initial median values, given our
parameter calibration, range between 6% and 690%, with an average of 90%. Also, own-price elas-
ticities are as imprecisely recovered as cross-price elasticities. Sensitivity analysis shows that results
still hold for different sources and levels of noise, and sample size used in estimation.

Key words: Data aggregation, duality theory, endogeneity, firm heterogeneity, measurement error,
Monte Carlo simulations, omitted variables, uncertainty.

JEL codes: C18, D22, D81, Q12.

The duality theorem applied to the underlying production function. Therefore,


Neoclassical theory of production has pro- both could be used to empirically estimate
vided practitioners with a useful method to price or substitution elasticities, returns to
obtain quantitative answers to important eco- scale, and welfare impacts.
nomic questions. Provided certain regulari- A typical application of the dual problem
ties hold, such as perfect competition, profit begins by approximating the profit (cost)
maximizing behavior, and certainty, the solu- function with a parametric functional form,
tion of the primal problem (i.e., the optimal and applying Hotelling’s (Shephard’s)
input demands and output supplies arising Lemma to obtain a parametric form for the
from the maximization of profits given prices optimal input demands and output supplies.
and the production function) are the same as Then, parameters are econometrically esti-
those arising from the dual problem, (i.e., the mated using market data (prices and quanti-
application of Hotelling’s [Shephard’s] ties), and finally used to recover the
Lemma to the profit (cost) function to derive technology features of interest (elasticities,
the optimal input demands and output sup- return to scale, etc.). According to Shumway
plies). In other words, the duality theorem (1995), attractive features of the dual ap-
implies an explicit algebraic relationship be- proach include the facts that (a) no system of
tween the value (profit or cost) function of
first-order equations has to be solved to ob-
the firm’s optimization problem and its
tain input (output) demands (supplies); (b)
more functional forms can be used; (c) it is
less prone to computational errors; (d) it
Francisco Rosas is a professor at Universidad ORT Uruguay and requires data that are usually easier to obtain;
a senior research associate at Centro de Investigaciones
Economicas (CINVE), Uruguay. Sergio H. Lence is a professor and (e) it is more accurate and tractable for
and Marlin Cole Chair of International Agricultural Economics multi-output technologies. However,
at Iowa State University. The authors thank three anonymous
referees for their very useful comments and suggestions. Shumway argues that curvature properties
Correspondence may be sent to: frosas@ort.edu.uy. should be pre-tested, and collinearity of

Amer. J. Agr. Econ. 101(3): 825–848; doi: 10.1093/ajae/aay071


Published online December 18, 2018
C The Author(s) 2018. Published by Oxford University Press on behalf of the Agricultural and Applied Economics
V
Association. All rights reserved. For permissions, please email: journals.permissions@oup.com
826 April 2019 Amer. J. Agr. Econ.

prices and allocatable inputs induces estima- authors assume perfect competition, profit
tion inefficiency. maximization, certainty, and lack of measure-
The reliance of the approach on some re- ment errors, deviations from duality theory
strictive assumptions prompted a body of lit- only come from the FFF choice. As a result,

Downloaded from https://academic.oup.com/ajae/article-abstract/101/3/825/5251971 by UNIVERSITAT DE BARCELONA. Biblioteca user on 20 July 2019


erature seeking to evaluate its performance the performance of duality theory in empiri-
in empirical applications. Burgess (1975) and cal applications cannot be judged, because
Appelbaum (1978) are among the earliest. data used by practitioners are usually not free
These authors failed to identify the source of from at least some of these problems.
the discrepancy between conclusions from In this paper, we propose to analyze the
the primal and dual approaches because they ability of the dual approach to recover under-
used a functional form that is not self-dual lying production parameters from data with
(translog), and used real-world data—for commonly observed problems.1 Among other
which the aforementioned assumptions do realistic properties, the simulated data in-
not necessarily hold and do not allow an un- clude (a) optimization under uncertainty; (b)
derstanding of the true data generating pro- prediction errors in prices and quantities of
cess (DGP). As a result, when the primal and variable netputs; (c) omitted variable netputs;
dual approaches led to conflicting results, the (d) output and input aggregation; (e) mea-
authors could not establish which approach surement errors in the observed variables; (f)
was preferable, or what portion of the whole unobserved heterogeneity across firms; and
divergence in the estimated parameters was (g) endogenous output and input prices. For
attributable to a failure of duality (violation meaningful analysis, we calibrate the simu-
of perfect competition, profit maximization, lated data to capture realistic magnitudes of
or certainty assumptions) versus being due to the noise arising from each source. Knowing
the functional specification. the initial technology parameters, Monte
An exception is the study by Lusk et al. Carlo simulations are used to compute the
(2002), who analyzed the empirical properties necessary price and quantity variables. While
of duality theory by simulating various data- calibrated to be consistent with typical data-
sets representing scenarios of price variabil- sets encountered in practice, the levels of
ity, length of time series, and measurement noise embedded in the simulated variables af-
error. These authors found that small sources fect the data used in estimation, preventing
of measurement error translate into large duality theory from holding exactly. Hence,
errors in estimated parameters, emphasizing the initial production parameters may not be
the need for high-quality data for empirical recovered with enough precision, and the es-
estimation purposes. timated elasticities measurements may be
Considerable effort has been put into test- less accurate than expected.
ing the most appropriate flexible functional We first generate a panel of input and out-
form (FFF) for a given dataset (Guilkey, put prices and quantities for successive peri-
Lovell, and Sickles 1983; Dixon, Garcia, and ods of time and coming from a set of firms
Anderson 1987; Thompson and Langworthy with heterogeneous technology. As this DGP
1989), because the utilized FFF drives the does not bear the problems described in the
results. Analyses of this type usually consist previous paragraph (i.e., the basic duality
of the following steps. First, a parametric assumptions are met), we employ it to con-
functional form is selected to approximate firm that the dual approach is able to recover
the production technology. Several parame- the production parameters with sufficient
ter scenarios are chosen, and observations accuracy.
are simulated corresponding to the “true” Second, we add noise to the generated
production DGP for each scenario. Second, a panel of price and quantity variables to repli-
set of input and output prices is computed un- cate the aforementioned features that
der the assumption of profit maximization.
Third, depending on the objective, the profit
or cost function is approximated by an FFF,
1
and the resulting system of input demands The ability of the dual theorem to recover the parameters of
the dual function (the primal-dual direction) is an analysis as im-
and output supplies is derived. Fourth, econo- portant as the one performed in this study, but it is not addressed
metric methods are applied to estimate the here due to space limitations and because the dual-primal direc-
parameters of the resulting system, which are tion is the preferred one in empirical applications. Furthermore,
the assessment of the primal approach to recover the underlying
finally compared with the true, known pro- production parameters is another important analysis, but for sim-
duction parameters. However, as these ilar reasons it is left for future inquiry.
Rosas and Lence How Reliable is Duality Theory in Empirical Work? 827

characterize the real-world data used by prac- netputs so as to maximize the expected utility
titioners. We aim at generating noise compa- ~ 1 )4:
of uncertain terminal wealth (W
rable to that encountered in widely used
datasets, such as the one constructed and ~ 1 Þ

Downloaded from https://academic.oup.com/ajae/article-abstract/101/3/825/5251971 by UNIVERSITAT DE BARCELONA. Biblioteca user on 20 July 2019


ð1Þ max½y;y0  E½UðW
maintained by Eldon Ball for U.S. input/
output price and quantities (USDA-ERS), ¼ max½y;y0  E½UðW0 þ p
~Þ
the United States Department of Agriculture -
Agricultural Resource Management Survey ~T ~y þ ~y0 Þ
¼ max½y;y0  E½UðW0 þ p
(USDA-ARMS) database, the U.S.
Agricultural Census database (USDA-NASS),
In the above expression, E() denotes the
and the Chicago Mercantile Exchange (CME)
expectation operator, U() is a strictly in-
futures prices database. We chose the first
creasing and twice-continuously differentia-
dataset because it is publicly available and it
ble concave utility function, W0 is initial
has been used for applications of duality the-
wealth, p ~ are uncertain end-of-period profits,
ory in several widely cited papers (Ball 1985,
vector y  [y1, . . ., yN]T comprises N variable
1988; Baffes and Vasavada 1989; Shumway
netput quantities, vector p ~ contains the corre-
and Lim 1993; Chambers and Pope 1994). The
sponding variable netput prices normalized
remaining two data sources yield useful infor-
by p0, which is the price of the numeraire net-
mation to calibrate cross-sectional parameters.
put y0.5 The tilde () indicates a random
We seek to calibrate parameters and noise lev-
variable.
els directly observed (e.g., price variability and
Defining K as the vector of M quasi-fixed
length of time series) and also unobserved
netputs, a production plan consists of the vec-
(e.g., measurement error, endogeneity of out-
tor [y0, y, K] belonging to the production pos-
put prices, production, and price shocks).
sibilities set S 2 <1þnþm.6 As shown by
Moreover, we adopt the criteria of calibrating
Jorgenson and Lau (1974), there exists a one-
parameter values to favor the recovery of
to-one correspondence between the set S and
known production parameters, especially for
a production function G() (also constrained
those that are unobservable.2
by the quasi-fixed netputs K), such that:7
We set up the expected profit function and
derive the system of input demands and out-
put supplies to then econometrically estimate ð2Þ Gðy; K; aÞ ¼ maxfy0 j ½y0 ; y; K 2 Sg
its parameters for comparison with the
known production parameters. Comparisons where a denotes a set of production function
are performed using Lau’s (1976) Hessian parameters. Hence, problem (1) can be re-
identities between production and restricted written as
profit functions.3

4
The model setup closely follows the one used in Rosas and
Lence (2017), which in turn is based on Lau (1976). The pro-
The Model of Individual Firms posed method could also be employed to conduct the estimation
using the state-contingent approach to production uncertainty
proposed by Quiggin and Chambers (2006). We do not pursue
Each firm underlying the simulated dataset is the state-contingent approach here because the limitations of the
assumed to consist of a producer who chooses available data for estimation have severely hindered its use in
empirical work. However, our method could be used to explore
the empirical performance of the approach by simulating data
rich enough to allow for its estimation.
2 5
In this study, we focus on the properties of duality theory According to netput notation, a positive value is a net output
applications using time series data by generating a panel of and a negative value is a net input.
6
observations across firms and over time. The analysis of applica- The properties of the set S include: (a) the origin belongs to
tions with cross-sectional data is as relevant as the one pursued S; (b) S is closed; (c) S is convex; (d) S is monotonic with respect
here, but is left for future research. The properties of duality the- to y0; and (e) non-producibility with respect to at least one vari-
ory using panel data can be studied as well, but they are less fre- able input, which implies at least one commodity is freely dispos-
quent in the literature because these datasets are not as readily able and can only be a net input in the production process (a
available. primary factor of production).
3 7
This issue can also be interpreted in the framework of the ex- The properties of the production function G() are: (a) the
istence of a representative technology arising from the aggrega- domain is a convex set of <nþm that contains the origin; (b) the
tion of heterogeneous firms. It can be argued that this is not a value of G at the origin, say G(0), is non-positive; (c) G is
problem exclusive of duality theory, and that the primal problem bounded; (d) G is closed; and (e) G is convex in {y, K}. Convexity
also bears it. We decided to work on the dual problem because it is required because of the convention used in Lau (1976) that y0
is an application extensively used to recover production parame- ¼ G(y, K). We follow the convention that the value of the pro-
ters and estimate elasticities, but the other option is appropriate duction function is positive infinity if a production plan is not fea-
as well. sible, that is, max{Ø} ¼ 1, where {Ø} is the empty set.
828 April 2019 Amer. J. Agr. Econ.

ð3Þ ~ 1 Þg
maxy fE½UðW vectors of ai,f coefficients, A11f and A22f are,
respectively, (N  N) and (M  M) symmetric
~T ~y  Gð~y;K;aÞÞg
¼ maxy fE½UðW0 þ p and nonsingular matrices, A12f is an (N  M)
matrix, and wft is a mean-zero heteroskedas-

Downloaded from https://academic.oup.com/ajae/article-abstract/101/3/825/5251971 by UNIVERSITAT DE BARCELONA. Biblioteca user on 20 July 2019


The solution to this problem is a set of tic production shock given by expression (5)
expected netput demand equations y*(p, K; below. Submatrices A11f, A12f and A22f form a
b) and a restricted profit function pR(p, K; b), symmetric and positive semi-definite
which are dependent on the vector of normal- ((N þ M)(N þ M)) matrix Af of aij,f coeffi-
ized expected netput prices p, the vector of cients.9 All ai,f and aij,f coefficients are collec-
quasi-fixed netputs K, and a set of profit func- tively denoted as af.
tion parameters b. There are two main reasons for adopting the
Duality theory establishes a relationship be- normalized quadratic functional form (4) in
tween the production function G(y, K; a) and the present study. First, this function has been
the restricted profit function pR(p, K; b), which broadly employed in applications of the dual-
Lau (1976) proved in terms of their Hessian ity approach (Huffman and Evenson 1989;
matrices under the assumption of convexity Thompson and Langworthy 1989; Shumway
and twice continuously differentiability of both
and Lim 1993; Lusk et al. 2002; Arnade and
functions. These Hessian relationships are criti-
Kelch 2007; Schuring, Huffman, and Fan
cal for our analysis because they allow us to not
2011). Second and most important, the nor-
only express the restricted profit function
malized quadratic function is self-dual, with
parameters (b) in terms of the underlying pro-
duction function parameters (a), but also to the Hessian matrix of the profit and produc-
compare the recovered parameters with the tion functions depending on parameters only.
simulated ones (Rosas and Lence 2017). This property greatly facilitates setting up the
To operationalize problem (3), we proceed simulations by making it easier to calibrate
by assigning functional forms. In particular, unobservable parameters to match characteris-
firm f in period t is assumed to choose the tics of observable real-world data (see, e.g., the
level of expected output at the end of the simulation of production function parameters
growing season so as to maximize the described in appendix A). In addition, it favors
expected value of a constant absolute risk identification, as it avoids a source of estima-
aversion (CARA) utility function U(W ~ ft;1 ) ¼ tion imprecision arising from the data point at
~ which parameters are evaluated.10
exp(kft W ft;1 ), with parameter kft represent- The zero-mean heteroskedastic production
ing the coefficient of absolute risk aversion. shock wft in equation (4) is specified as
The treatment of risk and uncertainty in the
duality theory framework with profit functions
2 1=2 T
includes the work by Pope (1982), Coyle ð5Þ wft ¼ ðy Þ Df vft
(1992), Coyle (1999), Pope and Just (2002). In 8 ft
the case of cost functions, developments are
due to Pope and Chavas (1994), Pope and Just where Df is an (N  N) diagonal matrix of
(1996, 1998), Chambers and Quiggin (1998), parameters, and vft is an (N  1) vector.11,12
Moschini (2001), and Chavas (2008), among
others. Then, we assume a quadratic FFF for
the production function G(yft, Kft; af)8: 9
Positive semi-definiteness is required because of the conven-
tion used in Lau (1976) that y0 ¼ G(y, K).
10
The performance of alternative flexible functional forms
ð4Þ Gðyft ; K ft ; af Þ ¼ yTft A1f þ K Tft A2f commonly used in applied work, as well as the impact of misspe-
1 cifying them, has been the focus of several studies (e.g., Guilkey,
þ yTft A11f yft Lovell, and Sickles 1983; Dixon, Garcia, and Anderson 1987;
2 Thompson and Langworthy 1989). The proposed method can be
þ yTft A12f K ft applied to study such interesting issues, but space limitations pre-
vent us from addressing them here.
1 11
þ K Tft A22f K ft  wft As formulated in equation (5), production shocks do not
contemplate the existence of risk-reducing inputs. Alternative
2
shock specifications would be required to accommodate such
inputs.
where f and t, respectively, index firms and 12
Production frontiers are often estimated allowing for an ad-
time, A1f and A2f are (N  1) and (M  1) ditive error structure, with an error component being non-
negative and another error component characterized by a zero
mean. The non-negative error component allows researchers to
accommodate production inefficiency. Here we adopt a zero-
8
We restrict our analysis to the differentiable case because it mean error structure to favor identification, which implies that
is the standard in empirical applications of duality theory that we all firms are assumed to be efficient, both when we generate the
intend to evaluate. data, as well as when we use such data to conduct the estimation.
Rosas and Lence How Reliable is Duality Theory in Empirical Work? 829

Entries of vft corresponding to variable inputs adopted (well-known) symmetric distribu-


are zero (vnft ¼ 0), whereas those associated tions to adhere to the general principle of fa-
with variable outputs consist of the product voring identification, and calibrated them to
of a systematic shock (v1nt) and an idiosyn- match the variance of the observed data. While

Downloaded from https://academic.oup.com/ajae/article-abstract/101/3/825/5251971 by UNIVERSITAT DE BARCELONA. Biblioteca user on 20 July 2019


cratic shock (v2nft): vnft ¼ v1nt  v2nft. For the the effects of asymmetric shocks or alternative
present purposes, the main feature of specifi- distributions are straightforward to analyze
cation (5) is that it yields heteroskedastic pro- with the method being proposed, they are be-
duction shocks, with a standard deviation yond the scope of the present study.
increasing but at a decreasing rate. In other Figure 1, depicting simulated production
words, the higher the deviation, firms are shocks for netput 1 (output) and netput 6 (in-
more affected in absolute terms; however, big put) for all firms at time t0 ¼ 1, helps illustrate
firms are proportionally less impacted be- the two main reasons for adopting function
cause of the decreasing rate. This property is (5).15 First, shocks are heteroskedastic with a
consistent with larger firms, ceteris paribus, standard deviation that increases at a decreas-
being less exposed to uncertainty (e.g., be- ing rate. The top panels, plotting the distribu-
cause a bad weather draw is more likely to be tion of each firm’s netput quantity ~yft0 against
offset by a good draw within the same firm). the firm’s average netput quantity yft0 , show
The parameters involved in (5) are estimated that the dispersion increases as the latter
based on the USDA-ARMS database. increases. However, as demonstrated by the
The main diagonal of Df is set equal to the bottom panels, it does so at a decreasing rate
inverse of the main diagonal of (A11f)1, so as (because the coefficient of variation decreases
to achieve the desired level of variability in with the firm’s average quantity produced).
each netput quantity and reduce variability Second, consistent with the observed data,
induced by other netputs (especially in the firm-specific shocks range from minus 10% to
case of inputs).13 Note that while this setup is plus 10% (minus 60% to plus 60%) of the av-
consistent with firms facing output quantity erage quantity produced in the case of the less
uncertainty, the jointly specified technology (more) disperse distribution.16 This point is
induces uncertainty in all variable netputs. shown in the middle panels, which plot the
The systematic shocks are v1nt  [1 þ 2 minimum, mean, and maximum production
Beta(2,2)], that is, symmetric zero-mean shocks (w)~ as percentages of firms’ average
shocks independent and identically distrib- netput quantities against the respective aver-
uted (iid) over the interval [1, 1]. The idio- age quantities produced.
syncratic shocks are modeled as v2nft  Firms also face end-of-period output price
Uniform(0.87, 1.13), which allows weather uncertainty. Prices are modelled as lognor-
variables to not only affect production quan- mally distributed (see appendix B), and the
tities over time but also have different local methods used to incorporate price uncer-
effects in a given year.14 The available data tainty are explained in appendix C.2.
allowed us to calibrate the variance of the
shocks, but were not sufficiently informative
to allow us to calibrate higher moments with
precision, or to indicate that a specific distri- Simulation of Panel Data
bution was strongly preferable. Thus, we
To analyze the empirical performance of the
dual approach, two datasets are generated;
Investigating the effects of incorporating a non-negative error namely, one noiseless and the other one in-
component is an interesting topic worthy of further efforts.
13
corporating noise. The noiseless data are
Since wft is a function of yft, the shock enters the solution of
variable netput quantities in its first derivative and premultiplied
used to confirm the ability of the dual ap-
by (A11f)1. proach to recover the original production
14
Data from USDA-ARMS and PRISM (2011) were used to
calibrate the width of the idiosyncratic shock interval. We run a
fixed-effects model of firm-level yields at various locations (coun-
15
ties) and time periods (years) on a location-specific effect, As explained later, the own-price elasticities corresponding
weather variables (temperature and cumulative precipitation), to netputs 1 and 6 used for the simulations are reported in appen-
and time dummies. After estimation, we measured the contribu- dix B (see the paragraph discussing equation B.2).
16
tion of weather variables to yield variation by fitting a For comparison, a pooled panel of firm-specific corn yields
“restricted” model with only the weather and time-dummy varia- over a five-year period shows that the 2.5th and 97.5th percentiles
bles using the estimated parameters. The coefficient of variation are, respectively, 60% lower and 40% higher than the average
of the fitted yields provides the dispersion of the idiosyncratic yields in the Corn Belt region, 60% lower and 42% higher in the
shocks. Details of this estimation are provided in the appendix of Lake States region, and 80% lower and 70% higher in the
Rosas and Lence (2017). Northern Plains region.
830 April 2019 Amer. J. Agr. Econ.

Downloaded from https://academic.oup.com/ajae/article-abstract/101/3/825/5251971 by UNIVERSITAT DE BARCELONA. Biblioteca user on 20 July 2019


Figure 1. Production shock as a function of firm’s average variable netput quantity (yf t0 ) at
time t ¼ t0, for selected netputs.
Note: Top panels: distribution of netput quantities (~yft0 ) faced by each firm. Middle panels: minimum, mean, and maximum shock as percentage of firm’s aver-
age quantity 
yft0 . Bottom panels: coefficient of variation of the distribution of quantities CV(~yft0 ) by firm. In all cases, horizontal axes are mean quantity of
netput 1 and mean of absolute quantity of netput 6. Netput 1 is an output and netput 6 is an input; the graphs depict absolute values for netput 6 to make
them easier to interpret.

parameters when data are problem-free. In expected variable netput prices p ft as de-
this study, we focus on the noisy dataset be- scribed in appendix B. This is followed by the
cause it allows us to document the effects on numerical solution to the firm’s optimization
the estimated production parameters when problem (3) which, due to the absence of noise
the data exhibit more realistic features. and the normalized quadratic production FFF
Figure 2 sketches the steps involved in the (4), collapses to a standard profit maximiza-
simulations, and the sections where they are tion with first-order conditions given by pft 

explained. The first step is the same for the A1f  A11f y
ft þ A 12f K ft ¼ 0. Hence, the opti-
noiseless and noisy datasets, and consists of gen- mal variable netput quantities for each firm
erating the starting production parameters af and time period are computed as
and quasi-fixed netputs K ft by Monte Carlo sim-
ulations. To this end, we used the procedures ð6Þ y 1
ðp 
ft ¼ ðA11f Þ ft  A1f þ A12f K ft Þ
described in Rosas and Lence (2017), which are
summarized in appendix A. To favor parameter
identification, production parameters are The final step to create the noiseless dataset
allowed to vary across firms but not over time.17 involves aggregating across heterogeneous firms
In the case of the noiseless simulations, the the simulated individual observations:
second step consists of drawing exogenous X
ð7Þ y
t ¼ y
f ft
X
ð8Þ K 
t ¼ f
K 
ft
17
The ability of the dual approach to recover parameters asso-
ciated with technological change is a very important topic, but it
is not pursued here for reasons of space. and
Rosas and Lence How Reliable is Duality Theory in Empirical Work? 831

Downloaded from https://academic.oup.com/ajae/article-abstract/101/3/825/5251971 by UNIVERSITAT DE BARCELONA. Biblioteca user on 20 July 2019


Figure 2. DGP of noiseless and noisy datasets used for estimation

X
ð9Þ p
nt ¼ f
wnft p
nft ; n ¼ 1; . . .; N
aggregating across firms, in a manner analo-
gous to expressions (7) to (9). The set of pro-
duction function parameter estimates
where wnft  y 
nft /ynt is firm f’s share of the ag- corresponding to the noisy data is denoted
gregate nth netput quantity at time t. That is, ^f .
by a
netput quantities are aggregated by adding The present simulations are parameterized
across firms because they are homogeneous so as to obtain panel data for N ¼ 8 variable
commodities, whereas aggregate netput pri- netputs and M ¼ 1 quasi-fixed netput over a
ces are weighted averages of firm-level prices. period of T ¼ 50 years from R ¼ 3 regions,

The resulting time series dataset [y 
t , pt , K t ] each composed of F ¼ 10,000 heterogeneous
is the one used to estimate the production firms, such that firm heterogeneity is higher
^
function parameters (af ). across regions than within them.18 Therefore,
The procedure for constructing the noisy conditional on the set of parameters af , there
dataset is more involved because it requires are R  FT ¼ 1.5 million observations for
the numerical maximization of expected util- each variable in the noisy data panel [yft, pft,
ity (rather than just profits), and the incorpo- Kft; af ]. Upon aggregation over the 10,000
ration of various sources of noise. In the heterogeneous firms at each time t, we obtain
second step, explained in appendix B, endog- a dataset of 50 observations for each variable
enous expected variable netput prices pft are per region that we use to estimate netput
drawn conditioning on the values of af and demands and supplies as shown in system
K ft . An additional step, described in appendix (11) below.
D, is necessary to obtain calibrated values of
initial wealth Wft,0 and the risk-preference pa-
rameter kft, which are used at the next step to Data Used for Estimation
compute the expected netput quantities yft

that maximize the expected utility of end-of- The noiseless data [y 
t , pt , K t ] include all
period terminal wealth (see appendix D for N ¼ 8 netput quantities and prices, and M ¼ 1
details). Before aggregating across firms, the quasi-fixed netput. Variable netput prices are
following sources of noise are incorporated exogenous from quantities but have serial
into the simulated data: shocks to expected correlation (see appendix B). To avoid the
price and quantity variables (outlined in ap-
pendix C.2); omission of variables (by elimi-
18
nating some of the netput series); aggregation This figure roughly represents about one-fifth of the number
of farms in a given state of the Corn Belt (Iowa, Illinois, Indiana,
across netputs (discussed in appendix C.3); Missouri, and Ohio), Lake States (Michigan, Minnesota, and
and measurement errors in price and quantity Wisconsin), and Northern Plains (Kansas, North Dakota,
variables (addressed in appendix C.4). Nebraska, and South Dakota) regions. Available U.S. state-level
time-series datasets with information on prices and quantities of
Finally, the noisy dataset [yt, pt, Kt] used to agricultural outputs and inputs comprise no more than 50 years
conduct time-series estimation is obtained by of observations.
832 April 2019 Amer. J. Agr. Econ.

addition of another source of noise coming problem (3) is approximated by the following
from heterogeneous technology across normalized quadratic FFF:
regions, we select region 1 to conduct the
baseline estimation, and compare results with pR ðp; K; bÞ ¼ pT B1 þ K T B2

Downloaded from https://academic.oup.com/ajae/article-abstract/101/3/825/5251971 by UNIVERSITAT DE BARCELONA. Biblioteca user on 20 July 2019


ð10Þ
the starting parameters for that same region.
The sensitivity analysis addresses the effect 1 T
þ p B11 p þ pT B12 K
of adding data from more regions to the 2
estimation. 1
In the case of the noisy data [yt, pt, Kt], we þ K T B22 K þ pT j
2
explore the effects of data omission and ag-
gregation by using only N’ ¼ 4 netputs for where B1 and B2 are (N  1) and (M  1) vec-
estimation. The reduced number of netputs tors of bi coefficients, B11 and B22 are sym-
results from omitting one input and one out- metric (N  N) and (M  M) matrices,
put, pooling two variable outputs into one, respectively, and B12 is an (N  M) matrix.
and pooling two variable inputs into one.19 Matrices B11, B12, and B22 form a symmetric
Panel A in figure 3 represents the structure ((N þ M)(N þ M)) matrix B of bij coeffi-
of the noisy data for the baseline analysis. cients, which in the case of the normalized
To perform the estimation using a sample in- quadratic profit function is exactly the
stead of the entire population (to conform Hessian matrix with respect to (p, K). All bi
with reality), and to avoid final results to be and bij coefficients collectively form the set
dependent on a single sample, we proceed as b. The error structure (pT j) is consistent
follows. We take the region’s population of with McElroy’s (1987) additive general er-
F ¼ 10,000 heterogeneous firms and draw ror model (AGEM) applied to the case of
100 samples of 6,000 observations each; for profit functions. The (N  1) vector of ran-
each sample, we aggregate over the hetero- dom variables j is jointly normally distrib-
geneous firms resulting in a time-series data- uted with mean zero and an (N  N)
set of 50 observations for each variable, and covariance matrix R. This covariance matrix
conduct econometric estimation of system induces contemporaneous correlation be-
(11) below.20 For the same reasons stated tween equations. Also, the DGPs of netput
above, we select region 1 to conduct the esti- prices—both exogenous and endogenous—
mation. The effects of pursuing estimation involve AR(1) processes (see appendix B),
with data from more heterogeneous regions implying serial correlation in the indepen-
(e.g., to capture a broader area and/or in- dent variables that needs to be addressed in
crease the sample size, which are common in the estimation.
these applications) are shown as a sensitivity Application of Hotelling’s lemma yields
analysis. the following set of input demands and out-
put supplies:

ð11Þ y ¼ B1 þ B11 p þ B12 K þ j


Estimation
System (11) is estimated by means of iter-
For estimation purposes, the restricted profit ated seemingly unrelated regressions (SUR),
function pR(p, K; b) that results from solving which converges to maximum likelihood, and
is the most common method employed in em-
19
pirical work based on duality theory.
The producer optimally chooses a set of N variable netputs
to maximize profits, but the econometrician rarely observes them
Symmetry cross-equation restrictions (bij ¼
all. This situation can arise due to a misreporting of data from a bji, i 6¼ j) in matrix B11 are imposed for the
surveyed producer in which one or more netputs are omitted, or estimation.
when some inputs are not part of the surveyed set.
20
Given that the population size in each region is relatively
large, we do not require too many samples to achieve robust
results. Also, the sample size within a region (6,000 observations) Addressing Mean-Independence Violations in
is sufficiently large compared to real-world datasets used to con-
struct state-level aggregates. For example, the 2004 ARMS data- Estimation
set consists of samples that average 428 firms per state, ranging
between 48 and 1,600 firms depending on the state. For compari- Each source of noise implying a mean-
son, estimation was also conducted using the entire population in independence assumption violation other
the region and aggregating across all firms, which implies only
one time-series dataset to be estimated. Results were very similar than measurement errors is explicitly treated
to the case of 100 samples from the population. in estimation using standard econometric
Rosas and Lence How Reliable is Duality Theory in Empirical Work? 833

Panel A. Baseline Scenario

Variable

Downloaded from https://academic.oup.com/ajae/article-abstract/101/3/825/5251971 by UNIVERSITAT DE BARCELONA. Biblioteca user on 20 July 2019


1 2 3 4 5 6 7 8
netput

Panel B. Sensitivity Analysis for Omission and Aggregation of Variable

Netputs (Cases 1 and 2)

Variable
1 2 3 4 5 6 7 8
netput

Variable
1 2 3 4 5 6 7 8
netput

Figure 3. Structure of the noisy dataset used for estimation

techniques.21 First, inspection of the autocor- endogeneity, that is, prices in system (11) are
relation and partial autocorrelation functions correlated with the error term j as a conse-
of the noiseless and noisy time series suggests quence of systematic market shocks /nts (see
that the time series present serial correlation. appendix B.2). Since instruments have to be
Therefore, we estimate system (11) with vari- correlated with prices but uncorrelated with
ables in pseudo first differences (Greene the error term, and we know the shocks /nt
2003).22 used to construct the price series, we con-
Second, omitted variables in the noisy data- struct instruments by regressing each netput
set would generate biased and inconsistent esti- price on its own systematic shock: pnt ¼ t0 þ
mates. Hence, we use instrumental variables t1 /nt þ ivnt. The residual (ivnt) is an ideal in-
(IV) for each omitted netput price in system strument because it is correlated with pnt and
(11). We do so even though it is not a common orthogonal to the systematic shock (/nt) by
practice, so as to favor the recovery of the construction; hence, it represents the varia-
parameters of interest. In particular, we use the tion in prices not explained by the systematic
omitted prices themselves as instruments be- shocks. There is one instrument for each net-
cause they are the best possible instruments. put price. We use three-stage least squares to
Third, an IV approach is used to control account for the instruments in estimation.
for the endogeneity of the explanatory price Furthermore, we perform a Hausman test
variables, which arises because they are cor- (Hausman 1978) for assessing the effect of
related with the error terms due to the supply the IV on the estimated parameters relative
and demand shocks. To instrument for the to not instrumenting for these mean-
endogenous prices, we make use of the fact independence violations.
that we know the underlying source of The parameters comprising matrix B11 and
vector B12 are the focus of our attention; they
are, respectively, the marginal effects of pri-
21
The DGP embeds a measurement error at the farm level, as ces and quasi-fixed netputs on netput quanti-
if each farm miss-reports the value actually observed. This error ties. Hence, they are the foundation for the
is simulated for each farm as an independently distributed mean- estimated profit function Hessian matrix and
preserving spread from the regional prices and optimal quantities
(see appendix C.4). Hence, being independent, measurement the elasticities of netput quantities with re-
errors should have little impact in the present estimation because spect to own price, cross prices, and quasi-
they should vanish when aggregating across a large number of
farms in each period of time. However, measurement errors
fixed netputs. As depicted in ^figure 4, we esti-
should have an effect when conducting estimation with cross- mate the Hessian matrices [B ] and [B] ^ from
sectional data or panel data.
22
the noiseless and noisy datasets, respectively.
Briefly, pseudo differences start by estimating system (11)
by SUR using the data in levels. Second, the estimated residuals The Hessians are then transformed ^
into the
^ of each equation are stacked in a sole vector of dimension
j corresponding elasticity matrices [E ] and [E] ^
4T, and used to estimate the autocorrelation coefficients -. ^ in a straightforward manner.
Third, each explained and explanatory variable of the system is
transformed into the pseudo-differenced variable. For example, To compare estimated elasticities with ini-
in the case of the price of netput 1, which is an explanatory vari- tial values, we begin from the known firm-
able, the transformation implies pt;1 ¼ pt;1  -p
^ t1;1 . Finally, we
proceed to estimation of system (11) using the pseudo-
specific production function Hessian matrix
differenced variables. [A]f and convert it into the corresponding
834 April 2019 Amer. J. Agr. Econ.

Downloaded from https://academic.oup.com/ajae/article-abstract/101/3/825/5251971 by UNIVERSITAT DE BARCELONA. Biblioteca user on 20 July 2019


Figure 4. Comparison of initial and estimated elasticities for noiseless and noisy datasets

profit function Hessian [B]f by resorting to summarizes the average difference between
Lau’s Hessian identities. We then transform each entry of the estimated elasticity matrix
the Hessian [B]f into the matrix of own- and and the median of the corresponding initial
cross-price initial elasticities and quasi-fixed elasticity distribution. The RMSE accounts
initial elasticities of netput quantities [E]f. for two sources of error, namely, one due to
Finally, as indicated in figure 4, we compare the SUR estimation error for each of the 64
the^
initial [E]f versus the estimated values parameters, and the other one associated
^ to evaluate how precisely we re-
([E ] and [E]) with the difference between the estimated
cover the starting elasticities under duality and the initial value of the elasticity across
theory, for both the noiseless and the noisy the 64 parameters. Given that the SUR esti-
data. Note that this comparison implies that mation provides only a minor source of error
each initial parameter is represented by a dis- because point estimates are all highly signifi-
tribution (across firms), whereas the estima- cant, we argue that most of the RMSE can be
tion yields a corresponding point estimate and attributed to the deviations between the
its confidence interval. estimated and the initial values across
elasticities.

Results
Estimation with Noisy Data
Estimation results for the noiseless and noisy Estimation with noisy data (i.e., [yt, pt, Kt])
data are discussed separately. yields 16 own- and cross-price elasticities of
variable netput quantities, and 4 elasticities
Estimation with Noiseless Data with respect to quasi-fixed netputs. A
Hausman test rejects the null that the three-
Econometric estimates from the data stage least squares IV estimates are equal to
obtained by aggregating across heteroge- the ordinary least squares estimates, indicat-
neous firms but without any other source of ing that the IV approach is appropriate.24

noise (i.e., [y 
t , pt , K t ]) are omitted to save Figure 5 shows the distribution of the firm-
space, as they can be found in Rosas and specific (initial) price elasticities, and their
Lence (2017). The estimates from the dual corresponding SUR point estimates indicated
approach are able to recover the medians of with a circle (and the bounds of its 95% confi-
the distribution of the corresponding initial dence interval with a “þ” sign). After estima-
firm-specific production parameters fairly ac- tion, we take 10,000 draws from the
curately. More specifically, estimated elastici-
ties with respect to prices (quasi-fixed
netputs) deviate on average by 12.4% (7.5%) the limiting distribution of the SUR parameter estimates, and
from the median of the starting elasticities subscript s indicates the sth draw of the ijth parameter.

Comparison with the mean can be performed by substituting E
according to the computed root mean by E ij . The RMSE averages over all the 64  10,000 squared dif-
ij

squared error (RMSE).23 The RMSE ferences. A measure of its dispersion is achieved by computing
the standard deviation of these 64  10,000 values before averag-
ing over them.
24
Variance inflation factors computed on the estimated model
23
When compared to the median of the distribution,
^
RMSE is over the 100 samples have an average (standard deviation) of
 E
computed as [(64  10,000)1Ri Rj Rs (E 2 1/2
ij;s ij;s ) ] , where 64 is 1.35 (0.0014), indicating that multicollinearity is not an issue de-
the number of parameters, 10,000 is the number of draws from spite the high correlation among explanatory variables.
Rosas and Lence How Reliable is Duality Theory in Empirical Work? 835

Downloaded from https://academic.oup.com/ajae/article-abstract/101/3/825/5251971 by UNIVERSITAT DE BARCELONA. Biblioteca user on 20 July 2019


Figure 5. Own- and cross-price elasticities of variable netput quantities: Initial firm-level dis-
tributions versus estimated values obtained from noisy data
Note. Each ij panel is the ij entry of the 4  4 own- and cross-price elasticity matrix E estimated from noisy data. The elasticity value is in the horizontal axis
and histogram frequency in the vertical. Each histogram depicts the distribution across firms of the initial elasticity (Eij). The circle is the SUR estimated elas-
^ ij ) and the “þ” signs denote the bounds of the 95% confidence interval.
ticity (E

parameters asymptotic distribution of each of the initial distribution in some instances, but
the 100 samples, transform them into elastici- a poor one in others. However, table 1 shows
ties, and calculate their mean, standard devia- that the percentage difference between the
tion, and confidence interval over the median of the initial distribution (E  ) and the
ij
1,000,000 values. Except for entries (2, 2), (2, estimated value (E ^ ij ) is high for the majority
3), (3, 2), and (3, 3) of the own- and cross- of the entries in the elasticity matrix. The dif-
price elasticity matrix, the distributions in- ference ranges between 6% and 690%, and is
volve more than one initial elasticity due to less than 18% in only one entry. The own-
the aggregation of netputs. In these cases, to price elasticities, reported along the main di-
be able to compare with the elasticities esti- agonal, are not recovered with much preci-
mated by means of SUR, we construct “new sion, given that the differences range
initial” elasticity distributions as the revenue- between 6% and 150%. Importantly, the
weighted averages of the distributions of the own-price elasticities for the netputs that are
corresponding original initial elasticities. not aggregated with other netputs (e.g., net-
In light of the conclusions from the previ- puts 5, corresponding to entry (3, 3)) are not
ous sub-section, we measure the accuracy in necessarily more precisely estimated than the
recovering initial elasticities by comparing main diagonal elements that do arise as ag-
the estimated values to the medians of the re- gregated netputs (entries (1, 1) and (4, 4)).
spective distributions.25 Visual inspection of As expected, the off-diagonal elements (i.e.,
figure 5 suggests that, when comparing where the cross-price elasticities) are less accurately
the estimated values fall relative to where the estimated than the main diagonal entries, as
distributions accumulate more mass, the dual they require more information to be
approach provides a good approximation of recovered.
As a summary measure of the dispersion in
recovering the initial elasticities, we calculate
25 the RMSE of the difference between the me-
Comparisons using the means of the distributions provided
less accurate results. dian of the initial distribution and the SUR
836 April 2019 Amer. J. Agr. Econ.

^ ij ) with Medians of
Table 1. Comparison of Price Elasticities Estimated from Noisy Data (E
 )
Distributions of Initial Price Elasticities (E ij

j¼1 j¼2 j¼3 j¼4

Downloaded from https://academic.oup.com/ajae/article-abstract/101/3/825/5251971 by UNIVERSITAT DE BARCELONA. Biblioteca user on 20 July 2019


i51 
E 0.489 -0.051 -0.021 -0.100
ij
^ ij
E 0.576 -0.067 -0.063 -0.049
Std. Dev. (0.087) (0.017) (0.009) (0.032)
Interval 0.405 0.747 0.099 0.034 0.081 0.045 0.111 0.013
% diff. 18% 30% 197% 52%
i52 
E 0.234 -0.371 0.033 0.148
ij
^ ij
E 0.276 -0.395 0.124 0.521
Std. Dev. (0.07) (0.042) (0.024) (0.06)
Interval 0.140 0.413 0.477 0.312 0.078 0.171 0.404 0.638
% diff. 18% 6% 277% 252%
i53 
E 0.213 0.075 -0.373 -0.043
ij
^ ij
E 0.459 0.217 -0.533 -0.132
Std. Dev. (0.067) (0.041) (0.057) (0.072)
Interval 0.328 0.589 0.136 0.298 0.644 0.422 0.272 0.008
% diff. 115% 191% 43% 208%
i54 
E 0.397 0.095 -0.010 -0.583
ij
^ ij
E 0.203 0.524 -0.076 -1.456
Std. Dev. (0.132) (0.06) (0.041) (0.146)
Interval 0.055 0.462 0.407 0.643 0.157 0.005 1.743 1.170
% diff. 49% 449% 690% 150%
^ ij .
Note: Interval is the 95% confidence interval of the point estimate E

estimated values for all 16 estimated price the inaccuracy in recovering the starting elas-
elasticities. Table 2 shows that the RMSE for ticities, averaged over the 4 netputs, amounts
the baseline scenario equals 0.18 in elasticity to 70% of the original elasticities.
units. The average value of all initial elastici- While in real-world datasets it is expected
ties (calculated as the mean absolute value of to observe all the simulated sources of noise
all the medians of the initial distributions) is simultaneously affecting the data, and this is
0.20. Therefore, by comparing both values we what the previous analysis intends to show, it
conclude that duality theory recovers elastici- is informative to document how much each
ties which are, on average, off by 90% of the source contributes to the bias. To this end,
initial elasticities. These results provide evi- three additional scenarios are investigated. In
dence that the dual approach is unable to de- the first scenario, the only source of noise is
liver precise estimates of underlying the prediction errors in prices and quantities
production parameters when employing data of netputs. The second scenario analyzes the
featuring real-world characteristics. effect of price endogeneity and maximization
The estimation of variable netput elastici- under uncertainty. Therefore, both scenarios
ties with respect to quasi-fixed netputs is not consider the 8 netputs of the DGP. Finally,
accurate either. Results are represented the third scenario incorporates both omission
graphically in figure 6. Each panel titled and aggregation of netputs. In each of these
“Eik” is the elasticity of netput i with respect scenarios, we follow the same parameteriza-
to the quasi-fixed netput. The SUR point esti- tion to represent the source of noise, and the
mates of the elasticities are within the sup- same approach to deal with that problem in
port of the respective initial distributions estimation, as in the baseline scenario. Note,
except for E1k and E3k; however, in those however, that all of the scenarios assume ag-
cases the 95% confidence interval does con- gregation across technologically heteroge-
tain the support of the initial elasticities. The neous firms, which is required to obtain the
baseline scenario in table 2 shows that the variables in time series.
RMSE relative to the median of the initial Table 2 shows the results from the addi-
distribution is 0.30 expressed in elasticity tional scenarios. It can be observed from the
units, and the average value of the elasticities elasticities with respect to variable netput pri-
is calculated at 0.43. These results imply that ces that, in each scenario, the bias in the
Rosas and Lence How Reliable is Duality Theory in Empirical Work? 837

Table 2. Comparison of Elasticities Estimated from Noisy Data (E ^ ij ) with Medians of


 ), Baseline Scenario, and Contribution of Each Source
Distributions of Initial Elasticities (E ij
of Noise

Downloaded from https://academic.oup.com/ajae/article-abstract/101/3/825/5251971 by UNIVERSITAT DE BARCELONA. Biblioteca user on 20 July 2019


Contribution of Each Source of Noise
Elasticities with Baseline Scenario 1: Scenario 2: Scenario 3:
Respect to Scenario Prediction Endogeneity Omitted
Errors in Prices &Expected &Aggregated
& Quantities Utility Netputs
Variable Netput RMSE 0.18 0.10 0.13 0.09
Prices Average of Absolute 0.20 0.32 0.32 0.20
Medians
% deviation 90% 32% 39% 44%
Quasi-Fixed Netput RMSE 0.30 0.41 0.15 0.18
Quantity Average of Absolute 0.43 0.39 0.39 0.43
Medians
% deviation 70% 104% 37% 42%
Note: Average of Absolute Medians is the mean absolute value of all the medians of the initial distributions.

Figure 6. Elasticities of variable netput quantities with respect to quasi-fixed netputs: Initial
firm-level distributions versus estimated values obtained from noisy data.
Note: Each Eik panel is the elasticity of netput i with respect to the quasi-fixed input in the case of noisy data. The elasticity value is on the horizontal axis and
histogram frequency on the vertical. Each histogram depicts the distribution across firms of the initial elasticity (Eik). The circle is the SUR estimated elastic-
^ ik ) and the “þ” signs denote the bounds of the 95% confidence interval.
ity (E

recovery of the initial elasticities is smaller generate a high percentage deviation from
than in the baseline, but it is still sizable. In the initial parameters, while the endogeneity
the case of elasticities with respect to quantity and expected utility on the one hand, and
of quasi-fixed netputs, prediction errors also omission and aggregation of netputs on the
838 April 2019 Amer. J. Agr. Econ.

Table 3. Comparison of Elasticities Estimated from Noisy Data (E ^ ij ) with Medians of


 ), Baseline Scenario, and Sensitivity Analysis
Distributions of Initial Elasticities (E ij

Sensitivity Analysis
Elasticities with Baseline

Downloaded from https://academic.oup.com/ajae/article-abstract/101/3/825/5251971 by UNIVERSITAT DE BARCELONA. Biblioteca user on 20 July 2019


Respect to Scenario Omission and Aggregation Regional
of Variable Netputs Pooling
Case 1 Case 2
Variable Netput RMSE 0.18 0.31 0.13 0.28
Prices Average of Absolute 0.20 0.35 0.25 0.27
Medians
% deviation 90% 88% 52% 104%
Quasi-Fixed Netput RMSE 0.30 0.59 0.08 0.69
Quantity Average of Absolute 0.43 0.50 0.33 0.34
Medians
% deviation 70% 120% 25% 205%
Note: Each scenario consists of a different set of omitted netputs and a different set of netputs aggregated together. In the baseline scenario, netputs 3 and
8 (1 and 2, and 6 and 7) are omitted (aggregated). In case 1 of the sensitivity analysis, netputs 1 and 4 (2 and 3, and 7 and 8) are omitted (aggregated). In case
2, netputs 3 and 7 (1 and 2, and 5 and 6) are omitted (aggregated). The regional pooling consists of data coming from three regions, each with 5 states, and
each state with 2,000 firms, over 50 years. After aggregation across heterogeneous firms, results in 750 observations (¼ 50 years  5 states/region  3 regions),
instead of only 50. Regional dummies are put in regions 1 and 2, leaving region 3 as the base. Average of Absolute Medians is the mean absolute value of all
the medians of the initial distributions.

other, generate high but smaller deviations. quantitatively similar to the ones for the
Table 2 also shows that the sum of the per- baseline scenario.26
centage deviations across the three additional We regularly encounter empirical applica-
scenarios is substantially higher than the per- tions of duality theory with time-series data
centage deviation in the baseline scenario. where observations from different regions or
This result might be due to the fact that indi- states are pooled together for estimation
vidual sources of noise contribute to the over- (e.g., Schuring, Huffman, and Fan (2011),
all deviation in different directions. More and O’Donnell, Shumway, and Ball (1999)).
research is needed to further assess their con- By expanding the sample size, pooling has
tribution under different configurations of the advantage of increasing the degrees of
parameters calibrated to the reality of other freedom, which is especially helpful in the
regions and countries, implying sources with presence of several explanatory variables.
different levels of noise or inducing changes However, pooling also has the downside of
in the level of one source as the others re- adding observations from states that are
main fixed. likely to have different technology. We ex-
plore the consequences of such practice by
conducting a sensitivity analysis.
Sensitivity Analysis Pooling implies seeking to recover produc-
We explore the robustness of noisy data esti- tion parameters from firms that are more het-
mation results to changes in the sources and erogeneous than in the case of a single state,
levels of noise. According to this analysis, the usually by adding regional- or state-level
specific combinations of the two or four net- dummy variables. Hence, for this sensitivity
puts being omitted or aggregated, respec- analysis, we exploit the noisy simulated data
tively, do not seem to affect the extent of the for all regions but 1, 2, and 3. For each region
percentage deviation found. For example, ta- and in each of the 50 time-periods, we take
ble 3 shows that estimation with the noisy five samples of 2,000 observations represent-
data structures shown in panel B of figure 3 ing samples of firms from five states within
yield price elasticity estimates with respect to the region, and aggregate across its heteroge-
variable netputs that are off by 88% (case 1) neous firms to obtain the corresponding
and 52% (case 2) relative to the starting price state-level time series. The resulting dataset,
elasticities. Similarly, the corresponding
elasticities with respect to quasi-fixed
netputs differ by 25% and 120% from their 26
We present only a few alternative scenarios due to the com-
starting values. Therefore, these values are putational burden of such analysis.
Rosas and Lence How Reliable is Duality Theory in Empirical Work? 839

consisting of 750 observations (¼ 50 years usually referred to as “duality theory


 5 states/region  3 regions), is then used to approach,” has the advantage of providing
estimate system (11). Following the afore- the mentioned features of the production
mentioned studies, dummy variables are function using market data on input and out-

Downloaded from https://academic.oup.com/ajae/article-abstract/101/3/825/5251971 by UNIVERSITAT DE BARCELONA. Biblioteca user on 20 July 2019


added for observations corresponding to put prices and quantities, without the require-
regions 1 and 2, leaving region 3 as the base. ment of explicitly specifying the parametric
The estimated parameters are transformed form of the production function. However,
into netput elasticities with respect to vari- the duality theorem requires assumptions
able netput prices and with respect to quasi- that are unlikely to hold in practice; in other
fixed netputs, and compared with the starting words, market data typically employed in this
elasticities. As in the previous analysis, the type of study bear levels of noise that prevent
latter are represented by the respective distri- the theorem from holding exactly. If this is
butions of starting firm-specific elasticities the case, elasticity estimates may be biased
(which now involve firms in the three with respect to their true values.
regions). In this paper we analyze the ability of the
The RMSE relative to the median of such approach to recover the technology features
distribution and averaged over the 16 price when the dataset taken to estimation reflects
elasticities being calculated equals 104%, real-world characteristics comparable to those
which is similar to the baseline scenario (see found by practitioners in empirical applica-
fourth column in table 3). This figure is com- tions. Based on a model of maximization of
puted by dividing the RMSE relative to the expected utility of terminal wealth, we first
median of the starting distribution (0.28), by choose the parametric form of the production
the median of the starting distribution of elas- function and use Monte Carlo simulations to
ticities (0.27). Standard errors of the esti- generate its set of parameters for a number of
mated elasticities are lower than in the firms with heterogeneous technology. In par-
previous analysis; in other words, there is a ticular, from the solution of this problem, we
gain in efficiency because of the increased generate a pseudo-dataset of netput prices
number of observations. However, that does and quantities for heterogeneous firms, com-
not contribute to reducing the deviation of ing from different regions and for successive
the estimated parameters and elasticities rel- years, such that their features are comparable
ative to the starting values. Similarly, the to those found in data on U.S. agriculture and
RMSE for the netput quantity elasticities often used by practitioners in empirical appli-
with respect to quasi-fixed netputs also indi- cations. In this regard, the DGP incorporates
cates that production parameters are not re- optimization under uncertainty, prediction
covered accurately. errors in prices and quantities of variable net-
Therefore, the practice of incorporating puts, endogenous prices, omitted variable net-
data from other regions, characterized by a puts, output and input data aggregation,
more heterogeneous technology than within measurement errors in the observed variables,
a region, reduces the standard error of the and unobserved heterogeneity across firms.
point estimates and enhances statistical sig- We calibrate model parameters using datasets
nificance, but is of little help in reducing the (both time-series and cross-sectional) widely
bias relative to the starting elasticity values. employed in practice.
Overall, the present sensitivity analysis We apply the duality approach to this multi-
suggests that the biases found are driven by netput pseudo-dataset, which consists of deriv-
all of the sources of noise that have been ing the system of input demands and output
incorporated. supplies from a profit function approximated
by an FFF, and estimate its parameters (and
the corresponding elasticities) using tradi-
Conclusions tional econometric methods. Because the ini-
tial (primal) production parameters are
The dual relationship between the production known to us, we can evaluate the ability of
function and the profit or cost function estab- this approach to recover these parameters by
lished by the neoclassical theory of the firm transforming the estimated parameters from
has been widely applied in empirical work the dual model into the primal parameters,
with the objective of obtaining price elastici- and then comparing them. This transforma-
ties, substitution elasticities, and return-to- tion is performed by means of the so-called
scale estimates. This empirical method, Hessian identities.
840 April 2019 Amer. J. Agr. Econ.

Also, because we know the existing sources errors in variables (such as omitted variables
of noise in the data, we explicitly address or aggregation of netputs). Also, the hetero-
them in the estimation. We deal with serial geneity of the underlying firms is relevant
correlation by estimating the model with data when recovering the underlying production

Downloaded from https://academic.oup.com/ajae/article-abstract/101/3/825/5251971 by UNIVERSITAT DE BARCELONA. Biblioteca user on 20 July 2019


in first differences. To tackle omitted varia- parameters. Practitioners should be discour-
bles, we employ an instrumental variables ap- aged from estimating elasticities using data
proach in which our instruments are precisely from regions with different technologies. While
the variables we omit in the first place. this practice is commonly used to increase the
Similarly, we use instruments to consider the sample size, we show that it does not reduce
presence of endogeneity in aggregate prices. the root mean squared error of the estimation
In this instance, we also know the source of relative to the initial parameters. Estimation
endogeneity and therefore we can construct using panel data might be an alternative to
the best set of instruments possible. overcome the bias induced by some of the sour-
Results show that the dual approach ap- ces of noise, for example, endogeneity, omitted
plied on a time-series dataset bearing the variables, netput aggregation, and measure-
minimum noise possible, that is, only arising ment error. Importantly, the proposed method
from aggregating firms with heterogeneous to generate pseudo-data can be used by practi-
technology, is able to recover elasticities tioners to investigate the performance of alter-
within the support of the distribution of ini- native methods to deal with the problems that
tial elasticities, and considerably close to the are most likely to afflict their specific datasets.
mean and median of such distributions. Future research may complement this anal-
However, the use of noisy data prevents ysis by assessing the empirical ability of the
the dual approach from providing parameter primal approach in recovering the underlying
estimates that are sufficiently close to their production parameters in the context of noisy
starting values. The root mean squared error, datasets, with the objective of documenting
measuring the average deviation of the esti- whether it performs better than the dual ap-
mated elasticities from their median initial proach under different levels of noise. Other
values, is calculated at 90%, implying that the avenues of inquiry may constitute the assess-
dual approach estimate elasticities are, on av- ment of the dual approach’s performance in
erage, 90% away from the initial values. recovering the production function parame-
Conditional on the dataset, own-price elastic- ters but when employing cross-sectional and
ities require less information from the data to panel data, or the parameters of technical
be estimated with the same level of precision change over time. Also, other sources of
as cross-price elasticities; however, both own- noise can be added such as the treatment of
and cross-price elasticities are inaccurately some variable netputs as quasi-fixed, an al-
recovered. The case of netput elasticities with ternative frequently employed by practi-
respect to quasi-fixed netputs is even more in- tioners to address the lack of price data for
accurate. When the sources of noise are ana- those netputs. Finally, more research is
lyzed individually, the bias in the recovery of needed to understand the contribution of
initial elasticities is smaller than when incor- each source of noise (and with different lev-
porated all together, but it is still sizable. els) to the overall deviation relative to the
Results are robust to different calibrations initial parameters.
of the data structure—specifically, the omis-
sion and aggregation of different sets of net-
puts, as well as the sample of firms used in
estimation. Also, sensitivity analysis shows Supplementary Material
that the common practice of pooling data
from different states and/or regions in order Supplementary material are available at
to increase the degrees of freedom in estima- American Journal of Agricultural Economics
tion yields a similar percentage deviation in online.
the estimated elasticities as in the case of con-
sidering a single and more technologically ho-
mogeneous state. References
Our results suggest that the quality of the
data used for estimation is critical for obtain- Appelbaum, E. 1978. Testing Neoclassical
ing more accurate results, especially to avoid, Production Theory. Journal of
as far as possible, the problems caused by Econometrics 7 (1): 87–102.
Rosas and Lence How Reliable is Duality Theory in Empirical Work? 841

Arnade, C., and D. Kelch. 2007. Estimation Functional Forms. International


of Area Elasticities from a Standard Economic Review 24 (3): 591–616.
Profit Function. American Journal of Hausman, J.A. 1978. Specification Tests in
Agricultural Economics 89 (3): 727–37. Econometrics. Econometrica 46 (6):

Downloaded from https://academic.oup.com/ajae/article-abstract/101/3/825/5251971 by UNIVERSITAT DE BARCELONA. Biblioteca user on 20 July 2019


Baffes, J., and U. Vasavada. 1989. On the 1251–71.
Choice of Functional Forms in Huffman, W.E., and R.E. Evenson. 1989.
Agricultural Production Analysis. Supply and Demand Functions for
Applied Economics 21 (8): 1053–61. Multiproduct U.S. Cash Grain Farms:
Ball, V.E. 1985. Output, Input, and Biases Caused by Research and Other
Productivity Measurement in U.S. Policies. American Journal of
Agriculture, 1948–79. American Journal Agricultural Economics 71 (3): 761–73.
of Agricultural Economics 67 (3): Iman, R.L., and W.J. Conover. 1982. A
475–86. Distribution-Free Approach to Inducing
———. 1988. Modeling Supply Response in a Rank Correlation Among Input
Multiproduct Framework. American Variables. Communications in Statistics -
Journal of Agricultural Economics 70 (4): Simulation and Computation 11 (3):
813–25. 311–34.
Burgess, D.F. 1975. Duality Theory and Jorgenson, D.W., and L.J. Lau. 1974. The
Pitfalls in the Specification of Duality of Technology and Economic
Technologies. Journal of Econometrics 3 Behavior. Review of Economic Studies 41
(2): 105–21. (2): 181–200.
Chambers, R.G., and R.D. Pope. 1994. A Lau, L.J. 1976. A Characterization of the
Virtually Ideal Production System: Normalized Restricted Profit Function.
Specifying and Estimating the VIPS Journal of Economic Theory 12 (1):
Model. American Journal of Agricultural 131–63.
Economics 76 (1): 105–13. Lence, S.H. 2009. Joint Estimation of Risk
Chambers, R.G., and J. Quiggin. 1998. Cost Preferences and Technology: Flexible
Functions and Duality for Stochastic Utility or Futility? American Journal of
Technologies. American Journal of Agricultural Economics 91 (3): 581–98.
Agricultural Economics 80 (2): 288–95. Lim, H., and C.R. Shumway. 1992a.
Chavas, J.-P. 2008. A Cost Approach to Separability in State-Level Agricultural
Economic Analysis under State- Technology. American Journal of
Contingent Production Uncertainty. Agricultural Economics 74 (1): 120–31.
American Journal of Agricultural ———. 1992b. Profit Maximization, Returns
Economics 90 (2): 435–46. to Scale, and Measurement Error.
Coyle, B.T. 1992. Risk Aversion and Price Review of Economics and Statistics 74
Risk in Duality Models of Production: A (3): 430–8.
Linear Mean-Variance Approach. Lusk, J.L., A.M. Featherstone, T.L. Marsh,
American Journal of Agricultural and A.O. Abdulkadri. 2002. Empirical
Economics 74 (4): 849–59. Properties of Duality Theory. Australian
———. 1999. Risk Aversion and Yield Journal of Agricultural and Resource
Uncertainty in Duality Models of Economics 46 (1): 45–68.
Production: A Mean-Variance McElroy, M.B. 1987. General Error Models
Approach. American Journal of for Production, Cost, and Derived
Agricultural Economics 81 (3): 553–67. Demand or Share Systems. Journal of
Dixon, B.L., P. Garcia, and M. Anderson. Political Economy 95 (4): 737–57.
1987. Usefulness of Pretests for Miranda, M.J., and P.L. Fackler. 2002.
Estimating Underlying Technologies Applied Computational Economics and
Using Dual Profit Functions. International Finance. Cambridge, MA: MIT Press.
Economic Review 28 (3): 623–33. Morgenstern, O. 1963. On the Accuracy of
Greene, W.H. 2003. Econometric Analysis. Economic Observations, 2nd ed.
5th ed. Upper Saddle River, NJ: Prentice Princeton, NJ: Princeton University Press.
Hall. Moschini, G.C. 2001. Production Risk and
Guilkey, D.K., C.A.K. Lovell, and R.C. the Estimation of Ex-Ante Cost
Sickles. 1983. A Comparison of the Functions. Journal of Econometrics 100
Performance of Three Flexible (2): 357–80.
842 April 2019 Amer. J. Agr. Econ.

O’Donnell, C.J., C.R. Shumway, and V.E. Nitrogen Fertilizer Applications.


Ball. 1999. Input Demands and Climatic Change 132 (2): 353–67.
Inefficiency in U.S. Agriculture. Schuring, J., W.E. Huffman, and X. Fan.
American Journal of Agricultural 2011. The Impact of Public and Private

Downloaded from https://academic.oup.com/ajae/article-abstract/101/3/825/5251971 by UNIVERSITAT DE BARCELONA. Biblioteca user on 20 July 2019


Economics 81 (4): 865–80. R&D on Farmers’ Production Decisions:
Pennacchi, G. 2008. Theory of Asset Pricing. Econometric Evidence for Midwestern
Boston: Pearson Addison Wesley. States, 1960–2004. Department of
Pope, R.D. 1982. Expected Profits, Price Economics Working Paper Series, WP
Change, and Risk Aversion. American #10021, Iowa State University, Ames IA,
Journal of Agricultural Economics 64 (3): U.S.
581–4. Shumway, R.C. 1995. Recent Duality
Pope, R.D., and J.-P. Chavas. 1994. Cost Contributions in Production Economics.
Functions Under Production Journal of Agricultural and Resource
Uncertainty. American Journal of Economics 20 (1): 178–94.
Agricultural Economics 76 (2): 196–204. Shumway, R.C., and H. Lim. 1993. Functional
Pope, R.D., and R.E. Just. 1996. Empirical Form and U.S. Agricultural Production
Implementation of Ex-Ante Cost Elasticities. Journal of Agricultural
Functions. Journal of Econometrics 72 Resource Economics 18 (2): 266–76.
(1–2): 231–49. Thompson, G.D., and M. Langworthy. 1989.
———. 1998. Cost Function Estimation un- Profit Function Approximations and
der Risk Aversion. American Journal of Duality Applications to Agriculture.
Agricultural Economics 80 (2): 296–302. American Journal of Agricultural
———. 2002. Random Profits and Duality. Economics 71 (3): 791–8.
American Journal of Agricultural U.S. Department of Agriculture, Economic
Economics 84 (1): 1–7. Research Service. Agricultural Resource
PRISM Climate Group. 2011. Oregon State Management Survey. Available at: http://
University. Available at: http://www.prism. www.ers.usda.gov/data-products/arms-
oregonstate.edu/ (accessed September farm-financial-and-crop-production-prac-
2016). tices.aspx (accessed September 2016).
Quiggin, J., and R.G. Chambers. 2006. The ———. Agricultural Productivity in the U.S.
State-Contingent Approach to Available at: http://www.ers.usda.gov/
Production under Uncertainty. Data/AgProductivity/.
Australian Journal of Agricultural and U.S. Department of Agriculture, National
Resource Economics 50 (2): 153–69. Agricultural Statistics Service (USDA-
Rosas, F., and S.H. Lence. 2017. Duality NASS). 2002 U.S. Agricultural Census.
Theory in Empirical Work, Revisited. Available at: http://www.agcensus.usda.
European Review of Agricultural gov/Publications/2002/index.asp (accessed
Economics 44 (5): 836–59. September 2016).
Rosas, F., B.A. Babcock, and D.J. Hayes. Wooldridge, J.M. 2003. Introductory
2015. Nitrous Oxide Emission Econometrics: A Modern Approach, 2nd
Reductions from Cutting Excessive ed. Mason, Ohio: Thomson South-Western.
Rosas and Lence How Reliable is Duality Theory in Empirical Work? 843

Appendix: How Reliable Is Duality Theory dispersion comes from heterogeneity across
in Empirical Work? firms. Since it is most likely that reality falls
somewhere between the two extremes, we es-
Appendix A Simulation of Initial Production timate the portion of yield variation attribut-

Downloaded from https://academic.oup.com/ajae/article-abstract/101/3/825/5251971 by UNIVERSITAT DE BARCELONA. Biblioteca user on 20 July 2019


Function Prameters af and Quasi-Fixed able to heterogeneity across firms. To this
Netput Quantities K f t end, we use a panel of firm-specific crop
yields from the USDA-ARMS database and
The set of production function parameters a*f county-specific weather data (growing season
consists of the submatrices A1f , A2f , and Af precipitation and temperature) from PRISM
(formed in turn by A11f , A12f and A22f ). over five years, and estimate a fixed-effects
Following Rosas and Lence (2017), a*f is model to infer the variability across firms that
obtained by: is not due to weather.
The vector K f of quasi-fixed netputs is
1. Selecting values of the elements of a for a obtained by drawing R  F Beta distributed
“generic” firm such that the symmetric random deviates (Rosas and Lence 2017).
ððN þ MÞ  ðN þ MÞÞ matrix A is posi- The Beta distribution allows us to mimic the
tive-semidefinite. different levels of skewness observed in the
2. Inducing variation across regions by firm-level distribution of these variables. The
obtaining “regional” ar sets as deviations parameters of the Beta distribution for each
from a. region are calibrated based on the 2002 U.S.
3. Generating parameters in the firm- Agricultural Census variable “Farms & land
in farms, approximate land area,” which is
specific set af as deviations from their cor-
chosen to represent the quasi-fixed netput.
responding regional ar .
The region-specific distributions are K *f ;r¼1 
To assure the matrix Af and its inverse are Beta(0.5679, 6.9707); K *f ;r¼2  Beta(0.6026,
positive-semidefinite, we draw the entries of 9.0446); and K *f ;r¼3  Beta(0.4929, 2.9624).
the upper triangular matrix Cf , the Cholesky The Iman and Conover (1982) method is
1
decomposition of matrix Af , such that0 used to impose a positive correlation with the
the latter is formed as the matrix product Cf production function parameters because both
Cf (Hamilton 1994). K *f and Af determine the netput quantities.
We calibrate the elements in af so as to Finally, time variation in each firm’s quasi-
yield a realistic distribution of quantities pro- fixed netput quantity is induced by incorpo-
duced and used (y*f t ). We do this because af is rating a multiplicative and independent
unobservable in the real world, but it determines shock centered at one and uniformly distrib-
the size, dispersion, and skewness of the net-
uted, K *f t ¼ K *f eft , where eft  Uniform[0.90,
put quantity variables, which are observable.
The skewness of the firm-specific deviations 1.10]. The narrow interval implies low varia-
from the “regional” ar is calibrated by fitting tion in firm size over time, which is meant to
a standard Beta distribution to the county- represent the observed low dispersion over
level data of the census variable “Total sales, time of aggregate agricultural area in a
values of sales, number of firms”, which region.
serves as a proxy for firm size. The size of the
elements in af is tackled by inducing positive Appendix B Random Generation of Expected
rank correlation among the Beta random Variable Netput Prices (p 
ft and pft )
shocks, such that a firm producing high levels
of output is more likely to use greater Firm-specific endogenous (pf ;t ) and exoge-
amounts of inputs. nous (pf :t ) netput prices are generated as
Finally, to calibrate the unobserved disper- deviations from the respective “national” pri-
sion of af from ar , we assume that observed ces (pUS;t and pUS;t ) discussed in the next two
yield dispersion in a region is a function of subsections. Using the endogenous case as an
unobserved technology heterogeneity and example, first, regional prices for the nth net-
observed random weather shocks. If all firms put are calculated as pnrt ¼ pn;US;t dr enrt,
used the same technology, the observed yield where dr is a regional indicator with a mean
variability would come only from weather of one across regions ([d1 d2 d3] ¼ [0.90 1.00
shocks. At the other extreme, where all firms 1.10]), and enrt  [0.95 þ 0.10 Beta(2, 2)] is a
differ but no weather shocks occur, all yield mean-one symmetric shock independent
844 April 2019 Amer. J. Agr. Econ.

from dr representing random deviations from prices.29 This endogeneity induces correlation
national prices.27 Then, firm-specific random between prices and the error term in system
prices are generated as deviations from the (11), which violates the orthogonality as-
respective regional average, pnft ¼ pnrt enft, sumption required by ordinary least squares.

Downloaded from https://academic.oup.com/ajae/article-abstract/101/3/825/5251971 by UNIVERSITAT DE BARCELONA. Biblioteca user on 20 July 2019


where enft  [0.80 þ 0.40 Beta(2, 2)] is a sym- Endogeneity is achieved by postulating the
metric mean-one shock independent of enft. system of isoelastic netput demands and
Firm-specific shocks (enft) are relatively small, supplies
so as to mimic the small cross-sectional vari-
ability observed in netput prices. The calibra- ðB:2Þ Qt ¼ Ut pgUS;t
tion of firm-specific shocks (enft) implies firm-
specific prices with a coefficient of variation The N-vector Qt consists of the aggregate
of 0.08, which doubles that of firm prices in output demands and input supplies faced by
the USDA-ARMS dataset and, if anything, g1
firms at time t, pgUS;t  [p1;US;t
gN
, . . .,pN;US;t ]T
favors parameter identification. Also to favor denotes an N-vector of “national” netput pri-
identification of parameters in estimation, ces, each raised to the power of the calibrated
netput prices are assumed to be continuous netput-specific own-price elasticity of de-
and independent from firm size.28 mand or supply (gn), and Ut is an (N  N) di-
agonal matrix of supply and demand netput-
B.1. Exogenous “National” Prices (p*US;t ) specific time-varying positive scalars /nt.
Exogenous “national” prices for the nth net- Based on the FAPRI Elasticities Database
put (p
n;US;t ) are generated by assuming that and other sources, own-price elasticities are
they follow the AR(1) process set equal to [g1, . . ., gN] ¼ [0.25 0.21 0.75
0.90 0.87 0.85 0.83 0.80].30
ðB:1Þ lnðp 
n;US;t Þ ¼ h0n þ h1n lnðpn;US;t1 Þ
The objective is to find firm-specific netput
þ fn;t prices pft such that the optimal netput quanti-
ties aggregated across firms (yt ¼ Rf yft ) equal
the aggregate output demands and input sup-
where h0n and h1n are parameters, and fnt 
plies faced by firms (Qt). To this end, con-
Normal(0,r2fn ) is an error term. Table B.1
sider the solution to the firm-specific
reports the parameter estimates obtained by
optimization problem (6), which yields the
fitting regression (B.1) to the observed time
firms’ optimal aggregate output supplies and
series of futures crop prices from the CME,
input demands:
and input prices from Eldon Ball’s (USDA-
ERS) dataset. The parameters in Table B.1 X
are used to simulate prices matching the ðB:3Þ yt ffi f
½ðA11f Þ1 ðpft  A1f
mean, standard deviation, and serial correla- þ A12f K ft Þ þ wft 
tion of the observed series, by (a) setting the
value for the first iteration equal to the un-
conditional mean ln(p  where pft is a vector comprising firm-specific
n;US;t¼0 ) ¼ E[ln(pn;US;t )]
¼ h0n/(1 – h1n), (b) taking 10,000 random prices pnft  pn,US,t dr enrt enft (explained
draws from a Normal(0, r2fn ) distribution, (c) above), and wft is the vector of heteroskedas-
plugging them into expression (B.1) to obtain tic production residuals (11) (e.g., optimiza-
a log-price series by iteration, and (d) keep- tion mistakes, weather shocks, deviation of
ing the last 50 values as the desired exoge- prices from expected values, etc.) indepen-
nous “national” netput prices p dent from the price shocks.
n;US;t .
Since the firm-specific time-invariant matri-
ces of production coefficients A1f, A12f, and
B.2. Endogenous “National” Prices (p*US;t ) A11f are generated by introducing deviations
Simulated endogenous “national” prices from the respective starting production
(pUS;t ) are obtained by assuming that aggre-
gate changes in netput quantities across indi-
vidual firms lead to changes in market
29
Note that individual firms are price takers, and therefore
consider price as exogenous when making decisions.
30
We were unable to find estimates of supply elasticities for
27
Analogous procedures are used to compute pft . inputs. Their values were set relatively high while maintaining
28
The observed firm-level prices for a netput in the USDA- them within the inelastic range. Higher input elasticity values
ARMS dataset are concentrated in about four to five values in would reduce the price effects due to demand changes; hence,
each period, which contrasts with the continuum of firm-level the endogeneity we seek to represent would be less important as
values computed in the simulation. a source of noise.
Rosas and Lence How Reliable is Duality Theory in Empirical Work? 845

Table B.1. Estimation Results of the OLS Regression Model Used to Generate Random
Exogenous “National” Prices from Equation (B.1)
n¼1 n¼2 n¼3 n¼4 n¼5 n¼6 n¼7 n¼8

Downloaded from https://academic.oup.com/ajae/article-abstract/101/3/825/5251971 by UNIVERSITAT DE BARCELONA. Biblioteca user on 20 July 2019


h0n 0.031 0.065 0.012 0.031 0.001 0.057 0.041 0.001
(0.038) (0.032) (0.030) (0.064) (0.035) (0.035) (0.024) (0.037)
h1n 0.680 0.34 0.67 0.902 0.861 0.60 0.843 0.923
(0.094) (0.14) (0.11) (0.079) (0.080) (0.12) (0.080) (0.054)
r2f 0.0680 0.0342 0.0372 0.0340 0.0439 0.0392 0.0207 0.0237
Note: Standard errors shown within parentheses. The estimation is based on observed annual data from 1961 through 2004, that is, there are 44 time series
observations in each regression (note that these estimates are then used to generate the simulated 50-year data set).

Table B.2. Calibrated Parameter Values for Market Shocks (/nt) Used in Equation (B.6)
n¼1 n¼2 n¼3 n¼4 n¼5 n¼6 n¼7 n¼8
q0n 5.1096 5.2822 5.1794 4.764 4.2696 4.4937 4.4506 4.6259
q1n 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000 0.5000
r2nn 0.1779 0.078 0.1406 0.3053 0.3605 0.4664 0.7051 0.269

matrices A1, A12, and A11, and such devia- ðB:6Þ lnð/nt Þ ¼ q0n þ q1n lnð/n;t 1 Þ þ nnt
tions are independent from the price shocks
(enrt and enft) and the production shocks (wft), where nnt  Normal(0, r2nn ). Parameters q0n,
for a sufficiently large number of firms (F)
q1n, and r2nn are calibrated so that they yield
and by the law of large numbers, yt converges
in distribution to a Normal random variable “national” prices pUS;t with descriptive statis-
whose mean is tics comparable to the observed output and in-
put prices (see table B.2 for the specific
parameter values). To generate the shock se-
ðB:4Þ yt ¼ FðA11 Þ1 pUS;t  FðA11 Þ1 A1

ries, we set ln(/n, t¼0) ¼ q0n/(1  q1n) (i.e., the
þ FðA11 Þ1 A12 K t þ F w

t unconditional mean of ln(/n, t¼0)), take 10,000
draws from a Normal(0, r2nn ) distribution, plug
This expression depends only on the them into expression (B.6) to generate itera-
known “average” production parameters and tively the systematic shocks (/nt), and keep
“national” time-t prices pUS,t, which are the the last 50 of them as the series of shocks used
same as those in the isoelastic demand or sup- to solve for the endogenous “national” netput
ply function (B.2) faced by firms. Thus, the prices pUS;t from equality (B.5).31
time-t endogenous “national” netput prices It is worth noting that price variability is
that clear the markets (Qt ¼ yt ) are obtained critical to recover production parameters be-
by numerically solving for pUS;t the system cause it contributes to the identification of a
bigger portion of the production function. The
ðB:5Þ Ut pg
US;t ¼ FðA11 Þ
1 
pUS;t simulated systematic shocks are independent

 FðA11 Þ1 A1

þFðA11 Þ1 A12 K t þ F w



t
31
To obtain the calibrated parameters in Table B.2, we rely
upon equation (B.5) because the /nt shocks are not directly ob-
served, but time series of netput prices are. So we first plug the
which is derived by equating the right-hand- 10,000 exogenous “national” prices p US;t (described in section
B.1) into system (B.5), and solve for starting values of /t . In this
sides of expressions (B.2) and (B.4). manner we compute time series of /nts that allow us to “learn”
According to equation (B.5), the endoge- about their unconditional means and variances, given prices,
nous “national” prices are determined by the “average” production parameters, and elasticities. Since the un-
conditional mean and variance of ln(/nt) are given by E½lnð/nt Þ
time-varying scalars /nt comprised in matrix ¼ p0n =ð1  p1n Þ and r2lnð/nt Þ ¼ r2nn =ð1  p21n Þ; respectively, and
Ut. These scalars, whose variability repre- there are three unknown parameters (q0n, q1n, r2nn ), we arbitrarily
fix q1n ¼ 0.5 to guarantee a stationary AR(1) process, and com-
sents systemic shocks, are modeled as auto- pute q0n and r2nn from the mean and variance for each netput n
correlated and log-normally distributed: (as reported in Table B.2).
846 April 2019 Amer. J. Agr. Econ.

from each other because random draws from order conditions for optimization to the nu-
Normal(0, r2nn ) are independent; however, merical routine as equality and inequality con-
when plugged into system (B.6) correlation straints, respectively. The solution is the
between national prices is induced through vector of expected netput quantities for each

Downloaded from https://academic.oup.com/ajae/article-abstract/101/3/825/5251971 by UNIVERSITAT DE BARCELONA. Biblioteca user on 20 July 2019


matrix A11. This DGP ultimately generates na- firm and time, which we denote as yft .
tional netput prices that exhibit higher vari-
ance and lower cross-correlations than the C.2. Realized Price and Production Shocks
CME future crop prices and Eldon Ball’s in- Firms solve the maximization problem given
put prices. These two features favor identifica- a set of output prices that reflects their
tion in estimation when prices are explanatory expectations of harvest prices. It is commonly
variables, as they are in our case. accepted that prediction errors make this dif-
ference relevant. Even in the presence of for-
Appendix C. Simulation of Noisy Dataset ward contracts, it might be the case that not
all of the production is sold under this type of
The following subsections provide details arrangement. In the case of input prices,
about the generation of the noisy data, in- some prices might not be known at the begin-
cluding the procedures used to maximize ning of the production period, especially for
expected utility, introduce price and produc- inputs purchased during the growing season.
tion shocks, aggregate across netputs, and in- We model this feature by assuming that real-
corporate measurement errors. ized log-prices are equal to the log-prices
used for optimization plus noise,
C.1. Expected Utility Maximization
Uncertainty introduces noise because the du- ðC:1Þ lnðpft Þ ¼ lnðpft Þ þ et
ality theorem assumes a deterministic problem
whose solution is generally different from the The N-vector et consists of realizations of
expected utility case. We solve optimization Normal(0, r2e ) random variables with r2e ¼
problem (3) for the vector of expected vari- 0.22 for outputs (Lence 2009), and r2e ¼ 0.12
able netput quantities yft ðpft ; K ft ; kft, Wft,0; af ) for inputs, implying smaller deviations from
conditional on expected netput prices, quasi- decision values for inputs than for outputs.
fixed netput quantities, the levels of absolute Shocks et are systematic in that they affect all
risk aversion and initial wealth, and the start- firms by the same proportion at a given time.
ing production parameters. We also let actual netput quantities (yft)
Optimization problem (3) is solved by differ from the optimal quantities that solve
employing numerical methods and Gaussian problem (3) (yft ), for example, due to uncer-
quadrature. In the present application, the tain events in agricultural production, such as
numerical integration must take into account weather, as follows:
that the objective function is multi-
dimensional, and that the uncertainty stems
from random variables that have nonstandard ðC:2Þ yft ¼ yft þ gðyft Þvft
distributions and are correlated with each
other. Given these requirements, we created a Shock vft is a realization of the random var-
routine to calculate nodes and weights used in iable controlling production errors (wft) given
the objective function approximation.32 We by expression (5).
then used MATLAB’s fmincon function to Finally, we introduce contemporaneous
optimize the approximated objective function, negative correlation between quantity and
by passing the necessary first- and second- price shocks with a coefficient equal to 0.3
(Rosas, Babcock, and Hayes 2015), and posi-
tive correlation with a coefficient of 0.9
32
We generated four independent log-normal nodes and within quantities and within prices. These
weights for each of the three output price random variables using correlations are induced by means of the
the MATLAB function qnwnorm (Miranda and Fackler 2002),
that calculates standard Normal nodes and weights. Similarly, Iman and Conover (1982) method.
based on the function qnwbeta, that calculates standard Beta
nodes and weights, we computed four independent beta nodes
and weights in the interval of interest for the three output quan-
C.3. Aggregation across Netputs
tity random variables. Then, using the Iman and Conover (1982) Technology processes employ a variety of
method, we imposed correlations directly to the nodes (correla- inputs to produce several outputs; however,
tion between output prices and quantities equal to -0.30, and cor-
relation within them equal to 0.90); these transformations do not data available to practitioners are usually not
affect the weights. as disaggregated. In some cases, even if data
Rosas and Lence How Reliable is Duality Theory in Empirical Work? 847

can be obtained for several inputs and out- initial wealth measured as total net assets
puts, they are aggregated to preserve degrees (TNAft) is strongly associated to the value of
of freedom, or because they are not the ob- production (VPft) in the USDA-ARMS data-
jective of the study. To incorporate this fea- base. More specifically, panel A of table D.1

Downloaded from https://academic.oup.com/ajae/article-abstract/101/3/825/5251971 by UNIVERSITAT DE BARCELONA. Biblioteca user on 20 July 2019


ture, we compute value-weighted aggregates shows that the coefficient estimates of the fol-
across netput quantities and prices: lowing regression fitted with such data are
highly significant33
X
ðC:3Þ yift ¼ n2Xi
winft ynft
ðD:1Þ TNAft ¼ c0 þ c1 VPft þ c2 VP2ft þ sft
X
ðC:4Þ pift ¼ n2Xi
winft pnft
Following Wooldridge (2003), heteroske-
dasticity in the residual term sft is modeled as
where Xi is the ith subset of netputs, and winft ln(r2sft ) ¼ d0 þ d1 VPft þ d2 VP2ft þ eft; panel B
 (pnft ynft)/(Rn2Xi pnft ynft) is netput n’s share of table D.1 reports the corresponding
of subset Xi’s value. estimates.
Based on the parameter estimates for re-
C.4. Measurement Error in Prices and gression (D.1), denoted by hats, the firm- and
Quantities time-specific initial wealth (W0,ft) is generated
Measurement error is a common problem in as follows:
datasets available to researchers and induces Step 1: Obtain the value of production of
bias and inconsistency in parameter estima- firm f and time t, calculated as: VPft ¼
tion. Efforts to quantify the level of errors in yT pft . Endogenous prices pft are described
ft
the data include Morgenstern (1963), who iden- in appendix B, and netput quantities y ft are
tifies a 10% standard error in the national in- the solution to problem (3) under risk neu-
come data, and reports that the U.S. trality. Risk neutrality is assumed at this stage
Department of Commerce in the state-level because solving the expected utility problem
Food and Kindred Products data have an 8% requires conditioning on initial wealth, which
measurement error in input and output figures. is what we are trying to calculate. Netput
Lusk et al. (2002) study the consequences of quantities y ft are only used to compute the
applying duality theory using variables mea- firm’s initial wealth, and are not used any-
sured with error, and Lim and Shumway where else in the analysis.
(1992a, 1992b) analyze violations of maintained Step 2: Take a draw from Normal(0, r ^2e ) to
hypotheses such as profit maximization, convex obtain eft, and use it to calculate r ^sft ¼ exp(^d0
2
technology, and regressive technical change. þ ^d1 VPft þ ^d2 VP2ft þ eft).
Based on this literature, we add noise to the se- Step 3: Take a draw from a Normal(0, r ^2sft )
ries. This noise is distributed as standard for the error term sft, and plug it into the ana-
Beta(2, 2), and we calibrate its interval to yield log of equation (D.1) to obtain initial wealth
the desired standard deviations of 0.05 around as Wft, 0 ¼ ^c0 þ ^c1 VPft þ ^c2 VP2ft þ sft.
the “true” value for netput prices, 0.08 for vari- Values of the absolute risk aversion coeffi-
able netput quantities, and 0.10 for quasi-fixed cient kft are computed as the ratio between a
netputs. These standard deviations are smaller relative risk aversion coefficient uniformly
than or equal to the ones reported in the litera- distributed in the interval [2, 4] (Pennacchi
ture, especially in the case of prices. 2008) and the terminal wealth Wft,1 (the ini-
tial wealth Wft,0 plus firm- and time-specific
Appendix D. Simulation of Initial Wealth profits).
(Wft,0) and Coefficient of Risk Aversion (kft)
A firm’s initial wealth is postulated to be a
function of its value of production, because

33
TNA is computed as “value of total farm financial assets”
minus “total farm financial debt,” and VP is calculated as “all
crops – value of production” plus “all livestock – value of
production.”
848 April 2019 Amer. J. Agr. Econ.

Table D.1. Parameter Estimates of Regression (D.1), and the Form of Its Heteroskedasticity.

A. Dependent variable: Total Net Assets (TNA)

Downloaded from https://academic.oup.com/ajae/article-abstract/101/3/825/5251971 by UNIVERSITAT DE BARCELONA. Biblioteca user on 20 July 2019


Region 1 Region 2 Region 3
Explanatory Variables: Constant (c0) 0.724 0.843 0.892
(0.040) (0.058) (0.058)
VP (c1) 1.279 1.138 0.574
(0.064) (0.062) (0.042)
VP2 (c2) 0.066 0.019 0.010
(0.008) (0.002) (0.001)
R2 0.160 0.195 0.092
Note: VP ¼ Value of Production. Standard errors appear in parentheses.

r2sft ), where r
B. Dependent variable: ln(^ ^2sft is the sample estimate of r2sft
Region 1 Region 2 Region 3
Explanatory Variables: Constant (d0) 2.062 1.925 1.5069
(0.045) (0.058) (0.048)
VP (d1) 1.544 0.964 0.416
(0.071) (0.063) (0.035)
VP2 (d2) 0.105 0.026 0.006
(0.008) (0.002) (0.001)
^2e
r 4.343 4.710 4.040
R2 0.150 0.113 0.080
Note: VP: Value of Production. Standard errors shown within parentheses.

S-ar putea să vă placă și