Documente Academic
Documente Profesional
Documente Cultură
This chapter explains the methodology adopted by the researcher for the collection
and analysis of data. The tools of data collection, sources of data and scientific
methods of data analysis have been discussed in detail. Hypotheses have also been
framed as per the objectives of the study.
The universe of the study includes both foreign multinational companies 1 as well as
domestic Indian companies listed in Bombay Stock Exchange in the year 2009. The
financial database relating to these companies has been compiled from the electronic
database PROWESS developed by the Centre for Monitoring of Indian Economy
(CMIE). PROWESS is the largest database providing financial information for large
and medium domestic as well as foreign companies operating in India. The database
contains detailed information for over 27,122 companies (www.prowess.cmie.com).
These comprise all companies traded on India's major stock exchanges and several
1
The word, corporation and company have been used as synonyms for the purpose of the present study
throughout.
63
others including the central public sector enterprises (CMIE, 2012). The Prowess
database is built from Annual Reports, quarterly financial statements, Stock
Exchange feeds and other reliable sources. CMIE normalizes this database in order to
enable inter-company as well as inter-temporal comparisons for the purposes of
researchers and analysts, which further enhances the usefulness of this database.
Following sectors are covered by PROWESS:
As the study aims to attain its objectives through firms operating at micro level,
therefore, the most suitable unit of the analysis is company/firm/corporation as
drawn from the review of literature. The sample of the study is based on both
64
foreign multinational and Indian domestic companies. Though these companies were
operating in various sectors such as “chemical sector”, “food sector”, “machinery
sector”, “metal sector”, “non-metallic sector”, “textile sector”, “transport sector”
and “miscellaneous sector” etc. yet, it was decided to limit the present study only
three sectors namely “Chemical sector”, “Food sector”, “Machinery sector”. This is
so because, these sectors represent 60 per cent of data available on Indian
manufacturing, whereas the data available for other sectors was not sufficient to
form a reliable comparative base. Secondly, an attempt has been made to study
those companies whose financial information for at least 5 consecutive years was
available, whereas majority of the companies of the left-out sectors failed to meet
this criteria. The sampled companies include all companies operating in these three
sectors of Indian manufacturing.
As per the objectives of the study, secondary data was considered to be the more
suitable to draw the results. As the study relates to post-liberalization period,
therefore, a comprehensive period of 18 years starting from 1992 to 2009 was chosen
to attain the objectives. 1992 was selected as the base year, as the reforms were
initiated in the year 1991 in India. The period of the study finds its justification as it
covers the post reforms period when large scale policy reforms had been introduced
in the industrial and foreign trade policy.
In order to study the dynamics related to performance of the companies, the data
pertains to financial indicators comprising of both foreign multinational as well as
domestic companies operating in India on the basis of Profit and Loss Accounts and
Balance Sheets in the form of annual reports of the sampled companies.
65
3.5 COLLECTION OF SECONDARY DATA
Among offline data sources, many national as well as regional libraries such as
National Council of Applied Economic Research; New Delhi, Indian Council for
Social Science Research; New Delhi, Jawaharlal National University; New Delhi,
IIM Udaipur, Punjabi University; Patiala, Guru Nanak Dev University; Amritsar and
Panjab University; Chandigarh have been consulted from time to time for the
collection of secondary data.
Data Arrangement
Used of pooled data has been made for the purpose of the present study . The data
set, considers observations related to n individuals (companies) over a t period of
time. In simple words, each individual carries observations ranging from first year to
t years. Hence, the total number of observations in pooled data set should be nt. The
data arrangement has been presented in table 3.1. The illustrated data shown
hereunder shows various variables for two set of companies i.e. foreign an d Indian
for a 5 year period for 5 companies i.e. I T C Ltd.; V S T Industries Ltd.; Nestle
India Ltd.; Glaxo-Smithkline Healthcare Ltd.; and Agro Tech Foods Ltd.
66
TABLE 3.1
S.
COMPANY YEAR STATUS INDUSTRY SALES PROFIT
No.
10. Agro Tech Foods Ltd. 1993 Indian Food 143.5 20.0
15. Agro Tech Foods Ltd. 1994 Indian Food 163.5 15.32
20. Agro Tech Foods Ltd. 1995 Indian Food 153.5 16.20
25. Agro Tech Foods Ltd. 1996 Indian Food 183.5 18.52
67
Following the same pattern, the data for all the Indian and multinational corporations has
been arranged. The data set was framed in excel file format which was further used for
determining statistical output in E-views, a statistical software used for statistical,
forecasting, and modeling tools through an innovative, easy-to-use object-oriented
interface. Furthermore, Statistical Package for Social Sciences (SPSS 16.0) has also been
used to analyse the data.
To attain the objectives of the study, the following hypotheses have been designed:
68
H7 = There is no difference in mean of export intensity of Foreign Multinational
Corporations and Domestic companies operating in India in Chemical sector.
69
H16 = There exists a difference mean of Research and Development intensity of
Foreign Multinational Corporations and Domestic companies operating in India
in Food sector.
The data collected from different sources have been analyzed with various statistical,
econometric as well as accounting tools and techniques. These techniques have been
analyzed in the following direction:
The application of t-test is based on the assumption that the samples have been randomly
drawn and are normally distributed over the population, with unknown variances 2. This
assumption states that the variables to be considered should be of such a nature whose
values should change randomly. Furthermore, the value of one variable should be
independent of the value of other variables. T-test further assumes random sampling
without any selection bias. Therefore, if any research, knowingly selects some samples
with properties that best suits the requirements of the study and compares these values
with other samples, then the conclusions drawn from non-random sampling will neither
be reliable nor generalized. However, according to the type of data considered in the
study, the researcher has to select the appropriate method of t-test. The methods of t-test
can generally be studied under Independent one-sample t-test, Independent two-sample t-
test, and Dependent t-test for paired samples.
2
If population variances are known then z-test with σ2 can be determined and there is no need of
determining variances.
70
In statistical terminology, t-test is a statistical hypothesis test in which the test statistic
follows a Student's t-distribution if the null hypothesis is true (Wikipedia, 2009).
The Student's t-test is used for determining the statistical significance of the difference
between two sample means, and for confidence intervals for the difference between two
population means. In probability and statistics, Student's t-distribution (t-distribution) is a
probability distribution that arises in the problem of estimating the mean of a normally
distributed population. The Student's t-distribution is a special case of the generalized
hyperbolic distribution. The general formation of data under t-distribution can be
depicted with the help of following diagram.
The above diagram shows the normal distribution of the data. Like other probability
distributions, the total area under the curve of t-distribution is equal to one. As the
number of degrees of freedom increases, the shape of the t-distribution converges to that
of the standard normal distribution. In the above diagram the student's-t distribution has
been depicted with the help of blue hyperbola whereas, the normal distribution has been
depicted through red hyperbola.
71
The researchers frequently use one-sample t-test or two sample t-test. Where, one-sample
t-test determines if the mean of a normally distributed population has a value specified in
a null hypothesis or the population mean is same as the hypothesized value or not, two
sample t-test attempts to test the null hypothesis that the means of two normally
distributed populations are equal. As the data in the present study was related to two
group of companies i.e. Indian and multinational, therefore, independent two-sample t-
test was found to be most suitable.
H0 : µ 1 = µ 2
72
or
µ 1- µ 2= 0
Sx1x2 is an estimator of the common standard deviation of the two samples. The objective
of defining Sx1x2 is to assure that its square is an unbiased estimator of the common
variance whether or not the population means are the same. In these formulae, n =
number of participants, 1 = group one, 2 = group two. n − 1 is the number of degrees of
freedom for either group, and the total sample size minus two (that is, n1 + n2 − 2) is the
total number of degrees of freedom, which is used in significance testing. However, to
test and satisfy the second assumption of homogeneity of variances, Levene test for
homogeneity of variances was applied.
The degrees of freedom are modified to account for the unequal sample sizes and
the unequal variances as well as small sample sizes.
The Standard Error does not ‘pool’ the sample variances to estimate a common
population variance.
73
Welch’s t-test uses following equation to derive conclusive results from the data with
unequal variances.
Variance1 + Variance2
Sample Size 1 Sample Size2
Under Welch’s t-test, a corrected number of degrees of freedom are utilized to assess the
significance of the t-statistic computed as usual. This number of degree of freedom is
determined by applying the formula given hereunder:
Variance1 + Variance2
Sample Size 1 Sample Size2
Welch’s d.f. =
2 2
Variance1 Variance2
Sample Size 1 Sample Size2
+
Sample Size1 – 1 Sample Size2 –1
However, while calculating the degree of freedom under Welch t-test, the researcher must
keep in mind that it cannot be larger than n1+n2-2 and it cannot be smaller than n1-1 and
n2-2. The application of Welch’s t-test is based on two assumptions which states that:
However, there may arise certain cases where the assumptions of Welch’s t-test are not
satisfied. In those particular cases, the researcher can overlook the second assumption of
drawing of samples from normal populations but in no case the test can be applied where
the samples turn out to be dependent on each other.
74
3.7.3 Levene's Test for Homogeniety of Variances
The suitability of Levene’s test over other test is due to the reason that Levene (1960)
proposed to compare the mean values of absolute deviations from sample means rather
than variances. Schultz (1983) proposed Levene test to be among the best of the tests
for determining the differences in variation. Also, Hines and O’Hara Hines (2000)
termed it as a widely used and robust test. Furthermore, Milliken and Johnson (1984)
also recommended the use of Levene test, subject to the condition that there is
confidence that the data are nearly normal or the data set is very large.
75
H0:
Where,
1.
2.
3.
76
The significance of W is tested against F (α, k − 1, N − k) where F is a quantile of
the F test distribution, with (k−1) and (N−k) are
its degrees of freedom, and α is the chosen level of significance (usually 0.05 or 0.01).
The rejection of null hypothesis will tend to the inability of Student’s t-test to derive
accurate results. In such cases, Welch’s t-test is highly recommended which ignores the
differences in variances and provides direction to apply valid t-test.
The simple meaning of ratios is to express one number in terms of another. A ratio is
regarded as a statistical yardstick which attempts to compare and measure relationship
between two or more variables. In finance, the term accounting ratios is used to describe
relationship between the figures shown in financial statements i.e. Balance Sheet and
Profit and Loss Account. In the present study various ratios have been used wherever
required to determine the relationship between the variables considered under different
sectors.
The term "trend analysis" refers to the concept of collection of data and attempts to spot a
pattern, or trend in the data. As the data related to financial indicators has been processed
for a period of 18 years, therefore, the trend analysis has been carried out in order to
know the change over this period while making a comparative analysis of the financial
indicators.
77
LRA is based on probabilities associated with the values of Y. For simplicity, and
because it is the case most commonly encountered in practice, we assume that Y is
dichotomous, taking on values of 1 (i.e., the positive outcome, or success) and 0 (i.e.,
the negative outcome, or failure). For theoretical, mathematical reasons, LRA is based
on a linear model for the natural logarithm of the odds (i.e., the log-odds) in favor of Y
= 1 (Dayton, 1992)
The input is z and the output is ƒ(z). The logistic function is useful because it can take
as an input any value from negative infinity to positive infinity, whereas the output is
confined to values between 0 and 1. The variable z represents the exposure to some set
of independent variables, while ƒ(z) represents the probability of a particular outcome,
given that set of explanatory variables. The variable z is a measure of the total
contribution of all the independent variables used in the model and is known as
the logit.
where is called the "intercept" and , , , and so on, are called the "regression
coefficients" of , , respectively. The intercept is the value of z when the value of
all independent variables are zero (e.g. the value of z in someone with no risk factors).
Each of the regression coefficients describes the size of the contribution of that risk
factor. A positive regression coefficient means that the explanatory variable increases
the probability of the outcome, while a negative regression coefficient means that the
variable decreases the probability of that outcome; a large regression coefficient means
78
that the risk factor strongly influences the probability of that outcome, while a near-
zero regression coefficient means that that risk factor has little influence on the
probability of that outcome.
Instead of finding the best fitting line by minimizing the squared residuals, as we did with
OLS regression, we use a different approach with logistic Maximum Likelihood (ML).
ML is a way of finding the smallest possible deviance between the observed and
predicted values (kind of like finding the best fitting line) using calculus (derivatives
specifically). With ML, the computer uses different "iterations" in which it tries different
solutions until it gets the smallest possible deviance or best fit.
2 1
LL( B)
LL(0)
McFadden’s R square tends to be smaller than R-square. This is because the Likelihood
Ratio Index (LRI) depends on the ratio of the beginning and ending log-likelihood
functions, it is very difficult to "maximize the R2" in logistic regression. and the values
between 0.2 to 0.4 are considered to be highly satisfactory (McFadden, 1979).
Heteroscedasticity
Most of the basic forms of models make use of the assumption that the errors or
disturbances ui have the same variance across all observation points. However, when the
variance of errors differs at different values of the independent variables, the
presence of heteroscedasticity is indicated. Heteroscedasticity is reflected in the
residuals estimated from a fitted model. To deal with this problem, heteroscedasticity-
79
consistent standard errors are used to allow the fitting of a model containing
heteroscedastic residuals. One of such approaches is White's (1980) estimator, which
explicitly tests forms of heteroscedasticity i.e. the relation of u2 with all independent
variables (Xi), squares of independent variables (Xi2) and all cross products (XiXj for i=j).
The present study makes use of White Heteroscedasticity Consistent Covariances to deal
with problem of existence of any heteroscedasticity in the data as this test is also
particularly suitable for large sample sizes.
Multicollinearity
Variance Inflation Factor (VIF): The Variance Inflation Factor (VIF) quantifies
the severity of multicollinearity in an ordinary least squares regression analysis. It
provides an index that measures how much the variance of an estimated
regression coefficient is inflated because of collinearity. Usually, the VIF values
ranging from 4 to 10 indicate the presence of higher multicollinearity between the
predictors (Rogerson, 2001; and Pan & Jackson, 2008). In order to check the
presence of multicollinearity in the data, VIF’s have been used in all regression
models.
Tolerance: Tolerance is a measure of collinearity reported by most statistical
programs such as SPSS. A small tolerance value indicates that the variable under
consideration is almost a perfect linear combination of the independent variables
already in the equation and that it should not be added to the regression equation
(Cohen et al., 2003). All variables involved in the linear relationship will have a
small tolerance. If a low tolerance value is accompanied by large standard errors
80
and non significance, multicollinearity may be an issue (Fox, 1991). The measure
of tolerance is given as
This study has universal applicability as the data for the study consists of all the Indian
and foreign corporations operating in India which have been listed with Bombay Stock
Exchange. Moreover, the study has attempted to measure the performance of
multinationals in India and also attempts to compare their performance with Indian
counterparts, the study provides a base for the policy makers to estimate the impact of
foreign companies on domestic companies. This will help them in formulating the
policies keeping in mind the interest of domestic companies. This study will further help
the Indian entrepreneurs to know the areas where foreign companies are not performing
well or where less competition is posed by foreign companies and hence guide them to
invest in such areas and to earn profits.
Limitations have always been a part and parcel of any analytical research work. This
study is also not free from the ambit of the same. Some of the limitations are listed
below:
81
1. The study is based on secondary data; therefore, the study suffers from all
limitations suffered by a research based on secondary data.
2. As the topic of the research is too comprehensive to cover all the units as well as
sectors in the universe in the given time frame, however, the study remained
confined to three main corporate sectors of India i.e. chemicals, food and
machinery. Therefore, the findings of the study may not be generalized to
excluded sectors and a separate research is required to be conducted for these
sectors.
3. Due to limitation of time and resources, the study excluded tertiary sector which
constitutes a significant share of Indian GDP. As policy guidelines are favoring
increasing share of foreign participation in Indian service sector, therefore,
research in this area can be pursued in future.
4. Limitations concerning to time, also denied a possibility of collection of primary
data to know the manager’s perception of Indian policy framework concerning
smooth growth of MNCs for mutual benefit.
82