Assessment of Outlier....................

QUANTITATIVE PROJECT
Submitted By: Rajib Ali
FEBRUARY 7, 2019
1. Assessment of Outlier
Barnett and Lewis (1994) “observations or subsets of observations which appear to be inconsistent
with the remainder of the data”. The presence of outliers in any data set can seriously distort the
regression coefficient estimation, hence leading to unreliable results (Verardi & Croux, 2008).
Before assessment of outlier, we first have to run the frequency distribution test to know any value
is missing in the data that we have included in the in SPSS. The frequency distribution can be
tabulated in SPSS using minimum and maximum statistics for all latent variables to determine
values that appear outside the SPSS value labels.
Table:01 USE5 EOU6 ATT4
N Valid 250 250 250
Missing 0 0 0
Minimum 1 1 1
Maximum 55 33 44
In these three items, there was typing error that we came to know after Table:02
Nunber ID
looking at the minimum and maximum value of each items.
1 119
Additionally, there was not any missing value in all 20 items. 2 231
3 176
There are two types of outlier which research can check in the data such 4 17
5 150
as univariate and mulivariate outliers. A univariate outlier is a data
6 164
point that consists of an extreme value on one variable. A multivariate 7 200
8 38
outlier is a combination of unusual scores on at least two variables.
9 71
Both types of outliers can influence the outcome of statistical analyses. 10 3
11 30
Therefore, mainly we assess multivariate outlier assessment, for that 12 12
13 11
we have to run Mahalanobis distance (D2) test. It refers that the
14 68
distance of a case from the centroid of the remaining cases where the 15 88
centroid is the point created at the intersection of the means of all the variables. As at p=0.001 the
threshold is 45.315 from chi square table. After looking at the threshold value these 15 (refers to
table: 02) case were considered as outliers because the value of Mahalanobis was greater than
threshold. Now the remaining sample is 235 cases, which we will be using for further screening
of the data.
1|Page
2. Normality
In a normal distribution measures of central tendency (mean, median, mode) all fall at the same
mid-line point. They are equal. According to Coakes and Steed, (2001) the data is considered good
when it is distributed normally without noticeable skewedness and is shaped like a bell. Norusis
(1997) suggested that the simple method of testing normality of a data is by looking at the
histogram of the residual. This histogram will tell us about normality of the data. Skewness and
Kurtosis are used for testing the normality of the data. The threshold for and kutosis value is
between -2 to +2 then we considered that variable data is normally distributed.
By looking at each histogram of each variable,

all the values are under -2 to +2 which indicate
that there is normality in the data. These
histogram are approximately bell-shaped as
well.
2|Page
3. Multicollinearity
Hair et al, (2010) multicollinearity refers to the degree of relationship between the independent
variables used in the model. Multicollinearity is the occurrence of high inter correlations among
independent variables in a multiple regression model. Multicollinearity can lead to skewed or
misleading results when a researcher or analyst attempts to determine how well each independent
variable can be used most effectively to predict or understand the dependent variable in a statistical
model. In general, multicollinearity can lead to wider confidence intervals and less reliable
probability values (P values) for the independent variables.
There are two method to know about correlation between variables such as; Pearson correlation
test and VIF (Variance inflation Factor). Therefore, we are only test correlation by VIF method. If
the value of VIF is greater than 5, we considered that there is correlation in variable, if it is less
than 5 it means there isn’t any strong correlation the in variables. Another threshold is of tolerance
which is tolerance must be greater 0.20.
Table:03 Coefficientsa
Unstandardized Standardized
Coefficients Coefficients Collinearity Statistics
Std.
Model B Error Beta t Sig. Tolerance VIF
1 (Constant)
.628 .206 3.045 .003
Usefulness
.029 .058 .027 .504 .615 .601 1.663
Ease of Use
.107 .067 .088 1.597 .112 .551 1.814
Observability
.129 .044 .139 2.915 .004 .730 1.370
Attitude .588 .055 .632 10.773 .000 .484 2.065
Table 03 includes value of VIF and tolerance. VIF value of usefulness, ease of use, observability
and attitude is 1.663, 1.814, 1.370 and 2.065 respectively, which indicate that there is no
multicollinearity between variables. Similarly, tolerance level of usefulness, ease of use,
observability and attitude is 0.601, 0.551, 0.730 and 0.484 respectively, which are greater than
threshold. These both indicators refers that there is no multicollinearity among variables.
3|Page
4. Non-Response Bias
Berg, (2005) non-response bias refers to the errors one is likely to make for the estimation of a
population characteristic based on a sample of survey data. Sometimes, in survey sampling,
individuals chosen for the sample are unwilling or unable to participate in the survey. Nonresponse
bias is the bias that results when respondents differ in meaningful ways from non-respondents.
Nonresponse is often problem with mail surveys, where the response rate can be very low.
We use Independent Samples T test for Non Response Bias. Pallant (2010) the significance values
of Levene’s test for equality of variance should be greater than 0.05 if there is no significant to say
that there is non-response bias in the data. We have assumed that, half data is collected early and
half of the data is collected late.
Table:04 Independent Samples Test

Levene's Test for Equality of Variances
F Sig.
Usefulness Equal variances assumed .658 .418
Ease of Use Equal variances assumed .000 .987
Observability Equal variances assumed .491 .484
Attitude Equal variances assumed .342 .559
Intention Equal variances assumed .157 .692
By looking at the table 4 the significant level of usefulness, ease of use, observability, attitude and
intention is 0.418, 0.987 0.484 0.559 and 0.692 respectively, these are greater than 0.05 which
means that there is no non-response bias in the data.
4|Page
5. Common Method Variance
Common method variance is also defined as “variance that is attributable to the measurement
method rather than to the construct of interest”. Researchers at large have the opinion that in self-
reporting survey method, the common method variance should be a major concern (Spector, 2006;
Podsakoff et al., 2003; Lindell & Whitney, 2001). We use Harman’s single factor test to this bias.
The results of the analysis yielded seven factors where they explained a cumulative of 73.047
percent of the variable, and the first (largest) factor explained 41.436 percent of the total variance.
Table:05 Total Variance Explained
Although this variance Extraction Sums of Squared
Initial Eigenvalues Loadings
is less than 50 percent,
% of Cumulative % of Cumulative
the results also indicate Component Total Variance % Total Variance %
1 9.116 41.436 41.436 9.116 41.436 41.436
that no single factor
2 1.990 9.045 50.481
accounted for the 3 1.543 7.012 57.494
majority of covariance 4 1.101 5.006 62.500
5 .836 3.800 66.299
amongst the predictor
6 .790 3.590 69.889
and criterion variables 7 .695 3.158 73.047
8 .654 2.972 76.019
(Kumar, 2012;
9 .627 2.849 78.868
Podsakoff et al., 2012). 10 .547 2.489 81.356
Therefore, the common 11 .538 2.447 83.803
12 .466 2.120 85.923
method bias is unlikely
13 .439 1.994 87.917
to inflate relationship 14 .427 1.941 89.859
between variables of the 15 .370 1.683 91.541
16 .359 1.632 93.174
study and is therefore 17 .305 1.385 94.559
not an issue. 18 .295 1.339 95.898
19 .266 1.209 97.107
20 .239 1.089 98.195
21 .212 .965 99.160
22 .185 .840 100.000
Extraction Method: Principal Component Analysis.
5|Page
6. Descriptive Statistics (Mean and Standard Deviation)
The mean is the average of the numbers: a calculated "central" value of a set of numbers. To
calculate: Just add up all the numbers, then divide by how many numbers there are. Standard
deviation refers to quantity expressing by how much the members of a group differ from the mean
value for the group or the standard deviation is a measure of how spread out numbers are.
Table:06 Descriptive Statistics

N Mean Std. Deviation
Usefulness 235 3.4929 .60148
EaseofUse 235 3.6993 .54160
Observability 235 3.7191 .71152
Attitude 235 3.8032 .70887
Intention 235 3.8426 .65958
Valid N (listwise) 235
The mean value of usefulness, ease of use, observability, attitude and intention is 3.4929, 3.6993,
3.7191, 3.8032 and 3.8426 respectively. Additionally, the standard deviation of usefulness, ease
of use, observability, attitude and intention is 0.60148, 0.54160, 0.71152, 0.70887 and 0.65958
respectively.
7. Correlation
In Statistical terms, a correlation is a mathematical measure of the strength of association between
two quantitative variable. A statistic that is often used to measure the strength of a linear
association between two variables is the correlation coefficient is called Peason’s correlation
coefficient. Hair et al (2010) correlation coefficient of 0.90 and above indicates the existence
correlation between the exogenous latent constructs.
Table:07 Correlations
Usefulness EaseofUse Observability Attitude Intention
Usefulness Pearson Correlation 1
EaseofUse Pearson Correlation .535** 1
Observability Pearson Correlation .331** .448** 1
Attitude Pearson Correlation .594** .616** .483** 1
Intention Pearson Correlation .495** .554** .493** .769** 1
**. Correlation is significant at the 0.01 level (1-tailed).
From Table 7 we can say that the maximum value of correlation coefficient is 0.769 between
attitude and attitude which is below 0.9. Therefore, we can conclude that there is no strong
correlation among variables.
6|Page
8. Regression Analysis
Regression analysis is used to study the relationship between two or more variables. Moreover,
the regression technique is used to observe changes in the dependent variable with changes in the
independent variables. The parameters in the regression equation are obtained by using least square
method.
Independent variable: In regression analysis, the independent variable represents the inputs make
the changes in the dependent variable.
Dependent variable: The dependent variable represents the output based on the values of the
independent variable.
Multiple linear regression is an extension of simple linear regression in which there is a single
dependent (response) variable (Y) and (k) independent (predictor) variables. Therefore, we will
use multiple regression analysis because we have one dependent variable intention and four
independent variables which are usefulness, ease of use, observability and attitude.
Table:05 Coefficients
Unstandardized Standardized
Coefficients Coefficients Collinearity Statistics
Std.
Model B Error Beta t Sig. Tolerance VIF
1 (Constant) .628 .206 3.045 .003
Usefulness .029 .058 .027 .504 .615 .601 1.663
Ease of Use .107 .067 .088 1.597 .112 .551 1.814
Observability .129 .044 .139 2.915 .004 .730 1.370
Attitude .588 .055 .632 10.773 .000 .484 2.065
a. Dependent Variable: Intention
Beta coefficients of each independent variables such as are usefulness, ease of use, observability
and attitude is 0.027, 0.088, 0.139 and 0.632 respectively. Among four variables only two variables
observability and attitude are statistically significant at 5 percent of level. And usefulness and ease
of use have no significant effect on intention, because their significant value is greater than 5
percent.
7|Page

Assessment of Outlier....................

Încărcat de

Informații document

Descriere originală:

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Assessment of Outlier....................

Încărcat de

Drepturi de autor:

Formate disponibile

QUANTITATIVE PROJECT

Submitted By: Rajib Ali

By looking at each histogram of each variable,

Table:04 Independent Samples Test

Table:06 Descriptive Statistics

S-ar putea să vă placă și