Documente Academic
Documente Profesional
Documente Cultură
Submitted by:
MAJID ALI KHOWAJA
ROLL NO: 2K6/SOC/42
BS (Hons) Part-IV-2009
2
EXECUTIVE SUMMARY
As the allotted topic is entitled as “BUSINEES USING STATISTICAL
METHODS, A CASE STUDY OF TALUKA MIR PUR BATHORO,
THATTA.” It is, therefore, the thesis report consists of the basic concepts of
the statistics and business and later on there is a case study report regarding
the ‘male & female of Mir Pur Bathoro footware preference’, which in
response helps the proprietor of footwear house: i.e. Mr. Atta-ullah Khattri
of Aashu footwear house, taluka mir pur bathoro. This report highlights the
statistical method of collecting information regarding the concerned area of
interest in order to enhance the business policies and no doubt the customer
is also facilitated. Initially a form is distribute among 50 each males and
females of the city and then the collected data is summarized in tabular form
and later on statistical probability test (chi-square), a probability function
widely used in testing a statistical hypothesis, for example, the likelihood
that a given statistical distribution of results might be reached in an
experiment, is applied and the result is concluded, which in response will
help the footwear house owner to manage his business accordingly and
hence the purpose of the thesis (Business using statistical methods along
with a case study is accomplished). The thesis also consists of the
conclusion & future scope of the case study in the end.
3
CONTENTS
CHAPTER#1 STATISTICS
Page No.
Probability 06
Estimation 07
Hypothesis testing 09
Bayesian methods 09
Experimental design 09
Time series and forecasting 12
Nonparametric methods 13
Statistical quality control 13
Sample survey methods 14
Decision analysis 14
CHAPTER#2 INTRODUCTION
Types of Business 16
Manufacturing firms 16
Merchandisers 16
Service enterprise 18
BIBLIOGRAPHY 45
4
STATISTICS
Definition
Statistics is the science of collecting, analyzing, presenting, and interpreting
data. The branch of mathematics that deals with the relationships among
groups of measurements and with the relevance of similarities and
differences in those relationships is known as statistics.
Descriptive statistics
Descriptive statistics are tabular, graphical, and numerical summaries of
data. The purpose of descriptive statistics is to facilitate the presentation and
interpretation of data. Most of the statistical presentations appearing in
newspapers and magazines are descriptive in nature.
5
Descriptive statistics > Numerical measures
A variety of numerical measures are used to summarize data. The
proportion, or percentage, of data values in each category is the primary
numerical measure for qualitative data. The mean, median, mode,
percentiles, range, variance, and standard deviation are the most commonly
used numerical measures for quantitative data.
Probability
Probability is a subject that deals with uncertainty. In everyday terminology,
probability can be thought of as a numerical measure of the likelihood that a
particular event will occur. Probability values are assigned on a scale from 0
to 1, whith values near 0 indicating that an event is unlikely to occur and
those near I indicating that an event is likely to take place.
6
Probability > Events and their probabilities
Oftentimes probabilities need to be computed for related events. For
instance, advertisements are developed for the purpose of increasing sales of
a product. If seeing the advertisement increases the probability of a person
buying the product, the events “seeing the advertisement” and “buying the
product” are said to be dependent.
distributions
A random variable is a numerical description of the outcome of a statistical
experiment. A random variable that my a assume only a finite number or an
infinite sequence of values is said to be discrete: one that mya assume any
value in some interval on the real number line is said to be continuous.
position distribution
The most widely used continuous probability distribution in statistics is the
normal probability distribution. All normal distribution graphs are, like a
bell – shaped curve.
Estimation
It is often of interest to learn about the characteristics of a large group of
elements such as individuals, households, buildings, products, parts,
7
customers, and so on. All the elements of interest in a particular study form
the population. Because of time, cost, and other considerations, data often
cannot be collected from every element of the population.
8
Estimation > Estimation procedures for two populations
The estimation procedures can be extended to two populations for
comparative studies. For example, suppo0se a study is being conducted to
determine differences between the salaries paid to a population of men and a
population of women.
Hypothesis testing
Hypothesis testing is a form of statistical inference that uses data from a
sample to draw conclusions about a population parameter or a population
probability distribution. This assumption is called the null hypothesis and is
denoted by H0.
Bayesian methods
The methods of statistical inference previously described are often referred
to as classical methods. Bayesian methods (so called after the English
mathematician Thomas Bays) provide alternatives that allow one to combine
prior information about a population parameter with information contained
in a sample to guide the statistical inference process.
Experimental design
Data for statistical studies are obtained by conducting either experiments or
surveys. Experimental design is the branch of statistics that deals with the
design and analysis of experiments. The methods of experimental design are
widely used in the fields of agriculture, medicine, biology, marketing
research, and industrial production.
9
Experimental design > Analysis of variance and
significance testing
A computational procedure frequently used to analyze the data from an
experimental study employs a statistical procedure known as the analysis of
variance. For a single factor experiment, this procedure uses a hypothesis
test concerning equality of treatment means to determine if the factor has a
statistically significant effect on the response variable.
analysis
Regression analysis involves identifying the relationship between a
dependent variable and one or more independent variables. A model of the
relationship is hypothesized, and estimates of the parameter values are used
to develop an estimated regression equation. Various tests are then
employed to determine if the model is satisfactory.
10
Experimental design > Regression and correlation
11
model. If the error term in the regression model satisfies the four assume
options noted earlier, then the model is considered valid.
12
Nonparametric methods
The statistical methods discussed above generally focus on the parameters of
populations or probability distributions and are referred to as parametric
methods. Nonparametric methods are statistical methods that require fewer
assumptions about a population or probability distribution and are applicable
in a wider range of situations.
13
(randomly occurring variations).
Decision analysis
Decision analysis, also called statistical decision theory, involves procedures
for choosing optimal decisions in the face of uncertainty. In the simplest
situation, a decision maker must choose the best decision from a finite set of
alternatives when there are two or more possible future events, called states
of nature, that might occur.
14
INTRODUCTION
Business plays a vital role in the life and culture of countries with industrial
and postindustrial (service and information based) free market economics
such as the United States. In free market systems, prices and wages are
primarily determined by competition, not by governments. In the United
States, for example, many people buy and sell goods and services as their
primary occupations. In 2001 American companies sold in excess of $10
trillion worth of goods and services. Businesses provide just about anything
consumers want or need, including basic necessitates such as food and
housing, luxuries such as whirlpool baths and wide screen televisions, and
even personal services such as caring for children and finding
companionship.
15
TYPES OF BUSINESS
There are many types of businesses in a free market economy. The three
most common are
• Manufacturing firms
• Merchandisers
• Services enterprises
Manufacturing firms
Manufacturing firms produce a wide range of products. Large manufacturers
include producers of airplanes, cars, computers, and furniture. Many
manufacturing firms construct only parts rather than complete, finished
products. These suppliers are usually smaller manufacturing firms, which
supply parts and components to larger firms. The larger firms then assemble
final products for market to consumers. For example, suppliers provide
many of the components in personal computers, automobiles, and home
appliances to large firms that create the finished or end products. These
larger end product manufacturers are often also responsible for marketing
and distributing the products. The advantage those large businesses have in
being able to efficiently and inexpensively control any parts of a production
process is known as economies of scale. But small manufacturing firms may
work best for producing certain types of finished products. Smaller end-
product firms are common in the food industry and among artisan trades
such as custom cabinetry.
Merchandiser
Merchandisers are businesses that help move goods through a channel of
16
distribution that is, the route goods take in reaching the consumer.
Merchandisers may be involved in wholesaling or retailing, or sometimes
both.
A wholesaler is a merchandiser who purchases goods and then sells them to
buyers, typically retailers, for the purpose of resale. A retailer is a
merchandiser who sells goods to consumers. A wholesaler often purchases
products in large quantities and then sells smaller quantities of each product
to retailers who are unable to either buy or stock large amounts of the
product. Wholesalers operate somewhat like large, end product
manufacturing firms, benefiting from economies of scale. For example, a
wholesaler might purchase 5,000 pairs of work gloves and then sell 100
pairs to 50 different retailers. Some large American discount chains, such as
Kmart Corporation and Wal-Mart Stores, Inc., Serve as their own
wholesalers, these companies go directly to factories and other
manufacturing outlets, buy in large amounts, and then warehouse and ship
the goods to their stores.
The division between retailing and wholesaling is now being blurred by new
technologies that allow retailing to become an economy of scale. Telephone
and computer communications allow retailers to serve far greater numbers of
customers in a given span of time than is possible in face to face interactions
between a consumer and a retail salesperson. Computer networks such as the
Internet, because they do not require any physical communication between
salespeople and customers, allow a nearly unlimited capacity for sales
interactions known as 24/7- that is, the Internet site can be open for
transaction 24 hours a day, seven days a week and for as many transactions
as the network can handle. For example, a typical transaction to purchase a
pair of shoes at a shoe store may take a half-hour from browsing, to fitting,
17
to the transaction with a cashier. But a customer can purchase a pair of shoes
through a computer interface with a retailer in a matter of seconds.
Computer technology also provides retailers with another economy of scale
through the ability to sell goods without opening any physical stores, often
referred to as electronic commerce or e-commerce. Retailers that provide
goods entirely through Internet transactions do not incur the expense of
building so called brick and mortar stores or the expense of maintaining
them.
Service enterprises
Service enterprises include many kinds of businesses. Examples include dry
cleaners, shoe repair stores, barbershops, restaurants, ski resorts, hospitals,
and hotels, In many cases service enterprises are moderately small because
they do not have mechanized services and limit service to only as many
individuals as they can accommodate at one time. For example, a waiter may
be able to provide good service to four tables at once, but with five or more
tables, customer service will suffer.
In recent years the number of service enterprises in wealthier free market
economies has grown rapidly, and spending on services now accounts for a
significant percentage of all spending. By the late 1990s, private services
accounted for more than 21 percent of U.S. spending. Wealthier nations
have developed postindustrial economies, where entertainment and
recreation businesses have become more important than most raw material
extraction such as the mining of mineral ores and some manufacturing
industries in terms of creating jobs and stimulating economic growth. Many
of these industries have moved to developing nations, especially with the
rise of large multinational corporations, As postindustrial economics have
18
accumulated wealth, they have come to support systems of leisure, in which
people are willing to pay others to do things for them. In the United States,
vast numbers of people work rigid schedules for long hours in indoor
offices, stores, and factories. Many employers pay high enough wages so
that employees can afford to balance their work schedules with purchased
recreation. People in the United States, for example, support thriving travel,
theme park, resort, and recreational sport businesses.
19
Overview
Chi-square is a non parametric test of statistical significance for bivariate
tabular analysis (also known as cross breaks). Any appropriately performed
test of statistical significance lets you know the degree of confidence you
can have in accepting or rejecting any hypothesis. Typically, the hypothesis
tested with chi square is whether or not two different samples (of people,
texts, etc) are different enough in some characteristic or aspect of their
behavior that we can generalize from our samples that the populations from
which our samples are drawn are also different in the behavior or
characteristic. A non parametric test, like chi squire, is a rough estimate of
confidence; it accepts weaker, less accurate data as input than parametric
tests (like t-tests and analysis of variance) and therefore has les status in the
pantheon of statistical test. Nonetheless, its limitations are also its strengths;
because chi squire is more forgiving in the data it will accept, it can be used
in a wide variety of research contexts.
Chi square is used most frequently to test the statistical significance of
results reported in bivariate tables, and interpreting bivariate tables is
integral to interpreting the results of a chi squire test, so we’ll take a look at
bivariate tabular (cross-break) analysis.
20
50 males and 50 females as randomly as possible, and ask them, “On
average, do you prefer to wear sandals, sneakers, leather shoes, boots, or
something else?” using the model form,
Name:
Sex:
Age:
Zodiac:
Occupation:
Footwear Choice:
Sandals
Sneakers
Leather shoes
Boots
Others
21
and measure the dependent variable to test our hypothesis that there is some
relationship between them. Bivariate rabular analysis is good for asking
questions of the foll0owing kinds.
Leather
Sandals Sneakers Boots Others
Shoes
Male
Female
22
hypothetically causal values on the independent variable to their effects, or
values on the dependent variable. How we arrange the values on each axis
should be guided in conically by our research question/hypothesis. For
example, if values on an independent variable were arranged from lowest to
highest value on the variable and values on the dependent variable were
arranged left to right from lowest to highest, a positive relationship would
show up as a rising left to right line. (But remember, association does not
equal causation: an observed relationship between two variables is not
necessarily causal).
Each intersection/cell of a value on the independent variable and a value on
the independent variable reports the result of how many times that
combination of values was chosen/observed in the sample being analyzed.
(So we can see that cross tabs are structurally most suitable for analyzing
relationship between nominal and ordinal variables. Interval and ratio
variables will have to first be grouped before they can “fit” into a bivariate
table.) Each cell reports, essentially, how many subjects/observations
produced that combination of independent and dependent variable values?
So, for example, the top left cell of the table above answers the question:
“How many male in Mir Pur Bathoro prefer sandals?”
23
Leather
Sandals Sneakers Boots Others
Shoes
Male 6 17 13 9 5
Female 13 5 7 16 9
Reporting and interpreting cross tabs are most easily done by converting raw
frequencies (in each cell) into percentages of each cell writhing the values/
categories of the independent variable. For example, in the Footwear
preferences table above, total each row, then divide each cell by its row
total, and multiply that fraction by 100.
Leather N
Sandals Sneakers Boots Others
Shoes
Male 12 34 26 18 10 50
Female 26 10 14 32 18 50
24
Male Preference
30
25
20
15
Ratio
10
0
Sandals Sneakers Leather Boots Others
Shoe
Female Preference
30
25
20
15
Ratio
10
0
Sandals Sneakers Leathe r Shoe Boots Others
25
representative ness) is drastic. So we should provide that total N at the end
of each row/independent variable category (for reliability and to enable the
reader to assess our interpretation of the table’s meaning).
With this limitation in mind, we can compare the patterns of distribution of
subjects/observations along the dependent variable between the values of the
independent variable: e.g., compare male and female of Mir Pur Bathoro,
footwear preference. (For some data, plotting the results on a line graph can
also help you interpret the results: i.e., whether there is a positive (/),
negative(\), or curvilinear (∨, ∧) relationship between the variables).
Table 3 shows that within our sample, roughly twice as many females
preferred sandals and boots as males: and within our sample, about three
times as many men preferred sneakers as women and twice as many men
preferred leather shoes. We might also infer from the “Other” category that
females within our sample had a broader range of footwear preferences than
did males.
26
sample, unless we submit our results to a test of statistical significance. A
test of statistical significance tells us how confidently we can generalize to a
larger (unmeasured) population from (measured) sample of that population.
27
2. Data must be reportd in raw frequencies (not percentages);
3. Measured variables must be independent:
4. Values/categories on independent and dependent variables must be
mutually exclusive and exhaustive;
5. Observed frequencies cannot be too small.
28
an “un-codable” category.) In any case, we must include the results
for the whole sample.
4. Furthermore, we should use chi-square only when observations are
independent: i.e.e no category or response is dependent upon or
influenced by another. (In linguistics, often this rule is fudged a bit.
For example, if we have one dependent variable/column for
linguistic feature X and another column for number of words
spoken or written (where the rows correspond to individual
speakers/ texts or groups of skeakers/texts which are being
compared), there is clearly some relation between the frequency of
feature X in a text and the number of wrds in a text, but it is a
distant, not immediate dependency.)
5. Chi-square is an approximate test of the probability of getting the
frequencies we’ve actually observed if the null hypothesis were
true. It’s basd on the expectation that within any category, sample
frequencies are normally distributed about the expected population
value. Since (logically) frequencies cannot be negative, the
distributon cannot be normal when expected population values are
close to zero, since the sample frequencies cannot be much below
the expected frequency while they can be much above it (an
asymmetric/non-normal distribution). So, when expected
frequencies cannot be much below the expected frequency while
they can be much above it (an asymmetric/non normal the
assumption of normal distributon, but the smaller the expected
frequencies, the less valid are the results of the chi-square test.
We’ll discuss expected frequencies are derived from observed
frequencies. Therefore, if we have cells in our bivariate table,
29
which show very low raw observed frequencies (5 or below), our
expected frequencies may also be too low for chi-square to be
appropriately used. In addition, because some of the mathematical
formulas used in chi-square use division, no cell in your table can
have an observed raw frequency of 0.
obeyed:
Collapsing values
A brief word about collapsing values/categories on a variable is necessary.
Firs, although categories on a variable especially a dependent variable may
be collapsed, they cannot be excluded from a chi-square analysis. That is,
you cannot arbitrarily exclude some subset of our data from our analysis,
Second, a decision to collapse categories should be carefully motivated, with
consideration for preserving the integrity of the data as it was originally
30
collected. (For edample, how could we collapse the footwear preference
categories in our example and still preserve the integrity of the original
question/data? We can’t, since there’s no way to know if combining, e.g.,
boots and leather shoes varsus sandals and sneakers is true to our subjects’
typology of footwear.) As a rule, we should perform a chi-square on the data
in its un-collapsed form; if the chi-square value achieved is significant, then
we may collapse categories to test subsequent refinements of your original
hypothesis.
31
scientific standards. For our purposes, we’ll set a probability of error
threshold of 1 in 20, or p< .05, for our Footwear Study.)
25
20
15
10
5
0
1 2 3 4 5 6
32
statistically significant relationship exists between our variables.
Chiq-squre derives a representation of the null hypothesis, the all other
things being equal’ scenario, in the following way. The expected frequency
in each cell is the product of that cell’s row total multiplied by that cell’s
column total, divided by the sum total of all observations. So, to erive the
expected frequency of the “Males who prefer Sandals” cell, we multiply the
top row total (50) by the first column total (19) and divide that product by
the sum total (100): ((50 × 19) / 100) = 9.5. The logic of this is that we are
deriving the expected frequency of each cell from the union of the total
frequencies of the relevant values on each variable (in this case, Male and
Sandals), as a proportion of all observed frequencies (across al values of
each variable). This calculation is performed to derive the expected
frequency of each cell, as shown in Table5 below (the computation for each
cell is listed below Table5).
33
Sandals Sneakers Leather Boots Others Total
Shoes
Male Observed 6 17 13 9 5 50
Male Expected 9.5 11 10 12.5 7
Female 13 5 7 16 9 50
Observed
Female 9.5 11 10 12.5 7
expected
Total 19 22 20 25 14 100
Expected Value = (cell’s column total)* (cell’s row total)/(som total of all
observations)
34
Male (Expected)
12
10
8
6
4
2
0
1 2 3 4 5 6
Choice
Female (Expected)
12
10
8
6
4
2
0
1 2 3 4 5 6
Choice
Sandals 1
Sneakers 2
Leather shoe 3
Boots 4
Others 5
35
36
As we have originally obtained a balanced male/female sample, our male
and female expected scores are the same. This usually will not be the case.
We now have a comparison of the observed results versus the results we
would expect if the null hupothesis were true. We can informally analyze
this table, comparing observed and expected frequencies in each cell (Males
prefer sandals less than expected), across values on the independent variable
(Males prefer sneakers more than expected, Females less thanexpected(, or
across values on the dependent variable (Females preferesandals and boots
more than expected, but sneakers and soes less tan expecte). But so far, the
extra computation doesn’t really add much more information than
interpretationof the results in percentage form. We need some way to
neasure how different our observed results are from the null hypothesis. Or,
to pur it another way, we need some way to determine whether we can reject
the null hypothesis, and if we can, with what degree of confidence that we’re
not making a mistake in generalizing from our sample results to the larger
population.
Logically, we need to measure the size of the difference between the pair of
observed and expected frequencies in each cell. More specifically, we
calculate the difference between the observed and expected frequency in
each cell, square that difference, and then divide that product by the
difference itself. The formula can be expressed as ((O-E)^2/E)
37
absolute value of differences. If we didn’t work with absolute values, the
positive and negative differences across the entire table would always add up
to 0. (We really understand the logic of chi-square if you can figure out why
this is true.) Dividing the squrared difference by the expected frequency
essentially removes the expected frequency from the equation, so that the
remaining measures of observed/expected difference are comparable across
al cells.
So, for example, the difference between observed and expected frequencies
for the Male/Sandals preference is calculated as foollowsE
38
Table 6. Male and Female of Mir Pur Bathoro, Footwear Preferences:
Observed and Expected Frequencies & Chi-Square.
39
Chi
Male/Sandals: 2
((19× 50) /9.5) = 1.289
9.5
Male/Sneakers: ((22× 50)2/11) = 11 3.273
Male/Leather Shoes ((20× 50)2/10) = 10 0.900
Male/Boots ((25× 50)2/12.5) = 0.980
12.5
Male/Other: (14× 50)2/7) = 7 0.571
Female / Sandals: 2
(19× 50) /9.5) = 1.289
9.5
Female/Sneakers: ((22× 50)2/11) = 11 3.273
Female/Leather ((20× 50)2/10) = 10 0.900
Shoes
Female/Boots: ((25× 50)2/12.5) = 0.980
12.5
Female/Other: ((14× 50)2/7) = 7 0.571
Total Chi-Square Sum of chi-Values 14.026
Value =
The total chi-square value for Table 1 is 14.026.
40
need to know is the probability of getting a chi-square value of a minimum
given size even if our variables are not related at all in the larger population
from which our sample was drawn. That is, we need to know how much
larger than 0 (the absolute chi-square value of the null hypothesis) our
table’s chi-square value must be before we can confidently reject the null
hypothesis. The probability we seek depends in part on the degree of
freedom of the table from which our chi-square value is derived.
Degrees of freedom
Mechanically, a table’s degrees offreedom (df) can be expressed by the
following formula:
Df = (r-1) (c-1)
That is, a table’s degrees of freedom equals the number of rows in the table
minus one multiplied by the number of columns in the minus cone. (For 1× 2
tables: df = k – 1, where k = number of values / categories on the variable.)
A degree of freedom is an issue because of the way in which expected
values in each cell are computed from the row and column total of each cell.
All but one of the expected values in a given row or column are free to vary
(within the total observed and therefore expected (frequency of that row or
column: once the free to vary expected cells are specified, the last one is
fixed by virtue of the fact that the expected frequencies must add up to the
observed row and column totals (from which they are derived).
41
Df=(#row-1)*(#column-1) = (2-1)*(5-1)=1*4=4
42
Measures of Associaiton
While the issue of theoretical or practical importance of a statistical of a
statistically significant result cannot be quantified, the relative magnitude of
a satistically significant relationship can be measured. Chi-square allows us
tomake decisions about whether there is relationship between two or more
variables; if the null hypothesis is rejected, we conclude that there is a
statistically significant relationship between the variables. But we frequently
want a measure of the strength of that relationship, an index of degree of
correlation, a measure of the degree of association between the variables
represented in our table (and data). Luckily, several related measures of
association can be derived from a table’s chi-square value.
For tables larger than 2 × 2 (like ours Table 1), a measure called ‘Cramer’s
phi’ is derived by the following formula (where N = the total number of
observations, and k = the smaller of the number of rows or columns):
Cramer’s chi = the square root of (chi-square divided by (N times (k minus
1))
So, for our Tbale 1 (2× 5), we would compute Cramer’s phi as follows:
N(k-1) = 100 (2-1) = 100
Chi square/100 = 14.026/100 = 0.14
Square root of (2) = 0.37
The product is interpred as a Pearson r (that is, as a correlation coefficient)
For 2× 2 tables, a measure called ‘phi’ is derived by dividing the table’s chi
square value by N (the total number of observations) and then taking the
square root of the product. Phi is also interpreted as a Pearson r.
43
A complete account of how to interpret correlation coefficients is
unnecessary for present purposes. It will suffice to say that r2 is a measure
called shared variance. Shared variance is the portionof the total behavior (or
distributon) of the variables measured in the sample data which is accounted
for by the relationship we’ve already detected with our chi square. For Table
1, r2 = 0.137, so approximately 14% of the total footwear preference storfy is
explained/predicted by biological sex.
Computing a measure of association like phi or Cramer’s phi is rarely done
in quantitative linguistic analyses, but it is an omportant benchmark of jus
‘how much’ of the phenomenon under investigation has been explained. For
example, Table 1’s Cramer’s phi of 0.37 (r2 = 0.137) means that there are
one or more variables still undetected which, cumulatively, account for and
predict 86% of footwear preferences. This measure, of course, doesn’t begin
to address the nature of the relation(s) between these variables, which is a
crucial part of any adequate explanation or theory.
Conclusion
Business can be well managed and enhanced using the sociological
statistical methods. The case study of male and female footwear preference
helps up to 14% the Aashu footwear house owner Mr. Atta-ullah Khattri of
taluka Mir _Pur Bathoro.
44
By keeping rhe observed male and female footwear preference items he can
utilize only up to 14%, and the rest of 86% of the business management is
hidden in some other variables which can be found and hence can be further
extended as clearly described in future scope.
Future Scope
As the conclusion clearly tells us that 14% of the total footwear preference
story is explained/predicted by biological sex, and hence the business using
this statistical approach can only be managed up to 14%, and the rest of the
86% of keeping the footwear in the Aashu footwear house is still unknown,
i.e. the thesis can be extended further by exploring the undetected variables
(un-sued variables in the model can also help) which account for therest of
86% of footwear preference.
45
BIBLIOGRAPHY
46