Sunteți pe pagina 1din 95

A Practical Guide to the

Preparation, Drafting and


Submission of Dissertations in
Finance

Gerald Pollio, Ph.D.

Preliminary –

Not to be quoted without the author’s permission

(2009)
1. Introduction

The purpose of this Guide is to acquaint you with the purpose, sources and structure of
your finance dissertation.

Most graduate business schools require students to produce a dissertation in partial


fulfilment of the requirements for obtaining an MBA degree. Many students resent this
requirement; and would if they could take additional courses or, better still, produce a
Business Project instead. This is the practice of some business schools, though the vast
majority still require production of what might best be described as an ‘academic’
dissertation.

We place the word academic in inverted commas to emphasise that dissertations are not
strictly speaking formal academic studies. Business education is after all an applied
subject and students, accordingly, will expect that the topic of their dissertation should
emphasise practical relevance. The two of course are not mutually exclusive: most
business school dissertations combine the former with the latter, in that students are
expected to produce output that meets or exceeds established academic norms but within
the context of addressing a topic that will advance understanding of a narrowly defined
business issue.

Student hostility towards the dissertation requirement is understandable, but as we hope


to show misguided. Throughout their course of study students have to face assessments
of various sorts, some oral, some written, some as part of a group exercise, others as
individual assignments. What these assessments have in common is that they were all set
by the student’s lecturers, with the choice, if any, confined to the limited range of topics
on offer. A dissertation is the only assessment the choice of which is determined more or
less uniquely by the student.

Students, of course, have the benefit of their Supervisor’s advice, designed to improve
their proposal and ensure that it can be completed within the time required. Only on very
rare occasions will a Supervisor reject the student’s topic and then only because it is too
broad and thus unlikely to be completed within the time allotted. Supervisors seldom
reject out of hand a dissertation topic, since we all recognise that a topic of the student’s
own choice is the best motivator for getting on with the work.

A logical place to begin our discussion is with the concept of research. Most students
find the task awesome, especially international students arriving from countries where the
prevailing approach to education differs, in some cases quite radically, from that of the
United Kingdom. Yet the process is far less daunting than you might imagine, not the
least because, perhaps without even being aware of it, most students have already
produced some fairly sophisticated research results of their own.

Consider the following: the university at which you are studying was not chosen
randomly; you will have reviewed the websites of a number of different business schools

2
that were of interest to you. You will have narrowed the focus by concentrating on those
where you meet all of the requirements, whether in respect of prior academic
accomplishment or linguistic proficiency. You will have determined whether tuition
costs are reasonable, and whether you can afford major ancillary expenses such as
housing, food and transportation. You will also have investigated whether you are able to
work, and if so, how many hours are both permissible and consistent with successfully
completing your course of study.

The answers to all of these issues will have come from a careful and detailed assessment
of the material from whatever source or sources you had access to, and quite possibly
from discussions with one or more students who attended the university you are
considering attending. These people will also be a source of valuable information
concerning the additional costs you will incur as part of acquiring your degree, what type
of work is available locally and what rates of pay are likely to be. Armed with this
information, you select from among the many post-graduate institutions you investigated
the one that best meets all of your requirements.

As will be seen the very same process applies when drafting your dissertation.

3
2. Research

What is research? The definition favoured by the UK’s Quality Assurance Agency is
also one of the most comprehensive:

Research … is to be understood as original investigation undertaken in order


to gain knowledge and understanding. It includes work of direct relevance to
the needs of commerce and industry, as well as to the public and voluntary
sectors; scholarship, the invention and generation of ideas, images
performances and artefacts including design, where these lead to new or
substantially improved insights; and the use of existing knowledge in
experimental development to produce new or substantially improved
materials, devices products and processes, including design and construction.
It excludes routine testing, and analysis of materials, components and
processes, for example, for the maintenance of national standards, as distinct
from the development of new analytical techniques. It also excludes the
development of teaching materials that do not embody original research.

The quote actually applies to the production of doctoral dissertations though with all
necessary changes applies equally to MBA dissertations. Knowledge is not produced in
isolation; it builds upon the scholarship and research efforts of others, hence the centrality
of the Literature Survey that is mandatory for all dissertations. It provides the platform
upon which the dissertation rests, pointing to the best way to approach a given topic; it
also summarises the results of recent scholarship against which the findings of your
dissertation can and will be measured.

The QAA definition highlights the critical importance of originality, by which they mean
either a new or unique contribution to existing knowledge or the generation of results that
extend, revise or supplant existing scholarship. A somewhat weaker standard is applied
to the production of a Master’s dissertation. You will not be expected to produce original
research in the sense just described. Originality here means you will select and address a
topic of interest or importance applying principles and techniques learned as part of your
post-graduate education. In effect, the dissertation demonstrates the depth of your
understanding of the analytical materials acquired during your course of study and your
ability to apply the associated tools constructively and effectively to a topic of direct
interest or relevance to you.

It also means something else of equal importance, which applies in respect of whatever
material you are required to prepare and submit for evaluation, namely, that all such work
must have been conceived and produced by you. We are, of course, referring here to
plagiarism. This does not mean that you can not refer to work produced by others, only
that whenever you do, you must acknowledge its source. This applies as much to direct
quotations, which you will be required to indicate with the use of inverted commas, or to
paraphrases, summaries of other people’s ideas. The failure to do so is not merely a
breach of research ethics, it amounts to intellectual theft that can result in immediate
failure of the work presented and possible termination of your studies here.

4
There are various aspects of research that you must understand as a prelude to writing
your dissertation; many of these terms are used casually, or carelessly, even among those
who should know better. It seems desirable, therefore, to provide rigorous definitions for
these terms to prevent you from falling into the same trap. We begin with the term
concept. Concepts are abstractions – products of the mind – that identify some aspects of
reality as forming a class made up of things that are similar – or at least alike enough –
for purposes of theorising about them.

A concrete example should help here. Mammals are warm blooded animals that incubate
their offspring internally and give birth to living young. There are of course many
different types of mammals, from mice to hippopotami, some of which are carnivores,
others herbivores, while yet others eat both meat and vegetation. These variations within
the broader category are, however, irrelevant to many biological theories; for many
theoretical purposes, all mammals are alike. Although concepts are abstract, and not
always directly measurable, all scientific concepts have empirical (observable)
counterparts. We may not be able to see, touch or smell the concept of mammal, but we
can see, touch, and smell animals that are classified as mammals.

An indicator is any observable measure of a concept. While scientific research is


undertaken to test a theory or thesis, actual research necessarily is based entirely on
examination of indicators. There is, for example, a strong presumption in the
management literature that training improves productivity which in turn benefits the
firm’s bottom line. What we observe are the results (indicators) of training on
profitability, and if profitability does improve we may reasonably conclude that training
contributed to improved financial performance.

Theories and theses produce specific predictions about relationships that do (or do not)
exist among a given set of indicators. A prediction about the relationship that exists
among indicators is known as a hypothesis; it is the hypothesis that serves as the specific
research focus. A theory is meant to have general relevance, applicable across time and
space; a thesis, by contrast, lacks generality and typically applies to a specific situation
occurring in a specific time period.

There is a long running debate about the function of theory, and how best to test its
accuracy. For some, theory serves three broad purposes equally: explanation,
prediction and control. To fulfil these requirements, the assumptions upon which the
theory depends should be realistic, facilitating determination whether the theory has
broad, limited or no explanatory power at all. If it does we should be able to use the
theory to forecast future outcomes: for example, a reduction in taxes or an increase in
public expenditure is expected to stimulate economic activity by some multiple of the
initial fiscal injection. If the prediction is valid, then the information can be used to
influence or control subsequent economic activity. When the economy is heading
towards recession, governments should pursue expansionary fiscal policies to limit the
decline in total output.

5
For others, the true test of a theory is prediction alone; whether the assumptions used to
generate the theory are realistic or not is immaterial. Consider the following example,
often used to support this assertion. A billiard player is about to launch the eight ball into
the corner pocket. Will she succeed? Other than guessing the outcome, we could
formulate a simple theory about the likelihood of success. One such theory would be to
assume that the player knows trigonometry and uses it to determine the correct angle with
which the white cue ball should strike the black eight ball to ensure it falls into the
designated pocket. As a description of how (most) people approach the game of billiards
it is probably hopelessly inaccurate, in other words, the assumptions underpinning the
theory taken at face are totally unrealistic, but as a theory of whether the eight ball will
drop into the corner pocket, there is none better.

Critics of this position point out that realistic assumptions are the foundation of sound
research. Suppose one of the conclusions of the model is shown to be false, then it
logically follows that one of the assumptions must be false; if the assumptions are not
realistic (plausible), it is hard to learn from the failure of the conclusion. Note,
significantly, that ‘realistic’ or ‘plausible’ are not synonymous with ‘true.’ It is thus
unclear exactly what additional information is imparted by the use of ‘realistic’
assumptions. Suppose we substitute ‘more’ for ‘less’ realistic assumptions; in what way
will we gain in understanding? If our objective is to determine how best to place a
billiard ball in a given pocket, with which plausible assumption(s) should we replace the
model’s unrealistic assumptions, and will that significantly improve the model’s
predictive accuracy?

A second, more focused, example may help clarify the point while pulling together
broader issues of how to go about testing hypotheses which, after all, is the primary
objective of any dissertation. In Principles of Finance courses students are taught the
Efficient Markets Hypothesis (EMH). The theory presupposes that market participants
make use of all publicly available information to value assets, real estate or shares of
stock, for example. Indeed, the EMH goes so far as to assert that it doesn’t matter
whether all traders act rationally; so long as some investors act rationally, then the market
as whole will exhibit rationality.

One of the main predictions of the EMH is that share prices will incorporate all known
information about the firm so that only the arrival of new information will cause prices to
change. Since new information (‘news’) is unpredictable, so too will be the behaviour of
share prices, thus providing an explanation for the apparent random character of their
movements. If so, the best prediction of tomorrow’s share price is today’s price. An
equally important implication is that active investment management is pointless, since it
will not be able to produce extraordinary returns; investors would do better to invest in
indexed funds that track the movements of a broad market index.

How can we be sure that departures from rationality will be corrected by a small number
of rational, knowledgeable traders? The mechanism that ensures that shares are correctly
valued is known as arbitrage; rational investors who spot pricing errors will buy (sell) the
under- (over-) priced asset until the correct valuation is achieved.

6
But how do we know this well informed individual or group of individuals exists, and if
they do how did they come by their extraordinary knowledge of asset valuation? The
simple fact is that the existence of well-informed arbitrageurs, and how they acquire their
omniscience, is nothing more than a theoretical construct, assumptions that are not
directly verifiable, but which help to explain why financial markets exhibit such a high
degree of efficiency. We could, of course, dispense with these assumptions, but then we
would have find another explanation for why markets are efficient

A final important consideration concerns causation. In principle, the concept is


straightforward: two variables are causally related if and only if it can be shown that the
behaviour of one variable (the dependent variable) is directly influenced by a second
variable known as the independent variable. It is much more difficult to establish
causation empirically; the fact of placing one variable on the right hand side of the
equation and calling it the independent variable is too facile.

Consider one of the examples given above that budget deficits can be used to stimulate
economic activity during recessionary periods. Since the size of the budget deficit
affects, as well as being affected by, the state of the economy, we could never be sure
which impact predominates, that is, whether fiscal policy is having the predicted impact.
Numerous causality tests exist most of which are too advanced to be used for
dissertations here. But the point to remember is that it is wrong to assert the existence of
causal relationships without any empirical or external sanction for the claim.

7
3. Getting Ready: The Proposal

To get you started the following exhibit highlights the main points you need to know to
develop your Proposal. The full UWIC report from which this exhibit is extracted is
available on your student portal; it would be to your advantage to read this document
before embarking on your Proposal, as it answers the most frequently asked student
questions.

Exhibit 1
Summary of Proposal Guidelines Red Book UWIC Revised Edition 2007
with explanatory notes

Word count: The proposal is ‘2500 words of which the literature review should represent around
1500 words.’

I Provisional Title

‘This should include an initial sentence that clearly encompasses the purpose and aims of the
dissertation’. You should be able to establish in which organisation/s the primary data is to be
collected and what the research aims to do.

II Brief review of literature

‘The aim is to have a focused and critical conclusion.’


‘The literature review is an essential guide to other stages’. Plan your review into themes that
relate directly to the research question and can direct the following research aims and objectives.

III Aims and objectives

‘These should flow directly from the focus of the literature review.’ ‘Clearly specify your
research objectives.’

IV Statement of the design and methodology

‘Overall design, validity and reliability. The justification will need to also refer to you title, aims,
objectives together with issues of access [primary data] and time’.

V Sources and acquisition of data

‘It must be shown how these relate to the research title and design. They must provide evidence
that access has been negotiated.’ [Email or letter from the organisation/s where the primary
research is to be gathered].

‘Ensure that ethical issues have been considered’ [never collect data, record interviews or
meetings without prior and clear permission being granted]. ‘ The method must be consistent with
the research design’.
‘Field work methods must be specified eg. participative observation, interviews and
questionnaires’

8
VI Method of data analysis

‘A clear and reasoned distinction needs to be made between deductive quantitative techniques
and inductive qualitative techniques. The selection needs to be consistent with the research design
and field work methods.’
VII Form of presentation

‘Usually in written form. Additionally, Indicate if graphs, charts or tables are to be also included.

VII Timetable

‘ensures that the work has been planned


guide to the amount of work to be done guide to how much time should be spent on each section
– time limitation can be a reason for selecting certain methods.

References

The Harvard System:

A reference is cited in the main body of the text by inclusion of the authors name and date of
publication, e.g. (Greaves, 2006) wrote about the size of……….

If you are using a direct quotation you must include quotation marks and the page number of the
reference in the main body of your text, e.g. (Rooney and Owen, 2006 p. 5) maintain that
teamwork: “supports the values of the organisation”.

If there are more than two authors you should name the first author only, e.g. (Kotler et al, 2006).

References are then listed alphabetically on a references page at the end of your dissertation.

Examples:

Single author: Keegan, W. J. (1989) Global Marketing Management


London: Prentice Hall

Two authors: Gerard, S. & Harman, D (2005) Winning the League


Liverpool: Penguin
Source: Michael L Nieto, LSC (2009).

You will be assisted with the production of your dissertation by your Supervisor, a
knowledgeable member of faculty with a background in your chosen subject area. Before
being assigned a Supervisor, you are required to take a course in which the purpose,
scope and nature of research and research methods are discussed and clarified. These
new tools are designed to supplement the knowledge you acquired in classroom lectures;
together they provide the essential preparation to undertake production of your
dissertation.

The first step in this process will be topic selection. By the time you have reached this
stage, you will already have decided in which area of business you intend to specialise,

9
and will similarly have considered those topics within your subject area that are of
greatest interest.

Your Research Methods course will help to sharpen your thoughts, after which you will
be required to produce a formal research proposal, that is, a brief description setting out
the primary and secondary, if any, objectives of your dissertation and how you intend to
proceed with your study. The Proposal amounts to a statement of intent; it is possible to
alter or amend your original submission, usually after preliminary consultation with your
Supervisor.

The title of your proposed dissertation provides a clear statement as to what you intend to
do. Best practice is to put the title in the form of a question; if you intend to answer the
question posed in the title with reference to a specific company or industry, you may add
an indication to that effect, for example, ‘Does Pay for Performance Affect Profitability?
The Walt Disney Company as a Case Study.’

In preparation for submitting your Proposal you will have done some preliminary
research to determine whether there are sufficient resources to produce your dissertation
within the time allotted. You will provide a general overview of how you intend to
develop your analysis (methodology) and discuss the main objectives of the study. Since
you have neither completed your research nor developed the data to support your
analysis, it will be difficult at this stage to set out a definitive statement of either. Only as
your literature review and research proceed, will it be possible to firm up the details of
your dissertation.

Practically, this means that you will have the opportunity to revise aspects of your study
as may be necessary, though ideally the changes will be modest and fully in keeping with
the originally stated intentions. If more radical changes are required, it would be best to
discuss the proposed modifications with your Supervisor.

Finally, you are required to provide a time line setting out a work schedule. The time you
allocate to the different sections of your study should reflect their relative importance, an
indication of which is shown in the following exhibit (2). The Proposal should be
comprehensive enough to provide a clear statement of your research intentions,
amounting to no more than 1,500-2,000 words, but not so detailed as to deprive you of
flexibility to modify your approach as your research proceeds.

The Proposal will then be graded, and comments provided to assist you in transforming
your initial ideas into a workable dissertation that can be produced within the required
time period. If, as sometimes happens, your Proposal is failed, it is nothing to be too
concerned about as your Supervisor will have provided detailed comments which, if
followed, should lead to a passing grade upon resubmission.

An MBA dissertation is a challenging task for most post graduate students, but it can also
be extremely rewarding. Most obviously, it is a piece of research that you have produced
largely or entirely on your own. And second, it is something that can be shown to,

10
among others, prospective employers to demonstrate the depth of your understanding of
an important financial topic; you will have few other examples of your post-graduate
work that can serve this purpose.

Exhibit 2
LSC/STM MBA Dissertation Assessment Form

Criteria Weighti Mark Comments


ng
1. Purpose: a clear
statement of the purpose
of the dissertation e.g. 15
reasons for the
investigation; statement of
problems; purpose of the
study
2. Literature review:
critical review of the
literature e.g. use of
relevant literature; 20
evidence of understanding
the ideas expressed;
development of extent of
application

3. Methods; appropriate
use of methods e.g. stated
reasons for using type of
methods; description of 15
methods; appropriateness
and extent of application

4. Data: presentation and


analysis e.g. description
and setting of the study;
presentation of the results; 20
analysis of the findings

5. Interpretation and
conclusion: e.g. analysis of
findings with reference to
purpose of study; issues 20
from the literature review;
practical application and
areas for further research
6. Presentation: e.g.
structure, language,
visuals, logic and 10
coherence

Final Mark

11
100

There are several general points you should bear in mind when developing or outlining a
dissertation topic. You must use information and knowledge learned during your studies
in order to be able to form suitable research titles. The most important thing is to develop
dissertation topics that you feel comfortable with and have confidence that they can be
completed within the time allotted and to the required standard. Tutors and or
Supervisors can provide you with helpful information.

The following pointers provide you with useful information as to how to proceed to
develop your dissertation topic or title.

• It may be belabouring the point, but dissertation topics should be based in the real
world of your field of study.

• Topics should be based on your area(s) of interest. There is no point in asking


your Supervisor to choose a topic for you, as they will refuse to do so. Because
you will have to carry out the research, it is essential you feel comfortable with
the topic.

• You should be knowledgeable about your topic, as you are more likely to
complete successfully a dissertation when you have a strong interest in or direct
knowledge of its subject matter. You might consider in this connection
researching a topic that would be helpful to your prospective career.

• You must ensure your chosen dissertation topic is up to date; new knowledge is
constantly being produced and must not be ignored, otherwise your dissertation
will appear outdated, with an attendant negative impact on your final grade. A
good dissertation topic, accordingly, must be up to date and reflect current
practices; MBA dissertations are not historical exercises, though you may refer to
or summarise relevant background information if that would help to better
understand or clarify points being made in your dissertation.

• You should rule out dissertation topics that are too difficult for you to research.
Many students select dissertation topics in areas that are considered to be
interesting or trendy, but are not actually of interest to them. Students imagine
that more difficult topics will impress their Supervisors and ultimately receive a
better grade than they would have gotten with a topic of more direct interest or
relevance to them It doesn’t, not the least because it is difficult to get good marks
in a subject which you lack the competence to write about.

• One of the best ways to develop a sound and interesting dissertation topic is to
think about issues that you have discussed and learned during your course of
study. Consider particular content areas or subjects that you studied within your
modules and especially those that stimulate ideas that might help throw light on
the research questions and topics that you are attempting to formulate.

12
• You might consider rereading one or two previously assigned text books or
articles. Revising previously acquired knowledge and ideas can often help
formulate an interesting dissertation topic; it can also provide you with an
indication whether yours is a good topic or not. Students will be able to gauge for
themselves whether they can easily obtain relevant information to support their
research if they decide to pursue that particular topic.

It might be useful at this point to illustrate, using an actual case study, the process by
which initial, tentative ideas are transformed into an acceptable proposal and ultimately a
satisfactory dissertation.

Against the backdrop of the large number of recent and past corporate failures −
beginning with Enron and WorldCom and culminating in the financial difficulties
recently experienced by many large multinational banks − several students thought it
might be worthwhile to investigate the importance of auditing failures as a possible
contributing factor.

Students expressing an interest in pursuing various aspects of this topic were keen to
focus on either accounting and auditing issues or were professional auditors wanting to
know more about the practices of the ‘Big 5’ accounting firms. 1 Many indicated the topic
connected with their professional development and thought (correctly) it would enhance
their career prospects, a perfectly valid reason for choosing a particular dissertation topic.

Their next task was to narrow the issue to a relevant topic that could be completed within
the required time period. Out of these preliminary discussions there emerged a number
of interesting and important topics that served both the students’ immediate academic
needs and longer term professional interests.

One of the first of these proposals took as its point of departure Enron’s failure and
sought to explain why Arthur Andersen, the firm’s auditor, had failed to detect the
company’s growing financial difficulties. This suggested two hypotheses: (1) Enron
deliberately withheld pertinent information from its auditors, concealing questionable
financial transactions in so-called Special Purpose Vehicles (SPV), legal entities that
benefited the company by reducing transparency; or (2) perhaps Andersen’s practice
concentrated on companies making similar extensive use of SPVs, and thus faced similar
financial issues. Andersen may also have advised as to how best to structure and manage
such vehicles, which raises the spectre of a potential conflict of interest.

Both hypotheses were perfectly reasonable, the more immediate question being how to
go about implementing them. The first hypothesis could be attacked from the standpoint
of numerous official investigations and research reports that have addressed all aspects of
Enron’s failure, including the importance of auditing failures.

1
The Big 5 auditing firms are: Arthur Andersen (now defunct), Price Waterhouse Coopers, Ernst and
Young, Deloitte and Touche, and KPMG.

13
The second involved developing data on the companies audited by Andersen to determine
whether its client list differed fundamentally from that of its competitors. This in turn
suggested two possible approaches:

• Establish whether Andersen specialised in companies operating in the same


industries as Enron. The student was encouraged to pursue this line of inquiry
because preliminary research indicated that Andersen did indeed have a larger
concentration of its clients in the oil and gas industries than did other major
accounting firms, meaning they could have been as vulnerable financially as
was Enron.

• Determine whether Andersen’s clients exhibited a risk profile different from


the firms audited by the other Big 5 accounting firms. Auditors serve two
main shareholder functions − assurance and insurance − the former
confirming the accuracy of the client’s financial statements, and the latter
ensuring the availability of financial resources needed to cover any damages
arising from auditing failures. The largest accounting firms would seem to
offer investors the strongest guarantees on both scores, in which case they
deserve the premiums they are reported to earn over lower tier accounting
firms.

There is of course no reason a priori to suppose their client lists differed significantly
from each other. Indeed, it is fairly common to assume that the Big 5 accounting firms
were more or less homogeneous, and thus pretty much interchangeable; the results of the
analysis would disclose not only whether there were specific reasons for Andersen’s
failures but also whether clients do in fact view their auditors comparably.

Two additional issues arose in connection with pursuing this proposal:

• How to go about collecting the necessary data for the dissertation, and

• After having gathered the data, could one be sure that any differences detected
were significant.

The first issue was resolved by choosing a relatively large sample of companies in a
number of different industries, then sorting each by its auditing firm; the second required
use of specific statistical techniques capable of differentiating whether any observed
differences could be interpreted as significant or whether they were more likely to reflect
chance, a by-product of the size of the sample chosen.

This highlights yet another important trade off that must be faced as you develop your
proposal and ultimately your dissertation. The larger the sample, the greater confidence
you may have that the observed data are indicative of real differences between the two
groups you are studying. On the other hand, unless you have access to large economic,
financial and corporate data bases, it will take considerable time to develop your data −
conceivably more than you have for producing your dissertation − meaning that you may

14
have to make do with a smaller sample thus reducing the confidence with which you can
present your findings and conclusions. This is where statistical procedures come into the
picture and we shall have more to say about them later in this Guide.

Before leaving this subject, we should mention a number of other hypotheses that
students have pursued in this specific area of research.

• Around the time of Enron’s growing financial difficulties, did Andersen’s other
clients experience a more negative share price impact compared with firms
audited by the other Big 5 accounting firms?

• Did Andersen’s growing legal difficulties affect the clients of the other large
accounting firms, and if so which were most directly and strongly affected?

• Why were the financial data of so many companies restated during this period?

• Did clients that reported dismissing Andersen experience positive stock price
reactions around the time of the dismissal?

• Did clients that remained with Andersen experience negative stock price reactions
in response to announcements by other firms they were dismissing Andersen?

Some idea of the full range of suitable MBA dissertation topics is given below, which
were chosen from those approved by the University of Wales, one of the institutions that
will be awarding your degree. Note, in particular, not only the wide range of topics
represented but also the geographic scope of the dissertations.

• Competition between the Hong Kong and Shanghai IPO Markets


• Determinants of Capital Structure: Cross-Sectional and Panel Analysis for UK
non-Financial Firms
• Inter-cultural Differences in Internet Marketing Communications
• Reform of the Foreign Exchange Rate Regime and Exchange Rate
Misalignment in China
• Capital Structure and Financial Crisis in Malaysia
• The Impact of the Asian Financial Crisis on the Relationship Between
Financial Development and Economic Growth
• The New Basel Accord: Implications for the UK Residential Mortgage Market
• Determinants of Japanese Commercial Banks’ Profitability, 1995-2003
• Financial Derivatives and the Exposure of US Banks
• Culture, Economic Development and the Financial Sector: an essay
• Economic Crises and the Financial Sector: Empirical Evidence from Turkey
• Bank Efficiency in the Nigerian Commercial Banking Sector
• Bank Off-Balance Sheet Business and Risk Exposure Taxes
• Effects of Fixed Assets Revaluation on Stock Returns: Evidence from Greece
• An exploratory study of supply chain integration over the Internet

15
• Recipes for Western Fast-food success in China – a value chain perspective
• Enterprising environment for Entrepreneurs
• E-commerce facilities within Greek retailers
• How mangers can increase Employees Motivation within the retail services
without increasing financial costs
• The relevance of Strategic Statements to the delivery of shareholder value
• Virtually working: an examination of the experience of home tele-workers in
North East Wales
• Are solicitors client focused?
• International expansion by franchising
• Marketing effectiveness: the case of the Brunei Islamic Trust Fund
• The training function of the International Joint Venture

Finally, the following exhibit illustrates a number of important things about your
dissertation; it will repay careful study as it answers most of the basic questions students
have concerning the structure and content of their dissertation.

Exhibit 3
MBA Dissertation Pointers

Structure Word Count Key Elements


Guide
Opening Section • Title page- should be written as a question and should be
indicative of the subject e.g.

• How can an organisation develop an international team? A


Abstract <300 focus on…

• Declaration and statement of own work / Supervisor sign off


• Acknowledgements
• Abstract: This can not be written until the end. It should be a
short outline that summarises all sections. It needs to focus on
the study question, methods used and the key findings in
particular.
• Table of contents
Introduction 2000-2500 What is the area or problem you are investigating and why it is
important to the research community, any company and you?
(purpose)
Requirements:
• Background / context of the study to the study. Use authoritative
sources and facts and figures to provide evidence of trends and
importance in this area

• A clear focus on a research issue or statement of any problem


area and possible causes that will be investigated if it is
company investigation.

• An introduction to any company. This should be a crafted


overview relating to the topic area to some degree and not just

16
dumped from the company website or an article.

• A clear statement of Aims and Objectives E.g.

Aim: (example)

To explore strategies to develop international teams

Objectives are the key elements that underpin the aim.


Therefore:

Objectives (Example)

To Explore IHRM models of the universalism school, cultural school


and key international cultural models

To examine the importance recruitment and selection in the context


of building international teams

To explore the characteristics of a successful international team

To examine culturally aligned training for these teams

• The results expected /aims of the research

• Introduce the theoretical / conceptual framework that you will


be exploring (Plain English). It identifies the main theoretical
points and themes that are RELAVANT to the dissertation and
informs the reader how you intend fulfil your research aims by
use of research methods. E.g. Secondary data or primary data
analysis

• Brief outline of the subsequent chapters

17
Literature 5500-7500 This is a theoretical exploration of the existing knowledge that is
Review RELEVANT to the area you are investigating
• What research exists?
• How does this impact on your research problem?
• What relevant theories and frameworks does it supply to your
problem?

• The literature review focuses on similar and contrasting


perspectives that researchers or academics have used to
approach this, or similar research areas. As such, you have to
identify the strengths and weaknesses of such approaches.

• Use only relevant studies, focus on their main findings and


conclusions

• You should be able to consider the most appropriate areas,


justify these, and use them to inform / create your research
methodology

• *The literature review is not a brief summary of all the books


and articles you have read. This is called an annotated
bibliography.

Methodology 1500-2000 An in-depth discussion on how the study will be undertaken and how
it fits with your research question

• What methodology will be used?


• How will it be done?
• Why was the method selected? (justification for the final
selection)
• Why were other approaches discounted?
• Strengths and weaknesses of the approach
• Design of research instrument

Design of research instrument


• Clear indication of the design features and any questions to be
asked in a survey

• Describe the data analysis techniques to be employed. E.g.


sampling techniques, frameworks, size and type of an
appropriate survey

• Description of subjects involved, settings for the study and any


variables that may be encountered

• Describe the data analysis techniques to be employed. E.g.


sampling techniques, frameworks, size and type of an
appropriate survey

18
If you are using a particular concept / model, then it should be critically
discussed and incorporated here.

For in- company • Description of the operation and its processes


Investigative • A perspective on the problem
dissertations • Factors that may be the cause of problems
where you have • Identification of internal documents, operation procedures and
access to the policies
operation
Note: These two sections will add to the word count. So, adjust the
Current system Literature Review and RM sections to compensate. The word count
described and structure should be adapted to suit the dissertation and its focus.

For in- company Investigations into the company, the operation and its systems
Investigative
dissertations This may involve application of a model or method that you discussed in
where you have the literature review and research methods.
access to the
operation E.G. Observing an operator carrying out job task, asking questions after
the operation and comparing this to required standards of performance
Current system necessary. This might show that there is a skill gap in operatives that is
Analysed causing quality problems.

19
Data Analysis 3500-4500 What did you find?
and findings
One or possibly 2 chapters on data analysis and findings

What have you discovered and propose reasons why this may be?

Presenting the findings


1) Restate the actual research question for the reader and show
frequency tables and graphed data. Interpret the data for each
data set. Make selected links back to the LR
2) You may be able to cross-tabulate; make correlations from one
data set to another IF you have set up the personal information
and demographics of the population first. From this you may be
able to identify patterns and trends.
3) Discuss the results in plain English
4) Build on and qualify the overall conclusions linking back to the
LR and key facts from the analysis.

Discuss whether your results successfully met the conditions to test your
research question or hypothesis e.g. in a postal survey, how many people
returned the questionnaire? Those who didn’t, are they different in
someway?
Conclusions 2000-3000 Discuss in narrative form your conclusions linking back to your research
question and literature review so that a clear argument and thesis can be
identified throughout the work.

• Use clear statements on the conclusions reached


• Was your hypothesis or theory supported in your findings?
• Answer the questions you raised in the introduction.
• How does the results compare with theory and good practice
discussed in the literature?
• How do your results compare to those of other research /
academic studies?
• What has surprised you about the results?
• What limitations may there be to your study?
• How does it add to knowledge in the field?
• What further research would you recommend to develop your
work in the area of knowledge and research?
Recommendation <1000 Recommendations should be concise statements and include time scales
s as necessary.
Bibliography All sources in the dissertation need to be referenced using the Harvard
method.

The order of referencing is as follows:


1) Books
2) Journals and articles
3) Web sites with full URL address and dates accessed.
Appendices Number these Use sparingly!
using Roman Appendices are to be used for relevant information that would spoil the
numerals flow of the report. E.g. a good example of use is the inclusion of any
survey questionnaire
Source: David Greenshields (LSC, 2009)

20
4. The Introduction and Literature Review

The first section of your dissertation is the Introduction, in which you state the basic
objectives of the study; this would include both primary and secondary objectives, the
former relate to the main topic(s) of your dissertation, while the latter are a number of
important side issues that flow out of the analysis of the main area of interest. It will also
serve to introduce how you intend to pursue your chosen topic by providing a discussion
of the methodology you will be using to implement the study.

Students sometimes find it easier to write the Introduction last, a decision that is less
convoluted than it appears. After all, it is much easier to draft the introduction when you
know how things turned out than when they are still pretty much up in the air. This is an
important point to bear in mind since the introduction frames the remainder of the
dissertation. The reader is more likely to be drawn into the study if the approach,
objectives and method are clearly, carefully and confidently stated then if the
introduction is carelessly or ambiguously worded.

The second broad area is the Literature Review or Survey. This is one of the most
critical sections of the dissertation as it provides the foundation upon which your study
ultimately rests. Despite its importance there is considerable ambiguity as to how best to
develop the Survey. In some instances students compile what amounts to little more than
an annotated bibliography, that is, they list numerous publications that relate to their topic
and provide brief summaries of each. This, of course, misses the point and value of such
a Survey.

For one thing, if pursued correctly it will open your eyes to valuable sources of
information that can be used to inform your study. It will point you in the direction of
what has been written in your area of research and will provide alternative perspectives
on many of the issues you will be covering in your dissertation, some of which will
confirm your conclusions, while others will not.

Part of your job will be to separate out the more from the less relevant studies, the more
from the less reliable findings. Relevant and reliable do not refer to whether previous
research agrees with the findings of your study, but rather to the quality of the findings,
that is, how well they have held up to subsequent research. Findings that have not (or
cannot) be replicated should be considered suspect, while those that have been validated
repeatedly are worthy of more serious consideration.

Before considering how best to evaluate the evidence you are developing in connection
with your dissertation, it is worth considering the different types of information resources
that are available, above all, their advantages and limitations. There are five principal
literature resources: popular press (including electronic documents); practitioner books
and compendia; practitioner journals; academic books and compendia; and academic
journals. We now consider each in turn.

21
Popular press: The popular press consists of widely read business publications
and magazines. It includes publications such as the Financial Times and Wall Street
Journal, both international newspapers, and magazines such as The Economist,
Euromoney, Business Week, Institutional Investor and Forbes.

The main value of this literature is timeliness and accessibility: global business and
financial newspapers provide real time information concerning all aspects of national and
international business and global financial market trends. The information provided is
generally factual, practical and intended for the use of decision makers. Business
magazines, because they publish only weekly or monthly, take a slightly longer
(thoughtful) view of international political, business and financial developments and
often provide informed commentary, frequently by guest contributors, designed to
provide perspective on recent or prospective developments.

The main weaknesses of the popular press are the same as its strengths, namely, its
currency. Currency means that much of the information provided is incomplete or
uneven. Some issues are covered in considerable depth, including commentary provided
either by staff or guest contributors or in editorials, while others rate only one column
inch of space. By definition, news equates to new information, which perforce is
fragmentary, inaccurately reported or simply wrong. Leading newspapers and magazines
often publish apologies for inaccuracies in the original stories or corrections that revise or
update information previously provided.

There are other difficulties that also affect the value of such information for use in
dissertations. Most business newspapers and magazines are, owing to their main
readership, business men and women, so they often colour information to maximise the
appeal to their target audience; their editorials equally tend to adopt pro-business stances.
It is absurd to imagine that such publications are (or can be) value-free; the very best one
can hope for is that their journalists offer a wide range of alternative or competitive
perspectives, from which readers can draw their own conclusions about the value or
relevance of the points of view being expressed.

Most such publications make every attempt to ensure that the information they provide is
complete if not accurate. Not being scientific, there is no reason to expect scientific
rigour in most of the information they provide. Only the most mundane information –
standardised measures of business, financial or economic performance, for example – are
likely to be presented reasonably accurately; less familiar or more technical information
is not always well covered or explained, nor is any attempt made to distinguish between
the relevance of different concepts. Trendy, rather than serious, ideas tend to be
emphasised partly because that is one of their main functions, to publicise the new and
the controversial, not to concentrate on concepts that have longer term significance,
promote sound business practices or add to the firm’s bottom line.

The Internet is a source of the widest imaginable range of subjects. Business and
financial information is easily accessible thanks to search engines that do most of the
work. There are numerous sources of information so that the competitive ideal noted

22
above is met well beyond what is possible within a given news organisation. On the
other hand, the available information lacks even the most elementary safeguards with
respect to accuracy or completeness.

The classic example here is Wikipedia. In its original form, Wikipedia entries could be
edited online by anyone, qualified or not, regardless of whether the ‘corrected’
information is more or less accurate than the information it replaces. Wikipedia
recognised this difficulty and now imposes a time delay for redaction of individual
entries. Of course this does nothing to correct the basic flaw in its design. Information is
useful only if it is accurate. In the Wikipedia world ideology or personal views are
considered to be of equal value to research driven and supported by ‘facts’; the lack of
intellectual controls means that no information so supplied can be regarded as
authoritative, even when it is. It is best to avoid such information; most Supervisors will
do more than raise an eyebrow should information from this source be listed in footnotes
or the bibliography.

Practitioner Books and Compendia: This category comprises an amazing


diversity of publications that it is difficult to know exactly how to describe and evaluate.
Many of the books falling into this category are of the popular variety, written by authors
keen their share their experience or point(s) of view with the reader. They vary widely in
terms of quality and value, so the best way to approach them would be with a high degree
of scepticism.

Unless the authors have an established track record in the area or areas in which they are
writing, it would be best avoid them altogether for the simple reason that no matter how
deeply felt the author’s concern for the topic being addressed there is simply no way the
reader can assess the validity of the results or how widely they can be applied. This is
not to say the genre is worthless; there are examples of such books having been written
that have changed thinking or influenced organisational practices, but these are the
exceptions.

Practitioner Journals: These avoid some, but not all of, the pitfalls of
practitioner books while most journals subject submitted articles to peer review. Their
main advantage is that they are highly readable, and do not presume any specialised
knowledge on the part of the reader; the Harvard Business Review is a prime example of
this genre. They also deal mainly with ‘live’ business issues and as they tend to be
authored either by current practitioners or academics, the arguments tend to be presented
with greater objectivity than in the popular press.

The weaknesses of this literature derive mainly from a predisposition to include articles
dealing with issues such as philosophy or other equally arcane areas that lack immediacy
or that drift into such generalities as to be practically worthless in terms of applicability
or relevance. A second shortcoming is the overriding importance editors of such journals
attach to clarity and uniformity of style. This emphasis produces a number of unfortunate
consequences.

23
For one thing editors are assigned to see an article through from submission to
publication. These people are typically better informed about the editorial requirements
of the journal than the content of the articles being submitted. In the course of rewriting
articles they can, and often do, sacrifice relevant material or lose the main point(s) being
made by the author for the sake of clarity. Many authors put up with this interference
because of the journal’s prestige and wide readership, the latter in particular being of
considerable importance to aspiring academics or consultants looking to add to their
reputations and hence client lists.

Even so, it would be wrong to dismiss such journals out of hand. Many articles are
written by acknowledged experts in their field, with the material presented in a form that
is far more accessible than where the research underpinning the article was originally
published. The Journal of Corporate Finance is another good case in point: like the
Harvard Business Review, it invites distinguished academics, lawyers, and business
practitioners to contribute articles or to participate in roundtable discussions invariably on
finance or related topics. The presentations may include technical material but presented
in such a way as to ensure that it can and will be easily understood; this is especially
useful for business students who may lack the qualifications to understand the more
technical analytical or statistical issues characteristic of articles appearing in learned
journals.

Used wisely, such journals can be a source of valuable and useful information; however,
they are best used as a complement to and not a substitute for the more scholarly articles
that should form the core of the dissertation’s Literature Review.

Academic Books and Compendia: Academia places considerable importance


on the production of research articles and less to the production of specialised
monographs or books. Many authors write books to summarise a relatively large body of
their previous research or to produce textbooks. Much of the material contained in
specialised monographs will have been published previously; in many instances the
results of prior research are incorporated in their entirety in the text, the main differences
being that dated information will have been replaced with more recent data. A new
introduction and conclusion are often added to tie the disparate research reports together
and, where required, some of the material will have been updated to take account of
recent developments or research in the study’s main field.

Such publications provide little information beyond what could have been gotten from
reading the original articles. There is nothing wrong with this practice, and from the
research consumer’s point of view may actually be beneficial in that most of the relevant
material will be available in a single source.

We might also include in this category official publications, such as those issued by
central banks (the Federal Reserve Bank, and its regional banks [Federal Reserve Bank of
New York, Federal Reserve Bank of Boston, and so forth]; the Bank of England,
Financial Services Agency, and other OECD central banks); government agencies (US
Department of Commerce or the UK Department of Trade and Industry) and publications

24
of international organisations such as the International Monetary Fund, the World Bank,
regional development banks and the Bank for International Settlements. Reports and
publications of local Chambers of Commerce and other trade associations also fall within
this category.

Central bank and other official publications provide both statistical information, current
and historical, covering a wide range of data; they also publish general reports intended
for popular consumption as well as more specialised publications, such as financial or
policy reviews, that tend to focus on matters of current interest, or working papers
prepared by professional staff members. Most central banks and international financial
institutions produce similar policy and research documents, mostly written in English, but
frequently translated into other European languages, mainly French. Publications of the
European Union are available in the languages of member countries.

Academic Journals: The principal advantages of academic journal articles are


that they are written by professionals with (usually) considerable knowledge of their
subject, and are peer reviewed meaning that other specialists will have had the
opportunity to assess whether their research findings are worthy of publication. This
process ensures higher standards of objectivity than apply to the other literature
categories we have reviewed. Authors accordingly are more likely to produce higher
quality articles that tend to focus more on narrower academic concerns than on current
issues or fads that tend to dominate the popular press.

This is not true of all academic journals; there are several excellent publications that
combine the highest academic standards with a focus on contemporary issues. Two of
the best are published by the Brookings Institution in Washington, D.C., Brookings
Papers on Economic Activity, one devoted to macroeconomic, the other to
microeconomic, issues. Prominent academics are invited to analyse topical issues at
forums held twice-yearly. Discussants are nominated to comment on individual reports,
and their remarks are included alongside the main article. Audience members, too,
contribute to the discussion, and the final publication contains summaries of their
comments as well.

The weaknesses of the academic literature are the obverse of its strengths. To ensure
objectivity, the editorial process is typically very long, the time from submission to
publication can be as long as two years, to the detriment of accessibility. To remedy this
shortcoming many authors publish so-called working papers, which can be accessed at
leading academic websites such as the Social Science Research Network (SSRN).

This, however, poses the same problems that we noted above: neither the fact of having
been included on the SSRN’s website, nor the academic affiliation of its author can
guarantee the quality of the publication; indeed, some authors do not even bother to
indicate their affiliation. This poses few if any problems for academics, equipped as they
are (or should be) to separate the wheat from chaff. For consumers of academic research,
the inability to properly assess whether the article has any merit at all suggests that all
such studies should be viewed cautiously.

25
One technique widely favoured by consumers as a quality measure is the number of times
a given article has been accessed, with higher downloading frequencies taken as an
indicator of intrinsic merit. The flaw here is the equation of interest with importance.

It is interesting to note that the SSRN does list the ten most popular articles measured by
the number of times they were accessed in a given time period. In each case, the most
popular articles were also written by some of the country’s best known academics. It is
the combination of frequency and author, not frequency alone, that gives this measure
any degree of credibility. Against this backdrop, peer review in leading journals is still
the soundest guide to the worth of any individual publication.

Finally, we might note that there are a huge number of publications that could be
included in this category of extremely variable quality. The fact of peer review here is no
guarantee of the value of the final publication. True, editors go to great lengths to ensure
that reviewers are qualified and competent, and often provide a comprehensive list of the
members of their editorial boards. There should, however, be no presumption that any of
these will have seen, let alone commented, on a particular article; in some instances the
editor has the final say as to whether or not to publish a given article, a far cry from the
highest standards of the best academic literature.

26
5. Evaluation of Published Research Resources

What conclusions can we draw about the value of the different research resources used by
students in the preparation of their dissertations? With respect to business literature,
there is a general consensus that the closer the methodology approaches that favoured by
the ‘softer’ social sciences (communications, decision making, motivation, leadership,
and so forth) the weaker is their claim to scientific validity; this conclusion holds true
equally for several other areas of business research. The principal exception is economic
or financial research, where the favoured method of analysis more closely approximates
to that of the natural or physical sciences.

Financial economists typically use published information; most other disciplines have to
generate their own data, usually based on surveys, the design and data collection are often
more time consuming than is the analysis. Statistical procedures are employed, though
not always of the same degree of sophistication or applied with the same competence as
economic modellers; nor are the full set of results always published, making it difficult
for other researchers to evaluate the claim(s) being made.

Against this backdrop, you will need help to determine how best to interpret the value of
the information being generated. How, in other words, you can best establish the
relevance and reliability of the information being developed. Abelson2 suggests the
acronym MAGIC as the best way to judge the worth of information.

M = Magnitude: The important point here is the how large are the effects being
reported? How reliable and broadly based are they? In each case, the more impressive
the findings, the more reliable they may be taken to be.

A = Articulation: How well is the research story being told? Does it consider both sides
of the issue fairly, and does it do so in a sound and reasoned form?

G = Generality: Do the findings have wide implications or are they specific to a


particular point in time or to a specific set of circumstances? How well is the claim(s)
made supported?

I = Interestingness (or perhaps, better still, importance), the ability of the research
findings to influence how other people view the topic, even the potential to alter or
change their beliefs concerning the phenomenon under investigation.

C = Credibility, that is, is the argument being put forward theoretically and
methodologically sound? Have alternative perspectives been confronted? Are the data
too good to be true: do the results depend upon statistical procedures correctly applied
that rule out the possibility of the observed outcome occurring by chance or do they
depend upon personal observation or experience only?

2
R. P. Abelson, Statistics as Principal Argument (Erlbaum: 1995).

27
These arguments provide a sound way forward for assessing the merits of individual
publications. A better approach would be to look at the specific findings within the
context of a broader body of research, the ultimate purpose of the Literature Review. It is
of course possible that your topic will have generated only a very small number of
publications in which case the ‘competitive test’ will obviously fail. More likely, you
will have access to a very large body of research, as tends to be true in virtually all areas
of finance. In which case, you will have to exercise great care and skill in weeding out
those that are worth serious consideration from those that are not or, more generally,
knowing when it is appropriate to use one particular source over another potential source.

Specialised newspapers or magazines will not be as reliable as articles found in scholarly


journals as the latter are reviewed by a panel of individuals who are experts in their field,
while the former will have been approved by an editor who may or may not have
specialised knowledge of the subject matter to which the article refers. Our point is not
to argue in favour of the superiority of one source over the other: they both are relevant
so long as you understand their limitations.

Newspapers and magazines are the main source for current information, of fast breaking
developments occurring across the business world. It will be months, even years, before
academics scrutinise these issues by applying high powered research techniques,
statistical analysis, for example. That is the fundamental difference between journalism
and research, and this distinction should be remembered when using information deriving
from these different sources.

One final point is worth stressing here: the distinction between advocacy and research; it
is not unusual in the world of business to find the former masquerading as the latter. This
is another important characteristic that differentiates academic from non-academic
sources. As we have seen, the rules of the academic game are fundamentally different
from those applying in the non-academic world. Academics are meant to generate
research results to better explain or clarify aspects of relevant business or financial topics.
They present their findings fairly and objectively. Editors of scholarly journals will weed
out from research submissions material that fails to meet these requirements; if authors
refuse to alter or amend their articles, the editor has the right to refuse publication.

Errors will inevitably occur, but where we are talking about serious research, they are
more likely to be unintentional, the result of an oversight on the author’s, reviewer’s or
editor’s part, than to a deliberate attempt at deception. This can and does happen, but the
risks are much lower in the scholarly than in the more popular literature.

Why? Because no comparable standards apply elsewhere, where authors are free to
express their views, no matter how controversial they may be. Indeed, controversy is
likely to stimulate newspaper or magazine sales and thus revenues, which after all is the
purpose of for-profit news media. This does not rule out use of advocacy-based financial
or business research; many of the best known management or financial theories have
their origins in research that went against the grain of prevailing theories, but not always.
It is best to regard advocacy literature with suspicion, because the intent is not necessarily

28
to enlighten but rather to convince, often of a point of view that does not command broad
support.

Academic research, though widely (and correctly) conceded to be the most objective
source of information is not without problems of its own. These limitations apply
especially in respect of business research, and other soft sciences. Even within business
subjects, there is considerable dispersion not only in terms of methodology, but also in
terms of interpretation.

The empirical literature tends to follow a common structure. A theory of the


phenomenon to be investigated is adumbrated and tested typically using statistical
analysis. This approach is most common in economics and finance research, but
increasingly is finding its way into management and marketing research as well. A key
difference is that in the latter disciplines there are few theories as well developed as those
in economics or finance. Moreover, while economic and financial research makes use of
numerous large data bases from which to test hypotheses, among management and
marketing disciplines the data tends to be developed ad hoc, that is, primary data based
upon surveys or interviews of varying size, or experiments, with the data thus generated
tested for the ‘associations’ or ‘differences’ the research is designed to identify.

We shall have much more to say later about surveys and interviews; here we are
concerned primarily with the limitations of these approaches and how these limitations
affect any generalisations that can be drawn from the test results. The first question that
we need to answer is: How do we know we have found anything of significance?
Significance can be defined either in its everyday sense of ‘importance’ or in its
narrower, more technical statistical sense, that is, whether the observed results could have
arisen by chance.

Here we need to draw a distinction between business studies research on the one hand,
and financial and economic research on the other. Analysts argue that the former (unlike
the latter) tends to suffer from three main shortcomings: (1) the lack of research
replication; (2) the inability to cumulate or generalise the research results; and (3) the
faulty interpretation of statistical significance.3 The first point is important because it
stands in marked contrast to the way research in the physical or natural sciences is
conducted, where research findings are expected to be replicated to establish the
robustness of the original findings. If the original results cannot be verified by
subsequent research, then the validity of the initial findings should be dismissed.

Nor does the growing use of high powered statistical techniques change things all that
much: statistical significance is not always proof of support for the hypothesis being
tested. Statistical results can and often are misused, either because the researcher fails to
understand the limitations of the methodology, or because crucial assumptions
underpinning the statistical model have not been met, or through some combination of the
two. Replication permits detection and correction of such errors; with the current

3
The following discussion is adapted from John Kmetz, The Skeptic’s Handbook: Consumer Guidelines
and a Critical Assessment of Business and Management Research (2002).

29
research ethos discouraging such an approach, much of the management research
literature should be regarded with caution.

That economic or financial research avoids all of the shortcomings of research methods
favoured in other business studies areas is only partially true. Much of the received
wisdom, based upon ‘strong’ assumptions concerning human behaviour, has been
questioned, while the application of new, less restrictive paradigms has challenged many
of the disciplines’ most basic conclusions.

Two assumptions in particular lie at the heart of much economic and financial research:
(1) individuals behave rationally, that is, when confronted with choices they select the
one that best serves their interest (self-interest); and (2) individuals have access to
identical information. The first assumption minimises the possibility that psychological
forces can and do influence human economic behaviour; once this assumption is relaxed,
so-called Behaviouralist models can begin to unravel phenomena that appear to defy
traditional economic analysis, speculative ‘Bubbles’, for example. The second
perspective undermines the traditional market model since it assumes that buyers and
sellers do not have access to identical information, in which case, the latter can exploit
the former with an attendant negative impact on economic efficiency.

These challenges have forced researchers to question their respective approaches, and by
having drawn attention to existing weaknesses have caused practitioners to confront
them. In economics and finance, asymmetric information and the research it has
generated has proved to be of immense value, with many of the insights having been
incorporated into the traditional curriculum. The other criticisms have fared less well,
but that hasn’t deterred researchers from pursuing their research programme even though
the balance of evidence makes clear there is much less there than meets the eye.

We raise these issues so as to provide a more balanced view of the quality of


informational resources available to business students. Research methods and results in
all disciplines, even in the physical sciences, have and continue to be questioned. But
they don’t change the fact that much of traditional theory still has considerable merit and
accordingly is still taught. Indeed, exceptions and anomalies are the rule not the
exception in all areas of science, and that is how things should be.

Karl Popper, the eminent philosopher, pointed out long ago that the struggle to
accommodate anomalous evidence is an important aspect of the accumulation of
scientific knowledge. This struggle, Popper points out, rarely leads to the complete
abandonment of the existing conceptual framework. Rather, it typically results in
modifications to the current paradigm capable of accommodating observed anomalies,
while retaining those features of the intellectual framework that still fit.

Business students can rest assured that the research upon which their dissertations are
based is still valid. By highlighting the limitations of different informational resources
we do not mean to suggest that all such resources lack any value at all; as we have seen,
they also have their advantages. The point is we need to take the good with the bad to

30
arrive at a balanced view of what the literature has to offer. Business students are
fortunate in that the available literature resources are vast. The aim is to ensure you make
best use of what is available, hence the emphasis we place on research methods to which
this Guide is a part.

31
6. Methodology

If you are preparing a dissertation in any subject area other than finance, you will be
required to generate primary data. Primary data are produced through surveys or
interviews or some combination of the two. Some finance dissertations also involve the
production of primary data; here, however, the survey results are intended to supplement
the basic objective of the dissertation.

For example, one recent dissertation sought to determine what impact adoption of EU
accounting rules would have on the reported profitability of leading Turkish
manufacturing companies. The author’s main hypothesis was that the shift would result
in lower earnings than would have been reported under traditional accounting standards.
The results were mixed so he decided to supplement his financial analysis by
interviewing analysts employed in leading local brokerage and accounting firms. He was
under no obligation to do so, but was convinced that the additional information he would
develop would illuminate the ambiguous results of the financial analysis. It did.

As you will have noticed from Exhibit 2, there are two principal dimensions to the
Methodology section. (1) It should provide an in-depth discussion of how the study will
be undertaken and how it fits in with the issue your research is intended to answer.
There are several different approaches that could be used to develop the information in
your analysis. You will need to provide justification for your particular approach: why it
was chosen in preference to alternative ways of going about collecting data, that is, why
you believe it provides the best fit for addressing your chosen topic. And (2) if you are
using one provide a detailed consideration of how you designed your survey.

6.1 Surveys

A survey is nothing more than a systematic method of collecting information from a


selected group of people who are asked a series of questions.4

(a) When Should I Use a Survey?

You employ a survey when it is faster, easier, or less expensive to use than other
methods. Sometimes other data collection methods are preferable. For example, to
determine the number of people using a clinic, you could simply count the number of
signatures on the sign-in sheet, or examine the daily records; no survey is required to
obtain such information.

Nor is there a need to undertake a survey when the information exists in some other form
that can easily be accessed, for example, in archives, records, or databases. Using such
data can save you time, money, and effort.

4
Much of the following discussion, examples and some charts and tables are adapted from: Houston, Survey Handbook
(Organizational Systems Division, Total Quality Leadership Office of the Under Secretary of the Navy).

32
(b) Survey Preparation

What is the purpose of the survey?

Surveys are used for many purposes, and these influence its form and content. Some of
these include:

• To obtain information that can be used to verify the main predictions of the
theoretical literature.

• Identify organizational strengths and weaknesses

• Targeting areas in need of improvement

• Assess the effectiveness of new or existing policies or programs

What specific information is needed?

To meet the purpose of the survey, identify the topics or issues of interest and the forms
of information needed. If, for example, you are interested in determining the importance
of maintaining current dividends, you might ask questions about how often and under
what circumstances dividends are increased. You might also ask people to compare the
value of dividend payout with alternative methods of returning cash, and so forth. If the
objective is determine future actions, you might ask respondents to identify what factors
influence future payout decisions.

Who will be surveyed?

Identify the types of people who can provide the information you are interested in
developing. Do they belong to a particular group (students, managers), a single category
within that group (post-graduate business students or middle managers) or do they come
from a variety of categories (under- and post-graduate students, all managers of Barclay’s
Southwark branch)?

How will the survey be administered?

There are three main ways to conduct surveys – face-to-face interviews, telephone
interviews, and written surveys, conducted either by post, email or group sessions – with
the method chosen capable of providing sufficient information as quickly, efficiently and
economically as possible.

Interviews only make sense when you need to collect detailed information from a
relatively small group of people. Interviews can be used to explore issues and options to
a greater extent than written surveys.

33
What resources will be needed?

You will be responsible for designing and implementing the survey. On the other hand,
people who respond to the survey should be considered as a resource in terms of number,
time invested, and information provided.

What survey items will be used?

Occasionally it is possible to use existing surveys, but you must consider ways in which
they can be updated to reflect any relevant changes that will have taken place between the
two surveys. For post graduate business students, this is practical only where previous
research surveys are deposited in the School’s library. It may be possible to access
completed UWIC dissertations, but unless they are online you will have to visit the
library in Cardiff, which may not always be practicable. It is best, therefore, to think in
terms of an original survey, though the specific questions you intend to ask in the survey
may be gleaned from prior studies uncovered in your Literature Review.

How will survey information be analyzed and reported?

Once developed, you must consider the best way the data can be organized and
interpreted. There are numerous possibilities: some relatively simple and straightforward
– tables, frequency distributions, line graphs, bar charts, pie charts, or histograms – while
others, only slightly more complex, may involve the calculation of averages (means) and
their associated standard deviations, medians and modes. The analysis dictates the best
way to present your data. If, for example, you intend to apply formal statistical
procedures, then you may want to present summary measures only; if the tabulated
information constitutes the main data then you may want to use more expansive formats.

All survey results are intended to measure differences or associations. What impact does
training have on worker productivity? Does additional training lead to even better
performance? In the first instance we are interested in determining how big the effect, if
any, is of requiring employees to attend training sessions. One way to do this would be
calculate average worker productivity, usually measured as output per hour worked,
before and after attending training sessions. Does it matter whether the training occurs
during work hours, after hours or on the weekend? Are the results sensitive to whether
employees attending after work sessions are or are not paid?

In each case, the answer depends upon the size of the effect. If after completing the
training session, for example, the improvement amounts to only 1.75 per cent, is that
large enough to confirm the value of the training sessions? Assume further that paying
staff attending out of hours training sessions improves productivity by 4.75 per cent
compared with 1.5 per cent if employees are not paid? In the first example, the effect
appears too small to confirm the claim that training matters; on ether hand, the
differences in productivity between employees attending training sessions who are or are
not paid appears big enough to support the contention that payment matters. Survey

34
results could be supplemented with interviews asking employees what difference
payment had on their attitudes towards the value of training.

A word of caution: even large differences can sometimes be misleading, especially where
sample sizes are relatively small. It may just be that the observed difference of paying
employees to attend training sessions after hours may have arisen as a result of chance.
Later on various statistical procedures will be described that can be used to determine
whether such results could have arisen by chance.

How many people need to respond to the survey?

What is of interest is not the sample results but rather whether we can draw any
meaningful inferences concerning the larger population from which the sample was
drawn. That really is the point of a survey: in most cases it is physically impossible or
prohibitively expensive (or both) to interview each person in the group you are interested
in. Samples of inappropriate size can lead to misleading results, inaccurate
interpretations, and ineffective actions.

(c) Constructing Survey Items

Survey questions should be:

Clearly written. Statements should be short, to the point and easy to read. Jargon,
technical terms or unfamiliar acronyms should be avoided.

Concise. Get to the point as quickly as possible: wordy questions are distracting and
could easily defeat the purpose of the survey.

Specific. Focus on one idea at a time. Each item should collect information on a single
behavior, attitude, opinion, event or subject.

Explicit. Do not force people to guess about what is being asked. Be sure they
understand what information you want by explicitly stating so. If necessary, highlight or
underline what is needed by way of an answer.

Selecting Response Formats

Along with the statements and questions, you need to provide methods for people to give
their answers. Typically, survey items are used to ask people how much they agree with
some statement, how important something is, or how often something happens.

Rating Scales. Surveys often ask that products or services be rated according to some
scale. Some survey items present statements and ask people to rate how much they agree
or disagree with the statements.

35
For example, on a scale of to five tell me whether your supervisor encourages
subordinates to participate in important decisions.

1 = Strongly disagree; 2 = Disagree; 3 = Slightly disagree; 4 = Agree; 5 = Strongly


agree.

When creating rating scales, ensure that the end points (or anchors) are equal and
opposite in meaning. Failure to do so runs the risk of biasing the survey responses.

Note, finally, that when the survey results are tabulated, you will obtain a mean rating of,
say, 4.25; in terms of the way the above rating system is defined, that would indicate that
most respondents took the view their supervisors did encourage their participation in
making important decisions. You may also calculate the associated standard deviation,
which indicates the extent to which individual responses deviate from the mean value.
The higher the standard deviation, the greater is the dispersion of responses, and vice
versa. Note, finally, that the mean and standard deviation are measured in the same units,
hence the preference for this statistic over the variance.5

Ranking Items. Another common response choice is to ask respondents to rank-order a


list of options in terms of some factor, importance, for example. These data help to
prioritise what is most important to respondents. Thus, if speed of service and cost of a
restaurant meal are ranked higher than quality of food and variety on the menu, efforts
can be focused on those aspects most important to respondents. Consider the following
example:

Please rank the following five objectives in terms of importance by marking


a 1 next to the most important objective, a 2 next to the second most
important objective, and so forth:

--------Achieving a quick success


--------Increasing the amount of output
--------Reducing the price charged to customers
--------Reducing the work backlog
--------Reducing the number of defects

Selecting Options. This response format presents a list of statements or options to which
respondents are asked to circle one or more items that apply to them. This format is
similar to the ranking question, but does not require survey respondents to put things in
any particular order; this kind of question is easier to respond to than having to rank-
order questions, for in many cases respondents are unable prioritise when they feel that
everything is of equal importance.

5
The standard deviation, usually denoted by the lower case Greek letter sigma (σ) is the square root of the variance,
hence the variance is the square of the standard deviation and thus the square of the data unit. For example, if the data
are measured in per cent, then so to will the standard deviation, while the variance will be per cent squared, a unit of
measure that is not easily grasped or understood.

36
Comments and Open-ended Questions. The fourth question/response type allows
respondents to provide additional comments or other information in response to general
questions. These questions usually leave blank space where respondents can write
whatever is important in their own words and format. Examples are listed below.

Do you have any suggestions on how we can improve classroom lectures?

Are there any products or services you need that we do not currently provide?

Is there anything else you would like us to know?

Demographic Questions. Demographic information is used mainly to segment


respondents into narrower groups based on specific characteristics such as age, level of
education, marital status or salary level. Segmentation is important if one of the purposes
of the survey is to determine whether significant differences in responses exist between
groups.

Which one of the following age categories do you currently fall into?

--------Under 20
--------20-29
--------30-39
--------40-49
--------50-59
-------Over 59

What is the highest educational level you have attained?

-----Less than secondary school diploma


-----Secondary school diploma
-----Associate Degree
-----Bachelor’s Degree
-----Master’s Degree
-----Doctoral Degree

What is your martial status?

-----Single (never married)


-----Married
-----Divorced/Separated
-----Widowed

What is your current position and salary?

Demographic items can be included that ask people to identify themselves, that is, give
their names, where they live or personal information relevant to the survey. The more

37
detail you require, the more intrusive people perceive the survey to be. People who feel
uncomfortable providing detailed personal information are unlikely to answer questions
honestly or may decline to answer them at all. To assuage privacy concerns, you should
include a description of how the survey answers will be used and a promise to keep
individual responses anonymous. ‘The information you are providing me with will be
used exclusively in an MBA dissertation designed to test the importance customers attach
to the quality of service they receive at this supermarket, and for no other purpose.’

(d) Reviewing Items

After developing a set of potential survey questions and response scales, review them to
make sure that they are:

Relevant to the purpose of the survey. Items that stray from the purpose will not
provide the information needed. You must have a specific reason why an item is being
asked in a survey. Always focus on the purpose of the survey and the type of information
needed to support that purpose. Carefully match the items to your survey purpose to
ensure that they address the issues that have been identified.

Appropriate for the individuals being surveyed. Do not include items that people do
not have the knowledge to answer. For example, store customers could probably answer
questions about a store’s layout; but they would not be able to answer questions about a
store’s compliance with health or safety regulations.

Capable of providing the appropriate type of results. Anticipate how the information
being developed will be summarised. Summaries should provide the types and level of
information required by the survey users. Will the results be presented in simple bar
charts or subjected to further analysis? How much detail will be required to meet the
information needs of those using the survey? If, for example, survey users are interested
in general impressions only, it is a waste of time to calculate averages to the seventh
decimal point. If, on the other hand, precise distinctions among quality features are
required, then just providing a list of verbatim comments is unlikely to be helpful either.

Check items to ensure that they are not:

Ambiguous. Avoid words or phrases that can be easily misinterpreted.

Overlapping. Avoid presenting response choices that overlap. Overlapping choices can
lead to confusion on the part of the respondent and difficulty in interpreting information.

Circle the number that best represents the number of hours per week you spend on
preparing your assignments:

1. None at all
2. Less than one hour per week
3. One to two hours per week
4. Two to three hours per week

38
5. Three to four hours per week
6. Four or more hours per week

Double-barreled. Avoid having respondents address two different issues in the same
item.

Did your teacher listen carefully to your question and did s/he answer it promptly?

Leading. Avoid giving clues that point to the desired answer, or limiting the answers to
those desired.

Do you think you are being forced to spend too much time on your accounting
lectures?

Redundant. Avoid duplication, that is, asking the same question more than once.

(e) Administering the Survey

Written surveys are typically administered by mail and in-person.

Mail-Out Surveys. The most common method of administration is to mail surveys to the
customer sample with a stamped reply envelope. When using the mail-out process, allow
time for the survey to get to its destination, time for the respondent to complete it, and
time for the survey to be returned. Surveys mailed to distant (i.e., overseas) addresses
might require will obviously require considerably more time for their return than those
posted locally.

In-Person Surveys. In-person (or face-to-face) surveys can be done in different ways. One
way is to ask respondents to complete an on-the-spot survey; use only very short surveys
in this situation. A second way is to ask respondents to come to a particular location to
complete a survey. While a third format is to visit respondents at their homes or work
sites and ask them to complete the survey.

(f) Analyzing the Survey Results

After surveys have been administered, you need to summarize, analyze, and interpret the
results. This requires sorting and consolidating individual responses to survey items so
that they can be more easily displayed and understood.

Frequency distributions. Frequency distributions are a very simple method of


displaying the variation in responses to survey items. These distributions can be
developed by counting and recording answers according to the response scales used in
the survey. Frequency distributions are typically presented as tables or bar graphs for ease
of interpretation.

39
Exhibit 4
Example of a Table Showing a Frequency Distribution of Responses to Survey Items

Exhibit 5
Example of a Frequency Distribution Presented as a Graph

40
Percentages. One of the simplest ways to summarize survey information is with
percentages. Percentages are calculated by dividing the total of a specific response
choice by the total number of responses and multiplying by 100. Percentages can be
displayed using tables, bar graphs, or pie charts.

Exhibit 6
Example of Percentages Presented as Pie Chart:
Areas Where Possible Synergies in Mergers and Acquisitions Exist

Source: KPMG Mergers and Acquisitions Report

Line Graph: A line graph typically presents data organised against time. Shown below
is Tesco’s share price from 15 June 2004 to 16 January 2009 and measured in pence per
share.
Exhibit 7
Example of Data Presented as a Graph
Tesco’s Share Price

Source: Yahoo Finance

41
Sampling

Since surveys can be expensive in terms of printing, mailing, and data entry costs, it is
common to select a subset or small group of people from which to gather data. This
subset or small group is known as a sample. The people targeted to fill out a survey are
chosen from a specific population. A population consists of all members of an
organization or group of people who possess the desired traits, knowledge, experience, or
characteristics of interest to the survey project.

A convenient rule of thumb for selecting the size of a sample when the size of the
population is known (employees that are to be selected for corporate training sessions) is
to randomly select 10-20 per cent of the members from the population being investigated.

Random sampling means that each member of the population has an equal chance of
being surveyed. Random sampling improves the probability that information obtained
through the survey will represent the responses the entire population would give.

One way to simplify the situation is to think of respondents in terms of segmentation.


Segmentation means to sort respondents into groups based on similar characteristics, such
as relationship to the organization (internal, end-user, supplier), position, or type(s) of
products or services used. Sorting respondents into groups is a commonly used method
to identify and distinguish the experiences and perceptions of distinct groups.
Demographic items are used in surveys to identify respondents so that their data can be
sorted and analyzed as needed.

We conclude this section with an example of a recent survey undertaken to better


understand the factors that influence how Chief Financial Officers (CFOs) at leading US
companies view dividend policy and share buybacks.6 The authors of the survey sent out
questionnaires to leading CFOs who are members of Financial Executives International,
an association that includes both publicly traded and privately owned companies. A
number of different procedures were used to deliver the questionnaire, and incentives
were offered to elicit a high response rate. The authors report a response rate of 16 per
cent, which is more or less typical for these types of surveys. (It would appear that the
incentives had little or no impact on whether executives were prepared to complete and
return the survey.)

A follow up interview was conducted by the authors, mainly via telephone though several
were conducted face-to-face. Interviews lasted between 40 minutes and more than two
hours. The authors report that CFOs were ‘remarkably candid and straightforward’
Interviewees were not chosen randomly so that the researchers could obtain cross-
sectional differences in firm characteristics and payout policies. Because dividend cuts
are rare (as financial theory predicts) companies that reduced, or contemplated reducing,
their dividends were over-sampled.
6
Brav, Graham, Harvey and Michealy (2005): “Payout Policy in the 21st Century,” Journal of Financial Economics,
77, and Brav, Graham, Harvey and Michaely (2008): “The Effect of the May 2003 Dividend Tax Cut on Corporate
Dividend Policy: Empirical and Survey Evidence,” National Tax Journal, LXI.

42
These additional results were then integrated with the survey evidence to ‘reinforce and
clarify the survey responses but occasionally to provide a counterpoint.’ The results,
presented below, are pretty much self-explanatory. In some places they confirm previous
theories as to dividend policy – firms regard dividend policy as being extremely
important and are reluctant to reduce or suspend dividends unless forced by
circumstances to do so – while offering interesting insights into factors that influence
share buybacks, the principal alternative to dividend payout as a way of returning cash to
shareholders.

The information provided by such surveys adds texture to the more theoretical and
quantitative studies that are characteristic of financial research. It is difficult to assess the
merits of this particular survey. True, great care has gone into the design and delivery of
the survey; questions were pre-tested so as to note how it long to took to complete and to
provide feedback to ensure the researchers were on the right track. This process resulted
in the rewording of several questions and the deletion of one quarter of the original
content. They authors also tested whether the order in which the questions were asked
mattered (it didn’t), and whether too many sub-parts might result in ‘burn-out’ that is,
affected the quality of the responses (it, too, didn’t).

Despite the care that went into its preparation, and the substantial costs incurred, we
cannot conclude it is the last word on the subject. Previous surveys, equally carefully
crafted, provide conflicting evidence on a number of other important financial issues
(capital budgeting techniques, for example), though whether these differences are
meaningful is another question. Even large differences in the reported results do not
constitute evidence that the indicated effect could not have arisen by chance. The survey
is thus a source of good news  in that many venerable propositions concerning dividend
payout policy were vindicated  and bad news, that many important conclusions found in
the relevant literature are not supported by CFO views. Again additional research will
needed before we can confirm (or reject) the significance of these findings.

43
Exhibit 8
Financial Executive Views Concerning Payout Policy

Dividends Repurchases
Very important. Do not cut dividends except Historical Historical level is not very important.
in extreme cases. Level
Sticky, inflexible, smooth through time. Flexibility Very flexible. Smoothing not needed.
Little reward for increasing. Consequence if Stock price increase when repurchase plan is
Increased announced.
Big market penalty for reducing or omitting. Consequence if Little consequence from one year to the next,
Reduced though firms try to complete plans.
Most common target is the level of dividends Target Most common target is dollar amount of
followed by payout ratio and growth of repurchase, a very flexible target.
dividends. Target is viewed as rather
flexible.
External funds would be raised before cutting Relation to Repurchases would be reduced before raising
dividends. External Funds external funds.
First maintain historic dividend level, then Relation to First investment decision, then make
make incremental investment decisions. Investment repurchase decisions.
Dividend increases tied to permanent , stable Earnings Repurchases increase with permanent
earnings. Quality earnings but also with temporary earnings.
At the margin, do not reduce repurchases to Substitutes? At the margin, reduce dividends increases
in order to increase dividends. (not level) in order to increase repurchases.
Tax disadvantage of dividends of secondary Taxes Tax advantage of repurchases of secondary
importance. importance.
Dividends convey information. Convey Repurchases convey information.
Information?
Dividends are not a self-imposed cost to Signal? Repurchases are not used to as a self-imposed
signal firm quality or separate from cost to signal firm quality or separate from
competitors. competitors.
Retail investors like dividends if tax Retail Investors Retail investors like repurchases less than
disadvantaged. Retail investors like they like dividends.
dividends about the same as institutions like
dividends.
Institutions generally like dividends but Institutional Institutions generally like dividends about the
institutions are not sought out to monitor Investors same as they like dividends.
firm.
Not important. Stock Price Repurchase shares when stock undervalued
by market.
Not important. EPS Repurchasing in an attempt to increase EPS is
very important.
Not important. Stock Options Repurchasing to offset dilution is important.
Not important. Cash on Use to reduce cash holdings when cash is
Balance Sheet sufficiently high.
Not important. Float or Do not repurchase if float is not sufficient.
Liquidity
Not important. Mergers and Important.
Acquisitions
Not important. Takeovers Important.
Expected to pay dividends. Cash Cows Expected to return capital, including
repurchasing shares.
… we would keep dividend commitment If we were We would rely heavily on repurchases to
minimised starting over… return cash to shareholders.
… when earnings become positive and stable. Non-payers will …the market is undervaluing their stock.
… institutions demand dividends. initiate when… …they have extra cash on the balance sheet.
… they have fewer investment opportunities …institutions demand repurchases.
available. …they have fewer profitable investments.
…they think that repurchases can increase
EPS or offset stock option dilution.
Source: (Brav, et al.:2004).

44
6.2 Statistical Inference

When you can measure what you are speaking about and express
it in numbers you know something about it; but when you can not
measure it when you can not express in numbers your knowledge
is of a meagre and unsatisfactory kind. Lord Kelvin7

Statistical methods are at the heart of economic and financial analysis. They provide the
basis for some of the discipline’s best known theories and the analytical means to
differentiate sound from flawed explanations. Since this is not meant to be a quantitative
Guide, our discussion will, accordingly, be limited to those topics that are central to a
proper understanding of the way statistical procedures can best be used in MBA
dissertations.

One of the key abstractions used by financial economists is the assumption of certainty,
that is, that only one outcome is possible and that outcome is widely known. Given
certainty, it is easy to predict how a rational, self-interested individual will behave.

Things become considerably more complicated once uncertainty – the possibility of more
than one outcome – is introduced into the analysis. Uncertainty compels us to act ‘upon
opinion rather knowledge’ as the eminent University of Chicago economist Frank Knight
once observed. If uncertainty is the driving force behind the quest for knowledge, then
statistics provides the means for accessing that knowledge, and it does this by accounting
for uncertainty in a formal, mathematical way.

Seen this way, knowledge may be defined as a competition between different


descriptions of how the world works, with the competition decided by data. Statistical
analysis allows for logical and rigorous management of this competition, by enabling
researchers to draw conclusions about a large number of events or the properties of a
population from a sample of those events or from the population itself.

Classical statistics is based on the use of observed frequencies of different events to make
inferences about the population or to test hypotheses. An essential tool of classical
statistics is the null hypothesis. The importance of the null hypothesis to statistical
inference derives from the fact that it is frequently impossible to prove that something
occurred. To obviate this difficulty, a null hypothesis is constructed that is the
complement of the hypothesis of interest. We use available data to assess the likelihood
that the null hypothesis is true. As the probability that the null hypothesis is true
decreases, it becomes a less likely description of how the world works. At some point,
the null hypothesis is considered to be disproved and is rejected, leaving the original
hypothesis.

By scholarly convention, the null hypothesis is rejected when the chance of observing the
data, given that the null hypothesis were true, is less than 5 per cent. The 5 per cent
criterion has been attacked as being arbitrary; some natural and social scientists propose
7
PLA, vol. 1, "Electrical Units of Measurement", 3 May 1883.

45
even more rigorous cut-off values, say, 1 per cent, though this standard, too, can be
dismissed as being equally arbitrary. The simple fact is that the 5 per cent value remains
the most widely used criterion; this consensus is actually quite beneficial since it means
that 5 per cent has become the more or less universal standard for evaluating statistical
evidence.

A simple example will help to illustrate the way the null hypothesis is used. Suppose you
are given a coin and asked to determine whether this coin was not fair. If the coin was
fair, then when the coin is tossed we would expect “heads” to come up as often as “tails”.
Since the coin has two sides, for each toss the likelihood of one or the other side coming
up should be the same. One toss cannot settle the matter; nor would we reject the idea of
it being a fair coin if two tosses both turned up heads. How would things change if after
ten tosses six heads came up, or after 100 tosses 60 came up as heads? Can we still
consider the coin to be fair? And how do we interpret this information?

From the perspective of classical statistics, the first thing we do is to construct the null
hypothesis, which says that the coin is fair, and then go on to determine the probability of
observing 60 per cent or more heads as function of the number of tosses. After ten tosses
there is about a 35 per cent chance of observing six heads; after 100 tosses, the chances of
a head appearing 60 per cent of the time decline to around 3 per cent. Knowing this, we
would reject the null hypothesis that the coin is fair. We have not proved that the coin
was unbiased only that, if the coin were fair, the chance of observing the data is less than
5 per cent.

Choosing the right test to compare measurements is not quite as straightforward as


standard statistics texts suggest, as you must choose between two broad families of tests.
Many statistical tests are based upon the assumption that the data are sampled from a
Gaussian (or Normal) distribution. Such tests are referred to as parametric tests;
commonly used parametric tests are shown in the first column of the table below and
include the t-test and analysis of variance.

Tests that make no assumptions about the population distribution are referred to as non-
parametric tests. All commonly used nonparametric tests rank the outcome variable
from low to high and then analyze the ranks. These tests are listed in the second column
of the table; they are also called distribution-free tests.

How does one choose between parametric and nonparametric tests? Sometimes it is very
easy to do so, but not always. You would choose a parametric test if you were sure the
data you collected are sampled from a population that follows a Gaussian distribution (at
least approximately).8

Nonparametric tests should be selected in the following three situations:

(1) If the outcome is a rank or a score then the population is clearly not Gaussian;

8
There are several tests that can be used to determine whether the data you are studying follow a normal
(Gaussian) distribution. The best known of these are the Kolmogorov-Smirnov and Shapiro-Wilks tests.

46
(2) If some values are ‘off the scale,’ that is, too high or too low to measure. Even if the
population is Gaussian, it is impossible to analyze such data with a parametric test since
you don't know all of the values. Using a nonparametric test with these data is simple.
Assign values too low to measure an arbitrary very low value and assign values too high
to measure an arbitrary very high value. Then perform a nonparametric test. Since the
nonparametric test only knows about the relative ranks of the values, it won't matter that
you didn't know all the values exactly; and

(3) When the data are measurements, and you are sure that the population is not normally
distributed. If the data are not sampled from a Gaussian distribution, consider whether
you can transform the values to make the distribution become Gaussian. You might, for
example, take the logarithm or reciprocal of all values. Financial researchers long ago
concluded that stock returns are not normally distributed, but when converted to
logarithms they correspond more closely to a Gaussian distribution.

It is not always easy to decide whether a sample comes from a Gaussian population. If
you collect many data points (more than 100), look at the distribution of data and it will
be fairly obvious whether the distribution is approximately bell shaped (Gaussian). With
a smaller number of data points, it will be difficult to tell by inspection alone whether the
data are Gaussian; sometimes even formal tests cannot always discriminate between
Gaussian and non-Gaussian distributions. You should look at previous data as well.

It bears repeating that what matters most is the distribution of the overall population, and
not the distribution of your sample. In deciding whether a population is Gaussian, look at
all available data, not just data in the current experiment. Consider the source of scatter.
When the scatter comes from the sum of numerous sources (with no one source
contributing most of the scatter), you would expect to find a roughly Gaussian
distribution.

When in doubt, some people choose a parametric test – because they aren't sure the
Gaussian assumption is violated – while others choose a nonparametric test for precisely
the opposite reason.

Does it matter whether you choose a parametric or nonparametric test? The answer
depends on the sample size. Four cases are worth considering in this connection:

• Large samples. What happens when you use a parametric test with data from a
non-Gaussian population? The Central Limit Theorem ensures that parametric
tests work well with large samples even if the population is non-Gaussian. In
other words, parametric tests are robust to deviations from Gaussian distributions,
so long as the samples are large. The snag is that it is impossible to say how large
is large enough, as it depends on the nature of the particular non-Gaussian
distribution. Unless the population distribution is really weird, you are probably
safe choosing a parametric test when there are at least two dozen data points in
each group.

47
• Large samples. What happens when you use a nonparametric test with data from a
Gaussian population? Nonparametric tests work well with large samples from
Gaussian populations. The p values tend to be a bit too large, but the discrepancy
is small. In other words, nonparametric tests are only slightly less powerful than
parametric tests with large samples.

• Small samples. What happens when you use a parametric test with data from non-
Gaussian populations? You can't rely on the central limit theorem, so the p value
may be inaccurate.

• Small samples. When you use a nonparametric test with data from a Gaussian
population, the P values tend to be too high. The nonparametric tests lack
statistical power with small samples.

Large data sets, in short, present no problems. It is usually easy to tell if the data come
from a Gaussian population, but it doesn't really matter because the non-parametric tests
are so powerful and the parametric tests are so robust. Small data sets, on the other hand,
create a dilemma. It is difficult to tell if the data come from a Gaussian population, and it
matters: non-parametric tests are not powerful and parametric tests are not robust.

With many tests, you must choose whether you wish to calculate a one- or two-sided p
value (also known as a one- or two-tailed p value). Let's review the difference in the
context of a t-test (see below). The p value is calculated for the null hypothesis that the
two population means are equal, and any discrepancy between the sample means is due to
chance. If the null hypothesis is true, the one-sided p value is the probability that two
sample means would differ as much as was observed (or further) in the direction
specified by the hypothesis just by chance, even though the means of the overall
populations are actually equal. The two-sided p value also includes the probability that
the sample means would differ that much in the opposite direction (that is, the other
group has the larger mean). The two-sided p value is, not unexpectedly, twice the one-
sided p value.

A one-sided p value is appropriate when you can state with certainty (and before
collecting any data) that there either will be no difference between the means or that the
difference will go in a direction you can specify in advance (that is, you have specified
which group will have the larger mean). If you cannot specify the direction of any
difference before collecting data, then a two-sided p value is more appropriate. When in
doubt, select a two-sided p value.

If you select a one-sided test, you should do so before collecting any data and you need to
state the direction of your experimental hypothesis. If the data go the other way, you
must be willing to attribute that difference (or association or correlation) to chance, no
matter how striking the data.

48
Exhibit 9
Selecting a Statistical Test
Type of Data
Rank, Score, or
Measurement
(from Non- Binomial
Measurement (from Gaussian (Two Possible
Gaussian Population) Population) Outcomes) Survival Time
Goal
Describe one group Mean, SD Median, interquartile Proportion Kaplan Meier
range survival curve
Compare one group to a hypothetical value One-sample t test Wilcoxon test Chi-square
or
Binomial test **
Compare two unpaired groups Unpaired t test Mann-Whitney test Fisher's test Log-rank test or
(chi-square for Mantel-Haenszel*
large samples)
Compare two paired groups Paired t test Wilcoxon test McNemar's test Conditional
proportional
hazards
regression*
Compare three or more unmatched groups One-way ANOVA Kruskal-Wallis test Chi-square test Cox proportional
hazard
regression**
Compare three or more matched groups Repeated-measures Friedman test Cochrane Q** Conditional
ANOVA proportional
hazards
regression**
Quantify association between two variables Pearson correlation Spearman Contingency
correlation coefficients**
Predict value from another measured Simple linear Nonparametric Simple logistic Cox proportional
variable regression regression** regression* hazard regression*
or
Nonlinear regression
Predict value from several measured or Multiple linear Multiple logistic Cox proportional
binomial variables regression* regression* hazard regression*
or
Multiple nonlinear
regression**

6.2.a Chi-Square

One important set of statistical tests allows us to test for deviations of observed
frequencies from expected frequencies. To introduce these tests, we will start with a
simple example: we want to determine if a coin is fair. In other words, are the odds of
tossing the coin heads-up the same as tails-up. We conduct an experiment using the coin
by flipping it 200 times. The coin landed heads-up 108 times and tails-up 92 times. At
first glance, we might suspect that the coin is biased because heads turned up more often
than tails. However, to determine whether the observed differences are significant or
could have arisen by chance we utilise a chi-squared test.

49
To perform a chi-square test - or for that matter any other statistical test - we must first
formulate the null hypothesis. In the example to hand, our null hypothesis is that for each
toss the coin should be equally likely to land heads-up or tails-up each time. The null
hypothesis allows us to state expected frequencies: for 200 tosses, we would expect 100
heads and 100 tails.

The following table summarises the results of our experiment:

Heads Tails Total


Observed 108 92 200
Expected 100 100 200
Total 208 192 400

The observed values are those we gather ourselves; the expected values are those
frequencies expected based on our null hypothesis. We sum the rows and columns as
shown in the table. It is always a good idea to make sure that the row totals equal the
column totals (both total to 400 in this example).

Statisticians have devised the chi-square test as a way to determine if a frequency


distribution differs from the expected distribution. Chi-squared is calculated according to
the following formula:

Χ2 = ∑ (observed-expected)2/(expected)

We have two classes to consider in this example, namely, heads and tails.

Chi-squared = (100-108)2/100 + (100-92)2/100 = (-8)2/100 + (8)2/100 = 0.64 + 0.64 =


1.28

We next consult a table of critical values of the chi-squared distribution; apportion of the
bale is presented below.

df/prob. 0.99 0.95 0.90 0.80 0.70 0.50 0.30 0.20 0.10 0.05
1 0.00013 0.0039 0.016 0.64 0.15 0.46 1.07 1.64 2.71 3.84
2 0.02 0.10 0.21 0.45 0.71 1.39 2.41 3.22 4.60 5.99
3 0.12 0.35 0.58 1.00 1.42 2.37 3.66 4.64 6.25 7.82
4 0.3 0.71 1.06 1.65 2.20 3.36 4.88 5.99 7.78 9.49
5 0.55 1.14 1.61 2.34 3.00 4.35 6.06 7.29 9.24 11.07

The left-most column lists the degrees of freedom (df), which are determined by
subtracting the number one from the number of classes. In the example to hand, we have
two classes (heads and tails), so our degrees of freedom is 1, and our chi-squared value is
1.28.

50
Now, move across the row for 1 df until we find critical numbers that bound our value. In
this case, 1.07 (corresponding to a probability of 0.30) and 1.64 (corresponding to a
probability of 0.20). We can interpolate our value of 1.28 to estimate a probability of
0.27; this value means there is a 73 per cent chance that our coin is biased. In other
words, the probability of getting 108 heads out of 200 coin tosses with a fair coin is 27
per cent. Because the chi-squared value we obtained in the coin example is greater than
0.05 (0.27 to be precise), we accept the null hypothesis as true and conclude that our coin
is fair.

6.2.b Testing the Difference Between Two Means: The t-Test

To introduce this technique, let’s revert to the simple example given above that training
affects worker productivity and by extension firm profitability. After collecting the data
we conclude that before receiving training the average hourly level of productivity
amounted to 3.882 units, with a variance of 2.743, and 5.353 and 2.743, respectively,
after training. We would like to know whether the observed differences in output could
have arisen by chance or are reflective of the impact of training on productivity. 9

We noted above that the magnitude of the effect by itself should not be taken as evidence
that the differences are significant. Of course, the larger the sample, the greater is the
likelihood of the difference being significant. In the case to hand, let us assume we are
using a fairly small sample of, say, 17 workers. If we have followed correct procedure,
then our sample should be representative of the population of the firm’s workers who
received on-the-job training. Taking the data at face value, the benefits of training appear
quite considerable, with the productivity of workers receiving training rising by 37 per
cent compared to the productivity of untrained workers.

We next need to establish whether we can rule out the possibility this improvement could
have arisen by chance, which would, of course, signify that training has no impact at all
on productivity. To do so we make use of three critical assumptions: (1) the two
populations from which the samples were drawn have the same variance (homogeneity of
variance); (2) the underlying populations follow a Gaussian distribution; and (3) each
value is sampled independently from all other values (random sampling). As a general
matter small violations of the first two assumptions do not matter all that much; that the
third assumption is satisfied is much more important if the results are to have validity.

To determine whether the observed differences are significant we calculate the following
statistic:

t = (estimated value- hypothesised value)/estimated standard error of the statistic.

The null hypothesis is that training does not affect the average level of worker
productivity, in which case the hypothesised value is zero. The first step in this process is
9
More formally, we could have used either the z-test or the t-test. If the underlying population follows a
Gaussian distribution and the variance is known, we should apply the z-test; if, by contrast, the variance is
unknown and the population distribution is Gaussian or very large, then the t-test is the correct procedure to
use.

51
calculation of the difference between the two means: M1 – M2 = 5.3523 – 3.8824 = 1.470.
Since the hypothesised value is zero, there is no need to subtract it from the statistic. The
next step is calculation of the standard error of the statistic: (S M1 – M2). Any statistical
text will show that the formula for the standard error of the difference in means in the
population is given by:

To derive this quantity, we estimate σ2 and use that estimate in place of the corresponding
population value in the formula. Since by assumption the population variance is the
same, we estimate this quantity by averaging the two sample variances: MSE = 2.743 +
2.985/2 = 2.864, where MSE is our estimate of σ2. Since n is 17, we need to adjust MSE
to reflect sample size in each group according to:

= = = 0.5805.

The next step is to compute the value of t by inserting these values into the formula given
above: t = 1.47/0.5805 = 2.533. Lastly, we compute the probability of getting as larger or
larger than 2.53 or as small or smaller than – 2.53. To do this, we need first to know the
number of degrees of freedom, the number of independent estimates of the variance upon
which the MSE is based. This value is determined as (n1 – 1) + (n2 – 1), where n1 is the
sample size for the first group and n2 is the sample size of the second group; since n1 = n2
= 17, the number of degrees of freedom amounts to 16. Now we can use this information
to use the t-distribution to the find the probability we are looking to establish.

As noted above, we use a one tailed test where we believe there either will be no
difference between the means or that the difference will go in a direction specified in
advance (in other words, you have specified which group will have the larger mean); with
t = 2.533 and 17 degrees of freedom, only one time out of 100 could the observed
difference have arisen by chance. Therefore we can reject the null hypothesis that the
difference in means amounts to zero; the sample data confirms that raining does have a
favourable impact on productivity.

To complete our discussion, we illustrate in the exhibit below how t-tests have been used
in financial research, in the case to hand an investigation into the differences in several
variables widely regarded as differentiating failed from viable banks; the data cover a
sample of Jamaican banks sorted by whether they did or did not fail between 1992 and
1993.

52
Exhibit 10
Mean of Selected Variables, Non-Failed and Failed Jamaican Banks, 1992-1993

Variables Non-Failed Failed t-statistic

Gross Capital/Risk Assets 18.69 - 18.46 1.8


Loan Loss Reserve/Gross Loans 6.78 17.78 - 3.1b
Total Operating Expense/Total 96.87 152.65 - 2.9a
Operating Revenue
Return on Assets - 0.03 - 8.28 2.0b
Liquid Assets/Total Assets 40.34 45.20 - 0.8
Log Total Assets 0.53 0.43 0.2
Change in Loans/GDP 0.02 - 0.16 2.7a
Source: Daley, Mathews and Whitfield, “Too-Big-Too-Fail: Bank Failure and Banking Policy in Jamaica,”
Cardiff Business School Working Paper, E2006/4 (January).

With one exception the data relate to a number of financial variables specific to each
bank in the sample. The results are pretty much as would be expected: failed banks have
much less capital relative to risk assets, a lower return on assets, much higher loan loss
provision and operating expense ratios, and are generally smaller than viable banks.
Once these differences are subject to formal statistical analysis, we observe that neither
size, liquidity or capital adequacy matter, that is, the differences noted in the table appear
to have arisen by chance; on the other hand, the efficiency and return measures all appear
to reflect genuine differences between the two categories of banks.

To summarise:

We use a one-tailed test when the research hypothesis predicts a significant difference
between two groups and the direction of the difference.

Hours spent studying improves finance exam results.

We use a two-tailed test when the research hypothesis predicts a significant difference
between two groups, but not the direction of the difference.

Male and female students differ in the number of hours they devote to
revising for their finance exam.

The critical value for the rejection of the null hypothesis is calculated differently
depending upon whether the hypothesis is one- or two-tailed.

6.2.c Testing the Difference of More than Two Means: Analysis of Variance

Where we are interested in determining whether two means differ significantly from each
other, the appropriate statistical procedure is to use the t-test. Where, on the other hand,
we are interested in establishing whether significant differences exist among three or

53
more means, then the appropriate procedure is to use analysis of variance or ANOVA as
it is commonly called. Why?

To answer this question we need to investigate more closely the meaning of a p-value.
When interpreting a p-value, we may conclude there is a significant difference between
groups if the p-value is small enough, with 5 per cent typically used as the cut-off value.
In this case, 5 per cent is the significance level, or the probability of a type I error – the
chance of incorrectly rejecting the null hypothesis, that is, incorrectly concluding that an
observed difference did not occur just by chance or, more simply, the chance of
concluding that there is a difference between two groups when in fact there is no such
difference. If multiple t-tests are carried out then the type I error rate will increase with
the number of comparisons made.

To illustrate the point, consider a study where there are six possible pair-wise
comparisons, with the number of comparisons given by 4C2 = 4!/[2!2!], where 4! =
4x3x2x1. If the chance of a type I error in one such comparison is 0.05, then the chance
of not committing a type I error is 1.0 - 0.05 = 0.95. If the six comparisons are assumed
independent, then the chance of not committing a type I error in any one of them is 0.956
= 0.74. Hence the chance of committing a type I error in at least one of the comparisons
is 1 – 0.74 = 0.26, which is the overall type I error rate for the analysis. Therefore there
is a 26 per cent overall type I error rate even though for each individual test the type I
error rate is 5 per cent. ANOVA is used to avoid this error.10

To understand how ANOVA is used in practice consider the following example.

Suppose we are interested in establishing whether the application of three different


fertilisers significantly affects farm yields. This could be done, for example, by a field
experiment in which each fertiliser is applied to 10 plots; the 30 plots are later harvested
with the crop yield being calculated for each plot. We now have three groups of ten yield
estimates, as shown in Exhibit 10.

Inspection of Exhibit 10 suggests that different fertilisers do appear to have a significant


impact on yields; from our previous discussion, it should be clear that these differences
could signify either that choice of fertilizer matters or the observed differences arose by
chance. The only way we could know for sure would be to subject the data to formal
statistical analysis.

10
Bewick, Cheek and Ball, “Statistics Review 9: One Way Analysis of Variance,” Critical Care, 8 (130-
136).

54
Exhibit 11
Yield per Plot for Thirty Plots Treated with Fertiliser

The variability in a set of data quantifies the scatter of the data points around the mean.
To calculate a variance, we first calculate the mean, then the deviation of each point from
the mean. Deviations will be both positive and negative; though their sum will be zero.
(This follows directly from how the mean was calculated in the first place). This will be
true regardless of the size of the data set, or the amount of variability within a dataset;
accordingly, the ‘raw’ deviations do not provide a useful measure of variability. If
instead the deviations are squared before summation then this sum is a useful measure of
variability, which will increase the greater is the scatter of the data points around the
mean. This quantity is referred to as a sum of squares (SS),

To illustrate how is this done consider the following chart (Exhibit 11), which presents
the basic data as well as the mean and deviation of each point from the mean. Note that
the fertiliser applied to each plot is not indicated. The SS cannot, however, be used as a
comparative measure between groups, because clearly it will be influenced by the
number of data points in the group; the more data points, the greater the SS. Instead, this
quantity is converted to a variance by dividing by n − 1, where n equals the number of
data points in the group:

A variance is therefore a measure of variability, taking account of the size of the dataset.

You might ask, why use n – 1 rather than n? If we wish to calculate the average squared
deviation from the mean (i.e., the variance) why not simply divide by n? The reason is
that we do not actually have n independent pieces of information about the variance. The
first step was to calculate a mean (from the n independent pieces of data collected). The
second step is to calculate a variance with reference to that mean. If n − 1 deviations are
calculated, it is known what the final deviation must be, for they must all add up to zero
by definition. So we have only n − 1 independent pieces of information on the variability

55
about the mean. Consequently, it makes more sense to divide the SS by n − 1 than n to
obtain an average squared deviation around the mean. The number of independent pieces
of information contributing to a statistic are referred to as the degrees of freedom.

Exhibit 12
Yield per Plot by Plot Number

In an ANOVA, it is useful to express the measure of variability in terms of its two


components; that is, a sum of squares, and the degrees of freedom associated with the
sum of squares. Returning to the original question: what is causing the variation in yield
between the 30 plots of the experiment? Numerous factors are likely to be involved such
as differences in soil nutrients between the plots, differences in moisture content, other
biotic and abiotic factors, as well as the fertiliser applied to the plot. However, it is only
the last of these that are of interest to us, so we will divide the variability between plots
into two parts: (1) the portion due to applying different fertilisers, and (2) the remainder
which is due to all of these other factors.

To illustrate the principle behind partitioning the variability, first consider two extreme
datasets. If there was almost no variation between the plots due to any of the other
factors, so that nearly all variation was due to the application of the three fertilisers, then
the data would follow the pattern shown below.

• The first step would be to calculate a grand mean, that is, the mean value of all the
data points, and there is considerable variation around this mean.

• The second step is to calculate the three group means we wish to compare: that is,
the means for the plots given fertilisers A, B and C.

Once these means are fitted, there little variation is left around the group means; in other
words, fitting the group means has removed or explained nearly all the variability in the
data. This has happened because the three means are distinct.

Now consider the other extreme, in which the three fertilisers are, in fact, identical. Once
again, the first step is to fit a grand mean and calculate the sum of squares. Second, three

56
group means are fitted, only to find that there is almost as much variability as before.
Little variability has been explained. This has happened because the three means are
relatively close to each other (compared to the scatter of the data). The amount of
variability that has been explained can be quantified directly by measuring the scatter of
the treatment means around the grand mean. In the first of the two examples, the
deviations of the group means around the grand mean are considerable, whereas in the
second example these deviations are relatively small.

Exhibit 13
Variability Around the Grand Mean

Having explained the principles behind an analysis of variance, let us analyse whether the
observed response differences reflect the impact on yields of applying the different
fertilisers or are due entirely to chance. This analysis requires two inputs from the
researcher, but we must first convert the data into two broad categories: Yields and
Fertilisers, with the first column sub-divided by the three different fertilizers and the
yields obtained from each.

Exhibit 14
Variability Around Three Treatment Means

57
The first variable is categorical, and in this sense the values 1, 2 and 3 are arbitrary.
YIELD, by contrast, is continuous, the values representing true measurements. Data are
usually continuous, while explanatory variables may be continuous or categorical or both.

(a) The question

The question to be answered is: ‘Does fertiliser affect yield?’

This question focuses on two variables: YIELD  the data we wish to explain  and
FERTIL  the variable we hypothesise might do the explaining. YIELD therefore is the
response (or dependent) variable, and FERTIL the explanatory (or independent)
variable. It is important that the data variable is on the left hand side of the formula, and
the explanatory variable on the right hand side. It is the right hand side of the equation
that will become more complicated as we seek progressively more sophisticated
explanations of our data.

Exhibit 15
Analysis of Variance with One Explanatory Variable

(b) Output

The primary piece of output is the ANOVA table, in which the partitioning of SS and df
has taken place. This will either be displayed directly, or can be constructed by you with
the output given. he total SS have been partitioned between treatment (FERTIL) and
error, with a parallel partitioning of degrees of freedom. Each of the columns ends with
the total of the preceding terms. The calculation of the SS is displayed in Table 1.3.
Columns M, F and Y give the grand mean, the fertiliser mean and the plot yield for each
plot in turn.

Column MY represents the deviations from the grand mean for each plot. If these values
are squared and summed, then the result is the total SS of 36.44. FY then represents the
deviations from the group mean for each plot; these values when squared and summed
give the error SS. Finally, MF represents the deviations of the fertiliser means from the
grand mean; squaring and summing giving the treatment SS. Dividing by the
corresponding df gives the mean square. Comparison of the two mean squares gives the
F-ratio of 5.70. The probability of getting an F-ratio as large as 5.70 or larger, if the null

58
hypothesis is true, is the p-value of 0.009. That is sufficiently small to conclude that these
fertilisers probably do differ in efficacy.11

Exhibit 16
The F-distribution of 2 and 27 Degrees of Freedom
(The area to the right of 5.7 represents the probability the F-ratio is at least 5.7, and
is 0.009 of the total area under the curve.)

To construct a confidence interval, both the parameter estimate, and the variability in that
estimate are required. In this case, the parameters estimated are means—we wish to know
the true mean yield to be expected when we apply fertiliser 1, 2 or 3—which we will
denote μA, μB, and μC, respectively. These represent true population means, and as such
we cannot know their exact values—but our three treatment means represent estimates of
these three parameters.

The reason why these estimates are not exact is because of the unexplained variation in
the experiment, as quantified by the error variance which we previously met as the error
mean square, and will refer to as s2. The 95% confidence interval for a population mean
is:

11
If none of the fertilisers influenced yield, then the variation between plots treated with the same fertilizer
would be much the same as the variation between plots given different fertilizers. This can be expressed in
terms of mean squares: the mean square for fertilizer (FMS) would be the same as the mean square for error
(EMS), namely, FMS/EMS = 1. The ratio of the two mean squares is the F—ratio, and is the end result of
the ANOVA. Even if fertilizers are identical, it is unlikely to exactly equal one; it could by chance take
take on a whole range of values. The F-distribution represents the range and likelihood of all possible F-
ratios under the null hypothesis that the fertilizers are identical.

59
Exhibit 17
Calculating the SS and DF

The key point is where our value for s comes from. If we had only the one fertiliser, then
all information on population variance would come from that one group, and s would be
the standard deviation for that group. In this instance however there are three groups, and
the unexplained variation has been partitioned as the error mean square. This is using all
information from all three groups to provide an estimate of unexplained variation—and
the degrees of freedom associated with this estimate are 27—much greater than the 9
which would be associated with the standard deviation of any one treatment. So the value
of s used is √EMS, √0.949, 0.974. This is also called the pooled standard deviation.
Hence the 95% confidence intervals are as shown in Exhibit 17. These intervals,
combined with the group means, are an informative way of presenting the results of this
analysis, because they give an indication of how accurate the estimates are likely to be.

60
Exhibit 18
Constructing Confidence Intervals

6.2.d Regression Analysis

In many finance applications we make use of correlation, the extent to which two
variables move together, positively or negatively, or are independent of each other.
Recall that in portfolio theory, we use the degree of correlation to determine the impact
adding an additional asset will have on the portfolio variance: the lower  or better yet,
the more negative  the correlation the greater will be the diversification potential
associated with the new asset.

In reality, correlation signifies nothing more than the existence of a gross association
between two variables; it emphatically says nothing about the direction of causation, that
is, whether changes in one variable lead to changes in the second variable. Financial
theory is, however, very much concerned with both the strength of the relationship and
the direction of causation. One of the statistical procedures favoured by financial
analysts is regression analysis, because it ostensibly addresses both concerns
simultaneously. Regression analysis can be conceptualised as an extension of the co-
variance/correlation concept. The key difference is that regression analysis presupposes
that one variable, the independent variable (X), “causes” changes in the dependent
variable (Y).

It is important to stress that this causal relationship is conjectural only. For example,
economic theory suggests that a change in the budget balance (resulting, say, from an
increase in public expenditure) will cause national income to rise. It is, of course
possible, that changes in nominal income caused the increase in the budget deficit: owing
to the existence of the so-called automatic stabilisers: any decline in economic activity
will cause unemployment payments to rise while simultaneously reducing the level of tax
receipts. A regression of GDP growth on the budget deficit presupposes that the latter
causes the former, when in reality the reverse may be true.

Of course, we could regress changes in the lagged budget deficit against the current
change in GDP. Here, there is no question as to the direction of causation: the current
state of the economy cannot cause past changes in the budget balance. If there is any
relationship between the two variables then, given this specification, the direction of
causation must run from budget deficits to changes in national income.

61
The simplest way to comprehend what regression analysis is all about is to plot two series
against each other on a graph, with the values of the dependent variable (Y) shown on the
ordinate (Y-axis) and the values of the independent variable (X) shown on the abscissa
(X-axis). What regression analysis does is to fit a line through the data that minimises
the sum of the squared deviations of a given point from the regression line; hence the
alternative descriptor ordinary least squares or OLS. The standard output of regression
analysis consists of two parameters, a measure of the variability of the two parameter
estimates, a goodness-of-fit statistic and a measure of how tightly individual data points
cluster around the regression line.

Regression analysis is used to quantify the relationship between two variables that are of
interest to the researcher. We might, for example, hypothesise that earnings are an
increasing function of education: the more years of schooling you have, the higher on
average we expect your income to be. Now if we plot the data on the two series collected
for a large sample of people randomly chosen, the resulting scatter of points should rise
from the origin to the northeast corner of the graph. This positive association could be
rationalised in two ways: (1) higher productivity (and thus higher income) goes hand in
hand with higher education, and (2) people require compensation for investing time in
studying, given they have the alternative of earning income or spending time with their
families or friends (leisure).

Having plotted the data for a large sample of individuals, we would now like to estimate
precisely how much additional income one might expect to earn by increasing their
number of years of schooling. The relationship will not be perfect, as other factors that
could effect earnings will have been omitted from our bi-variate analysis; these omitted
influences are typically described as “noise.” Thus earnings are a function of years of
schooling (education) and noise.

Regression analysis produces a line that describes the relationship between the two series
by minimising the sum of squared errors, the square of the difference between actual and
predicted values of the dependent value (earnings in our example). This criterion has
several important advantages over other possible “fit” criteria. For one thing, it is easy to
employ computationally: “When one expresses the sum of squares mathematically and
employs calculus techniques to ascertain the value of (the regression coefficients) a and b
that minimise it, one obtains expressions for a and b that are easy to evaluate with a
computer using only the observed values of education and earnings in the data sample.”

And for another, “it also has attractive statistical properties under plausible assumptions
about the error term”12, namely, the resulting estimators will be unbiased (meaning that
the estimate will produce values corresponding to the ‘true’ mean of the population from
which the sample was drawn), consistent (the tendency for the estimator to converge to
the true population parameter as the number of observations increases) and efficiency (an
estimator having the lowest variance).

12
Sykes, “An Introduction to Regression Analysis,” Inaugural Coase Lecture (published as Chicago Working Papers in
Law & Economics, No.20: 1993).

62
The first regression output, the intercept, is the point at which the estimated regression
line cuts the Y-axis; the second is the slope of the regression line. The intercept can be
interpreted in one of two ways. It is either the value of the dependent variable when the
value of X is zero. Alternatively, and following directly from the way the parameter is
defined, it is the difference between the average value of Y and the slope-adjusted
expected value of X.

The slope measures both the direction and the magnitude of the relation. If the two series
are positively correlated, the slope coefficient will have a positive sign; if the series are
inversely related, the slope coefficient will be negative. The magnitude of the slope
coefficient can be interpreted in the following way: a unit change in the independent (X)
variable causes the dependent (Y) variable to change by the amount b. As the formula
given in Exhibit 2.18 indicates, the slope of the regression and the degree of correlation
among the two variables are closely linked.

Each of these coefficient estimates has an associated standard error. If we make the
assumption that the estimated intercept and slope coefficients are normally distributed,
the parameter estimates can be combined with the associated standard errors to obtain a t-
statistic. The t-statistic measures whether the relationship is statistically significant in the
sense defined above; that is, the likelihood that the coefficient of interest differs reliably
from zero (or from some other value).

While there are tables that provide critical values of t (thresholds that must be met or
surpassed to reject the null hypothesis), broadly speaking, for regressions estimated using
120 or more observations, t > 1.66 (2.36) signifies that 95 (99) times out of 100, the
coefficient will be differ from zero or, equivalently, that only 5 (1) times out of 100 the
estimated coefficient will be zero. For smaller samples, the t-statistic has to be larger to
indicate statistical significance.

The goodness of fit measure is known as the coefficient of determination, and is


designated as R2. The coefficient of determination measures the proportion of the
variation in the Y variable that is explained by the X variable. R 2 is a function of the
correlation between the two variables, but unlike the correlation coefficient, R2 is
bounded by zero and one. If the X variable explains none of the variation in the Y
variable – the two series are unrelated – R2 will equal zero; the closer the R2 is to one, the
greater is the strength of the relationship – regardless of whether it is positive or negative
– between the two variables.

Most economic or financial relationships are dependent upon more than one explanatory
variable. For example, we might expect the debt/total capital ratio to be a function of the
volatility of the firm’s cash flow, the level of the firm’s outstanding debt and the
company’s tax rate. In which case, we will want to know both the separate and combined
impact of each of these variables on the debt ratio.

63
Exhibit 19
Regression Outputs and Their Interpretation

Measure Calculation Interpretation

OLS Regression Y = a + bX A linear relationship between the independent


variable X and the dependent variable Y.

Slope of the Regression σYx/σX2 Measures the change in Y given a unit change
(b) in X.

Regression Intercept (a) μY – b(μx) The value of Y when X equals zero.

R2 of the Regression ρXY2 = b2σ2X/σY2 Measures the proportion of the variation in Y


that is explained by X.

Standard Error of the A measure of the spread around the intercept


Intercept (SEa) term. Used to asses the probability that the
estimate does not equal zero [ t = a/SEa >
t(0.95)].

Standard Error of the A measure of the spread around the estimated


Slope (SEb) slope coefficient. Used to assess the
probability that the estimated coefficient does
not equal zero [t = b/SEb > t(0.95).

Multiple Regression Y = a + b1X1 + b2X2 Allows for a relationship between the


+ b3X3 + b4X4 dependent variable and more than one
independent variable.

Adjusted R2 R2 – [k – 1/k – n]R2 Corrects for the tendency for the R2 to rise as
the number of independent variables increases.

The coefficient of multiple determination measures goodness-of-fit,13 the estimated


regression coefficients provide a measure of the direction and strength of the relationship
between each of the independent variables and (holding the influence of the other
independent variables constant) the dependent variable, and each regression coefficient
can be tested using the t-statistic to determine whether it differs significantly from zero.
In this form, the analysis is known as a multiple regression.

To sumarise: There are five key steps in the successful implementation and use of
regression analysis:

1. Specify the variables in the model and the exact form of the relationship between
them.
• In most instances a linear approximation will do; in others, more complex
relationships may have to be employed. Inspection of a plot of the dependent
variable should clarify the appropriate functional form.
13
There is a tendency for the R2 to increase as the number of independent variables included in the analysis increases.
To counter this tendency, the R2 is corrected by the following ratio: (k – 1/n – k) where k is the number of independent
variables and n is the number of observations included in the analysis. The result is known as the adjusted-R2, and is
usually denoted with a bar above the statistic.

64
2. Collect data on the variables in the analysis.
• Data are derived from multiple sources ranging from government statistics to
corporate annual reports. There are numerous economic and financial data bases
available that now greatly simplify data collection.

3. Estimate the parameters of the model.


• Various regression packages are available; the one favoured by students is part of
the EXCEL package.

4. Test statistically the utility of the model developed, and verify whether the assumptions
of the linear regression model have been satisfied.
• The diagnostic tests used to establish whether the assumptions of the regression
model have been met are typically a standard feature of the output; where the
assumptions are violated, some regression packages apply the appropriate
corrections automatically.

5. Use the model for prediction.


• Predictive accuracy is tested either (1) in-sample, that is, estimating the values of
the dependent variables given the values of the independent variables used in the
analysis; or (2) out of sample, where a sample of withheld data are is used to
predict their values. In either case, the values of the independent variables are set
at their actual levels. This is not a very stringent test of the model’s forecasting
ability as in most instances the input variables are estimates not known quantities.
In reality, it is the combination of the model and the forecaster’s skill that
determines the usefulness of the final model for predictive purposes.

6.2.e Assumptions of the Regression Model

To ensure that the results of the analysis are correctly interpreted, it is essential that all of
the assumptions underpinning the model are verified; otherwise, the results obtained may
not stand up under closer scrutiny, that is, may inferences may not be valid and
generalisable.

One of the most important assumptions concerns the residuals εi: the εi represent the
difference between the regression predictions and the actual data. Where the assumptions
concerning the residuals are verified, ordinary least squares provides the best estimates of
population coefficients. But what happens when these assumptions are violated? More to
the point, we need to know how to recognise when they are violated, assess the
implications of the violation, and understand the techniques can be used to correct
violations of these assumptions.

65
Assumptions of the Multiple Linear Regression Model

The "ideal" conditions for estimation and inference in regression analysis are:

1. The expected value of the disturbances is zero. E(εi) = 0. The regression line passes
through the conditional means of Y given X values. This also implies that Y has a linear
relationship with the explanatory variables.

2. The disturbances, εi; have a constant variance equal to σ2ε.

3. The disturbances, εi, are normally distributed.

4. The disturbances, εi, are independent.

5. The explanatory variables are not highly correlated with each other.

The Regression Residuals

As noted above, we can describe a sample regressions as:

Yi = β0 + β1X1i + β2X2i + … + βKXKi

where ^Yi is the regression's predicted value of the ith observation of the dependent
variable. If we denote Yi to be the actual ith observation of the dependent variable, then
the term
Yi - ^Yi

represents the difference between the predicted and actual values of the dependent
variable. This is called the residual for observation and is written as

εi = Yi – ^Yi

These are sample disturbances which can be used to approximate the population
disturbances εi. These residuals can be used to examine assumptions concerning the
population disturbances.

After any regression estimation, an analysis should be undertaken to assess whether the
model assumptions are verified or not. The residuals can be assessed graphically through
what is known as a residual plot, which is nothing more than a scatter plot of the
residuals. Graphical techniques often involve personal judgement about whether
violations are occurring; when you look at a residual plot, a "good" regression will
generate the following three properties:

(a) The average of the residuals will equal to zero, which is simply a result of the least
squares solution which forces this to be true.

66
(b) If conditions 1 through 3, above, hold, then the residuals should be randomly
distributed about their means (zero); in other words, there should be no systematic pattern
evident in the residuals.

(c) If conditions 1 through 4 noted above hold, then the residuals should look like random
numbers drawn from a normal distribution.

Remember: the residuals represent the errors of your regression and the errors should be
totally random. When using residual plots, the following should be used:

• Plot the residuals against each explanatory variable.

• Plot the residuals against the predicted values.

• If the data are a time series, the residuals should also be plotted against time.

Each of these plots can be used to find violations in different assumptions. The following
chart depicts a residual plot of the residuals of a regression of size versus value. The
figure plots the residuals against the explanatory variable size.

The residuals appear to be fairly random and also fairly normal. If anything, there appears
to be a slight tendency for the residuals to decrease as size increase. In reality, there is no
trend in the residuals (the vertical line shown in the chart). This is an example of a
residual plot where no assumptions appear to be violated.

67
Consider, by contrast, the following chart where a violation is probably occurring.

Notice the humped shaped pattern in the residuals; the residuals are no longer random. In
the previous two figures we used the actual residuals in our residual plots. If you are
using SPSS, the software allows us to save the standardised residuals: these are simply
the residuals divided through by their standard deviation. The use of standardised
residuals is most helpful when assessing whether the residuals are normally distributed.
The following figure plots the standardized residuals for the real estate regression.

We know that if something is distributed normally, then most of the values (99%) should
lie within three standard deviations of the mean and 95 per cent of the values should lie
within two standard deviations of the mean. In the above chart only 2 residuals are
beyond three standard deviations from the mean and only three are outside two standard
deviations from the mean. These residuals appear to be normally distributed.

68
Is the Relationship Linear?

Using Plots to Assess Linearity Assumption

The first assumption of a regression model is that there is a linear relationship between
the regression and the explanatory variables. We can assess this assumption by looking at
a residual plot. The residual plot shown below alerts us to a violation of this assumption,
as the residuals have a clear quadratic pattern. This systematic pattern in the residuals
suggests that the relationship between the explanatory variable and the regression is not
linear, but instead a possible curvilinear relationship in the regression.

Corrections for Violations of the Linearity Assumption

Fixing this type of violation is not always obvious. A violation simply means that there is
a curvilinear relationship between x and y. To correct this violation we need to perform a
curvilinear transformation. The most common technique is to try a transformation and
look at the residual plot again. If the violation has been corrected, proceed no further;
otherwise, continue trying other curvilinear relationships. In the example noted above,
there appears to be a quadratic pattern to the residuals, so we should try a polynomial
transformation of degree 2. After performing this transformation, the residuals look like
(residual plots against x and x2).

We see no systematic pattern between the residuals and x or x2. It appears we have
corrected the violation of the linearity assumption.

69
Is the Variance Around the Regression Line Constant?

Using Plots to Assess the Assumption of Constant Variance

The second assumption of the regression model is that the residuals have constant
variance equal to σ2ε. In a plot of the residuals against any explanatory variable, the
residuals should appear random with constant variance. If there appears to be a change in
the variance, then the assumption of constant variance in the residuals may be violated. In
a residual plot, non-constant variance (commonly called heteroskedasticity) is indicated
by a "cone-shaped" pattern in the residuals as shown in the chart below.

70
Note that as the explanatory variable gets larger, the variance in the residuals also gets
larger. This residual plot shows non-constant variance in the residuals. When we have
non-constant variance, the use of regression suffers two major drawbacks.

• The estimates of the regression coefficients are no longer minimum variance. The
standard errors of the regression coefficients are larger than they should be.

• The estimates of the standard errors of the coefficients are biased. Therefore,
hypothesis testing about population coefficients may lead to misleading results.

There are several tests for non-constant variance. One particular test has proven to be
very powerful detecting non-constant variation. Here is the structure for this hypothesis
test:

Null Hypothesis: Variance of the residuals is constant.

Alternative Hypothesis: Variance of the residuals is not constant.

The test statistic for this test is

Q = (6n/N2 – 1)1/2 (h – n + ½)

where n is the sample size, and h is

h = ∑i x ^εi2/∑^εi2

where ei2 is the residual from the ith observation in the regression equation. The decision
rule for this test is

Reject if : Q > zα

Accept if : Q ≤ zα

where α is the level of significance for the test and zα is chosen from the standard normal
distribution with an upper tail area equal to α. The test assumes that all of the
observations in the regression have been ordered in increasing variance. Typically, it is
assumed that the variance in the residuals increases as the value of one of the explanatory
variables increases. Thus the data need to be arranged in increasing order with regards to
this explanatory variable.

Corrections for Non-Constant Variance

There are several ways to correct non-constant variance. All of these corrections involve
transforming the dependent variable, y. This will sometime makes regression results
difficult to interpret. Here are three of the most common correction strategies:

71
• In place of the dependent variable, y, use the variable ln(Y). The natural log is
less variable and may correct the problem. Remember the natural log only works
on positive values. If your dependent variable has negative values you can not
perform this transformation.

• In place of the dependent variable, y, use the variable py. The square is less
variable and may correct the problem. Remember the square root only works on
positive values. If your dependent variable has negative values you can not
perform this transformation. Works best when your dependent variable is a count
variable.

3. If the variance in the residuals is believed to be proportional to some function of one of


the explanatory variables, then that explanatory variable may be used to stabilize the
regression. Suppose, we believe that

σ2 = σ2 xi2
That is the variance of the residuals is related to xi. Then we simply divide the entire
regression through by xi This gives us the following regression

Where εi’= εi/xi are the residual constant variance. Note that the roles of β0 and β1 have
ben reversed in this model. This transformation does not work if xi= 0.

Remember: when you transform your dependent variable, you are actually estimating a
different variable. To get the original forecasts for y back, you are going to have to use
the estimates from your transformed model and then back out the corresponding y.

Consider the following model.


Y = β0 + β1x

Suppose this model results in non-constant variance in the residuals. If I choose


option 1 to correct this violation, the model that will be run is

ln(Y) = β0 + β1x

Suppose you get the following results:

ln(Y) = 8:74 + 0:00537x

What is the estimated value of y when x = 300?


We first use our transformed model to get:

72
ln(Y) = 8.74 + 0.00537(300) = 10.351

Then we have to exponentiate this to get back Y:

Y = e10.351 = 31.288
To make sure you have fixed the non-constant variance problem you should always get a
residual plot of the transformed model and make sure that the "cone" shape has been
removed.

Are the Disturbances Normally Distributed?

Using Plots to Assess the Assumption of Normality

The residual plot of the standardized residuals versus the predicted values can be use to
assess if the residuals are normally distributed. Remember, to normally distributed about
68 per cent of the values should be within one standard deviation of the mean, and 95 per
cent of the values should be within two standard deviations of the mean. If you find
many standardized residuals that are more than 2 standard deviations from the mean, then
the residuals may not be normal. The easiest way to check the normality of the residuals,
if you are using SPSS, is to save the standardized residuals from your regression, and
construct a histogram of the residuals. Under the histogram option in SPSS, check the
box "Display Normal Curve," then click OK. The histogram will show the distribution of
the residuals with a superimposed normal curve: if the residuals are normally distributed,
the histogram should follow the normal curve.

The following histogram shows residuals which are not normally distributed.

Std. Dev = .99


Mean = 0.00
N = 35.00

We see that the residuals do not follow the normal curve very well.

73
Corrections for Normality

The assumption of normally distributed residuals is not always necessary to run a


regression estimate. It is necessary when making inferences from small samples. In large
samples, this normality assumption is not important because the Central Limit Theorem
shows that the sample distribution of the estimators are normal.

When the normality assumption is violated and we have a small sample, we need ways to
fix the violation. Some of these corrections involve transformations of the dependent
variable called Box-Cox Transformations. These are complex corrections and are not
covered in this class (if interested you should sign up for Econometrics).

In truth this is typically the last violation we worry about with the residuals. Some time
other violations make the residuals appear non-normal when they really are. In the above
histogram, there is clearly an outlier which needs to be dealt with. Once this outlier has
been corrected, the residuals of the regression may in fact be normal. The most obvious
way to avoid this violation is to have a larger sample. Once the sample is large enough,
the assumption of normality in the residuals does not really matter.

Extreme Values?

The objective of the least-squares method is to minimize the error sum of squares, SSE.
In doing so, this method is looking to avoid larger distances between the data Y i and
regression prediction ^yi. If any large distances do exist, the regression line can be
substantial pulled toward this influence point in order to remove this large distance. From
this we can see one down fall of the least-squares approach, extreme points (outliers) are
give proportional the most weight in the construction of the regression equation.

When a sample data point has a value which is much different than the other values of the
data, it is called an outlier. Outliers can be both good and bad. Outliers do provide some
information. They identify the total possible variability in the data. This is a good thing.
On the other hand, the presence of outliers can cause misleading and confusing regression
results. Either way, it is important recognise the presence of an outlier.

The following charts show how an outlier can affect a regression. The top figure displays
a regression plot from some sample data that has no outliers. The bottom figure displays a
regression plot of the same data on one observation has been changed to an outlier.

74
Note that the one outlier has an effect on the regression line. The regression equation has
been displayed for both samples. In the bottom figure, the outlier has shifted the entire
regression upward, the intercept term having risen from 1.18 to 1.70. On the other hand,
the outlier has had no effect on the slope of the regression. Why? because the outlier is in
the middle of the sample data. The outlier also causes a big change in the quality of the
regression, with the R2 dropping from 0.97 to 0.25. Instead of explaining 97 per cent of
the variance in Y, the regression now explains only about one quarter.

In this example, the outlier had no effect on the slope of the regression line, though this
will not always be the case. In short, the position of the outlier in the data matters.
Consider the following situation. Suppose that instead of having the outlier at the middle
data value, it is associated with a high X value.

75
This time the outlier has produced a change in both the intercept and the slope of the
regression line. Outliers which cause this are called leverage points (think of a lever
twisting the line).

Identifying Outliers

Now that we have see the effects of outliers, we need to find a method for identify them.
The preferred method makes use of standardized residuals. Remember, standardised
residuals, εis, are simply the residuals of a regression, εi, divided through by the standard
deviation of the residuals, σεi.. The variance of the standardized residuals is 1. Why? If
the residuals are normal with mean zero, dividing each residual by its standard deviation
makes the standardised residuals distributed as a standard normal.

To identify an outlier, we simply look for standardised residuals which are large in
absolute value. Given that these standardized residuals are distributed as a standard
normal, we should only find that about 5 per cent of the standardised residuals are larger
than two in absolute value: any observations that has a standardized residual larger than
two in absolute value can be classified as an outlier.

Another measure often used to identify outliers is to calculate the studentised residuals.
To compute the studentised residuals, the residuals εi, is again standardized, but with a
different standard deviation. The standard deviation σεi for the ith observation is
calculated with the ith observation removed. If the ith observation is unusual, this will be
reflected in the residual and not in σεi. Thus, any unusual observations should be easy to
locate. Studentised residuals follow a t-distribution with n - K - 1 degrees of freedom. A
simple t-test can be preformed to determine whether a specific observation is an outlier.

What to do with Unusual Observations?

Now that we know how to identify outliers, we need to learn what to do with these
observations. Remember not every unusual observation is an outlier. Sometimes other
violations of such as non-constant variance or non-linearity may make an observation
look like an outlier when it really isn't. Also, even if an observation is an outlier, you will
not always want to simply delete this observation. The information gained through an
outlier may be important. Here are some basic rules for dealing with outliers.

If the data value is simply incorrect (typos), then this observation should be dropped from
the analysis. If possible the correct data value should be found and included in the data
set.

If the data value is correct and is just unusual, then the choice of what to do with this
observation is uncertain. It simply depends on why the value is unusual and how
important this value is to the analysis. If the outlier is in a range of the data that is beyond
the main focus of the analysis, then dropping the data observation is the appropriate thing
to do. However, this is purely a judgement call, and should be avoided if possible.

76
Are the Residuals Independent?

Autocorrelation

One violation that frequently occurs in time-series analysis is that of independence in the
residuals. Disturbances in adjacent time periods are often correlated. If your regression
has a negative residual in time period t there is a good chance that you will see another
negative residual in time period t + 1. The relationship between residuals is represented
by:

εi = ρε i - 1 + ui

where εi is the residual in period i, εi – 1 is the residual in period i - 1; ρ is the serial


correlation coefficient or autocorrelation coefficient, and ui represents a disturbance that
is independent over time. Our regression model is written as

yi = β0 + β1xi + εi
where εi = ρε i - 1 + ui.

When residuals display autocorrelation the estimated standard errors of the coefficients
(the βs) are biased and larger. As a result, the confidence interval estimates and
hypothesis tests do not generate the expected results. The autocorrelation coefficient, ρ,
determines the strength of the relationship between residuals over time periods. Like any
other correlation coefficient, the autocorrelation coefficient can take on any value
between -1 and 1. Values close to -1 or 1 indicate a strong relationship over time. Values
close to 0 indicate independence in the residuals. To test for autocorrelation, we perform
a Durbin-Watson test.

A Test for First-Order Autocorrelation

The Durbin-Watson test is widely used test for autocorrelation. In most applications,
if autocorrelation is present, it will be a positive autocorrelation (ρ > 0). For this reason,
the Durbin-Watson test is set up as a hypothesis test for positive autocorrelation. The test
is set up as:

Null Hypothesis: ρ = 0

Alternative Hypothesis: ρ > 0

where ρ is the autocorrelation coefficient. If the null hypothesis is accepted, then


autocorrelation is deemed not to be a problem. If the null is rejected, then we have
evidence that autocorrelation exists and we may need correct it.

The Durbin-Watson test statistic is computed as

77
where bei is the residual for observation i and bei¡1 is the residual for observation εi ¡ 1.

When the residuals are independent, d is approximately equal to 2. When the residuals
are positively correlated, d < 2. The decision rule for this test take the form:

Reject Null if : d < dL(a; n; K)

Accept Null if : d > dU (a; n; K)

where dL(a; n; K) and dU (a; n; K) are selected from Table B.7 in the back of your
textbook. These value depend on the level of significance,α; the size of the sample, n, and
the number of explanatory variables in the regression, K.

In the Durbin-Watson test, there is a range of value for d, where the test is inconclusive.
This range is:

dL(a; n; K) · d · dU (a; n; K)

If the test-statistic falls in this range, we are unsure if autocorrelation is a problem. If the
test-statistic falls in this range, the best approach is to go ahead and correct for
autocorrelation. If the results of the regression change, then the correction should have be
preformed and autocorrelation was a problem. If the results of the regression do not
change very much, then the correction did not need to be preformed and autocorrelation
was not a problem. SPSS can calculate the Durbin Watson statistic, it is located under
the statistics option in the regression dialog box.

Correction for First-Order Autocorrelation

When the residual are autocorrelated, this typically means that some important variable
has been omitted from the regression. One way to fix autocorrelation is to find this
variable and include it in the regression.

Another way to correct first-order autocorrelation (correlated across adjacent time


periods), transforms the original time-series variables in the regression so that the
regression will use independent disturbances. Let's look at this transformation:

We start with our original regression.

Yi = β0 + β1 Xi + εi

where the residual, ei, has first-order autocorrelation

78
εi = ρe i-1 + ui

To remove the autocorrelation, the following transformations are used on both


the dependent and explanatory variables

Y*i = yi - ρyi-1

X*i = xi – ρXi -1

for observations 2 through n. For the 1st observation we use the following
transformations

Y*1 = √1 – ρ2Y1
x*1 = √p1 – ρ2X1.

We then use the new regression

Y*i = β0 + β1X*i + ui

and the disturbances ui are independent.

The third option to correct autocorrelation is to add a lagged value of the dependent
variable in as an explanatory variable. In this case your regression would take the form

Yi = β0 + β1X+i + β2 yi -1 + εi

This option works well when we have large samples.

Test for Autocorrelation

We use lagged values of the dependent variable as an explanatory variable, the Durbin-
Watson test for autocorrelation is no longer valid. To test for autocorrelation we have to
switch to Durbin's h-test. The hypothesis for this test is:

Null Hypothesis: ρ= 0

Alternative Hypothesis: ρ > 0

where ½ is the autocorrelation coefficient. If the null hypothesis is accepted, then


autocorrelation is deemed not to be a problem. If the null is rejected, then we have
evidence that autocorrelation exists and we may need correct it.

The test statistic is computed as

79
where n is the sample size, r is an estimate of the autocorrelation coefficient ρ,
and sb1 is the estimated variance of regression coefficient of the lagged dependent
variable (yi¡1 ). The decision rules for this test are:

Reject Null if : h > zα

Accept Null if : h · zα

Where zα is measured from the standard normal. A quick estimate of ½ is calculated as:

r = 1 – d/2

where d is the Durbin-Watson statistic.

Is Multicollinearity a Problem?

Consequences of Multicollinearity

For a regression with K explanatory variables, it is hoped that the explanatory variables
are highly correlated with the dependent variable. At the same time however, it is also
hoped that the explanatory variables are NOT highly correlated with each other. When
the explanatory variables are correlated with each other we have a problem of
multicollinearity. The seriousness of this problem depends on the degree of the
correlation between the explanatory variables. High correlations may result in highly
unstable least-square estimates of regression coefficients. The presence of several
multicollinearity brings about the following problems:

• The standard errors of the regression coefficients become usually large. As a


result, individual t-tests report values that are too small. We may conclude that
coefficients are zero when that in fact is not the case.

• The regression coefficients become unstable; the signs of the regression


coefficients may even get switched around. Dropping one explanatory variable
may lead to large changes in the coefficients for all the other variables in the
analysis.

Detecting Multicollinearity

There are several ways to detect multicollinearity.

1. Compute the correlations between the explanatory variables. Because multicollinearity


is exists when the explanatory variables are highly correlated, calculating these
correlations should identify the problem. The cutoff for high correlation is generally
thought to be somewhere around 0.50 in absolute value. This is a simple rule of thumb.
To calculate correlations in SPSS, simply choose correlate: bi-variate in the ‘analyse’
menu.

80
There is a serious limitation in this approach: you can only see if the explanatory
variables are directly correlated. If you have three explanatory variables X1, X2, X3,
correlations establish the relationship between any two of these explanatory variables at a
time. What can not be captured is the relationship between one explanatory variable and a
combination of the other variables. If X1 is highly correlated with X2 + X3, calculating
correlations will not capture this effect even though multicollinearity exists.

2. The second methods involves close inspection of the regression output.

Multicollinearity can also be indicated by large F statistics with small t statistics. In this
case the F test would say the overall regression is significant, but each individual βi
would fail their individual t-tests, owing to the inflated standard errors that are removing
all the power from the t-tests.

It should be noted that neither of these methods is fool proof when it comes to
establishing the existence of multicollinearity. Consider the case where only some of the
t statistics are small but the F stat is large. It is unclear whether this is an indication that
multicollinearity is a problem.

Correction for Multicollinearity

One easy method for correcting multicollinearity is to remove those explanatory variables
that are highly correlated with the other explanatory variables. There is, however, an
obvious downside to this approach: you are removing all of the information from the
dropped variables, which can lead to substantial changes in your regression estimates.
There are other high powered ways to correct multicollinearity, though they are beyond
the scope of this Guide; any standard econometrics text can point you in the right
direction.

One final point: you do not have to correct multicollinearity if you are using regression
analysis for forecasting purposes only. Multicollinearity does not limit a regression’s
ability to predict, nor does it affect a regression's ability to obtain a good fit (high R 2).
Correcting multicollinearity matters only if you are using regression for inference and
coefficient estimation.

6.2.e Logistic (Logit) Regression

Before leaving the topic, we should note a special form of regression that is widely used
in financial analysis, namely, logistic (or logit) regression. In most regression
applications both the dependent and independent variables are continuous, that is, they
assume a wide range of values. In others, the data are categorical, that is, take on only
one of two values (alive/dead; win/lose, bankrupt/viable; defaulted/current, union/non-
union, Tory/Labour and so forth). In each case we are interested not so much in
predicting the actual value but whether we can establish the likelihood (probability) that
an unclassified observation will fall into one or the other category. In finance, such

81
research goes back to the 1970s when Edward Altman of the Stern School of Business
(NYC) first developed his famous Z-Score model of corporate bankruptcy.14

Altman’s model uses financial ratios to sort companies into financially distressed or
financially viable categories with a reasonably high degree of accuracy. His model is still
widely used, even though critics note that its exclusive dependence on corporate data has
two significant drawbacks: (1) Corporate data are frequently published with a lag,
sometimes as long as year. This means that the classification prediction is based upon
data that may no longer reflect the financial reality of the company, thus biasing the
analysis; and (2) It ignores non-corporate data, which may have a significant bearing on
the correct classification.15

In his original formulation Altman used a technique known as Discriminant Analysis; in


more recent years analysts have come to favour logistic regression over discriminant
analysis for a number of reasons even though both approaches start from a common
premise, namely, that the categories of outcome in the dependent variable must be
mutually exclusive.

• Logistic regression is much more relaxed and flexible in its assumptions than is
discriminant analysis. Unlike the latter, logistic regression does not require the
independent variables to be normally distributed, linearly related or of equal
variance within each group. The greater flexibility of logistic regression argues
strongly in its favour, though it has been noted that ‘when (the) assumptions
regarding the distribution of predictors are met, discriminant analysis may be a
more powerful and efficient analytic strategy.16
• Even though the logistic regression does not have many assumptions, and thus is
usable in more instances, it does require a larger sample size – at least fifty cases
per independent variable might be required for ensuring accurate hypothesis
testing. A good rule of thumb is to use logistic regression when the dependent
variable is categorical and the sample to be used in the analysis is large.

More formally, logistic regression treats the distribution of outcomes in a probabilistic


manner, that is, the occurrence of the study phenomenon is evaluated in terms of
probability, which take on values of between zero (no chance of the event occurring) to
one (the occurrence is certain). In other words, the outcome of the analysis is the

14
“Financial Ratios, Discriminant Analysis and the Prediction of Corporate Bankruptcy,” Journal of
Finance, 4

15
These limitations led some analysts to develop Contingent Claims (option) models that make use of all
available information; the main problem with this approach is that produces an estimate of the probability
of default but without any clear understanding of the factors that contributed to the estimate. Other
financial analysts favour use of a hybrid model that combines observable variables with the unobservable
inputs of Contingent Claims models, though it arguable whether the addition of the latter − subsumed, in
principle, in the former − adds anything of real value.
16
Tabachnik and Fidell, Using Multivariate Statistics (Harper Colins, 1996).

82
likelihood of the event of interest occurring, say, there is 67 per cent chance that
company X will fail within the next year.

If the probability of the phenomenon occurring (say, default) is PA and PB is the


probability of the absence of the phenomenon, then PA + PB = 1 (that is, PB = 1 - PA), and

PA = EXP(UA)/1 + EXP(ZA), and

ZA = β0 + β1X1 + β2X2 + βNXn + ε,

Where the variable ZA is a measure of the total contribution of all the risk factors used in
the model and is known as the logit. Β0 is called the intercept, and the βis are called the
regression coefficients of the risk factors Xi. The intercept is the value of ZA when the
risk factors are zero. Each of the regression coefficients describes the size of the
contribution of that risk factor, while a positive sign signifies that the risk factor
increases the probability of the outcome, while a negative coefficient indicates that the
risk factor decreases the probability of that outcome. A large (small) coefficient means
that the risk factor has a strong (little) influence on the probability of the outcome.

The greater is the value of ZA, the greater is the probability the event will occur; as Z A
approaches infinity, PA approaches one indicating a high probability of the event
happening. When, by contrast, UA approaches negative infinity, PA approaches 0. When
ZA = 0, PA = 0.5, meaning there is a 50:50 chance of the event occurring.

One of the standard outputs of logistic regression is the Odds Ratio (OR) associated with
each control (independent) variable. The ‘odds’ of an event is defined as the probability
of the outcome of that event occurring divided by the probability of the event not
occurring. For example, let the probability of an event occurring be 0.8 (p = 0.8), so that
the probability of failure is q = 1 - p = 0.2. The odds of ‘success’ are defined as
odds(success) = q/p = 0.8/0.2 = 4, that is, the odds of success are 4 to 1. The odds of
failure would be q/p = 0.2/0.8 = 0.25; while the results may look odd, in fact all that is
being said is that the odds of failure are 1 to 4. In other words, the odds of success and
the odds of failure are reciprocals of each other: ¼ = 0.25 and 1/0.25 = 4.

We need to add one more variable to the equation to be able to compute the Odds Ratio.
This is best accomplished by use of another example. Suppose that seven out of ten
males are admitted to a business school compared with three out of ten females. The
probabilities for admitting a male student are p = 7/10 = 0.7 and q = 1 - p = 1 – 0.7 = 0.3.
The same probabilities for females are: p = 3/10 = 0.3 and q = 1 – 0.3 = 0.7.

We can now use these probabilities to calculate the admission odds for both male and
female students: odds(male) = 0.7/0.3 = 2.3333; odds(female) = 0.3/0.7 = 0.4285.
Compute the odds ratio for admission as follows: OR = 2.3333/0.42857 = 5.44, which
says: the odds of being admitted if the applicant is a male student are 5.44 times greater
than if the applicant were a female.

83
In most logistic packages the logistic produces results in terms of an odds ratio while
logit produces results in terms of coefficients. A logit is defined as the loge of the odds:
logit(p) = log (odds) = log(p/q). In fact, there is a direct relationship produced by logit
and the odds ratios produced by logistic.

Logistic regression is in reality ordinary regression analysis using the logit as the
response (dependent) variable: log (p) = β0 + β1X or log(p/x) = β0 + β1X. This means
that the coefficients in the logistic regression (βi) are in terms of log odds, that is, the
coefficient β1 implies that a one unit change in the independent variable (X) results in β1
unit change in the log of the odds.

The equation log(p/x) = β0 + β1X can be used expressed in terms of odds by getting rid of
the log; this is done by taking e to the indicated power and applied to both sides of the
equation: p/q = eβ0 + β1X. The end result of these mathematical manipulations is that the
odds ratio can be computed by raising e to the power of the logistic coefficient: OR = e β1.

To make the point clear let’s revert to the business school example mentioned above,
where we sorted students by whether they gained entrance to the business school or not.
Assume a logistics equation was estimated and the following result obtained:

log(p/1- p) = -0.847 + 1.694596X,

where X stands for gender (male or female). According to our previous discussion, OR
= e β1 = e1.694596 = 5.444, which (as we saw above) indicates that a male student is nearly
5.5 times more likely to be admitted to our hypothetical business school than is a female
student.17

To better understand how logistic regression can be used and interpreted, we review the
results of a recent article that explored why consumers choose non-traditional finance
companies (finance or loan companies, but excluding loans with auto manufacturing
companies or mortgage brokers) in preference to traditional lenders (commercial banks,
savings banks or savings and loan associations) to obtain credit. 18 The data are drawn
from the US Survey of Consume Finance for the year 2004. The classification was based
upon whether the individual had borrowed from a financial institution that did
(traditional) or did not (non-traditional) accept deposits; the sample consisted of more
than 4,500 respondents.

The estimated regression is as follows:19

ZA = - 1.43 + 0.73X1 + 0.28X2 + 0.10 X3 – 0.02X4,

17
The above example was derived from the University of California’s Academic Technology Service.
18
Jeffrey Dew, “Credit Crunched? The Relationship Between Credit Denials and the Use of Alternative
Financial Institutions,” Consumer Interests Annual, 54 (122-126).
19
Shown here are only those variables where p = 0.05 or higher.

84
where all variables are as defined below. Taken at face value, the single most important
determinant of whether a respondent sought financing at a non-traditional financial
institution was whether s/he had been denied credit within the past five years. The
following table shows how the probabilities vary with each of the variables included in
the analysis. The odds are to be understood as signifying the likelihood, for example, of
the respondent having been denied credit compared with those that were not: an
individual having been denied credit was twice as likely to have an account with a non-
traditional financial institution than someone whose credit application was not rejected.
As expected the estimated regression coefficients and the log odds tell exactly the same
story in terms of the importance of the variable.

Variable Log Odds


X1 = Denied Credit within past five years 2.08
X2 = 1 if credit is ‘good’, zero if ‘bad’ 1.19
X3 = Number of individuals in household 1.11
X4 = age of respondent 0.98

If we now wished to use the model to determine what the likelihood was of an individual
23 years old, denied credit in the past five years, with a good credit record and living on
his own, we would simply insert these values into the equation:

ZA = - 1.43 + 0.73(1) + 0.28 (1) + 0.10 (0) - 0.02 (23) = + 0.04,

so that PA = exp(0.04)/1 + exp(0.04) = 0.51, that is, there is a 51 per cent chance that this
individual will have an account with a non-traditional financial institution. Note the
importance of having a good credit score: had this individual had a poor credit history the
odds would have increased to 63 per cent [PA = exp(-1.16)/1 + exp(- 1.16)].

We can explore further some of the properties of logistic regression by reverting to the
example given above concerning the characteristics of failed and viable Jamaican banks.
In our previous discussion individual variables were tested pair-wise, that is, the
difference in means between two groups were compared and the t-test applied to
determine whether the observed differences could have arisen by chance. The main
problem with the pair-wise approach is that it neglects possible interactions that may
exist among the variables; only after such effects are controlled for will we be able to
establish the contribution of individual variables towards explaining the phenomenon
under investigation.

We noted above that ordinary least squares is ideal where the dependent variable is
measured continuously and logistic regression where the dependent variable takes on
either a binomial or multinomial form, as in the case to hand. The logistic analysis takes
the following form:

Z = β0 + β1X + β2Y + β3Q,

85
Where X is a vector of bank financial characteristics, Y a vector of other bank specific
characteristics and Q is a vector of macroeconomic variables. Included in X would be
capital adequacy, asset quality, liquidity and earnings variables; included in Y would be
variables that measure the efficiency (inefficiency) of bank management, size, bank risk,
audit status and ownership (foreign vs. local); and Q variables such as real GDP growth.

The final version of the model is shown below and includes only those variables that
were significant at the 5 per cent level or higher; the accompanying table defines the
variables and indicates the level of significance for each of coefficients shown.

Z = 1.01 +0.1X1 + 0.05Y1 + 1.36Y2 – 10.15Q.

(t-statistic) and level of


Variable Definition significance (* = 0.05; ** = .01)

Intercept (0.20)
X1: Change in the gross Long term debt + Equity/Loans + (2.1)*
capital/risks assets lagged two leases
periods
Y1: Management inefficiency Total operating expense/net interest (2.7)*
lagged on period revenue + other operating income
Y2: Size lagged two periods Log of total assets (2.7)**
Q1: Real GDP growth lagged GDP growth in constant 1986 (- 2.9)**
three periods prices

From the fairly long list of potential variables the only ones shown to bear a statistically
significant relationship to bank failure are lagged changes in capital adequacy and
management inefficiency variables, bank size and the growth of real GDP. Real GDP
growth is shown to reduce the probability of failure; as the economy improves, so too
does the performance of the banking sector as a whole. All other variables, including
bank size, are shown to increase the risk of failure. If bank size is taken to indicate an
increase in loans and investments, then the expansion appears to generate a subsequent
decline in asset portfolios contributing to bank failure. The positive association between
declining capital adequacy and poor management performance hardly requires comment.

Of equal (or perhaps) greater interest is the extent to which the model was able to
correctly classify failed banks. The results indicate a high degree of accuracy, with a
correct classification rate of 96 per cent though the model appears to have a better record
identifying viable (97.8 per cent) than non-viable (81.2 per cent) banks. While the
model’s ability to accurately classify viable banks remains high, there is a marked decline
in its ability to identify failures, with the correct percentage declining to roughly 57 per
cent one, two and three years before bankruptcy. In other words, the model is almost as
likely to miss as it is to correctly identify a potentially bankrupt bank.

One of the main uses of regression analysis is prediction, using the estimated relationship
to forecast future values of the dependent variable. In our discussion of regression
analysis we assumed the dependent variable pertained to an economic or financial
magnitude, but as we have seen there is no reason to restrict its application solely to those

86
types of variables. Logistic analysis (and its regression variants) generates estimates of
the probability of an outcome falling into one or more mutually exclusive categories, and
thus can be used to address a wider set of issues.

Most such analyses claim to be quite good in terms of classification accuracy, the
reported proportion as we have seen often exceeding 90 per cent. But a stronger test of
the predictive power of such models is how well they classify observations that were not
part of the data set used to generate the original results. For this reason, many financial
economists typically withhold part of the sample, testing model accuracy by solving the
equation and comparing forecast with actual outcomes. Of course, in most forecasting
situations the actual values of the input variables will not be known; that is, the test
assumes that the forecaster knows with certainty the correct values of the model’s
independent variables – an extremely unlikely possibility which biases the test procedure
in favour of correctly predicting the variable of interest.

Bluntly put, this means that the best forecasts will more likely than not combine
professional judgement with quantitative rigour, as opposed to relying solely on model
output. One of the most interesting demonstrations of this conclusion appeared in an
article written a few years ago in the Wall Street Journal Europe to predict which film
was likely to win the Oscar for best picture for 2005.20

To that end the editors of the paper invited a statistician to design and build a model that
could be used to predict which motion picture had the highest probability of being voted
the Best Film of the year. The model’s forecast was then compared with the predictions
of two respected Hollywood pundits – one the paper’s own film columnist, the other an
independent film reviewer.

The Journal’s statistical expert estimated and used a model similar to the one described
above, whose output also consisted of assigning a value of between one and zero to the
five films nominated in a given year; a value of one implies that a given film is certain to
win an Oscar, while zero implies there is no chance at all of winning. All of the
predictions are conditional on a set of factors known or thought to affect the probability
of winning.

Three variables appear in the basic version of the model: (1) the total number of Oscar
nominations wracked up by a given film, (2) the number of Golden Globe awards won,
and (3) whether or not a given film is a comedy. The logic of each is pretty
straightforward.

• Past best films typically garnered a large number of overall Oscar nominations;
each additional nomination increases the odds that a given film will win.

20

“A Winning Formula?” The Wall Street Journal Europe (February 25, 2005).

87
• The number of Golden Globes won, another Hollywood award ceremony that
precedes the Oscars, provides an alternative, independent measure of a given film’s
popularity that in the past has been a good lead indicator of winning.

• And finally, since not a single comedy nominated for an Academy Award over
the past twenty years has won, classification as a comedy everything else held
constant drastically reduces a film’s chance of success.

How good is the model? And are there any modifications that might improve forecast
accuracy? According to the modeller, the equation is 90 per cent accurate; that is, nine
out of ten times over the past twenty years, it correctly predicted the best film in each of
those years. Of course, statistical models usually perform extremely well within the
estimated sample period, so that a more stringent test is how good the model’s
predictions are outside the period over which it was estimated.

To test the robustness of the model, the three variable version was tweaked to take into
account two plot devices, namely, whether the film involved a hero riding on a horse or
included a leading character with a disability. These modifications did improve
somewhat overall forecast accuracy; the expanded model correctly predicted 19 of the
previous 20 best films – one more than the basic model

The main difficulty with these additions is that they are too ad hoc: they are meant to
correct specific past errors and accordingly are unlikely to outperform the basic model
over time. For example, in its original version, the model incorrectly pegged Born on the
Fourth of July as the best film for 1989, when the leading character’s disability is
factored into the model it generated the correct forecast for that year: Driving Miss Daisy.

How well did the model predict the Best Film for 2005? The results of the basic three
variable model predicted that The Aviator would be 2005’s Best Film, assessing the
probability of winning at 84.6 per cent – a virtual certainty. Million Dollar Baby, the
winner predicted by the Journal’s experts, was deemed to have only a negligible (13.5
per cent) chance of winning. According to the article from which these forecasts are
extracted: “should a picture other than The Aviator walk away with Best Picture, it
would be the biggest upset of the past twenty years.”

And the winner was: Million Dollar Baby!

88
7. Analysis

This section of the dissertation describes the data used in your study and provides a
detailed discussion of the results obtained. If you are relying on secondary data, they
should be described in detail: what each variable measures (UK Gross Domestic
Product), the units in which they are expressed (constant 1990 UK pounds sterling), and
whether the data have been transformed and if so in what way (to logarithms, differenced,
whether the original or transformed data), and their source(s). If your dissertation relies
upon survey data or interviews you will be expected to provide detailed information on
the size, structure and response rate to your survey.

Surveys, while used in finance dissertations, are not all that common. Interviews, by
contrast, are as they can help to clarify or elucidate unresolved issues; an example of the
use of interviews for this purpose was described above in connection with the impact on
reported earnings associated with the shift from traditional to EU accounting standards in
Turkey.

This is an important distinction: unlike samples, which are intended to describe one or
more characteristics of the larger population from which the sample was drawn, and thus
must satisfy fundamental statistical criteria, interviews do not. In fact, the number of
interviews can be relatively small, so long as the interviewees are carefully chosen; by
virtue of their position, experience or background they should be able to pronounce on
important issues with authority. There is no presumption here that the views expressed
can be generalised to the profession as a whole, even though such opinions may indeed
command widespread agreement among practitioners.

Much more common is the use of statistical procedures, typically regression analysis.
Many financial or economic issues lend themselves to quantitative analysis, ranging from
whether particular markets can be shown to be informationally efficient (and at what
level of efficiency) to whether asset returns are better described using the Capital Asset
Pricing Model (CAPM) or the Arbitrage Pricing Theory (APT). In the latter case, the key
distinction between the two relates mainly to whether beta (which measures market or
macroeconomic risk) is a comprehensive enough indicator of risk to provide a
satisfactory explanation for a given observed pattern of returns or whether additional,
more specific macro variables are required.

Regression analysis is one of the best methods that can be used to discriminate between
these alternative models. For one thing, it provides a test of whether the estimated beta is
significantly different from zero. And for another, the results indicate how much of the
variance in returns can be explained by that single risk measure in comparison to the
proportion explained using the macroeconomic variables suggested by the APT. The
model that accounts for the greater proportion of the return variance could be considered
superior, though it should be pointed out that the quantitative analysis explains actual not
expected returns, which is what the model was devised to explain; on the other hand,
there are reasons to believe that over the longer term the two should coincide.

89
Let us now demonstrate, by way of an illustration, how regression analysis can be used in
the production of MBA dissertations; we will look at the results presented in a recently
submitted dissertation that compares the performance of the two asset valuation models.
The following exhibits summarise the regression results used to assess the dissertation’s
main objective; the author’s data were derived for a sample of three Thai banks, the
country’s largest and most important financial institutions.

Separate equations were estimated for each bank and for each valuation model, six
regressions in all. The principal criterion used to evaluate the author’s hypothesis was to
establish which model explained a greater proportion of the variance in historic returns,
and whether it did so consistently. Her results for both models were extremely robust.

The three macro variables used in her APT analysis were GDP, the bhat/dollar exchange
rate, and local short term interest rates. All things constant, we might expect stronger
economic growth to be associated with higher returns, owing to higher lending activity
and accordingly bigger margins. The relationship between the other financial variables
and bank returns is less clear. Interest rates are both a cost and a source of revenue, the
first consideration arguing in favour of a negative relationship (higher funding costs
depress lending margins), the latter a positive relationship, given that higher interest rates
are indicative of stronger economic growth, Thai banks may be able to increase lending
margins to the benefit of their bottom lines.

As to exchange rates, a weaker bhat-dollar rate could be consistent with lower returns, by
raising external funding costs inducing a decline in earnings; on the other hand, a weaker
bhat would positively effect export industries, stimulating borrowing to finance the
expansion of production capacity to accommodate anticipated higher foreign demand.

In each regression, the positive impact of rising economic activity on bank returns is
affirmed, the estimated regression coefficients are of roughly the same order of
magnitude and all are highly significant statistically. The results indicate further that
higher interest rates are consistent with higher bank returns, while a decline in the
exchange rate (more bhats required to purchase a US dollar) is shown to negatively effect
bank returns, underlining the importance of expanded lending opportunities associated
with devaluation than higher funding costs.

The models are estimated using a relatively large number of observations, while the
results shown below indicate that the estimated equations explain 70-80 per cent of the
variance in returns. The estimated regression coefficients provide an estimate of the
impact of changes in each of the independent on the dependent variables on bank returns
holding all other variables constant (mathematically they are known as partial
derivatives). Note that all of the explanatory variables are statistically significant;
generally speaking, t > 2.00 usually indicates significance at the 5 per cent level or better
meaning there is less than a 5 per cent chance that the observed relationship could have
arisen by chance.

90
Exhibit 21
Regression Results

1. Results for CAPM

Bank Linear Equation P-Value T-stat β R2

Siam Commercial Bank(SCB) y = 1.6112x + 0.0061 2.96E-32 16.469 1.6112 69.86

Bangkok Bank (BBL) y = 1.2462x + 0.0023 1.54E-32 16.600 0.0271 70.20

Kasikorn Bank (KBANK) y = 1.2889x + 0.0036 9.83E-33 16.690 0.0241 70.42

2. Results for APT

Siam Commercial Bank


Standard
Coefficients t Stat P-value
Error
0.11014
Constant 72.24829 44.10934 1.63794 8
GDP 0.00013 0.00003 4.42104 8.68E-05
Δ Exchange Rate Bhat -3.75759 0.68122 -5.51599 3.08E-06
0.00015
Interest Rate 4.46881 1.05956 4.21760 9

Bangkok Bank

Standard
Coefficients t Stat P-value
Error
Constant 22.7984904 49.5515341 0.4600966 0.6482139
GDP 0.0002463 0.0000321 7.6695970 0.0000000
Δ ERB -4.3289166 0.7652648 -5.6567564 0.0000020
Interest rate 5.6616889 1.1902902 4.7565619 0.0000315

Kasikorn Bank

Standard
Coefficients t Stat P-value
Error
Constant 54.3365136 33.2736488 1.6330194 0.1111822
GDP 0.0001110 0.0000216 5.1471101 0.0000096
Δ ERB -2.9116377 0.5138721 -5.6660743 0.0000019
Interest rate 4.8544467 0.7992749 6.0735634 0.0000006

91
Exhibit 22
Summary Statistics for Three APT Regressions

Siam Commercial
Statistic Bank Bangkok Bank Kasikorn Bank

R2 0.7332 0.8002 0.7932


R2 (adjusted for 0.7109 0.7835 0.7759
degrees of freedom)
Standard Error 15.5176 17.432 11.706
Number of 40 40 40
Observations
Source: Supanapasot (2009).

Of course, if there were reasons to do so, we could have tested whether the coefficients
differed significantly from any value. In which case, we would have tested the following
statistic: t = (regression coefficient – assumed value)/standard error. For example
suppose we wanted to establish whether the change in the exchange rate (DERB) on
Kasikorn’s return differed significantly from one, we would then subtract the estimated
regression coefficient from one and divide the difference into the standard error: - 2.991 –
1.000/0.5139 = – 2.52. Again, there is less than a 5 per cent chance the actual coefficient
could differ from one by chance.

Turning to the CAPM results, we note that for each bank the betas are all highly
statistically significant, though the estimated beta for Siam Commercial Bank is
considerably higher than for the other two banks in the sample, which are extremely
small; in other words, Bangkok Bank and Kasikorn Bank exhibit little sensitivity to
changes in the overall market while for each 10 per cent increase (decrease) in the market
return SCB’s return increases (decreases) by 16 per cent.

In terms of the way the author set out to test her hypothesis, namely, that the APT
outperforms the CAPM in terms of explaining historic returns, the differences as
measured by R2 are comparatively small. True, the APT regressions have uniformly
higher R2 s than the CAPM regressions but the differences are not big enough to settle the
issue, the author having reached the same conclusion.

The use of statistical analysis is not mandatory for finance dissertations but it does, as we
have seen, provide a firmer foundation upon which to base the results of your study. This
conclusion applies only where the appropriate statistical technique was chosen and
correctly employed. In the case to hand, we might note some of the things that were
omitted from the Thai bank analysis, and thus highlight the risks associated with using an
approach with which you are not thoroughly familiar.

The regression model rests upon a number of specific assumptions that all too often are
not spelled out. Practically, this will diminish the confidence the reader will have in the
accuracy of the claims being made. For example, a key assumption of the regression

92
model is that the residuals (the difference between the actual and predicted values) must
be serially uncorrelated and of constant variance (heteroskedasticity).21 If the first
assumption is violated it could signify that an important variable was omitted from the
analysis; the practical consequence of violating the second requirement is that it biases
downward the standard errors of the regression coefficients possibly compromising the
validity of the t-test used to establish statistical significance. These and other issues were
discussed at length above and there is no need to repeat that discussion. What does bear
repeating is that the diagnostic tests should be as a matter of course reported alongside
the regression results. In this way, the reader can decide whether you have made your
case convincingly.

The key point here is that use of one of more statistical procedures in a dissertation is not
enough; you must demonstrate a clear understanding of the method, its assumptions and
what to do if these assumptions are violated. Failure to do so can lead to a loss of points,
cancelling out any benefits you may have gotten for having tackled your topic in a formal
and rigorous way.

Finance dissertations depend upon the use of data to test and support the hypothesis you
have chosen to investigate. The points noted above apply equally whether the data are
subject to formal statistical analysis or not. For example, many dissertations rely upon
ratio analysis. Financial ratios are normally calculated from information derived either
from data bases or company annual reports, and are then examined to determine whether
any consistent patterns can be identified over time. All too often the ratios, neatly
calculated and summarised in tabular form, are left to speak for themselves; alternatively,
the text merely repeats what is obvious from the table itself. ‘The profit ratio rose in
2003, fell in 2004 and 2005, and then rose again in 2006.’ The obvious question left
unanswered is: Why?

It is insufficient to present ratios without first providing a clear statement of what the
ratio is intended to measure or fail to explain the reason(s) behind the observed patterns.
For example, in a recent dissertation analysing the performance of the IBB bank since its
inception, the author noted and explained the trends indicated for a number of key
financial variables: deposit growth, loan growth, expenses, and so forth. While noting
that the bank had yet to report a positive return on equity, the failure was neither
explained nor its significance assessed. No matter how favourably some financial
variables behave, an unprofitable bank can not survive for very long. Ironically, had the
author connected the data shown for expenses and loan losses with operating income, the
answer would have been obvious: both the bank’s expense and loan loss ratios were way
out line with those of other British banks.

21
Two other problems we might mention are multicollinearity and non-linear regression. The first problem signifies
that the independent variables are highly correlated producing too much ‘noise’ to detect their separate effects. The
problem can be solved either by combining the variables or by dropping one. The other problem is easily detected and
corrected: plot the dependent variable and observe the shape of the scatter.

93
8. Conclusions, Limitations, Recommendations and References

The purpose of this section of your dissertation is to pull together, in narrative form, the
main findings of your study together with a statement of its limitations and hence
directions, if any, for future research; it may also contain a list of recommendations that
follow from your findings. All of the main elements of this section are clearly and
carefully spelled out in Exhibit 2, though it may be useful to provide brief commentary
on some of the points noted there.

Your conclusion should be expressed clearly and concisely; this section is not intended to
repeat other parts of your dissertation but rather to make clear what you have actually
accomplished.

The most obvious point(s) worth making is whether the hypothesis you set out to test
was, or was not, supported by your analysis. Negative findings can be as important as
positive results, providing the analytical procedures employed are up to the task.

You should also indicate the extent to which your results correspond to those noted in
your Literature Review. If they do, you should say so. If they provide only partial
support, then the reasons need to be spelled out and even more so if they contradict
previous research findings.

At this point it is useful to describe the study’s limitations, if any. One of the main
shortcomings of a dissertation is that the results typically apply to a particular company
or to several companies in a given sector or industry. Until your findings are replicated
for other firms in the same industry or for the sector as a whole or for allied sectors, it
will not be possible to generalise them nor should you attempt to do so. Indeed, this is
one of the reasons why you are encouraged to make recommendations or suggest
extensions to the existing body of research, which now includes your dissertation.

The final component of your dissertation is the bibliography, which lists all the works
you consulted in the preparation of the study and those that were specifically referenced
in the body of your text. You should familiarise yourself with the correct procedure for
citing references.

When referring to books, there should be an indication if there is more than one edition of
the text you used; if so, that edition together with the location and name of the publisher
should be included (eg., London and New York: Macmillan). It is common to provide
the date of publication within the brief reference used in the text  the way the references
are organised in your bibliography  so that you do not have to repeat the date twice.

The format used for articles is somewhat more complex, but essential if the source is to
be properly identified. If the article appears in a book, the name of the article is set
within quotes, the editor(s) of the book named [Jones and Smith (eds.)] followed by the
italicised title of the book, with the publisher information provided as described above.

94
For articles appearing in journals, you should again set the title of the article in quotation
marks, followed by the name of the journal (in italics), the edition and volume number of
the journal where the article can be found as well its page numbers (386-412).

If you cite information published in newspapers or magazines, if the article(s) has an


author you follow the procedure used above; newspapers do not always have an edition
or volume number, in which case substitute in its place the date when the article appeared
(Sunday, 5 April 2009). Where no author is indicated, as is not uncommon, refer to the
source and its date of publication and list that in your bibliography (Times (2009): “The
G20: Failure or Success?” Sunday, 5 April 2009).

Some dissertations contain appendixes, the value of which is open to question. If your
dissertation must have one or more appendixes, though there is no compelling reason
why it should, then keep to the essentials. In most instances authors provide a record of
the data used, or the survey questionnaire and related information (how or where it was
administered), the number of respondents, the response rate, the number of follow-up
interviews conducted, with whom, how long they took, and so forth. It should be clear
that all of this information could just as easily have been included in the appropriate
sections of the dissertation. In others, it provides summary statistics for the data used in
the analysis, though again it is unclear why such information was not presented in the
main body of the text.

On the other hand, where you have investigated alternative models, or performed failed
experiments, it is probably best to include these in an appendix so as not to the clutter the
text with too much information. It is appropriate to refer to these additional tests or
experiments, note the findings and present the results so that anyone interested in seeing
what you have done will be able to do so. In most instances footnotes referring to these
additional results are more than adequate. In short, you should consider carefully the
need to include the supplementary material that is normally confined to appendixes.

The information contained in this Guide is intended to provide you with an overview of
what to expect as you approach the dissertation stage, to eliminate as much as the
mystery as possible, to assist you with topic selection, explain the role of your supervisor,
highlight potential sources of information, including an assessment of their strengths and
limitations, how to evaluate the merits of alternative sources of information you are
developing, approaches to research, including a discussion of statistical inference and the
various statistical procedures that have been used in dissertation, as well as much
practical information that (hopefully) will address many student FAQs.

Forewarned is forearmed.

95

S-ar putea să vă placă și