Sunteți pe pagina 1din 14

Measurement & Scales Measurement

is the process of assigning numbers or labels to objects, persons, states of nature, or events. Done according to set of rules that reflect qualities or quantities of what is being measured.

Scales


Measurement means that scales are used. Scales are a set of symbols or numbers, assigned by rule to individuals, their behaviors, or attributes associated with them. Four types of scales are used in research, each with specific applications and properties: nominal, ordinal, interval, and ratio.

Constructs Concepts measured with multiple variables.

Nominal Scales are


 

used to classify objects, individuals, groups, or even phenomena. Examples: 1) Gender 2) State of residence 3) Country 4) Ethnicity mutually exclusive (meaning that those items being classified will fit into one classification). also collectively exhaustive (every element being classified can fit into the scale). Examples of nominally scaled question (as it might appear on a questionnaire): What is your class rank at KSU? 1. Freshman 3. Junior 2. Sophomore 4. Senior
The numbers themselves do not have meaning (we could have used letters, too), they are used just to identify the possible responses to the question. Thus: in evaluating responses to this you cannot use the mean. Permitted statistics: frequencies (% and counts), modes

  

Nominal Scale is always used for


obtaining personal data such as gender or department in which one works, where grouping of individuals or objects is useful, as shown below.
1. Your gender ___Male ___Female 2. Your department ___Production ___Accounting ___Personnel ___Other (specify) ___Sales ___Finance ___R & D

Ordinal Scales
allow for labeling (or categorization) as in nominal scales, but they allow for ranking.
Example: Rate these vacation destinations in terms of how much you would like to visit from 1 to 5 with 1 your most preferred and 5 your least preferred. 1) Bermuda 2) Florida 3) Hawaii 4) Aspen 5) London Ordinal scales do not provide information on the distance between preferences, I may say I prefer Hawaii the most, followed by Bermuda, London, Florida, and Aspen when in reality I could diagram my preferences like this:


Prefer More

Ordinal scale can  provide information about some item having more or less of an attribute than others, Prefer Less  but no information on the degree of this. Permitted statistics: Frequencies, median, mode

Aspen Bermuda

Florida Hawaii

London

Ordinal Scale is used to


 

rank the preferences or usage of various brands of a product by the individuals rank order individuals, objects, or events as per the examples below.
Rank the following personnel computers with respect their usage in your office, assigning the number 1 to the most used system, 2 to the next most used, and so on. If a particular system is not used at all, in your office, put a 0 next to it. ____Apple ____Compaq ____Dell Computer ____Packard Bel ____Hewlett Packard ____Comp USA ____IBM ____Sonyl

Interval Scales
   

Contains the information available in ordinal scales (ranking) but with the added benefit of magnitude of ranking. have equal distances between the points of a scale. can contain a zero point, but they are arbitrary and are not meaningful (0 C = 32 F). Temperature is an example of a interval scale Permitted statistics: mean, median, mode, as well as more advanced tests.

Example

On a scale of one to five, with five meaning you strongly agree, and one meaning you strongly disagree consider this statement I believe my college education has prepared me well to begin my career.

1
Strongly disagree

2
Somewhat disagree

3
Neither

4
Somewhat agree

5
Strongly agree
Examples: 1) Weight 2) Sales volume 3) Income 4) Age

Ratio Scale the most comprehensive scale


usually used in organization research when exact numbers on objective (as opposed to subjective) factors are called for, as in the following question:


Has all of the characteristics of the other three with the additional benefit of an absolute, meaningful zero point. Permitted statistics: same as with interval data.

How many other organizations did you work for before joining this system? Please indicate the number of children you have in each of the following categories: ---- below 3 yrs ---- between 3 and 6 ---- over 6 yrs but under 12 ---- 12 yrs and over How many retail outlets do you operate?

In Survey Research people verbally answer an interviewers questions.

Types of Error in Survey Research


Total Error
Random Sampling Error

Systematic Error

Random Sampling Error Unavoidable, it is the difference between the sample value and the population value. Systematic error (bias) results from mistakes in research design or execution.

Description of Systematic Error


Systematic Error
Measurement Error Sample Design Error

Measurement Error results from the difference between the information sought and the information obtained. Sample Design Error results from flaws in sample design or sampling procedures.

Sources of Measurement Differences


  

There are various sources of error that researchers try to minimize. Random sampling error is always present:  It is known and can be quantified.  It becomes smaller with larger samples. Even with knowledge of these systematic sources of bias, it is possible that non-systematic error occurs in the measurement process (random error).

Measurement Differences differences due to


1. Short term personal factors mood swings, fatigue, time constraints, or other transitory factors. Example telephone survey of same person, difference may be due to these factors (tired versus refreshed) may cause differences in measurement. 2. Situational factors calling when sb may be distracted by sth versus full attention. 3. Variations in administering the survey voice inflection, non verbal communication, etc. 4. Sampling of items included in the questionnaire. 5. A lack of clarity in measurement instrument (measurement instrument error). Example; unclear or ambiguous questions. 6. Mechanical or instrument factors blurred questionnaires, bad phone connections.

Reliability
The degree to which measures are free from random error, and thus will provide consistent measure from one administration of the scales to the next. Key questions for researchers to address: Will this measurement instrument provide consistent results over time? For scales used in questionnaires: `Will we get the same results using these scales over time?

Ways to Check Reliability of Measurement Instrument


1. Test-Retest Measurement of Reliability  Use the same instrument,  administer the test shortly after the first time,  taking measurement in as close to the original conditions as possible, to the same participants. If there are few differences in scores between the two tests, then the instrument is stable. The instrument has shown test-retest reliability. Problems with Test-Retest approach:  Difficult to get cooperation a second time  Respondents may have learned from the first test, and thus responses are altered  Other factors may be present to alter results (environment, etc.) 2. Equivalent Form Reliability  This approach attempts to overcome some of the problems associated with the test-retest measurement of reliability.  Two questionnaires, designed to measure the same thing, are administered to the same group on two separate occasions (recommended interval is two weeks). If the scores obtained from these tests are correlated, then the instruments have equivalent form reliability.  Tough to create two distinct forms that are equivalent.  An impractical method (as with test-retest) and not used often in applied research. 3. Internal Consistency Reliability  This is the ability of a measurement instrument to produce similar results using different samples to measure a phenomenon during the same time period.  This technique assumes equivalence, that is how much error my be introduced by using different samples. This can be measured.  Example: Conducting simultaneous studies of different samples at the same time.

Validity

The ability of a scale or measuring instruments to measure what is intended to be measured.

Political polling the results from administration of measurement instrument we developed to assess voter intent says Joe Smith should get 40% of the vote in the upcoming election. He gets 60%. Is our measurement instrument valid?

Forms of Validity
Face Validity the weakest form of validity  Researcher simply looks at the measurement instrument and concludes that it will measure what is intended.  Thus it is by definition subjective.  A red herring form, researchers would not use an instrument if they did not think it would be valid.
Content Validity
The degree to which the instrument items represent the universe of the concepts under study.

In English: did the measurement instrument cover all aspects of the topic at hand? Example: Lets say that Amazon wanted to measure customer satisfaction and they asked questions about the following: i. Did the merchandise arrive on time? ii. Was it in good condition?  This measure may lack content validity because it assumes that satisfaction is a function of product delivery rather than incorporating questions about the shopping experience.

Criterion Related Validity


 The ability of some measure to correlate with other measures of the same construct.  The degree to which the measurement instrument can predict a variable known as the

criterion variable.

Two subcategories of criterion related validity:  Predictive Validity the degree to which the future level of a criterion variable can be
forecast by a current measurement scale. i. has great implications for survey research ii. Example: Purchase intent


Concurrent Validity the degree to which a predictor variables can assess


a criterion variable at the same point in time

Construct Validity This is the territory of academic researchers.


The ability of a measure to confirm a network of related hypotheses generated from a theory based on the concept. Does the measurement conform to some underlying theoretical expectations?  If so, the measure has construct validity. If we are measuring consumer attitudes about product purchases then:  Do the measure adhere to the constructs of consumer behavior theory?

Two approaches used to measure construct validity




Convergent Validity A high degree of correlation among different measures intended to measure same construct Discriminant Validity A low degree of correlation among constructs that are assumed to be unique or distinct.

Reliability and Validity on Target

Attitudes and Scaling




Researchers in marketing (and other behavioral sciences) are concerned with measuring constructs that exist in the minds of research participants (referred to as attitudes), and as such are not directly observable by the researcher. Scaling: procedures to quantitatively measure abstract or subjective concepts (such as attitudes). Scales can measure only one attribute (uni dimensional) or more than one (multidimensional). presents respondents with graphic continuums anchored by two extreme points. Easy to administer and use Interval scales Only used in mail, in person, or Internet studies. Cannot be used in phone studies 1 Very Bad 5 All right 10 Excellent

Graphic Rating Scale


    

Example: On a scale of 1 to 10 how would you rate your supervisor?

Itemized Rating Scales



   

Definition: Respondents provide answer from a limited number of ordered categories.


used most often in research Have shown to be reliable Easy to develop and administer Scales on following slides are examples of itemized rating scales.

Forms of Scales
Rank Order Scale ranks (from most preferred to least preferred) an object, concept, or person.  Ordinal scale  Comparative in nature ranking one item against each other.  Easy to administer  With too many items, tough to do on phone studies  May not have content validity
Example: Rank the following midsize cars in overall quality. Give the car with the highest quality a one, and the car with the lowest quality a four. ___ Honda Accord ___ Volkswagen Passat ___ Toyota Camry ___ Ford Taurus

Semantic Differential a series of seven-point rating scales with bipolar adjectives, such as good and bad, anchoring the ends (or poles) of the scale. A weight is assigned to each position on the scale. Traditionally, scores are 7, 6, 5, 4, 3, 2, 1, or +3, +2, +1, 0, -1, -2, -3.  Interval scale  Found to be one of the best scales for providing actionable information.  Not often used in telephone interviewing.
Example: Evaluate Honda Accords relative to the following pairs of attributes on the following scale: Exciting ___ : ___ : ___ : ___ : ___ : ___ : ___ Calm Interesting ___ : ___ : ___ : ___ : ___ : ___ : ___ Dull Simple ___ : ___ : ___ : ___ : ___ : ___ : ___ Complex Passive ___ : ___ : ___ : ___ : ___ : ___ : ___ Active

Staple Scale
  

Uses a single adjective as a substitute for the semantic differential when it is difficult to create pairs of bipolar adjectives. Measures direction and intensity of feelings simultaneously Tends to be easier to conduct and administer than a semantic differential scale.

Example: Rate the Accord on: +3 +2 +1 Price -1 -2 -3 Likert Scale




Respondent specifies a level of agreement or disagreement with statements that express favorable or unfavorable attitudes toward the concept under study.

Example 1: It is more fun to play a tough, competitive tennis match than to play an easy one. ___Strongly Agree ___Agree ___Not Sure ___Disagree ___Strongly Disagree Example 2: state the extent to which you agree with each of the following statements: 1. My work is very interesting 2. I am not engrossed in my work all day 3. Life without my work will be dull 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5

Considerations When Selecting a Scale


Balanced versus Non balanced alternatives  Balanced scale has same number of positive and negative categories, Non balanced is tilted in one direction or another  In my view balanced scales provide more honest assessment Number of Points on Scale important in Likert or Semantic Differential scales  Need enough increments to capture spectrum of response, too many become cumbersome for respondent Selecting a Rating, Ranking, Sorting, or Purchase Intent Scale Consider the method of research i. Semantic differential difficult to administer over the phone ii. Likert scale, rank order scale with few possibilities work well on the phone iii. Respondents prefer simplicity (nominal or ordinal). Odd/Even Number of Categories  Odd numbers of categories provide an opportunity for a neutral point.  Even number of categories have no neutral point.  The debate is that even number force a respondent to take a stand, proponents of odd number say that respondents can legitimately be neutral. Forced versus Nonforced Choice  Non forced choice means that respondents have the ability to say do not know as a reply.  This may be legitimate, the respondent may not have enough information to reply or  It can be an out for a lazy respondent.  Forced choice proponents say that respondents should be required to take a stand.  Prevailing view is that non forced choice provides the more valid data.

S-ar putea să vă placă și