Sunteți pe pagina 1din 11

A Comparison of Three Brand Evaluation Procedures

YORAM WIND, JOSEPH DENNY, AND ARTHUR CUNNINGHAM

i M his 1977 AAPOR presidential address, Irving Crespi (1977) concluded his observations on the current state of attitude research by stating: "What has been found wanting is the way so much attitude research has been conducted with, on the one hand, reliance on attitude scales of an unwarranted high order of generality and abstraction and, on the other hand, theory that is overly particularized and concrete." Recognizing the limitations of attitude measurement has long dominated much of the attitude research literature. Hauser (1969), for example, stated it succinctly: "I should like to venture the judgment that it is inadequate measurement, more than inadequate concept or hypothesis, that has plagued social researchers and prevented fuller explanation of the variances with which they are confounded." Attitude measurement often reflects not only the "true" attitude toward an object but also the respondent's reaction to the specific scale used, the wording of the questions, the context in which the specific measuring instrument was administered, the timing, and other task and situational factors. Many of these effects on attitude meaAbstract The paper reports an empirical examination of the results, reliability, and validity of three attitude measurement proceduresopen-ended, paired comparison, and monadic rating. Significant differences among the three approaches were found in the evaluation of two brands on a number of attitudinal dimensions with respect to judgment of "both brands are alike" and the evaluation of "brand A vs. brand B." Yoram Wind is Professor of Marketing, The Wharton School, University of Pennsylvania. Joseph Denny is a marketing consultant. Arthur Cunningham is Director of New Ventures, Market and Consumer Research, at S. B. Thomas, Inc.
Public Opinion Quarterly 1979 by The Trustees of Columbia University Published by Elsevier North-Holland, Inc. 0033-362X/79/0043-0261/$l.75

262

WIND, DENNY AND CUNNINGHAM

surement have been examined,' yet little attention has been given to the explicit comparison of different measurement approaches. The objective of this paper is to examine three commonly used approaches to the measurement of respondents' attitudes toward objects (for example, brands, political candidates, social issues, and the like). These approaches are: (a) open-ended approach (that is, unaided and unconstrained approach to eliciting attitudes), (b) a paired comparison, and (c) a monadic rating of the object under study on selected attitudinal dimensions. The researcher is often faced with the question of which of these three attitude measurement approaches he should use. Aside from the conceptual differences among the procedures of open-ended, paired comparison (ranking), and ratings, the researcher in selecting among these procedures should know (1) whether the three methods lead to different results with respect to the identification of differences among objects (e.g., brands) and the preference for object A over object B with respect to attribute /, and (2) whether the three approaches differ in their reliability and validity. This paper explores these two questions. The comparison of the three approaches, reported here, was undertaken in the context of brand evaluation, but since these approaches are often used in other areas of attitude research (such as political studies, communication research, and social psychological studies), the implication of this study goes beyond the narrow range of brand evaluation. The Study Three attitude measurement approaches were used to assess consumers' evaluation of a given brand of a frequently purchased packaged food product and its competitive position. Each of the three measurement approaches covered two dimensionsevaluation of brand A over brand B (for overall preference and on a number of key attributes), and whether both brands were perceived as the same or not. The three approaches (and a prototypical example from each) are as follows:
1. OPEN-ENDED QUESTIONS

I would like to know your overall opinion of the various brands of product X. Overall would you say that all brands are equally good, that one brand is better than the others, or that one brand stands out as best?
' See, for example, Phillips (1977), Summers (1970), and Fishbein (1967). An excellent analysis of some of the specific measurement problems as they relate to context effect is offered by Lipset (1977).

THREE BRAND EVALUATION PROCEDURES

263

If some brands are better than others, which brands are those? If one brand stands out, which brand is that? . 2. PAIRED COMPARISON I would like to know your overall opinion of two brands of product X. They are brand A and brand B. Overall, which of these two brands^^A or Bdo you think is the better one? Or are both the same? A is better B is better They are the same
3. MONADIC RATING

I would like to know your overall opinion of two brands of product Xbrand A and B. Thinking of how you would describe your overall opinion of each brand, how would you rate brand A? Wotild you describe it as Excellent Very Good Good Fair Poor

And how would you rate brand B? Excellent Very Good Good Fair Poor

Each of these brand evaluation procedures was utilized to establish consumers' overall evaluations of each brand as well as to determine their evaluation of the brands on a specific list of product attributes. The study, designed to compare these three approaches, was conducted among 600 housewives in a major metropolitan area. Three independent random samples were drawn, and the respondents in each sample received one of three questionnaires corresponding to one of the three brand evaluation procedures. The three sets of questions followed the prototypical questions presented above and provided data on the respondents' evaluation of the two leading brands in the given food category.^ This evaluation included overall opinion as well as specific evaluation on nine product attributes, and covered the two dimensions previously mentioned evaluation of brand A vs. B, and evaluation of whether the brands were perceived as similar or not. The latter dimension was measured directly for the open-ended and paired comparison approaches and indirectly for the monadic evaluation by comparing the ratings of the two brands (i.e., if the two brands received the same rating, they were classified as similar). In addition, the respondents were asked to indicate their degree of confidencevery sure, somewhat sure, or not
^ The open-ended procedure elicited more than two brands, but for comparison purposes we reported only the results for the two leading brands in the marketplace (one of these was the client's brand and the other the leading competitor). The analysis included only the 200 respondents who mentioned brands A and B. Interviews with respondents who did not mention these two brands were terminated.

264

WIND, DENNY AND CUNNINGHAM

surein their evaluation of the brands on each of the nine attributes. Information on brand purchase behavior and demographic characteristics was also collected. To answer the various research questions, the data generated from the three samples were submitted to a number of analytical techniques. Each of these analytical techniques was conducted twice: first for an evaluation of the product category on "whether both brands are alike" and second for evaluation of brand A vs. brand B. The specific plan of analysis included the following steps: 1. To determine whether there are any differences in the results obtained from the three brand evaluation procedures, three analytical approaches were used: (a) a simple cross-tabulation by the three brand evaluation approaches of the percentage of respondents who indicated that both brands are alike and similarly the percentage of respondents indicating that brand A is better than brand B; (b) a multivariate analysis of covariance (Wind and Denny, 1974); and (c) a multiple discriminant analysis. 2. The "reliability" (of relationship) of the three brand evaluation procedures was established by the "split-half approach, that is, each of the three samples was split randomly and two-group discriminant analysis conducted to establish whether the two subsamples differed significantly. 3. To establish the "predictive power" of the three approaches, the respondents' brand choice behavior (number of brands bought and brand most often bought) was used as the criterion measure, and the evaluation of the brands on the nine attributes (as both brands are alike and brand A stands out) served as the predictor variable in a series of multiple regression analyses. These analyses were undertaken twice: once for the total sample of each brand evaluation procedure to establish the overall R^, and again on 80 percent of the observations of each sample with the remaining 20 percent used as a validation sample. However, given the lack of an external measure of "validity," one can only evaluate the relative predictive power of the three approaches.
Results
DO THE THREE BRAND EVALUATION PROCEDURES LEAD TO DIFFERENT RESULTS?

An examination of the results of the multivariate analysis of covariance, as summarized in Table 1, suggests that there are significant differences among the three approaches with respect to both the evaluation of the two brands as being alike and of brand A as the preferred brand.

THREE BRAND EVALUATION PROCEDURES

265

Table 1. Summary of Multivariate Analysis of Covariance

Approximate F Statistic For: Both Brands Brand A Preferred Are Alike Over B Source of variance Brand evaluation procedure (/) Respondents' degree of confidence (/) Interaction (//) Covariates Size of household Education
Age

17.25 1.64 2.01 1.36 1.81 1.51 0.48

2.19 1.21 1.10 2.35 1.75 1.33 0.74

Income

Table 2 provides some insights as to the nature of the differences among the three approaches. Examination of this table suggests a number of conclusions concerning the ability to differentiate between the brands and evaluate their relative attractiveness. With respect to brand differentiation: Ratings, when compared to open-ended responses and paired comparisons, tend to lead to a higher percentage of respondents who view both brands as similar. Paired comparison tends to lead to the smallest percentage of respondents who view both brands as similar; i.e., of the three procedures tested, it is the one which leads to the greatest brand differentiation. The open-ended procedure seems to be positioned somewhere between the paired comparison and the rating procedures. In relation to the paired comparison method, on most attributes it results in a somewhat higher percentage of respondents who indicated that both brands are alike. The three evaluation methods differ in their ability to identify the two brands as being alike. The percentage of cases correctly classified is significantly better than chance. With respect to evaluation of brand A vs. brand B: Ratings, when compared to open-ended responses and paired comparison, tend to lead to a higher brand differentiation between the two leading brands with respect to all specific product attributes. Paired comparison, which leads to the smallest percentage of respondents who view the brands as similar, leads to the highest percentage of respondents who prefer brand A over brand B with respect to the two general measures of preference"overall evaluation" and "value for money."

266

WIND, DENNY AND CUNNINGHAM

Table 2. A Comparison of the Results Obtained by the Three Procedures"

CROSS TABULATION % of Respondents Indicating Both Brands Are the Same Openended Comparison Ratings Attributes I. Overall evaluation 2. Availability 3. Price 4. Nutrition 5. Aroma 6. Size 7. Freshness 8. Flavor 9. Value for money
(1) (2) (3)

0.24 0.45 0.54 0.85 0.57 0.87 0.36 0.34 0.61

0.23 0.21 0.52 0.42 0.18 0.38 0.25 0.20 0.38

0.38 0.46 0.76 0.71 0.57 0.78 0.49 0.47 0.72

% of Respondents Preferring Brand A Over B with Respect to the 9 Attributes Openended Comparison Ratings (3) (1) (2) 0.11 0.18 0.14 0.08 0.05 0.09 0.02 0.07 0.09 0.01 0.08 0.09 0.06 0.10 0.12 0.02 0.07 0.09 0.06 0.10 0.13 0.06 0.13 0.16 0.04 0.16 0.09

TWO-GROUP DISCRIMINANT ANALYSIS (1) Both Brands Are Alike Paired CompariOpen-ended (1) r vs. Paired son (2) vs. Open-endea'(I) Comparison (2) Ratings (3) vs. Ratings (3) % of cases correctly classified (method ' i vs. j) Significant discriminating attributes (and their associated discriminant coefficients)

1 85% 2 75% / .87 .85 .57 .45

2 72% 3 74%
2 .37 .42 .18 .21 2 3

I 76% 3 71% 1 Price Size Nutrition Freshness Overall Eval. Flavor .14 87 85 36 ..24 34 3 .78 .77 .71 .49 .37 .47

Size Nutrition Aroma Avail.

Size Aroma Avail. Value

.37 .18 .21 .37

.78 .57 .46 .71

'2) ( Brand A > Brand B Open-ended (1) 1 vs. Paired Comparison (2) % of cases correctly classified (Method i vs. j) Paired Comparison (2) vs. Ratings (3) Open-ended(I) vs. Ratings (3)

1 85% 2 52%
/ 2 .12 .07 .07 .05

2 76% 3 22%
2 3

1 88% 3 26% 1 Nutrition Size Price Flavor 01 02 02 06 3 .09 .09 .09 .16

Significant discriminating Nutrition attributes (and Size their associated Price coefficients) Avail.

.01 .02 .02 .08

Avail. Nutrition

.05 .09 .12 .08

" The sample base was 200 completed questionnaires for each method; there were no more than 18 nonresponses to any given item.

THREE BRAND EVALUATION PROCEDURES

267

The open-ended procedure, when compared to the other two approaches with respect to the evaluation of brand A over B, yielded a smaller percentage of preference for brand A. Overall, all three methods resulted in a relatively small percentage of respondents (less than 20 percent of the total sample) preferring brand A over brand B. This may reflect the current market share of brand A, which is less than 20 percent. The ability to discriminate among the three brand evaluation procedures on preference for brand A over B is not uniform. Brand evaluation via the open-ended approach can be discriminated quite clearly, while brand evaluation on paired comparison cannot be discriminated as well, and ratings not at all.
WHICH OF THE THREE METHODS IS THE MOST RELIABLE?

Since no test-retest data were collected, traditional reliability measures cannot be established for the three data collection procedures. A surrogate for reliability was developed, however, by randomly splitting the data from each of the three brand evaluation procedures and analyzing it via two-group discriminant analysis with respect to evaluation of both brands as being alike (or not) on the nine attributes. The results of this analysis, summarized in Table 3, suggest that the three brand evaluation procedures are similar in their "reliability" with the percentage of cases correctly classified not better than chance. The same procedure was repeated with respect to evaluation of brand A vs. brand B. The results of this analysis suggest that, in this context, none of the data collection procedures is very "reliable."
WHICH OF THE METHODS HAS THE HIGHEST PREDICTIVE POWER?

The validity of the three brand evaluation procedures was assessed by two series of multiple regression analyses. The first set utilized the evaluation of brand A on the nine attributes as the predictor set and the purchase of brand A as the criterion variable. The second set of runs was based on the evaluation of both brands as alike on the nine
Table 3. A Comparison of the Reliability of Results of the Three Procedures

ff

Two-Group Discriminant Analysis on Two Random Split Samples Both Brands Are Alike Brand A > Brand B Open- Paired ended Comparison Ratings 64% 62% 66% 58% 50% 70% Open- Paired ended Comparison Ratings 92% 80% 80% 19% 28% 19%

Correctly Classified Split 1 Split 2

268

WIND, DENNY AND CUNNINGHAM

attributes as the predictor set, and the total number of brands bought as the criterion variable. The results of the multiple regression analysis on the entire sample are presented in Table 4, which suggests the following conclusions: Evaluation of brand A as better than brand B on the nine attributes is a reasonably good predictor of actual purchase of brand A. The open-ended procedure seems to be the best of the three approaches in predicting purchase of brand A (R^ = .33). There is no significant difference between the predictive efficacy of the other two procedures. Evaluation of both brands as being alike on the nine attributes provided for all three brand evaluation procedures poor prediction of the number of brands bought. (The hypothesis underlying this analysis was that the more consumers view both brands as similar (on all attributes) the less loyal they will be to a particular brand and will tend to buy more brands.) The predictive efficacy of all three procedures is very poor.
Table 4. Predictive Efficacy of the Three Procedures Results of Multiple Regression Analysis: Evaluation of brand A on the 9 attributes as predictors of the purchase of brand A Open-ended R^ F .33 (2.45) Paired Comparison .24 (4.12) Ratings .25 (2.91)

Regression Coefficients of Key Variables Open-ended .34 Flavor Value for money .19 .14 Overall evaluation -.41 Size .14 Freshness Paired Comparison Flavor .23 Value for money .24 Freshness .15 Ratings Flavor .18 Overall evaluation .12 Size -.16 Price .15 Availability .14 Aroma .11

Results of Multiple Regression Analysis: Evaluation of both brands as being similar on the 9 attributes as predictors of the number of brands bought Open-ended F Open-ended Nutrition Paired Comparison Ratings .03 (1.87) Ratings .19 .21 .06 .02 (4.69) (1.97) Regression Coefficients of Key Variables Paired Comparison .20 Price .22

Overall evaluation Nutrition

THREE BRAND EVALUATION PROCEDURES

269

The results of a split-run validation procedure provided further support for the above conclusions, stressing the better predictive efficiency of the open-ended procedure.
Conclusions

Three attitude measurement proceduresopen-ended, paired comparison, and ratingswere compared with respect to their ability to discriminate among brands of a frequently purchased food product. The analyses suggest four major conclusions with respect to the attitude measurement procedures as measured in this study: 1. There are significant differences in the results obtained by the three data collection procedures. 2. The three data collection procedures are about equal in reliability and slightly more reliable for the evaluation of "both brands are alike" than for the evaluation of "brand A vs. brand B." 3. The open-ended approach seems to be the best one in predicting actual purchase of a given brand. 4. Paired comparison and rating, both with lower predictive efficacy ("validity"?), tend to overestimate the preference for one brand over others. Ratings seem also to provide the highest percentage of respondents who indicate that all brands are alike. These results, which were based on a relatively large sample of typical consumers, clearly suggest that the results obtained from a survey and, hence, the conclusions drawn from it depend as much on the researcher's choice of measurement approach as on the real world phenomena. This suggests the need for further research into the characteristics of alternative attitude measurement procedures. Typically, little attention is given to the explicit evaluation of alternative approaches. Too often the decision to use an open-ended approach, a rating scale, a paired comparison, or any other attitude measurement procedure is a result of the researcher's habit, familiarity with the various procedures, and the past practice of his/her company. Attention should be focused on the conceptual, measurement, and implementation properties of various attitude measurement approaches to better understand their reliability, validity, and biases. Research efforts in this direction will lead to improvements in the quality of attitude research in general and brand evaluation in particular.
References Crespi, I. 1977 "Attitude measurement, theory and prediction." Public Opinion Quarterly 41:285-94.

270

WIND, DENNY AND CUNNINGHAM

Fishbein, M. (ed.) 1967 Readings in Attitude Theory and Measurement. New York: Wiley. Hauser, P. 1969 Pp. 122-36 in R. Bierstedt (ed.), A Design for Sociology: Scope, Objectives and Methods. Philadelphia: American Academy of Political and Social Sciences. Lipset, S. M. 1977 "Polls for the White House and the rest of us." Encounter, Nov. 1977:24-34. Phillips, D. C. 1977 Knowledge for What? Chicago: Rand-McNally. Summers, G. F. (ed.) 1970 Attitude Measurement. Chicago: Rand-McNally. Wind, Y., and J. Denny 1974 "Multivariate analysis of variance in research on the effectiveness of TV commercials." Journal of Marketing Research 11:136-42.

S-ar putea să vă placă și