Sunteți pe pagina 1din 13

MB0050 - Research Methodlogy

What is the significance of research in social and business sciences? According to a famous Hudson Maxim, All progress is born of inquiry. Doubt is often better than overconfidence, for it leads to inquiry, and inquiry leads to invention. It brings out the significance of research, increased amounts of which makes progress possible. Research encourages scientific and inductive thinking, besides promoting the development of logical habits of thinking and organization. The role of research in applied economics in the context of an economy or business is greatly increasing in modern times. The increasingly complex nature of government and business has raised the use of research in solving operational problems. Research assumes significant role in formulation of economic policy, for both the government and business. It provides the basis for almost all government policies of an economic system. Government budget formulation, for example, depends particularly on the analysis of needs and desires of the people, and the availability of revenues, which requires research. Research helps to formulate alternative policies, in addition to examining the consequences of these alternatives. Thus, research also facilitates the decision making of policymakers, although in itself it is not a part of research. In the process, research also helps in the proper allocation of a countrys scare resources. Research is also necessary for collecting information on the social and economic structure of an economy to understand the process of change occurring in the country. Collection of statistical information though not a routine task, involves various research problems. Therefore, large staff of research technicians or experts is engaged by the government these days to undertake this work. Thus, research as a tool of government economic policy formulation involves three distinct stages of operation which are as follows:

gnosis, i.e., the prediction of future developments Research also assumes a significant role in solving various operational and planning problems associated with business and industry. In several ways, operations research, market research, and motivational research are vital and their results assist in taking business decisions. Market research is refers to the investigation of the structure and development of a market for the formulation of efficient policies relating to purchases, production and sales. Operational research relates to the application of logical, mathematical, and analytical techniques to find solution to business problems such as cost minimization or profit maximization, or the optimization problems. Motivational research helps to determine why people behave in the manner they do with respect to market characteristics. More specifically, it is concerned with the analyzing the motivations underlying consumer behaviour. All these researches are very useful for business and industry, which are responsible for business decision making. Research is equally important to social scientist for analyzing social relationships and seeking explanations to various social problems. It gives intellectual satisfaction of knowing things for the sake of knowledge. It also possesses practical utility for the social scientist to gain knowledge so as to be able to do something better or in a more efficient manner. This, research in social sciences is concerned with both knowledge for its own sake and knowledge for what it can contribute to solve practical problems.

What is the meaning of hypothesis? Discuss the types of hypothesis? According to Theodorson and Theodorson, a hypothesis is a tentative statement asserting a relationship between certain facts. Kerlinger describes it as a conjectural statement of the relationship between two or more variables. Black and Champion have described it as a tentative statement about something, the validity of which is usually unknown. This statement is intended to

be tested empirically and is either verified or rejected. It the statement is not sufficiently established, it is not considered a scientific law. In other words, a hypothesis carries clear implications for testing the stated relationship, i.e., it contains variables that are measurable and specifying how they are related. A statement that lacks variables or that does not explain how the variables are related to each other is no hypothesis in scientific sense. Criteria for Hypothesis Construction Hypothesis is never formulated in the form of a question. The standards to be met in formulating a hypothesis:

ntradictory.

Nature of Hypothesis A scientifically justified hypothesis must meet the following criteria: must accurately reflect the relevant sociological fact.

The Need for having Working Hypothesis

be studied. hypothesis suggests which type of research is likely to be most appropriate.

Characteristics of Good Hypothesis 1. Conceptual Clarity 2. Specificity 3. Testability 4. Availability of Techniques 5. Theoretical relevance 6. Consistency 7. Objectivity 8. Simplicity Types of Hypothesis There are many kinds of hypothesis the researcher has to be working with. One type of hypothesis asserts that something is the case in a given instance; that a particular object, person or situation has particular characteristics. Another type of hypothesis deals with the frequency of occurrence or of association among variables; this type of hypothesis may state that X is associated with Y. A certain Y proportion of items e.g. urbanism tends to be accompanied by mental disease or than something are greater or lesser than some other thing in specific settings. Yet another type of

hypothesis asserts that a particular characteristics is one of the factors which determine another characteristic, i.e. X is the producer of Y. hypothesis of this type are called causal hypothesis. Null Hypothesis and Alternative Hypothesis In the context of statistical analysis, we often talk null and alternative hypothesis. If we are to compare method A with method B about its superiority and if we proceed on the assumption that both methods are equally good, then this assumption is termed as null hypothesis. As against this, we may think that the method A is superior, it is alternative hypothesis. Symbolically presented as: Null hypothesis = H0 and Alternative hypothesis = Ha Suppose we want to test the hypothesis that the population mean is equal to the hypothesis mean ( H0) = 100. Then we would say that the null hypotheses are that the population mean is equal to the hypothesized mean 100 and symbolical we can express as: H0: = H0=100 If our sample results do not support these null hypotheses, we should conclude that something else is true. What we conclude rejecting the null hypothesis is known as alternative hypothesis. If we accept H0, then we are rejecting Ha and if we reject H0, then we are accepting Ha. For H0: = H0=100, we may consider three possible alternative hypotheses as follows: Alternative Hypothesis Ha: H0 To be read as follows (The alternative hypothesis is that the population mean is not equal to 100 i.e., it may be more or less 100) (The alternative hypothesis is that the population mean is greater than 100) (The alternative hypothesis is that the population mean is less than 100)

Ha: > H0

Ha: < H0

The null hypothesis and the alternative hypothesis are chosen before the sample is drawn (the researcher must avoid the error of deriving hypothesis from the data he collects and testing the hypothesis from the same data). In the choice of null hypothesis, the following considerations are usually kept in view: that wish to disprove. Thus a null hypothesis represents the hypothesis we are trying to reject, the alternative hypothesis represents all other possibilities. hypothesis because then the probability of rejecting it when it is true is (the level of significance) which is chosen very small. approximately a certain value. the alternative hypothesis in view. Why so? The answer is that on assumption that null hypothesis is true, one can assign the probabilities to different possible sample results, but this cannot be done if we proceed with alternative hypothesis. Hence the use of null hypothesis (at times also known as statistical hypothesis) is quite frequent. Distinguish between Schedules and Questionnaire? Questionnaires are mailed to the respondent whereas schedules are carried by the investigator himself. Questionnaires can be filled by the respondent only if he is able to understand the language in which it is written and he is supposed to be a literate. This problem can be overcome in case of

schedule since the investigator himself carries the schedules and the respondents response is accordingly taken. A questionnaire is filled by the respondent himself whereas the schedule is filled by the investigator. What are the problems encountered in the interview? In personal interviewing, the researcher must deal with two major problems, inadequate response, non-response and interviewers bias. Inadequate response Kahn and Cannel distinguish five principal symptoms of inadequate response. They are: o partial response, in which the respondent gives a relevant but incomplete answer o non-response, when the respondent remains silent or refuses to answer the question o irrelevant response, in which the respondents answer is not relevant to the question asked o inaccurate response, when the reply is biased or distorted and o verbalized response problem, which arises on account of respondents failure to understand a question or lack of information necessary for answering it. Interviewers Bias The interviewer is an important cause of response bias. He may resort to cheating by cooking up data without actually interviewing. The interviewers can influence the responses by inappropriate suggestions, word emphasis, tone of voice and question rephrasing. His own attitudes and expectations about what a particular category of respondents may say or think may bias the data. Another source of response of the interviewers characteristics (education, apparent social status, etc) may also bias his answers. Another source of response bias arises from interviewers perception of the situation, if he regards the assignment as impossible or sees the results of the survey as possible threats to personal interests or beliefs he is likely to introduce bias. As interviewers are human beings, such biasing factors can never be overcome completely, but their effects can be reduced by careful selection and training of interviewers, proper motivation and supervision, standardization or interview procedures (use of standard wording in survey questions, standard instructions on probing procedure and so on) and standardization of interviewer behaviour. There is need for more research on ways to minimize bias in the interview. Non-response Non-response refers to failure to obtain responses from some sample respondents. There are many sources of non-response; non-availability, refusal, incapacity and inaccessibility. Non-availability Some respondents may not be available at home at the time of call. This depends upon the nature of the respondent and the time of calls. For example, employed persons may not be available during working hours. Farmers may not be available at home during cultivation season. Selection of appropriate timing for calls could solve this problem. Evenings and weekends may be favourable interviewing hours for such respondents. If someone is available, then, line respondents hours of availability can be ascertained and the next visit can be planned accordingly. Refusal Some persons may refuse to furnish information because they are ill-disposed, or approached at the wrong hour and so on. Although, a hardcore of refusals remains, another try or perhaps another approach may find some of them cooperative. Incapacity or inability may refer to illness which prevents a response during the entire survey period. This may also arise on account of language barrier.

Inaccessibility Some respondents may be inaccessible. Some may not be found due to migration and other reasons. Non-responses reduce the effective sample size and its representativeness. Methods and Aims of control of non-response Kish suggests the following methods to reduce either the percentage of non-response or its effects: 1. Improved procedures for collecting data are the most obvious remedy for non-response. Improvements advocated are (a) guarantees of anonymity, (b) motivation of the respondent to cooperate (c) arousing the respondents interest with clever opening remarks and questions, (d) advance notice to the respondents. 2. Call-backs are most effective way of reducing not-at-homes in personal interviews, as are repeated mailings to no-returns in mail surveys. 3. Substitution for the non-response is often suggested as a remedy. Usually this is a mistake because the substitutes resemble the responses rather than the non-responses. Nevertheless, beneficial substitution methods can sometimes be designed with reference to important characteristics of the population. For example, in a farm management study, the farm size is an important variable and if the sampling is based on farm size, substitution for a respondent with a particular size holding by another with the holding of the same size is possible. Attempts to reduce the percentage or effects on non-responses aim at reducing the bias caused by differences on non-respondents from respondents. The non-response bias should not be confused with the reduction of sampled size due to non-response. The latter effect can be easily overcome, either by anticipating the size of non-response in designing the sample size or by compensating for it with a supplement. These adjustments increase the size of the response and the sampling precision, but they do not reduce the non-response percentage or bias. Write short Note on Dispersion? Dispersion Dispersion is the tendency of the individual values in a distribution to spread away from the average. Many economic variables like income, wage etc., are widely varied from the mean. Dispersion is a statistical measure, which understands the degree of variation of items from the average. Objectives of Measuring Dispersion Study of dispersion is needed to: 1. To test the reliability of the average 2. To control variability of the data 3. To enable comparison with two or more distribution with regard to their variability 4. To facilitate the use of other statistical measures. Measures of dispersion points out as to how far the average value is representative of the individual items. If the dispersion value is small, the average tends to closely represent the individual values and it is reliable. When dispersion is large, the average is not a typical representative value. Measures of dispersion are useful to control the cause of variation. In industrial production, efficient operation requires control of quality variation. Measures of variation enable comparison of two or more series with regard to their variability. A high degree of variation would mean little consistency and low degree of variation would mean high consistency.

Properties of a Good Measure of Dispersion A good measure of dispersion should be simple to understand. 1. It should be easy to calculate 2. It should be rigidly defined 3. It should be based on all the values of a distribution 4. It should be amenable to further statistical and algebraic treatment. 5. It should have sampling stability 6. It should not be unduly affected by extreme values. Measures of Dispersion 1. Range 2. Quartile deviation 3. Mean deviation 4. Standard deviation 5. Lorenz curve Range, Quartile deviation, Mean deviation and Standard deviation are mathematical measures of dispersion. Lorenz curve is a graphical measure of dispersion. Measures of dispersion can be absolute or relative. An absolute measure of dispersion is expressed in the same unit of the original data. When two sets of data are expressed in different units, relative measures of dispersion are used for comparison. A relative measure of dispersion is the ratio of absolute measure to an appropriate average. The following are the important relative measures of dispersion. 1. Coefficient of range 2. Coefficient of Quartile deviation 3. Coefficient of Mean deviation 4. Coefficient of Standard deviation Range Range is the difference between the lowest and the highest value. Symbolically, range = highest value lowest value Range = H L H = highest value L = lowest value Relative measure of dispersion is co-efficient of range. It is obtained by the following formula. Coefficient of range = (H L) / (H + L) Range is simple to understand and easy to calculate. But it is not based on all items of the distribution. It is subject to fluctuations from sample to sample. Range cannot be calculated for open-ended series. Quartile Deviation Quartile deviation is defined as inter quartile range. It is based on the first and the third quartile of a distribution. When a distribution is divided into four equal parts, we obtain four quartiles, Q1, Q2, Q3 and Q4. First quartile Q1 is point of the distribution where 25% of the items of the distribution lie below Q1, and 75% of the items of the distribution lie above the Q1. Q2 is the median of the distribution, where 50% of the items of the distribution lie below Q2, and 50% of the items of the distribution lie above the Q2. Third quartile Q3 is point of the distribution where 75% of the items of the distribution lie below Q3, and 25% of the items of the distribution lie above the Q3.

Quartile deviation is based on the difference between the third and first quartiles. So quartile deviation is defined as the inter-quartile range. Symbolically, inter-quartile range = Q3- Q1 Quartile Deviation = (Q3- Q1) / 2 Co-efficient of Quartile Deviation = (Q3- Q1) / (Q3 + Q1) Merits of Quartile Deviation 1. Quartile Deviation is superior to range as a rough measure of dispersion. 2. It has a special merit in measuring dispersion in open-ended series. 3. Quartile Deviation is not affected by extreme values. Demerits of Quartile Deviation 1. Quartile Deviation ignores the first 25% of the distribution below Q1 and 25% of the distribution above the Q3. 2. Quartile Deviation is not amenable to further mathematical treatment. 3. Quartile Deviation is very much affected by sampling fluctuations. Mean Deviation Range and quartile deviation do not show any scatter ness from the average. However, mean deviation and standard deviation help us to achieve the dispersion. Mean deviation is the average of the deviations of the items in a distribution from an appropriate average. Thus, we calculate mean deviation from mean, median or mode. Theoretically, mean deviation from median has an advantage because sum of deviations of items from median is the minimum when signs are ignored. However, in practice, mean deviation from mean is frequently used. That is why it is commonly called as mean deviation. Formula for calculating mean deviation = D/N Where D = sum of the deviation of the items from mean, median or mode N = number of items D is mode less meaning values or deviation is taken without signs. Steps 1. Calculate mean, median or mode of the series 2. Find the deviation of items from the mean, median or mode 3. Sum the deviations and obtain D 4. Take the average of the deviations D/N, which is the mean deviation. The co- efficient of mean deviation is the relative measure of mean deviation. It is obtained by dividing the mean deviation by a particular measure of average used for measuring mean deviation. If mean deviation is obtained from median, the co-efficient of mean deviation is obtained by dividing mean deviation by median. The co-efficient of mean deviation = mean deviation / median If mean deviation is obtained from mean, the co-efficient of mean deviation is obtained by dividing mean deviation by mean. The co-efficient of mean deviation = mean deviation / mean If mean deviation is obtained from mode, the co-efficient of mean deviation is obtained by dividing mean deviation by mode. The co-efficient of mean deviation = mean deviation / mode Continuous series

The procedure remains the same. The only difference is that we have to obtain the midpoints of the various classes and take deviations of these midpoints. The deviations are multiplied by their corresponding frequencies. The value so obtained is added and its average is the mean deviation. Merits of Mean Deviation 1. Mean deviation is simple to understand and easy to calculate 2. It is based on each and every item of the distribution 3. It is less affected by the values of extreme items compared to standard deviation. 4. Since deviations are taken from a central value, comparison about formation of different distribution can be easily made. Demerits of Mean Deviation 1. Algebraic signs are ignored while taking the deviations of the items. 2. Mean deviation gives the best result when it is calculated from median. But median is not a satisfactory measure when variability is very high. 3. Various methods give different results. 4. It is not capable of further mathematical treatment. 5. It is rarely used for sociological studies. Standard deviation Standard deviation is the most important measure of dispersion. It satisfies most of the properties of a good measure of dispersion. It was introduced by Karl Pearson in 1893. Standard deviation is defined as the mean of the squared deviations from the arithmetic mean. Standard deviation is denoted by the Greek letter Mean deviation and standard deviation are calculated from deviation of each and every item. Standard deviation is different from mean deviation in two respects. First of all, algebraic signs are ignored in calculating mean deviation. Secondly, signs are taken into account in calculating standard deviation whereas, mean deviation can be found from mean, median or mode. Whereas, standard deviation is found only from mean. Standard deviation can be computed in two methods 1. Taking deviation from actual mean 2. Taking deviation from assumed mean. Formula for Steps 2. Take deviation of the items from the mean ( x-x) 3. Find the square of the deviation from actual mean-x)2 / N 4. Sum the squares of the deviation -x)2 -x)2 / N 6. Take the square root of the average of the sum of the deviation Discrete Series Standard deviation can be obtained by three methods. 1. Direct method 2. Short cut method 3. Step deviation -x)2 / N

Direct method Under this method formula is 2 Mathematical Averages: Arithmetic mean, geometric mean and harmonic mean are mathematical averages. Median and mode are positional averages. These statistical measures try to understand how individual values in a distribution concentrate to a central value like average. If the values of distribution approximately come near to the average value, we conclude that the distribution has central tendency. Arithmetic Mean Arithmetic mean is the most commonly used statistical average. It is the value obtained by dividing the sum of the item by the number of items in a series. Symbolically we say X/n N = the number of items in the series. If x1 x2 x3 xn are the values of a series, then arithmetic mean of the series obtained by (x1 + When frequencies are also given with the values, to calculate arithmetic mean, the values are first multiplied with the corresponding frequency. Then their sum is divided by the number of frequency. Thus in a discrete series, arithmetic mean is calculated by the following formula. fx/ f

If x1 x2 x3 xn are the values of a series, and f1 f2 f3 fn are their corresponding frequencies, Arithmetic mean is calculated by (f1 x1 + f2 x2 + f3x3 + fn xn) / (f1 + f2 + f3 + fn) or fx / f Geometric Mean Geometric mean is defined as the nth root of the product of N items of a series. If there are two items in the data, we take the square root; if there are three items we take the cube root, and so on. Symbolically, GM = n 2 1 ...x .x x n Where x1, x2. ..xn are the items of the given series. To simplify calculations, logarithms are used. Accordingly, GM = Anti log of log x /n) In discrete series GM = Anti log of f . log x / f Harmonic Mean In individual series (1/x) In discrete series f (1/m) N = Total frequency M = Mi values of the class Median

Median is the middlemost item of a given series. In individual series, we arrange the given data according to ascending or descending order and take the middlemost item as the median. When two values occur in the middle, we take the average of these two values as median. Since median is the central value of an ordered distribution, there occur equal number of values to the left and right of the median. Individual series Median = (N+ 1 / 2) th item Median for Discrete Series To find the median of a grouped series, we first of all, cumulate the frequencies. Locate median at the size of (N+ 1) / 2 th cumulative frequency. N is the cumulative frequency taken. Steps 1. Arrange the values of the data in ascending order of magnitude. 2. Find out cumulative frequencies 3. Apply the formula (N+ 1) / 2 th item 4. Look at the cumulative frequency column and find the value of the variable corresponding to the above.. Median for Continuous Series To find the median of a grouped series, with class interval, we first of all, cumulate the frequencies. Locate median at the size of (N) / 2 th cumulative frequency. Apply the interpolation formula to obtain the median Median = L1 + (N/2 m) / f X C L1 = Lower limit of the median Class N/2 = Cumulative frequency/ 2 m = Cumulative frequency of the class preceding the median class f = frequency of the median class C = Class interval Merits of Median 1. Median is easy to calculate and simple to understand. 2. When the data is very large median is the most convenient measure of central tendency. 3. Median is useful finding average for data with open-ended classes. 4. The median distributes the values of the data equally to either side of the median. 5. Median is not influenced by the extreme values present in the data. 6. Value of the median can be graphically determined. Demerits of Median the number of items in a series is numerous. nce the value of median is determined by observation, it is not a true representative of all the values.

Mode

Mode is the most repeating value of a distribution. When one item repeats more number of times than other or when two items repeat equal number of times, mode is ill defined. Under such case, mode is calculated by the formula (3 median 2 mean). Mode is a widely used measure of central tendency in business. We speak of model wage which is the wage earned by most of the workers. Model shoe size is the mostly demanded shoe. Merits of Mode . -ended classes.

Demerits of Mode 1. It is difficult to calculate mode when one item repeats more number of times than others. 2. Mode is not capable of further algebraic treatment. 3. Mode is not based on all the items of the series. 4. Mode is not rigidly defined. There are several formulae for calculating mode. What is Sampling process? A statistical sample ideally purports to be a miniature model or replica of the collectivity or the population. Sampling helps in time and cost saving. If the population to be studied is quite large, sampling is warranted. However, the size is a relative matter. The decision regarding census or sampling depends upon the budget of the study. Sampling is opted when the amount of money budgeted is smaller than the anticipated cost of census survey. The extent of facilities available staff, access to computer facility and accessibility to population elements is another factor to be considered in deciding to sample or not. In the case of a homogenous population, even a simple random sampling will give a representative sample. If the population is heterogeneous, stratified random sampling is appropriate. Probability sampling is based on the theory of probability. It is also known as random sampling. It provides a known non-zero chance of selection for each population element. Simple random sampling technique gives each element an equal and independent chance of being selected. An equal chance means equal probability of selection. Stratified random sampling is an improved type of random or probability sampling. In this method, the population is sub-divided into homogenous groups or strata, and from each stratum, random sample is drawn. Proportionate stratified sampling involves drawing a sample from each stratum in proportion to the latters share in the total population. Disproportionate stratified random sampling does not give proportionate representation to strata. Systematic random sampling method is an alternative to random selection. It consists of taking kth item in the population after a random start with an item form 1 to k. It is also known as fixed interval method. Cluster sampling means random selection of sampling units consisting of population elements.

In Area sampling larger field surveys cluster consisting of specific geographical areas like districts, taluks, villages or blocks in a city are randomly drawn. Multi-stage sampling is carried out in two or more stages. The population is regarded as being composed of a number of second stage units and so forth. That is, at each stage, a sampling unit is a cluster of the sampling units of the subsequent stage. Double sampling and multiphase sampling refers to the subsection of the final sample form a preselected larger sample that provided information for improving the final selection. Replicated or interpenetrating sampling involves selection of a certain number of sub-samples rather than one full sample from a population. Non-probability or non random sampling is not based on the theory of probability. This sampling does not provide a chance of selection to each population element. Purposive (or judgment) sampling method means deliberate selection of sample units that conform to some pre-determined criteria. This is also known as judgment sampling. Quota sampling is a form of convenient sampling involving selection of quota groups of accessible sampling units by traits such as sex, age, social class, etc. it is a method of stratified sampling in which the selection within strata is non-random. Snow-ball sampling is the colourful name for a technique of Building up a list or a sample of a special population by using an initial set of its members as informants.

S-ar putea să vă placă și