All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.

Related Documents

Share

Description

International Journal of Psychological Research, 2010. Vol. 3. No. 1.
ISSN impresa (printed) 2011-2084
ISSN electrónica (electronic) 2011-2079
Courvoisier, D.S., Renaud, O., (2010). Robust analysis of the central tendency,
simple and multiple regression and ANOVA: A step by step tutorial. International
Journal of Psychological Research, 3 (1), 79-88.
International Journal of Psychological Research 79
Robust analysis of the central tendency, simple and multiple regression

Transcript

International Journal of Psychological Research, 2010. Vol. 3. No. 1. ISSN impresa (printed) 2011-2084 ISSN electrónica (electronic) 2011-2079
Courvoisier, D.S., Renaud, O., (2010). Robust analysis of the central tendency, simple and multiple regression and ANOVA: A step by step tutorial.
International Journal of Psychological Research, 3
(1), 79-88.
International Journal of Psychological Research
79
Robust analysis of the central tendency, simple and multiple regression and ANOVA: A step by step tutorial
.
Analisis robusto de la tendencia central, regresión simple, múltiple y ANOVA: Un tutorial paso a paso.
Delphine S. Courvoisier and Olivier Renaud
University of Geneva
ABSTRACT
After much exertion and care to run an experiment in social science, the analysis of data should not be ruined by an improper analysis. Often, classical methods, like the mean, the usual simple and multiple linear regressions, and the ANOVA require normality and absence of outliers, which rarely occurs in data coming from experiments. To palliate to this problem, researchers often use some ad-hoc methods like the detection and deletion of outliers. In this tutorial, we will show the shortcomings of such an approach. In particular, we will show that outliers can sometimes be very difficult to detect and that the full inferential procedure is somewhat distorted by such a procedure. A more appropriate and modern approach is to use a robust procedure that provides estimation, inference and testing that are not influenced by outlying observations but describes correctly the structure for the bulk of the data. It can also give diagnostic of the distance of any point or subject relative to the central tendency. Robust procedures can also be viewed as methods to check the appropriateness of the classical methods. To provide a step-by-step tutorial, we present descriptive analyses that allow researchers to make an initial check on the conditions of application of the data. Next, we compare classical and robust alternatives to ANOVA and regression and discuss their advantages and disadvantages. Finally, we present indices and plots that are based on the residuals of the analysis and can be used to determine if the conditions of applications of the analyses are respected. Examples on data from psychological research illustrate each of these points and for each analysis and plot, R code is provided to allow the readers to apply the techniques presented throughout the article.
Key words:
robust methods; ANOVA; regression; diagnostic; outliers
RESUMEN
A menudo, métodos clásicos como la media, la regresión simple y múltiple, y el análisis de varianza (ANOVA), requieren que los datos se distribuyan normalmente y estén exentos de valores extremos, lo que en práctica es inusual. Los investigadores típicamente usan métodos como la detección y eliminación de valores extremos como una medida para que los datos se ajusten a los requerimientos de los métodos clásicos. En este artículo se muestran las desventajas tal práctica. En particular, se muestra que los valores extremos algunas veces pueden ser difíciles de detectar afectando así la interpretación de los resultados. Se propone entonces un método más apropiado y moderno que se basta en procedimientos robustos en donde los valores extremos no afectan los datos permitiendo una interpretación más adecuada de los mismos. Se presenta un tutorial paso a paso de un análisis descriptivo que le permita a los investigadores hacer una revisión inicial del método más apropiado para analizar los datos. Luego, se compara el ANOVA y la regresión tradicional con su versión robusta para discutir sus ventajas y desventajas. Finalmente, se presentan diagramas de los residuales de los análisis y que pueden usarse para determinar si las condiciones de aplicación de los análisis son apropiadas. Se usan ejemplos tomados de la investigación en psicología para ilustrar los argumentos acá expuestos, y se presenta un código en lenguaje R para que el lector use las técnicas acá presentadas.
Palabras clave:
métodos robustos, ANOVA, regresión, diagnostico, valores extremos.
Article received/Artículo recibido: December 15, 2009/Diciembre 15, 2009, Article accepted/ Artículo aceptado: March 15, 2009/Marzo 15/2009 Dirección correspondencia/Mail Address:
Delphine S. Courvoisier, Division of Clinical Epidemiology, HUG, University of Geneva, Switzerland
Email: delphine.courvoisier@hcuge.ch
Olivier Renaud, Methodology and Data Analysis, Dept. of Psychology, FPSE, University of Geneva, Switzerland INTERNATIONAL JOURNAL OF PSYCHOLOGICAL RESEARCH esta incluida en PSERINFO, CENTRO DE INFORMACION PSICOLOGICA DE COLOMBIA, OPEN JOURNAL SYSTEM, BIBLIOTECA VIRTUAL DE PSICOLOGIA (ULAPSY-BIREME), DIALNET y GOOGLE SCHOLARS. Algunos de sus articulos aparecen en SOCIAL SCIENCE RESEARCH NETWORK y está en proceso de inclusion en diversas fuentes y bases de datos internacionales. INTERNATIONAL JOURNAL OF PSYCHOLOGICAL RESEARCH is included in PSERINFO, CENTRO DE INFORMACIÓN PSICOLÓGICA DE COLOMBIA, OPEN JOURNAL SYSTEM, BIBLIOTECA VIRTUAL DE PSICOLOGIA (ULAPSY-BIREME ), DIALNET and GOOGLE SCHOLARS. Some of its articles are in SOCIAL SCIENCE RESEARCH NETWORK, and it is in the process of inclusion in a variety of sources and international databases.
International Journal of Psychological Research, 2010. Vol. 3. No. 1. ISSN impresa (printed) 2011-2084 ISSN electrónica (electronic) 2011-2079
Courvoisier, D.S., Renaud, O., (2010). Robust analysis of the central tendency, simple and multiple regression and ANOVA: A step by step tutorial.
International Journal of Psychological Research, 3
(1), 79-88.
80
International Journal of Psychological Research
INTRODUCTION
Null hypothesis testing is used in 97% of psychological articles (Cumming et al., 2007). Thus, it is particularly important that such a widely used tool be applied correctly in order to obtain correct parameter estimates and
p
-values. Often, classical methods, like the mean, the usual simple and multiple linear regressions, and the ANOVA require normality and absence of outliers, which rarely occurs in data coming from experiments (Micceri, 1989). Some analyses require additional assumptions (e.g., homoscedasticity for ANOVA, see below). When these conditions of application are not respected, parameter estimates, confidence intervals, and
p
-values are not reliable. Many researchers use ad hoc methods to “normalize variables by either transforming them (e.g., logarithmic transformation) or by deleting outlying observations (e.g., deleting observations more than two standard deviations from the mean). However, there are several problems with these methods. The main problem with transformation is that the scale of the variables becomes harder to interpret. Moreover, it is difficult to be sure that the transformation chosen really restored normality. Finally, transformation may not reduce the number of outliers (Wilcox & Keselman, 2005) and thus solve only part of the problem. There are also several problems with outliers deletion. The first is that outliers are difficult to detect. The most common procedure consider as outliers all observations that are more than two SD from the mean (Ratcliff, 1993). However, the mean and the standard deviation used to determine outliers are themselves influenced by outliers, thus yielding inappropriate estimates of central tendency and dispersion (Wilcox & Keselman, 2005; Rousseeuw & Leroy, 1987). Furthermore, by simply removing observations, the standard errors based on the remaining observations are underestimated, thus providing incorrect
p
-values (Wilcox, 2001). Additionnally, while it has been shown that this method leads to low overestimation of the population's mean, such overestimation is dependent of sample size (Perea, 1999). Finally, especially in the context of multiple regression, removing outliers variable by variable does not prevent from so called multivariate outliers that cannot be spotted by any simple method, but that can have a huge influence on the estimation and all
p
-values. Whereas transformation or deletion of some or all observations changes the data themselves, robust procedures change the estimation of the indices of interest (e.g., central tendency, regression coefficient). Robust procedures are able to provide correct estimation of parameters and
p
-values and thus maintain the type I error rate at its nominal level, while keeping almost the same power, even when the conditions of applications of the classical test are not respected (see Wilcox, 2003 for a more detailed definition). By comparison with classical procedures, robust procedures have several advantages. First, they provide a more correct estimation of the parameters of interest. Second, they allow an a posteriori detection of outliers. With this a posteriori detection, it is possible to check if classical analyses would have led to the correct values. The only disadvantage of robust procedures is that, when the conditions of application were in fact respected, they are slightly less powerful than classical procedures (Heritier, Cantoni, Copt, & Victoria-Feser, 2009). Robust procedures are often described by two characteristics. The first characteristic is relative efficiency. Efficiency is maximum when the estimator has the lowest possible variance. Relative efficiency compares two estimators by computing the ratio of their efficiency in a given condition (for example, see below for definitions and indications on the relative efficiency of biweight regression compared to OLS when the errors are Gaussian). The second characteristic of robust procedures is breakdown point. Breakdown point is a global measure of the resistance of an estimator. It corresponds to the largest percentage of outliers that the estimator can tolerate before producing an arbitrary result (Huber, 2004). One of the reasons why robust procedures are not used more often is that detailed presentations of how to run these analyses are rare. In this tutorial, we will present three step by step robust analyses: central tendency and dispersion measures, regression, and ANOVA.
ROBUST PROCEDURES: ALTERNATIVE TO SOME CLASSICAL INDICES AND ANALYSES
Each analysis will follow the same presentation and present real data examples. First, we will present the conditions of applications of the analysis and the consequences when these conditions are not respected. Second, before any analysis, a graphical exploration of the data is always useful. Specifically, it allows researchers to make a first check on the structure of the data (e.g., normality, presence of outliers, presence of groups). For multivariate data, it also allows to check if the data are correlated. Thus, in this second step, we will explain how to check these conditions with plots and/or tests for regression and ANOVA, we also present diagnostic tools to check the suitability of the different methods. Third, we will present several robust alternatives to the classical tests and discuss their relative merits and disadvantages.
International Journal of Psychological Research, 2010. Vol. 3. No. 1. ISSN impresa (printed) 2011-2084 ISSN electrónica (electronic) 2011-2079
Courvoisier, D.S., Renaud, O., (2010). Robust analysis of the central tendency, simple and multiple regression and ANOVA: A step by step tutorial.
International Journal of Psychological Research, 3
(1), 79-88.
International Journal of Psychological Research
81
Central tendency and dispersion measures
Conditions of applications of the classical measures
. The most well-known measure of central tendency is the mean. The mean provides an optimal estimation of the central tendency only if the variable is normally distributed and without outliers. If the variable is skewed and/or has outliers, the mean will be excessively influenced by the extreme observations. Similarly, the most well-known measure of dispersion is the standard deviation. This measure is also highly influenced by non-normality and outliers.
Check of conditions of applications and graphical exploration
. Normality and presence of outliers can be checked with normality tests and with graphs. Unfortunately, normality tests are often biased (Yacizi & Yolacan, 2007). Moreover, these tests do not specifically detect the presence of outliers. Thus, it is more informative and correct to look at graphs of the variables, such as histograms or boxplots. Figure 1 presents the distribution of the time to the first cigarette for 2435 subjects (Courvoisier & Etter, 2008 for each subject, it represents the self-reported average time in minutes from awakening to smoking the first cigarette.). This variable is often used in the psychology of addiction to estimate the degree of dependence to cigarette and to predict smoking cessation (Baker et al., 2007; Courvoisier & Etter, 2010). As can be expected, the distribution of the time to the first cigarette
is not normal and is heavily skewed towards high values. Figure 1.
Boxplot of time to first cigarette [min]. Robust alternatives
. There are many alternatives to the mean and standard deviation. However, it is not always clear which alternative should be used. We will first present some of the alternatives and then discuss their relative merits. Alternatives to the mean include the well-known
median
and
trimmed mean
, as well as the
Winsorized mean
publicized by Wilcox (Wilcox, 2001; Wilcox & Keselman, 2005; Wilcox, 2003). Less famous, among psychologists, estimators are the class of M-estimators. For the central tendency, we will present the (Tukey's)
biweight estimator
. The median is given by the central value, if the variable has an odd number of observations, or given by the mean on the two central values, if the variable has an even number of observations. In other words, if a variable has values: 2, 3, 4, 4, 5, 5, 6, 6, 6, 20, the median will be equal to the mean between the fifth and sixth value (i.e., 5). The mean, however, will be equal to 6.1, which is higher than all observations except one. This illustrates that the mean has a zero breakdown point, since only one outlying observation can heavily modify its value. Several researchers feel that the median discards too many observations - in fact all except one or two - and prefer to discard a smaller amount of information by using the trimmed mean. Wilcox (2001) proposed to use 20% of trimming. To obtain the 20% trimmed mean, the 20% lowest and highest values are removed and the mean is computed on the remaining observations. In our example, these values will be: 4, 4, 5, 5, 6, 6, and the 20% trimmed mean will be equal to 5. The Winsorized mean is similar to the trimmed mean but the lowest (resp. highest) values are not removed but replaced by the lowest (resp. highest) untrimmed score. In our example, the values of the variables, also called Winsorized scores, will then be: 4, 4, 4, 4, 5, 5, 6, 6, 6, 6, and the 20% Winsorized mean will be equal to 5. The mean, median, trimmed mean all either take or drop observations. As for the Winsorized mean, it replaces values by less extreme values. While these techniques are simple, they all lack a clear rationale on why they choose this particular way of dealing with observations. In contrast, the M-estimators, such as the biweight estimator of the central tendency, weight each observation according to a function selected for its special properties (Yohai, 1987; Maronna, Martin, & Yohai, 2006). The weights depend on a constant that can be chosen by the researcher (for more details, see Heritier et al., 2009). Figure 2 presents the
International Journal of Psychological Research, 2010. Vol. 3. No. 1. ISSN impresa (printed) 2011-2084 ISSN electrónica (electronic) 2011-2079
Courvoisier, D.S., Renaud, O., (2010). Robust analysis of the central tendency, simple and multiple regression and ANOVA: A step by step tutorial.
International Journal of Psychological Research, 3
(1), 79-88.
82
International Journal of Psychological Research
functions weighting the observations for all the estimators of central tendency presented above. Figure 2.
Weight of the observations for different estimators of central tendency.
Left side of Table 1 presents the different measures of central tendency for the time to first cigarette variable. As expected for such a skewed distribution, the mean is very high but this is due to only a few extreme observations. On the contrary, all other robust measures are much closer to the main body of observations. Similarly to measures of central tendency, there are several measures of dispersion in addition to the standard deviation. They include the
Inter-Quartile Range (IQR)
and the
Median Absolute Deviation (MAD)
, often used in concert with the median, the
Winsorized
standard deviation
and M-estimators of the dispersion. Table 1.
Alternatives to measures of central tendency and dispersion.
Measures Central tendency Measures Dispersion Mean 57.02 SD 140.84 Median 15.00 IQR 40.00 MAD 19.27 20% trimmed mean 18.61 20% trimmed SD 11.52 20% Winsorized mean 24.17 20% Winsorized SD 21.70 M-estimator mean 15.14 M-estimator SD 18.23
The
trimmed standard deviation
can of course be computed but the Winsorized standard deviation is considered as more correct to estimate the dispersion and thus obtain a
p
-value for tests comparing central tendencies (Wilcox & Keselman, 2005). The IQR is simply the difference between the first and third quartile. The MAD is the median of the absolute deviations from the median. In the case of our example, the median is 5 and the absolute deviations from this median are: 3, 2, 1, 1, 0, 0, 1, 1, 1, 15. The MAD is the median of these deviations and is equal to 1. The MAD itself is not a consistent estimator of the standard deviation (i.e., it does not converge to the standard deviation when the number of observations increases to infinity and the variable in the population followed a normal distribution). To make it consistent, it is necessary to multiply it by 1.4826 (Heritier et al., 2009). The Winsorized standard deviation is calculated exactly like the standard deviation but on the Winsorized scores. Finally, M-estimators are based on the weighted observations, similarly to the M-estimation of central tendency. Right side of Table 1 presents the different measures of dispersion for the time to first cigarette variable.
(Dis) advantages
. The main advantage of the robust methods is that they are not excessively influenced by (a few) extreme values (i.e, high breakdown point). However, assigning a weight of zero to a large number of observations may be a bit extreme. The M-estimator solves this problem of assigning a zero value to many observations by downweighting the observations progressively. The only aspect of the M-estimator that could worry substantive researchers is that one must chose the degree of downweighting of the observations. While this gives more flexibility to the method, it may seem too “intuitive . To compensate the negative side of this aspect, most software provide a default value for the parameter quantifying the weight.

We Need Your Support

Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks