A Statistics Question

Since my amateur discursion into stochasticity appears to have flushed out all of the mathematically savvy, I'm going pose a real life statistics question for you. I have some data that are non-normally distributed (in fact, they really don't seem to fit any distribution well--and, yes, I've tried various transformations and the data still don't fit anything). If the data were normally distributed, I would like to perform ANOVA to partition the sources of variance (using percent sums of squares).

Since the data aren't normally distributed, ANOVA is not the right test to use. Because I'm using a full-factorial model with four factors, using Friedman's test or any of the one-way non-parametric tests doesn't seem to apply either. I should add that, while my data are continuous, I can transform the data into ordinal and/or nominal categories if that would help. While logistic regression could circumvent the non-normality problem, I'm unaware of how one partitions variance with that test (if there's a way to do so, please let me know).

Any ideas? If there are any statistics programs or R packages out there that can handle this, please let me know. Don't be shy...

Tags

More like this

Warning: there is an excellent chance that I don't know what I'm talking about.

The Kruskal Wallis test is the nonparametric analogue of ANOVA.

Or you could force your data into a gaussian distribution using ranks, a la the following R code:

x = (rank(x)-.5)/length(x)
x = qnorm(x)

Here's a paper that might be of service, "Rank-Based Analyses of Linear Models Using R":
http://www.jstatsoft.org/v14/i07/v14i07.pdf

The R library MASS has some other robust stuff as well, from what I recall.

By igor eduardo kupfer (not verified) on 07 Jul 2006 #permalink

Perhaps your results are purely random, so there would be noway you get them to mean anything.

By eric bloodaxe (not verified) on 08 Jul 2006 #permalink

Do you get significant results from an ANOVA? Violating the assumptions will usually make your test less powerful, so if you still get significance, that's a good sign. ANOVA is generally considered robust to non-normality. Are your variances fairly constant? ANOVA is less resistant to heteroskedasticity than to non-normality.

This posting gives some hints about using proportional odds models as a path to robust ANOVA. For logistic regressions, look especially at the conditional logit functions.

Mike, you've got two copies of the original post.
And they've got different comments.
Read both!

Your data of interest follows which distribution? You may apply an approximation techinique to make your data follow the normal distribution and then you can apply ANOVA, now can't you?

It goes like, as if X is a random variable ~ BIN (n; p). It can't be easily approximated to normal distribution by applying stirling formula of n!

You may also find Minitab helpfull and convenient for analysis of variance.