Caveats first. Using appropriate statistics is not always easy. Please do not
blame me when a reviewer caustically refers to the 56 uncorrected t-tests that
you performed on your data as the work of a moron. The techniques explained
here will probably be adequate for univariate experiments and confirmatory tests
of group equality. This is __not__ a statistics book.

`infert`

data set and the
`brkdn()`

function, let's look at the means of age for cases and non-cases.

> brkdn(age~case,infert) 0 1 Mean 31.49091 31.53012 Variance 27.60510 27.86189 n 165.00000 83.00000 attr(,"class") [1] "dstat"

It looks as though the two groups have been age-matched. Try a t-test to see if there is a difference.

> t.test(subset(infert$age,infert$case == 0), + subset(infert$age,infert$case == 1)) Welch Two Sample t-test data: subset(infert$age, infert$case == 0) and subset(infert$age, infert$case == 1) t = -0.0553, df = 163.766, p-value = 0.956 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -1.439600 1.361177 sample estimates: mean of x mean of y 31.49091 31.53012

Bronwyn wants to know if those women in the sample who completed high school were significantly younger than those who did not.

> brkdn(age~education,infert) 0-5yrs 6-11yrs 12+ yrs Mean 35.25000 32.85000 29.72414 Variance 40.02273 28.66639 19.19280 n 12.00000 120.00000 116.00000 attr(,"class") [1] "dstat"

She may have a case here, but first let's do something about that painful typing
in of every subsetting operation. Have a look at the function
`group.t.test()`

.
By calling this function as follows:

`> group.t.test(infert$age,infert$education,"12+yrs")`

we can specify a grouping factor of high school completion versus everything else. This function also allows us to test two specified groups against one another.

Before we leave `t.test()`

, the ellipsis (...) at the end of the
arguments means that you can pass additional arguments to `t.test()`

.
For example, you might want the 99% confidence interval displayed rather than
the default 95% one.

Welch Two Sample t-test data: age by as.factor(ifelse(infert$education != "12+ yrs", "<12 yrs", "12+ yrs")) t = 5.3423, df = 243.997, p-value = 2.103e-07 alternative hypothesis: true difference in means is not equal to 0 99 percent confidence interval: 1.718973 4.969115 sample estimates: mean in group <12 yrs mean in group 12+ yrs 33.06818 29.72414

Looks like Bronwyn was right.

For a much more detailed treatment of ANOVAs and other methods, get the VR package or Notes on the use of R..." in the Contributed documentation page.