    # Should I use ANOVA or t

A second study design is to recruit a group of individuals and then split them into groups based on some independent variable. Again, each individual will be assigned to one group only. This independent variable is sometimes called an attribute independent variable because you are splitting the group based on some attribute that they possess (e.g., their level of education; every individual has a level of education, even if it is "none"). Each group is then measured on the same dependent variable having undergone the same task or condition (or none at all). For example, a researcher is interested in determining whether there are differences in leg strength between amateur, semi-professional and professional rugby players. The force/strength measured on an isokinetic machine is the dependent variable. This type of study design is illustrated schematically in the Figure below: SAS has the UNIVARIATE, MEANS, and TTEST procedures for t-test, while SAS ANOVA, GLM, and MIXED procedures conduct ANOVA.

The ANOVA procedure is able to handle balanced data only, but the GLM and MIXED procedures can deal with both balanced and unbalanced data. The t-test and one-way ANOVA do not matter whether data are balanced or not.

STATA has the `.ttest`, and the `.ttesti` commands for `t-test`, and the `.anova`, and `.manova` commands conduct ANOVA. Note STATA `.glm` command is not used for ANOVA. #### DATA STRUCTURE

It is useful to read multiple observations in a data line. Note that @@ is a line holder in SAS.

LIBNAME js 'c:\data\sas';

DATA js.data1;
INPUT group block \$ response @@;
DATALINES;
1 A 34.5 1 B 54.5 1 B 25.8 3 C 54.8
2 B 54.8 3 A 15.8 2 C 14.5 2 A 15.1
...
RUN;

1 1 A 34.5
2 1 B 54.5
3 1 B 25.8
...
*******************************/

The DO statement allows to read more complicated data. You may list the particular numbers in the DO statement rather than set a range of values (e.g., DO treatment=1 TO 2;). The @ may not be omitted. This tip is very useful especially when you type in data for the randomized complete block design (RCB) and the Latin square design (LSD).

DATA js.data2;
DO block=1 TO 3;
DO treatment=1,5;
INPUT response @;
OUTPUT;
END;
END;
DATALINES;
4.91 4.63 4.76 5.04 5.38 6.21
5.60 5.08 4.91 4.63 4.76 5.04
...
RUN;

1 1 1 4.91
2 1 5 4.63
3 2 1 4.76
4 2 5 5.04
5 3 1 5.38
...
**********************************/

If data are arranged in the long format, you need to rearranged into the wide format.

DATA js.wide1;
SET js.long;
IF period=1;
RENAME response=response1;

PROC SORT DATA=js.wide1;
BY id;
RUN;

...

DATA js.wide;
MERGE js.long1 js.long2 ...;
BY id;
RUN;

STATA has the `.pkshape` command to transform a data set in the latin square form into the corresponding data set for analysis.

. list, noobs
+---------------+
|id row c1 c2 c3|
|---------------|
|100 1 74 97 54 |
|101 2 54 84 25 |
|102 3 15 57 64 |
+---------------+

. pkshape id r c1-c3, order(abc cab bca) outcome(y) sequence(row) treat(treat) period(col) #### T-TEST

One Sample T-Test

The MU0 option specifies a value of the null hypothesis. The ALPHA option specifies the significance level. The T option in the MEANS procedure runs the t-test.

PROC UNIVARIATE MU0=0 ALPHA=.01;
VAR response;
RUN;

. ttest response=0, level(99)

PROC UNIVARIATE MU0=10 VARDEF=DF NORMAL ALPHA=.05;
VAR response;
RUN;

. ttest response=10

PROC MEANS T PROBT;
VAR response;
RUN;

. ttest response=0

PROC MEANS MEAN STD STDERR T VARDEF=DF PROBT CLM ALPHA=.01;
VAR response;
RUN;

Paired T-Test

PROC TTEST;
PAIRED pre*post;
RUN;

. ttest pre=post,level(95)

Note that STATA `.ttest` command does not have the "unpaired" option. SAS PAIRED statement is able to compare multiple pairs.

PROC TTEST;
PAIRED (a b)*(c d);
RUN;

Two Independent Samples T-Test

The TTEST procedure reports two T statistics: one under the equal variance assumption and the other for unequal variance. Users have to check the equal variance test (F test) first. If not rejected, read the T statistic and its p-value of pooled analysis. If rejected, read the T statistic and its p-value of Satterthwaite or Cochran/Cox approximation.

PROC TTEST COCHRAN;
CLASS male;
VAR response;
RUN;

. ttest response, by(male)
. ttest response, by(male) unequal

STATA is able to conduct the t-test for two independent samples even When data are arranged in two variables without a group variable. The unpaired option indicates that the two variables are independent, and the welch option asks STATA produces Welch approximation of degree of freedom. Note STATA does not give us Cochran/Cox approximation.

. ttest response1=response2, unpaired level(99)
. ttest response1=response2, unpaired unequal welch

T-Test on Aggregate Data

The FREQ statement in the TTEST procedure can handle aggregate data

PROC TTEST H0=5 ALPHA=.01;
CLASS smoke;
VAR lung;
FREQ count;
RUN;

STATA `.ttesti` command enables you to conduct t-test using aggregated descriptive statistics. The numbers listed are the number of observation, mean, and standard deviation of first sample and of second sample.

. ttesti 30 4.5 0.54 // One sample T-test
. ttesti 30 4.5 0.54 30 5.0 1.44 // Two sample T-test #### TWO-WAY ANOVA

Randomized Complete Block (RCB): Treatments are assigned at random within blocks of adjacent subjects, each treatment once per block. The number of blocks is the number of replications. Any treatment can be adjacent to any other treatment, but not to the same treatment within the block.

Again, the ANOVA, GLM, and MIXED conduct the two-way ANOVA with the identical usage.

PROC GLM;
CLASS treat1 treat2;
MODEL response=treat1 treat2;
RUN;

In the case of the randomized complete block design, you may have one observation in each cell. So, including an interaction term is meaningless, producing awkward results. But it is noteworthy that the sum of squares due to error (SSE) is equivalent to the sum of squares of interaction (SSI).

You may compare group means using the MEANS or the LSMEANS (least squares means) statement. The LSMEANS statement is not available in the ANOVA procedure.

### Why can't I just use t

The key difference between ANOVA and T-test is that ANOVA is applied to test the means of more than two groups. In contrast, a t-test is only used when the researcher compares or analyzes two data groups or population samples.

### Why should we use an ANOVA rather than lots of t

The t-test is a method that determines whether two populations are statistically different from each other, whereas ANOVA determines whether three or more populations are statistically different from each other.

### Is ANOVA the same as t

The t-test and ANOVA examine whether group means differ from one another. The t-test compares two groups, while ANOVA can do more than two groups.

### Does ANOVA and t

ANOVA uses the F distribution, a t test uses the t distribution. The F is a squared t, therefore they will always give the answer to the same question. 