Where the interpretation of the interaction comes from? Let us assume that we have patients in out sample and a linear model (simple or mixed).

For the linear model we have an equation of the form:

Y=A0+A1*X1+A2*X2+A3*X1*X2,

where A0 is the intercept and A1 and A2 are the main effects of X1 and X2 respectively.

Let’s say X1 is the factor: Group 1 versus Group 2. Then by the statistical software it is coded as Group 1 X1=0 and for Group2 X2=1.

Then let’s assume that X2 is a continuous covariate such as age.

What exactly the A1, A2 and A3 means?

Assume that all estimates A1, A2 and A3 are significant. That means that they are significantly different from zero and we can interpret them. In other words, if we would repeat our experiment with the same model and the different sample of the same size from the population the A1, A2 and A3 would be different from the estimates we have now but they will be still different from zero. This is what the significance means.

So, lets come back to the meaning of A1, A2 and A3.

Look how the equation looks:

Y=A0+A1*X1+A2*X2+A3*X1*X2

If we didn’t have the interaction term then the interpretation of A1 and A2 would be easy. Let’s work it out.

Y=A0+A1*X1+A2*X2.

X1 is coded as 0 for Group and 1 for Group 2 because Group 1 is a reference group. You can always change the reference group but assume that Group 1 is the reference one. Note that once you ask the statistical program otherwise it will choose for character variable as the reference group the firs one in alphabetical order. So, for “A”, “B” coding it will be “A” group.

Now let’s say we have two patients. Patient 1 is from Group 1 and his/her age is Age1. What is his/her response Y then?

Y1=A0+A1*0+A2*Age1

Let’s say Patient 2 is from Group 2 and his/her age is Age2.

His/her response then is:

Y2=A0+A1*1+A2*Age2.

What is the is the difference between Y2 and Y1?

Y2-Y1= A0+A1*1+A2*Age2- (A0+A1*0+A2*Age1)=A1+A2(Age2-Age1).

And what if both are at the same age? Then:

Y2-Y1=A1.

**So here is where our interpretation of A1 comes from!** **A1 is the change in the response in Y between two different patients from Group 1 and 2 that have the same Age (when there are not interaction terms!)**

In other words when we change group from 1 to 2 the Y changes by A1. Deepening on the sign of A1 it increases or decreases.

What about A2? X2 is a continuous variable. Therefore, we don’t have now “the groups change” effect. We have “the increase by 1 unit in X2” effect.

Let’s say we have two the same patients from Group 1 and Age=Age1 and patient 2 from Group 2 and Age2 as before but now additionally let’s assume that Age2=Age+1. So the difference between these two in age is 1 year.

What is the is the difference between Y2 and Y1 for those patients now?

Y2-Y1= A0+A1*1+A2*Age2- (A0+A1*0+A2*Age1)=A1+A2(Age2-Age1)= A1+A2(Age1+1-Age1)=A1+A2.

And what if they are from the same group? Then if they are both from group 1:

Y2-Y1=A2

And if they are both from Group 2:

Y2-Y2=A1(1-1)+A2=A2

**And this is the interpretation of A2! A2 is the change in the response in Y between two different patients from the same Group when the difference between their Age is 1 year (when there are not interaction terms!). In other words, we can say A2 is the change in the response Y when X2 increases by 1 unit and X1 is kept fixed.**

Ok so now we know how to interpret the coefficents in a model without interaction terms. Let’s add the interaction term now and see what changes.

Y=A0+A1*X1+A2*X2+A3*X1*X2

Let’s try to interpret the A1 coeffcient as we did before, in the model without the interaction term.

Let’s compute the difference between the two people from Group 1 and Group 2 and the same Age:

Y2-Y1= A0+A1*1+A2*Age+A3*Age*1 – (A0+A1*0+A2*Age+A3*0*Age)=A1+A3Age

Ok… so this is not exactly A1 as before. How to make it A1? We have to assume that Age is zero, even if this is unrealistic since we do not have people with age zero.

Therefore, the interpretation of A1 is the following:

**A1 is the change in Y when we compare two people from Group 2 and Group 1 with the same Age equal 0. **To avoid the unwanted interpretation of A1 in that model we can use instead of Age X2 variable X2’= X2-mean(X2) which is called **mean-centering**. In that case you will have the following interpretation of A1:

**A1 is the change in Y when we compare two people from Group 2 and Group 1 with the same Age =mean Age.**

Analogically let’s make the interpretation of A2 .

Let’s compute the difference between the two people from the same group and Age different by 1 year.

If they are from both Group 2:

Y2-Y1= A0+A1*1+A2*Age1+A3*Age1*1 – (A0+A1*1+A2*(Age1+1)+A3*1*(Age1+1))=A2+A3

And if they are both from Group1:

Y2-Y1= A0+A1*0+A2*Age1+A3*Age1*0 – (A0+A1*0+A2*(Age1+1)+A3*0*(Age1+1))=A2.

**So, the last gives us the interpretation of A2! This is the change in Y when Age increases by 1 unit and the Group is not only fixed but it must be the reference group 1!**

**And the last but not least what is the interpretation of A3?**

**Well we already have it before:**

Y2-Y1= A0+A1*1+A2*Age1+A3*Age1*1 – (A0+A1*1+A2*(Age1+1)+A3*1*(Age1+1))=A2+A3

**This is the modification of the A2 effect estimated for Group 1 calculated for patients from Group 2. So, the total effect of increasing Age by 1 for Group 2 is equal A2+A3 and that is how we should interpret the effect of A3. The effect of increasing X2 by 1 for Group 2 is bigger by A3 as compared to the same effect for Group 1.**

**One last additional word. What if there would be no continuous variables in the equation:**

Y=A0+A1*X1+A2*X2+A3*X1*X2

** How that would change our previous interpretation?**

**So X2 is now another factor. Let’s say Gender. Since it is coded as Females and Males unless you ask otherwise it will be coded as Female=0 (reference) and Male=1.**

**Then the interpretation of A1 is:**

**A1 is the change in Y when we compare two people from Group 2 and Group 1 with the same Gender =Female. So, this is the effect of Group for Females.**

**And for A2:**

**This is the change in Y when we compare Males vs Females from the same Group 1.**

**And for A3:**

**This is the modification of the A2 effect estimated for Group 1 calculated for patients from Group 2. So, the total effect of comparing Males versus Females for Group 2 is equal A2+A3. The effect of Gender for Group 2 is bigger by A3 as compared to the same effect of Gender for Group 1.**

__Homework:__

**Write the interpretation of A1 and A2 for the model without interaction:**

Y=A0+A1*X1+A2*X2,

** when X1 is a Group and X2 is a Gender.**