आर में एलएम फार्मूला में इंटरैक्शन शब्द की व्याख्या कैसे करें?

आर में, अगर मैं lm()निम्नलिखित तरीके से फ़ंक्शन को कॉल करता हूं :

lm.1 = lm(response ~ var1 + var2 + var1 * var2)
summary(lm.1)

यह मुझे प्रतिक्रिया चर का एक रैखिक मॉडल देता है var1,var2 और उनके बीच की बातचीत। हालाँकि, हम बातचीत के शब्द की व्याख्या कैसे करते हैं?

प्रलेखन कहता है कि यह "क्रॉस" var1और के बीच हैvar2 , लेकिन इसने यह विवरण नहीं दिया कि वास्तव में "क्रॉस" क्या है।

यह जानना मेरे लिए मददगार होगा कि दो चरों के बीच की बातचीत को शामिल करने के लिए R कितनी सही संख्या की गणना कर रहा है।

r regression

— एंज़ो
स्रोत

क्या आप विशेष रूप से यह जानना चाहेंगे कि आर इस फॉर्मूले के लिए डिज़ाइन मैट्रिक्स कैसे बनाते हैं, या क्या आप अधिक व्यापक रूप से रुचि रखते हैं कि फिट किए गए मॉडल के संदर्भ में ऐसे गुणक ("इंटरैक्शन") शब्द की व्याख्या कैसे करें?

— मोमो

मैं इस गुणात्मक शब्द की व्याख्या करने में अधिक रुचि रखता हूं। उदाहरण के लिए, यदि मैं एक रेखीय सूत्र (एक गणितीय एक, एक आर एक नहीं ...) लिखना चाहता हूं, तो मुझे गुणा शब्द के लिए क्या कहना चाहिए?

— एनजो

यह समझने के लिए कि क्रॉस का क्या मतलब है, var3 <- var 1 * var2तो भवन की गणना पर एक नज़र डालेंlm.2 <- lm(response ~ var1 + var2 + var3)

— जेम्स स्टैनले

तो यह केवल प्रविष्टि वार गुणा है?

— एंज़ो

हाँ, @Enzo, पार सचमुच है दो शब्दों गुणा - व्याख्या काफी हद तक इस पर निर्भर करेगा कि क्या var1और var2दोनों निरंतर (काफी मेरी राय में व्याख्या करने के लिए, हार्ड) या इन में से एक है कि क्या जैसे द्विआधारी स्पष्ट (। आसान विचार करने के लिए) पीटर फ़्लॉम

— जेम्स स्टेनली

जवाबों:

अपने मॉडल के लिए भविष्यवाणी समीकरण लिखने का मानक तरीका है:

$\hat y = b_0 + b_1*x_1 + b_2*x_2 + b_{12} * x_1 *x_2$

But understanding the interaction is a little easier if we factor this differently:

$\hat y = (b_0 + b_2*x_2) + (b_1 + b_{12}*x_2) * x_1$

With this factoring we can see that for a given value of $x_2$ the y-intercept for $x_1$ is $b_0 + b_2*x_2$ and the slope on $x_1$ is $(b_1 + b_{12}*x_2)$ . So the relationship between $y$ and $x_1$ depends on $x_2$ .

Another way to understand this is by plotting the predicted lines between $y$ and $x_1$ for different values of $x_2$ (or the other way around). The Predict.Plot and TkPredict functions in the TeachingDemos package for R were designed to help with these types of plots.

— Greg Snow
स्रोत

Suppose you get point estimates of 4 for $x_1$ , 2 for $x_2$ and 1.5 for the interaction. Then, the equation is saying that the lm fit is

$y = 4x_1 + 2x_2 + 1.5x_1x_2$

Is that what you wanted?

— Peter Flom
स्रोत

It is easiest to think about interactions in terms of discrete variables. Perhaps you might have studied two-way ANOVAs, where we have two grouping variables (e.g. gender and age category, with three levels for age) and are looking at how they pertain to some continuous measure (our dependent variable, e.g. IQ).

The x1 * x2 term, if significant, can be understood (in this trivial, made-up example) as IQ behaving differently across the levels of age for the different genders. For example, maybe IQ is stable for males across the three age groups, but young females start below young males and have an upward trajectory (with the old age group having a higher mean than the old age group for males). In a means plot, this would imply a horizontal line for males in the middle of the graph, and perhaps a 45 degree line for females that starts below males but ends above males.

The gist is that as you move along the levels of one variable (or "holding X1 constant"), what is going on in the other variable changes. This interpretation also works with continuous predictor variables, but is not so easy to illustrate concretely. In that case, you might want to take particular values of X1 and X2 and see what happens to Y.

— Twitch_City
स्रोत