How to Read Proc Reg Output in Sas

In the linear regression model, we explain the linear relationship between a dependent variable \mathbold{y} and i or more explanatory variables \mathbold{x_1}, \mathbold{x_2} \dots. In matrix notation we write the model as \bold{y} = \mathbold{\mu + X \beta + \epsilon}. Here, \mathbold{y} is a vector of dependent variables to exist explained. \mathbold{\mu} is the overall hateful of the model. \mathbold{X} is a matrix of independent explanatory variables. \mathbold{\epsilon} is a vector of residuals and \mathbold{\beta} is a vector of parameters to exist estimated from the independent variables. In this postal service, I present an example of how to lawmaking linear regression models in SAS.

The usual method of estimating \mathbold{\beta} is Ordinary Least Squares (OLS). OLS minimizes the sum of the squared residuals. This method leads to the airtight form solution for the estimated parameters , \mathbold{\hat{\beta}} = \left( \mathbold{X^T X} \right) \mathbold{X^T y}. We presume that the fault terms take finite variance and are uncorrelated with the regressors. That means the calculator is unbiased  and consistent . Further assuming that the variance is constant through the observations, the estimator is likewise efficient . Wikipedia provides a more than thorough exam of the theory of the linear regression model.

Fit a linear regression model in SAS

The simplest manner to fit linear regression models in SAS is using ane of the procedures, that supports OLS interpretation. The first process you should consult is PROC REG. A simple example is

              proc reg              data              = sashelp.class;    model weight = height;              run;

proc reg data = sashelp.course; model weight = height; run;

SAS Linear Regression Example Code

In the MODEL statement, we list the dependent variable on the left side of the equal sign and the explanatory variables on the right side. This means that the model looks similar this

\begin{equation*} \text{Height} = \mu + \beta * \text{Weight} + \epsilon \end{equation}

The REG Procedure produces a lot of output and it is important to go about this in the right order. First, you lot should wait at the 'Fit Diagnostics' plots. Based on the histogram and QQ plots, does your data look approximately normal? If and then, you tin can go along to look at the 'Analysis of Variance' and 'Parameter Estimates'. Here, yous can see that we get a very small p-value for the overall model. And then, the probability of obtaining our data purely past chance is very small-scale. This indicates a good overall model fit. Finally, you should have a look at the parameter estimates and the t-tests and p-values. In this case, both the Intercept and the parameter for Weight are highly significant.

Call back that you can Control Your Output With ODS Select And Exclude if you are non interested in all the procedure output.

The OLS parameter estimates provided past PROC REG imply that the best fitting linear regression model given the specified variables is

\begin{equation*} \text{Height} = -143.03 + 3.9 * \text{Weight} \end{equation}

which ways that a unit of increase in Weight implies a three.nine unit increment in Height. If an intercept does not make sense in your model, y'all can suppress it using the NOINT Option in the Model Statement.

Using PROC GLM

The linear regression model is a special example of a general linear model. Here the dependent variable is a continuous normally distributed variable and no class variables exist among the independent variables. Therefore, another common fashion to fit a linear regression model in SAS is using PROC GLM. PROC GLM does back up a Class Statement.

              proc glm              data              = sashelp.course;    model weight = tiptop;              run;

proc glm data = sashelp.class; model weight = meridian; run;

For more than cloth and examples of model fitting using the above procedures, consult the SAS documentation for PROC REG and PROC GLM. Both procedures assume normality. Therefore, you should familiarize yourself with the Normal Distribution.

Linear Regression in IML

The two procedures used in the department above produce a lot of output and data with footling lawmaking. Nonetheless, it can be a fleck confusing how SAS actually calculates these quantities. Therefore, I accept written an IML program beneath, that calculates all the quantities from the 'Analysis of Variance' and 'Parameter Estimates' sections in the previous. Admittedly, using three lines of code one of the above procedures is much simpler than doing this through IML. However, it gives a prissy overview of the calculations performed in linear regression.

              proc iml; use sashelp.grade;              /* Open up dataset for reading                       */              read all              var              {              'weight'              }              into              y;              /* Read dependent variable into vector y */              read all              var              {              'acme'              }              into              X              [c=names];/* Read independent variable(south) into matrix X     */              close              sashelp.grade;              /* Close dataset for reading                      */              df_model = ncol(              X              );              /* Model degress of freedom                       */              X              = j(nrow(              Ten              ),1,1              )              ||              Ten;              /* Intercept                                      */              df_error = nrow(              X              )              - ncol(              X              );              /* Error degrees of freedom                       */              beta_hat = inv(t(              X              )*X              )              * t(              X              )*y;              /* Solve normal equations for parameter estimates */              y_hat =              X*beta_hat;              /* Predicted values                               */              res = y - y_hat;              /* Residuals                                      */              SSM =              sum              (              (y_hat -              mean              (y)              )##2              );              /* Model Sum of Squares                           */              SSE =              sum              (res##2              );              /* Eror Sum of Squares                            */              MSM = SSM / df_model;              /* Model Hateful Square                              */              MSE = SSE / df_error;              /* Error Mean Foursquare                              */              R_square = SSM /              (SSM + SSE);              /* R^two                                            */              F = MSM / MSE;              /* F exam statistic for overall model             */              p_F =              one              -              CDF              (              'F',F,df_model,df_error);              /* p-values                                       */              std_err =              sqrt              (MSE*vecdiag(inv(t(              X              )*X              )              )              );              /* Standard Errors of estimated parameters        */              t = beta_hat / std_err;              /* t examination statistic for estimated parameters      */              p_t =              2              *              (              1-cdf              (              't',abs              (t),df_error)              );              /* p values for s                                 */              print              (              'Intercept'              // t(names)              )              [l='Parameters'              ]              beta_hat[f=best10.2 l='Estimate'              ]              std_err[f=best10.two fifty='Std. Error'              ]              t[f=best5. fifty='t Value'              ]              p_t[f=pvalue6.four fifty='p Value'              ];              /* Print beta values, t-stats and p-values        */              print R_square[f=best10.two l='R^2'              ];   print              (              {              'Model',              'Fault',              'Corrected Total'              }              )              [l='Source'              ]              (df_model // df_error // df_model+df_error)              [f=best10. 50='DF'              ]              (SSM // SSE // SSM+SSE)              [f=best10. 50='Sums of Squares'              ]              (MSM // MSE)              [f=best10. l='Hateful Square'              ]              F[f=best5. l='F Value'              ]              p_F[f = pvalue6.4 50='p Value'              ];              /* Print sums of squares, F test and p-value      */              quit;

proc iml; utilize sashelp.class; /* Open dataset for reading */ read all var {'weight'} into y; /* Read dependent variable into vector y */ read all var {'height'} into X[c=names];/* Read contained variable(southward) into matrix Ten */ close sashelp.course; /* Shut dataset for reading */ df_model = ncol(X); /* Model degress of liberty */ X = j(nrow(X),1,one) || X; /* Intercept */ df_error = nrow(X) - ncol(X); /* Fault degrees of freedom */ beta_hat = inv(t(Ten)*10) * t(X)*y; /* Solve normal equations for parameter estimates */ y_hat = X*beta_hat; /* Predicted values */ res = y - y_hat; /* Residuals */ SSM = sum((y_hat - mean(y))##two); /* Model Sum of Squares */ SSE = sum(res##2); /* Eror Sum of Squares */ MSM = SSM / df_model; /* Model Hateful Square */ MSE = SSE / df_error; /* Error Mean Square */ R_square = SSM / (SSM + SSE); /* R^2 */ F = MSM / MSE; /* F test statistic for overall model */ p_F = 1 - CDF('F',F,df_model,df_error); /* p-values */ std_err = sqrt(MSE*vecdiag(inv(t(Ten)*10))); /* Standard Errors of estimated parameters */ t = beta_hat / std_err; /* t examination statistic for estimated parameters */ p_t = 2 * (i-cdf('t',abs(t),df_error)); /* p values for s */ print ('Intercept' // t(names))[l='Parameters'] beta_hat[f=best10.2 l='Estimate'] std_err[f=best10.2 l='Std. Fault'] t[f=best5. 50='t Value'] p_t[f=pvalue6.four l='p Value']; /* Print beta values, t-stats and p-values */ print R_square[f=best10.ii 50='R^2']; impress ({'Model', 'Error', 'Corrected Total'})[l='Source'] (df_model // df_error // df_model+df_error)[f=best10. l='DF'] (SSM // SSE // SSM+SSE)[f=best10. l='Sums of Squares'] (MSM // MSE)[f=best10. l='Mean Foursquare'] F[f=best5. fifty='F Value'] p_F[f = pvalue6.four l='p Value']; /* Impress sums of squares, F test and p-value */ quit;

Equally y'all can see, the PROC IML code example produces the same results as the previous procedures.

Summary

Summing upwardly, the linear regression model is i of the most common statistical models in introductory statistics courses. Nonetheless, many of the features of the model are essential in other, more complicated models. Therefore, a good understanding of the model will give y'all an advantage when you fit other classes of linear models.

For farther reading, I recommend the book SAS For Linear Models.

Finally, yous can download the entire program here.

edgertonthered.blogspot.com

Source: https://sasnrd.com/sas-linear-regression-example-proc-glm-reg/

0 Response to "How to Read Proc Reg Output in Sas"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel