The Fit Regression Model page allows users to perform a variety of regression analyses on the active dataset. Regression outputs returned to the user are confidentialised; however, in most cases there will be no statistically significant impact on inferences drawn from the fitted model. To run a regression, follow these steps:
- Enter the Fit Regression Model tool by clicking on the "Fit Regression Model" tab under "Data Analysis" on the left hand side of the Analysis Service web page.
- Choose the family and link function corresponding to the dependant variable you wish to model and click OK.
|Family||Link Functions Available||Type of variable|
|Linear (Robust)||Identity||Continuous variable|
|Binomial||Logit or Probit||Categorical variable with two categories|
|Multinomial||Logit||Categorical variables with at least two categories|
|Poisson||Log||Continuous variable representing count data (positive integers)|
- Click the "Dependant Variable " button, and select a suitable variable. Some variables are marked as X-only variables and will be unavailable in this panel regardless of which family is chosen. These variables typically are those pertaining to demographic or endogenous characteristics.
- Click the "Explanatory Variable(s)" button, and select one or more variables. These may be either categorical or continuous variables.
- Click "Run Analysis" and wait for results to appear.
- Browse through results by clicking the tabs labelled "Brief Results", "Expanded Summary", and "Graphs".
A basic list of regression coefficients along with some goodness of fit diagnostics.
Provides a more expanded summary of results, including standard error, z-value, p-value, and significance levels for each coefficient.
Provides several diagnostic graphs for the requested regression. Graphs can be downloaded as PNGs by clicking "Download Graph #".
The graphs that are provided are:
Field Exclusion Rules failed
- Residuals vs predicted values
- Q-Q plot of ordered deviance residuals by quantiles of standard normal.
- Cook's statistic vs standardised hat matrix diagonal (not provided for linear models).
This means the combination of variables has be used that has been assessed as a high confidentiality risk. Field Exclusion Rules stop combinations of variables that cause greater confidentiality risks from being used together. They consist of a set of variables and a value k (variables are typically geography variables). An error is trigged when k Field Exclusion Rule variables are used together.
Error running (regression type): There is at least one combination of categories for which there are no observations
When this error occurs, it means that the relationships between the dependent variable and each explanatory variable need to be checked to find which combination of categories has no observations. This can be achieved by running a series of summary tables, with the dependent variable cross classified with each of the explanatory variables, looking for zero totals. Once these zero totals have been located, you can use Create New Variable to merge categories in order to have no zero totals.
An example of this error occurring and how to find what is causing the error:
Running the series of summary tables with dependent variable with each of the explanatory variables:
The variable that is causing the issue is "Type of shift work". In order to run this regression, a new variable would have to be created using Create New Variable, collapsing categories in the "Type of shift work" variable.
This page last updated 15 April 2013