We can also use the ‘summarize’ command to get even more detailed information on our two main variables of interest – our outcome, BMI at age 42 (bmi42), and our predictor variable, ‘general ability’ at age 11 (n920). You should note that ‘summarize’, as well as other Stata commands, can often be abbreviated to keep your command syntax concise. So instead of typing out the full ‘summarize’ command, we can instead use ‘sum’, which Stata will interpret in the exact same way. Stata commands also often allow us to specify additional options to customise the output we get when we run the command. If we use the ‘detail’ option with the ‘sum’ command for example, the Stata output will also include percentiles, measures of central tendency and variance.
From the output, we can see that BMI at age 42 ranges from 14.74 to 51.72, with a mean of 25.86 and a median of 25.22 (the 50th percentile). General ability at age 11 ranges from 0 to 79, with a mean of 46.64 and a median of 48. The distribution of BMI at age 42 is not symmetrical (skewness = 1.13) and is heavy on the tails of the distribution (kurtosis = 5.24) which we can examine graphically using the ‘qnorm’ and ‘histogram’ commands, as shown in the plots below.
We will examine these in further detail when we investigate the regression diagnostics at the end of the general linear regression example.
The Learning Hub is a resource for students and educators
tel | +44 (0)20 7331 5102 |
---|---|
closer@ucl.ac.uk |
Sign up for our email newsletters to get the latest from CLOSER
Sign up