### Introduction to Statistical Methods (MED5029)

Coursework 2020-2021 (Practical skills e-assessment)

Save the files to a convenient location on your PC and read them into R. The data set contains variables that are ready for analysis and no recoding of variables is required, but you will need to label them appropriately and assign value labels to the variables where appropriate. You are expected to use both the R manual and the Introduction to Statistics manual to help you answer the questions.

Please check well in advance that you can access the data for analysis in case you have problems with this near the deadline!

When analysing the data and writing your report you should:

? provide appropriate plots and /or tables to summarise the data

? state and justify any assumptions you make,

? describe the analysis you have done,

Context

The Framingham data set is a famous cross sectional survey of respondents in Framingham, Massachusetts who were asked to participate in a study in 1948 to help predict the factors of coronary heart disease. Since the survey was first done in 1948 the relatives of the original respondents now contribute data to the study. The data set you have is a random sample from the original 1948 survey.

Data

The data are stored in the Stata data file “ITS2021_REASSESSMENT_v12.dta" as follows:

ID             Subject reference number (1001, 1002, …,5240)

age         age at last birthday (years)

sex 1:     male 0: female

4: post graduate qualification NA: missing

smoker     current smoker

cigsperday       number cigarettes smoked per day NA: missing

bpmeds         taking blood pressure medication

0: not on medication

stroke      has had a stroke ?

1: Yes    0: No

hypertension        has hypertension ?

1: Yes      0: No

diabetes         has diabetes ?

totchol        cholesterol level (mg/dL)

NA: missing

sysbp      systolic blood pressure (Hg)

bmi         Body Mass index (kg/m 2 )

heartrate     Heart rate (beats per minute)

NA: missing

glucose             blood glucose (mg/dL)

NA: missing

1: At risk      0: No risk

Questions

1. Calculate the % of males and females who have hypertension in the sample. Use an appropriate test and 95% confidence interval to investigate whether the percentage of males with hypertension is significantly different to the percentage of females with hypertension

3. For males and females separately is there any evidence that blood glucose differs by hypertension diagnosis?

5. The literature on cardiovascular disease suggests that BMI can be predicted from diastolic blood pressure. Is there any evidence that this is the case within this data set?

7. In your opinion are these predictions valid ?

