Download the file CCCdatahw3.xlsx (or CCdatahw3v2.xlsx). This is a partially faked dataset derived from real medical records. https://drive.google.com/file/d/1zYrAeyNbB6GTDG7y7MpUvm_iQ9ciCIcE/view
A variety of variables are used including some demographic data and two variables on smoking and diabetes. You can see that the last two variables “smoking” and “diabetes” are simplified variables recording a “1” if any mention was made and a “0” if not. You will just be using the last two columns, “smoking” and “diabetes.” Please use this file to complete the following questions. Show your work. You will likely need to use Excel. a) What is the size of the sample? b) What percentage of the sample reported smoking? c) What percentage of the sample reported having diabetes? What percentage of reported smokers also report having diabetes? d) What percentage of people who report having diabetes also report smoking? e) What percentage of nonsmokers also have diabetes? f) Does smoking appear associated with diabetes? Explain your reasoning. g) Does the data allow us to conclude smoking causes diabetes? Explain.
a) Sample size
Size of the sample can be calculated using the "COUNT" function in Excel, it will count the number of non-missing values. Therefore, using the count funtion on the vaiable ID will give us the estimate of the sample size. In any of the cell type =COUNT(A2:A1001). The above function results in 1000. Therefore the sample size is 1000.
b) Percentage of Smoking
Using "COUNTIF" function the number of sample that reported smoking can be identified. Inputs for the function are range (The data range of smoking) and the code for smoking which is "1". Therefore in any of the empty cell type =COUNTIF(H2:H1001,1). The function will result is 127. Hence, the percentage is calculated as 127/1000*100 = 12.7%
c)
Sample reported having diabetes
Using "COUNTIF" function the number of sample that reported diabetes can be identified. Inputs for the function are range (The data range of smoking) and the code for diabetes which is "1". Therefore in any of the empty cell type =COUNTIF(I2:I1001,1). The function will result is 37. Hence, the percentage is calculated as 37/1000*100 = 3.7%
Percentage of people who report diabetes also report smoking
A new variable is calculated if smoking=1 and diabetes=1 using the "AND" function. In a new column type =AND(H2=1,I2=1). The same function is used for all other rows. The result will be "TRUE" if the diabetes=1 and smoking = 1, else the result will show "FALSE". Therefore, counting the number of "TRUE" will give count of number of people who report diabetes also reporting smoking. Using the COUNTIF function number of TRUE values are identfied. The result will be 20(Use the function COUNTIF on the new column, =COUNTIF(J2:J1001,"TRUE")). Therefore, there are 20 people who reported diabetes also report smoking.
Percentage of reported smokers also report having diabetes:
Number of reported Smokers = 127.
Number of Diabetes and smoker= 20.
Therefore, percentage of reported smokers also report having diabetes is calculated as 20/127 = 0.1575 = 15.75%
d) Percentage of people who report having diabetes also report smoking
Number of reported diabetes= 37.
Number of Diabetes and smoker= 20.
Therefore, percentage of people who report having diabetes also report smoking is calculated as 20/37 = 0.5405 = 54.05%
e) Percentage of nonsmokers also have diabetes
Number of Non-smokers = 1000-127 = 873
Number of Nonsmokers and diabetes = Number of diabetes - Number of smokers and diabetes = 37 - 20 = 17.
Therefore, percentage of non-smokers also have diabetes = 17/873 = 1.95%
f) Association of smoking with diabetes:
Tobacco use can increase blood sugar levels and lead to insulin resistance. The more you smoke, the greater your risk of diabetes. Hence, once can assume that there will be an associaton of smoking with diabetes.
g) Association of smoking with diabetes via data:
To check the association the two percentages has to be comapred.
i. Percentage of reported smokers also report having diabetes = 15.75%
ii. Percentage of Non-Smokers also have diabetes = 1.95%
The difference between the reporting of diabetes in smokers and non-smokers = 13.8%. Hence, it appears that the smoking is associated with diabetes.
Get Answers For Free
Most questions answered within 1 hours.