Question

What are the goals of data screening? How can you identify and remedy the following? Errors...

What are the goals of data screening? How can you identify and remedy the following? Errors in data entry. Outliers. Missing data.

Homework Answers

Answer #1

Goals of data screening are as follows.

  • Accuracy of data entry- We have to cross check weather we have entered the data correctly or made any typing error, collection error or something like these.
  • Dealing with missing data- We have to notice those data which are missed in our collected data. Based on that we have to analysis whether that number and effect are significant or not. If not significant, we can proceed for further computations. Otherwise we have to make arrangements to collect or estimate those data (all or partially as possible and as required) using different approaches.
  • Handling outliers- Through overall view of the gathered data, we have to notice if there is any outlier and if possible we have to crosscheck those. If crosschecking is not possible, we have to assess the effect of those outliers in overall data and if required, we have to neglect those data for further computations.
  • Test of assumptions- Earlier made assumptions like normality, linearity, uniformity, symmetricity and others are to be checked while data screening is performed.

Errors in data entry-

We have to crosscheck data after entering and thus errors in data entry can be avoided (or reduced). Further observing any outlier value, we have to take special care to crosscheck whether those are entered correctly or not.

Outliers-

Outliers in a set of data can be identified by mere observation of the data values or through plots like scatter plot or histogram and so others. For those we have to check whether

  • these occurred due to data entry error
  • these are cases which are not at all part of the population
  • these are the real cases which are practically different from others

For outliers we have to analysis its leverage, discrepancy and influence on the data set.

Missing data-

We have to note the missing data and check whether missing data is random or not. Creating two groups one with missing data and other without missing data we have to perform t-test to examine whether there is any difference between groups. If difference is significant we have to proceed through any of following processes.

  • Cases or variables related to missing data may be deleted.
  • Missing values may be estimated during analysis. Replacements can be done using prior knowledge or by replacement of estimated mean (which does not change mean but reduces the standard deviation).
  • Estimating using regression approach (though it is time consuming).

After reconstructing the data set we have to again perform analysis.

Know the answer?
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for?
Ask your own homework help question
Similar Questions
How is the nursing shortage affecting our healthcare and what can be done to remedy the...
How is the nursing shortage affecting our healthcare and what can be done to remedy the situation” 1000+ words
How do I do a regression data screening with SpSS with these data results: Couples Satisfaction...
How do I do a regression data screening with SpSS with these data results: Couples Satisfaction Index in measurement scale,(CSIComp), Rosenberg Self Esteem (RSE) measurement scale, TOSCASHA TOSCA-Shame measurement in scale, KISS(sexual shame) measurement scale and DASSDEP-depression measurement scale, all are numeric, role input and no missing info, step by step on SpSS please with how to write it up?
Identify the types of errors that you make most often and what strategy you will use...
Identify the types of errors that you make most often and what strategy you will use to minimize making this type of error in the future. State what you have learned by analyzing your mistakes. Why do we make mistakes?   The old adage that “nobody is perfect” is certainly true but we can LEARN from our mistakes.   Start to analyze your mistakes.  In this analysis consider these categories for your mistakes.
Question: How would you scan for outliers in your dataset? What would you do with data...
Question: How would you scan for outliers in your dataset? What would you do with data points that are considered outliers?
What baseline data can be captured during patient referral management process in ambulatory clinic and how...
What baseline data can be captured during patient referral management process in ambulatory clinic and how would it go about doing observation, questionnaire, data mining etc. What are the goals of patient referral management process in ambulatory clinic and how the baseline data helps to obtain the process of referral management to achieve the goals. describe how the health IT that is used in patient referral management and used to capture information within the process and helps identify if the...
what 2 tools you can use to identify possible issues in a quickbooks online company? -transaction...
what 2 tools you can use to identify possible issues in a quickbooks online company? -transaction journal -import data -account and settings -audit log -journal entry
Discuss data preparation activities. What goals are associated with data preparation and what techniques are leveraged...
Discuss data preparation activities. What goals are associated with data preparation and what techniques are leveraged to meet the goals identified? Write a minimum of three well-formed scholarly paragraphs that include a topic sentence, several body sentences (aim for three to five), and a closing, summary, or transition sentence. Write your discussion posts using your own words (avoid quoting as best you can – paraphrasing is best) and cite as appropriate.
Which of the following statement is correct? a. Type I errors can only occur if you...
Which of the following statement is correct? a. Type I errors can only occur if you fail to reject H0. b. Type II errors can only occur when you reject H0. c. When the sample size increases, both the probability of making Type I errors and the probability of making Type II errors can decrease. d. The level of significance is the probability of making a Type II error.
can you identify any areas of structured or unstructured data in your organization that need to...
can you identify any areas of structured or unstructured data in your organization that need to be rectified, and if so, what are they and how might you go about making this change in your organization?
Identify errors and correct them of the following Java program: 1. import java.utility.Random; 2. 3. public...
Identify errors and correct them of the following Java program: 1. import java.utility.Random; 2. 3. public class Sequen 4. 5. private int stand 6. private int calc; 7. 8. public Sequen(int numStand) 9. { 10. stand = numStand; 11. skip; 12. } 13. 14. public void skip() 15. { 16. Random sit = new Random(); 17. calc = sit.nextInt(stand) + 1; 18. } 19. 20. public getStand() 21. { 22. return stand; 23. } 24. 25. int getCalc() 26. {...