Question

What are two important first steps in data analysis? -Cleaning the data and exploring it -Cleaning...

What are two important first steps in data analysis? -Cleaning the data and exploring it -Cleaning and scrubbing the data -the box plot and the 5 number summary -correlation and regression

Homework Answers

Answer #1

There is nothing like scrubbing of data, for any data analysis project the primary steps that are common to any kind of data analysis is having a problem statement, understanding the data, cleaning the data and then exploring it.

The other secondary analysis steps are making plots like line plots, bar charts or box plots. The 5 number summary -mean, median, quartiles, are only computed for each variable after the cleaning of data and getting a refined dataset.

Therefore Cleaning the data and exploring it are the important first two steps in the data analysis.

Know the answer?
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for?
Ask your own homework help question
Similar Questions
Below is the data for the optical densities observed during analysis of solutions with 5 different...
Below is the data for the optical densities observed during analysis of solutions with 5 different concentrations of API (OD1 and OD2 are two measurements for the same solution). Conc, mg/L OD1 OD2 5 0.126 0.098 10 0.208 0.194 25 0.515 0.492 50 0.996 1.201 100 1.960 2.090 Using Minitab, develop regression equation for the relationship of Mean OD (need to calculate it first!) versus concentration of API and make a plot showing the data and the regression line. Show...
This week you read about different ways of exploring data in order to display and describe...
This week you read about different ways of exploring data in order to display and describe that data. If you were presented with an unfamiliar dataset, what are the first exploratory steps you would take to get familiar with the dataset and decide how to display the data?
1.   The following data set is the number of hours that a sample of college students...
1.   The following data set is the number of hours that a sample of college students    spent studying for a test:           0, 0, 0 , 3, 3, 5, 6, 6, 8, 10, 12, 15 a)Find the 5-number summary for this data: Low, Q1.Q2,Q3 High     b) Use the 5-number summary to make a box-and-whisker plot for this set of data. Be sure to use a number line with an appropriate scale. c) Based on the box-and-whisker plot, is...
NOTE: you might be able to get all the necessary information from the Excel analysis output...
NOTE: you might be able to get all the necessary information from the Excel analysis output or the scatter plot. You don’t need to do computations by applying the numeric formula. A clothing department wants to find out if there is a relationship (and what kind of relationship) between the price of an item and the time that it takes to a customer to decide in buying that item. ITEM PRICE (X) DECISION TIME IN MINUTES (Y) $17 9 $63...
For Exercise 1 through 7, do a complete regression analysis by performing the following steps. a.Draw...
For Exercise 1 through 7, do a complete regression analysis by performing the following steps. a.Draw the scatter plot. b.Compute the value of the correlation coefficient. c.Test the significance of the correlation coefficient at α = 0.01, using Table I or use the P-value method. d.Determine the regression line equation if r is significant. e.Plot the regression line on the scatter plot, if appropriate. f.Predict y′ for a specific value of x, if appropriate. 7.Internet Use and Isolation A researcher...
For Exercise 1 through 7, do a complete regression analysis by performing the following steps. a.Draw...
For Exercise 1 through 7, do a complete regression analysis by performing the following steps. a.Draw the scatter plot. b.Compute the value of the correlation coefficient. c.Test the significance of the correlation coefficient at α = 0.01, using Table I or use the P-value method. d.Determine the regression line equation if r is significant. e.Plot the regression line on the scatter plot, if appropriate. f.Predict y′ for a specific value of x, if appropriate. 5.Typing Speed and Word Processing A...
Decide by taking at least 13 data with simple correlation and regression analysis so that there...
Decide by taking at least 13 data with simple correlation and regression analysis so that there is a relationship between the two variables. When X = 8 Y =? Guess
Decide by taking at least 12 data with simple correlation and regression analysis so that there...
Decide by taking at least 12 data with simple correlation and regression analysis so that there is a relationship between the two variables. When X = 8 Y =? Guess
Is the randomness of a sample important to statistical data collection? Data analysis? Why or why...
Is the randomness of a sample important to statistical data collection? Data analysis? Why or why not for both of the above? How about two real world examples?
What is a residual? Why are residuals important when performing a regression analysis?
What is a residual? Why are residuals important when performing a regression analysis?