Question

Answer the question below: · Mention what is the difference between data mining and data profiling?...

Answer the question below:

· Mention what is the difference between data mining and data profiling?
· Explain what should be done with suspected or missing data?

Homework Answers

Answer #1

Differences between data mining and data profiling :

Data mining is a process of considering the existing database and turning into useful information.

Data profiling is about analyzing the data that is already existing and collecting the statistics about the data. Helps in finding the data quality.it identifies wrong data in the data base and corrects it when necessary.

Data mining evaluates the data base and evaluate patterns in data.

Suspect or missing values handling

Simply not considering the values if our data set is a large one.

Taking median for that column and replacing the median value with that missing values.

If the missing values is between 5%to 10% then we can simply drop that but more than that percentage missing values should be replaced with median or mean of that particular column

Know the answer?
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for?
Ask your own homework help question
Similar Questions
Answer the question below: · Explain what is KNN imputation method? · Mention what are the...
Answer the question below: · Explain what is KNN imputation method? · Mention what are the data validation methods used by data analysts?
Explain the global energy balance being sure to specifically mention the difference between the solar and...
Explain the global energy balance being sure to specifically mention the difference between the solar and terrestrial electromagnetic radiation emission spectrum. Describe the Greenhouse Effect and explain the difference between the Greenhouse Effect and its related topic of public concern, Global Climate Change. Be as clear and specific as possible.
Suppose a researcher is interested in answering the following question: Is there a difference between the...
Suppose a researcher is interested in answering the following question: Is there a difference between the mean body temperatures for men and women? He collects data on body temperature and gender from a random sample of 130 men and women. The data collected is shown in the table below. Calculate a 95% confidence interval to answer the researcher’s question. Mean Standard Deviation Sample Size Males 98.105 0.699 65 Females 98.394 0.743 65 Based on your confidence interval, is there a...
Please answer the following question in a few sentences: What is the difference between valid arguments...
Please answer the following question in a few sentences: What is the difference between valid arguments and strong arguments?
Suppose a researcher is interested in answering the following question: Is there a difference between the...
Suppose a researcher is interested in answering the following question: Is there a difference between the mean body temperatures for men and women? He collects data on body temperature and gender from a random sample of 130 men and women. The data collected is shown in the table below. Calculate a 95% confidence interval to answer the researcher’s question. Mean Standard Deviation Sample Size Males 98.105 0.699 65 Females 98.394 0.743 65 Are the criteria for the t-distribution met? Explain...
What is the difference between data warehousing and big data?
What is the difference between data warehousing and big data?
In order to study the relationship between uranium mining and lung cancer in West Virginia, 3,500...
In order to study the relationship between uranium mining and lung cancer in West Virginia, 3,500 people with incident lung cancer were identified through cancer registries. 4,300 people without lung cancer were selected from the DMV registries. Uranium mining data was collected through mailed questionnaires. Uranium mining was reported by 925 of the people with lung cancer and 955 of those without lung cancer respectively, while the others reported no uranium mining. What study design was employed? Mention in short...
Answer all questions Q2. The data-mining method that can be used in market segmentation to divide...
Answer all questions Q2. The data-mining method that can be used in market segmentation to divide consumers into different homogeneous groups is _____. a. data visualization b. cluster analysis c. market analysis d. supervised learning e. None of the above 2 points    QUESTION 3 Q3. The simplest measure of similarity between observations consisting solely of categorical variables is given by _____. a. the Euclidean distance b. the standardized Euclidean distance c. matching coefficient d. Jaccard's coefficient e. None of...
(History of Economic Thought) Question. 2) (a) What is the difference between labor and labor power...
(History of Economic Thought) Question. 2) (a) What is the difference between labor and labor power according to Marx? (b) Explain what Marx means by surplus value and how he uses the difference between labor and labor power to explain surplus value and exploitation.
explain the difference between a research question and a hypothesis
explain the difference between a research question and a hypothesis
ADVERTISEMENT
Need Online Homework Help?

Get Answers For Free
Most questions answered within 1 hours.

Ask a Question
ADVERTISEMENT