Question

Essay Questions - Write a 500 words document explaining and describing the question below.  You are required...

Essay Questions - Write a 500 words document explaining and describing the question below.  You are required to use APA format with at least 5 references.

  1. Explain and describe 5 data cleansing techniques involved in data transformation during a data visualization project.

Homework Answers

Answer #1

Data cleaning is the process of ensuring that your data is correct, consistent and useable.

Here are several key benefits that come out of the data cleaning process:

  1. It removes major errors and inconsistencies that are inevitable when multiple sources of data are getting pulled into one dataset.
  2. Using tools to cleanup data will make everyone more efficient since they’ll be able to quickly get what they need from the data.
  3. Fewer errors means happier customers and fewer frustrated employees.
  4. The ability to map the different functions and what your data is intended to do and where it is coming from your data.

Data cleansing depends on thorough and continuous data profiling to identify data quality issues that must be addressed.

Some of the data cleansing techniques involved in data transformation during a data visualization project are as follows:

  • Defining a data quality plan: Derived from the business case, the quality plan may also entail some conversation with business stakeholders to tease out answers to questions like “What are our data extraction standards,” “What opportunities do we have to automate the data pipeline,” “What data elements are key to downstream products and processes,” “Who is responsible for ensuring data quality,” and “How do we determine accuracy.”
  • Validating accuracy: One type of accuracy is taking steps to ensure data is correctly entered at the point of collection – for example, if a website has changed and the value is no longer there, or if the pricing of a product is only available when you put an item into a shopping cart because of a promotion.
  • Deduplicating: No source data set is perfect, and sometimes source systems send duplicate rows. The key here is to know the “natural key” of each record, meaning the field or fields that uniquely identify each row. If an inbound data set includes records having the same natural key, all but one of the rows could be removed.
  • Handling blank values: Are blank values represented as “NA,” “Null,” “-1,” or “TBD”? If so, deciding on a single value for consistency’s sake will help eliminate stakeholder confusion. A more advanced approach is imputing values. This means using populated cells in a column to make a reasonable guess at the missing values, such as finding the average of the populated cells and assigning that to the blank cells.
  • Reformatting values: If the source data’s date fields are in the MM-DD-YYYY format, and your target date fields are in the YYYY/MM/DD format, update the source date fields to match the target format.
  • Threshold checking: This is a more nuanced data cleansing approach. It includes comparing a current data set to historical values and record counts. For example, in the health care world, let’s say a monthly claims data source averages total allowed amounts of $2M and unique claim counts of 100K. If a subsequent data load arrives with a total allowed amount of $10M and 500K unique claims, those amounts exceed the normal expected threshold of variance, and should trigger additional scrutiny.

Hope this answers your questions, please leave a upvote if you find this helpful.

Know the answer?
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for?
Ask your own homework help question
Similar Questions
Answer the below question: Write in your own words minimum 500 words 1) Provide examples of...
Answer the below question: Write in your own words minimum 500 words 1) Provide examples of how SPC tools are used to root out process control issues. Note: Please include the References
Within the Discussion Board area, write 400-500 words that respond to the following questions with your...
Within the Discussion Board area, write 400-500 words that respond to the following questions with your thoughts, ideas, and comments. This will be the foundation for future discussions by your classmates. Be substantive and clear, and use examples to reinforce your ideas. Through improved financial strategies and data management, the supply chain of a company can improve their customer experience, business efficiencies and overall financial performance. By collecting data, improving customer intimacy, establishing financial strategies, and through the use of...
Using your experiences and the internet as your guides, write an essay (at least 500 words)...
Using your experiences and the internet as your guides, write an essay (at least 500 words) with an introduction and a conclusion in which you address the following topics. List all resources used in standard format at the end of your paper. 1. Identify a cultural group to which you belong. Explain why you consider it a culture by describing some of its characteristics and providing examples of the kinds of experiences and knowledge shared by its members. (4 points)...
please I want you to answer this question and most importantly, the answer is not already...
please I want you to answer this question and most importantly, the answer is not already available on this site or other site Course Learning Outcome: Apply Organizational behavior knowledge and skills to manage diversified culture in the organizational settings (Lo 2.2). Essay: Differences in Culture and Diversity at workplace Essay Write an essay about the differences in Culture and Diversity at workplace. Use examples, peer-reviewed journals to support your answer. This essay must be at least 1000-word in length....
1) Write a short paper (1 to 2 paragraphs) explaining what you believe the problem or...
1) Write a short paper (1 to 2 paragraphs) explaining what you believe the problem or need your product or service is addressing. Remember entrepreneurial opportunities are always grounded in a well defined problem. https://www.entrepreneur.com/article/237668 It would be an essay or paper explaning in ur own words 2)Procedures for this assignment Essay by answering the below question Interview at least 10 new potential customers (do not interview the same people) about the problem you think your product or service is...
Using short essay format, and in your own words, answer the following questions. 1. Describe the...
Using short essay format, and in your own words, answer the following questions. 1. Describe the structure of a monosaccharide and name the three monosaccharides important in nutrition. Name the three disaccharides commonly found in foods and their component monosaccharides. In what foods are these sugars found? 2. Describe the structure of polysaccharides and name the ones important in nutrition. How are starch and glycogen similar, and how do they differ? How do the fibers differ from other polysaccharides? 3....
Required information Skip to question [The following information applies to the questions displayed below.] Oslo Company...
Required information Skip to question [The following information applies to the questions displayed below.] Oslo Company prepared the following contribution format income statement based on a sales volume of 1,000 units (the relevant range of production is 500 units to 1,500 units): Sales $ 35,000 Variable expenses 21,000 Contribution margin 14,000 Fixed expenses 8,400 Net operating income $ 5,600 13. Using the degree of operating leverage, what is the estimated percent increase in net operating income of a 5% increase...
IDEAS PLEASE below are the requirements and what i need ideas maybe sources to write the...
IDEAS PLEASE below are the requirements and what i need ideas maybe sources to write the paper FASB Project "The mission of the Financial Accounting Standards Board is to establish and improve standards of financial accounting reporting for the guidance and education of the public, including issuers, auditors, and users of financial information." (http://www.fasb.org) Project Objective Describe the history, current status, and adoption implications of a Financial Accounting Standards Board ongoing project. Requirements The FASB has several ongoing projects that...
In about 300 words, write an essay analyzing the below case study using the SWOT framework....
In about 300 words, write an essay analyzing the below case study using the SWOT framework. Johnson & Johnson Consumer Products, INC. Johnson & Johnson is a household name in baby-care as well as medical products. Nearly every family in the United States has in its house at least one product made by this company. Founded in 1885, Johnson & Johnson is currently an international enterprise, with 170 affiliated companies in fifty-five countries. J&J enjoys a reputation for high-quality products...
Problem Set 1: Linear Regression Research Scenario:A human resources specialist wanted to know if conscientiousness is...
Problem Set 1: Linear Regression Research Scenario:A human resources specialist wanted to know if conscientiousness is a good predictor of job performance. Conscientiousness was evaluated using likert based questions from the open access International Personality Item Pool (IPIP) based on Goldberg’s Big Five (1992). Questions were summed (range 0 – 20), with higher values indicating more conscientiousness. Job performance was an average of several factors, ranging from 0 – 20, with higher scores indicating better job performance. He compiles the...