Analyzing data without planning or preparation is a bit like trying to build a car by getting some steel, glass, plastic, and rubber, and trying to put them together. Without first understanding, planning, and preparing, the manufacturing is likely to be a failure. Analytics is much the same—before you analyze, you must understand the data, plan your approach, and prepare the data set.
To complete this discussion board, do the following:
some common pitfalls:
1) not planning next steps:
Before you invest time in running an experiment or conducting an analysis of data, it’s important to understand how the results will impact your ongoing behavior. There will always be countless ways to slice your data, and the best way to avoid analysis paralysis is to focus on metrics that will drive you to act.
To determine if a metric is actionable, simply consider the possible results of your analysis and ask yourself how your strategy or behavior will change based on the results. If it won’t, maybe your time would be better spent elsewhere.
2) Correlation vs. causation:
The underlying principle in statistics and data science is the correlation is not causation, meaning that just because two things appear to be related to each other doesn’t mean that one causes the other. This is apparently the most common mistake in Time Series. There is usually a statement like “Correlation = 0.86”. Recall that a correlation coefficient is between +1 (a perfect linear relationship) and -1 (perfectly inversely related), with zero meaning no linear relationship. 0.86 is a high value, demonstrating that the statistical relationship of the two-time series is strong.
3) Not cleaning and normalizing data before analysis:
Always assume the data you are working with is inaccurate at first or make it clean. Once you get familiar with it, you will start to “feel” when something is not quite right. Take a first glance using pivot tables or quick analytical tools to look for duplicate records or inconsistent spelling to clean up your data first. Also, not normalizing the data is one more concern which can hinder your analysis. In most cases, when you normalize data you eliminate the units of measurement for data, enabling you to more easily compare data from different places.
Big Data = Big Trouble
Here we discuss on The problem with statistics
It may be helpful to consider some aspects of statistical thought which might lead many people to be distrustful of it. First of all, statistics requires the ability to consider things from a probabilistic perspective, employing quantitative technical concepts such as confidence, reliability, significance . this is in contrast to the way non-mathematicians often cast problems: logical, concrete, often dichotomous conceptualizations are the norm: right or wrong, large or small, this or that.
additionally, many non-mathematicians hold quantitative data in a sort of way. They have been lead to believe that numbers are, unquestionably correct. Consider the sort of math problems people are exposed to in secondary school, and even in introductory college math courses: there is a clearly defined method for finding the answer, and that answer is the only acceptable one. It comes, then, as a shock that different research studies can produce very different, often contradictory results. If the statistical methods used are really supposed to represent reality, how can it be that different studies produce different results? In order to resolve this paradox, many naive observers conclude that statistics must not really provide reliable indicators of reality after all. And, the logic goes, if statistics aren't right, they must be wrong. It is easy to see how even intelligent, well-educated people can become cynical if they don't understand the subtleties of statistical reasoning and analysis.
Now, I'm not going to say much about this public relations crisis directly, but it does provide a motivation for examining the way we practice our trade. The best thing we can do, in the long run, is make sure we're using our tools properly, and that our conclusions are warranted. I will present some of the most frequent misuses and abuses of statistical methods, and how to avoid or remedy them. Of course, these issues will be familiar to most statisticians; however, they are the sorts of things that can get easily overlooked when the pressure is on to produce results and meet deadlines. If this workshop helps you to apply the basics of statistical reasoning to improve the quality of your product, it will have served its purpose.
Some keys:
Get Answers For Free
Most questions answered within 1 hours.