Question

Looking for explanation of Quantitative analysis techniques (e.g. cluster analysis, descriptive and inferential statistics) with examples

Looking for explanation of Quantitative analysis techniques (e.g. cluster analysis, descriptive and inferential statistics) with examples

Homework Answers

Answer #1

QUANTITATIVE ANALYSIS:

Quantitative analysis is based on describing and interpreting objects statistically and with numbers. This is the process of presenting and interpreting numerical data. Quantitative analysis aims to interpret the data collected for the phenomenon through numeric variables and statistics. Quantitative analysis includes computational and statistical methods of analysis.

There are different quantitative analysis techniques used like cluster analysis, descriptive and inferential statistics.

CLUSTER ANALYSIS:

Clustering is the task of dividing the population or data points into a number of groups such that data points in the same groups are more similar to other data points in the same group than those in other groups. In simple words, the aim is to segregate groups with similar traits and assign them into clusters.

EXAMPLE:

Suppose, you are the head of a rental store and wish to understand preferences of your costumers to scale up your business. Is it possible for you to look at details of each costumer and devise a unique business strategy for each one of them? Definitely not. But, what you can do is to cluster all of your costumers into say 10 groups based on their purchasing habits and use a separate strategy for costumers in each of these 10 groups. And this is what we call clustering.

TYPES OF CLUSTERING:

Clustering can be divided into two subgroups :

Hard Clustering: In hard clustering, each data point either belongs to a cluster completely or not. For example, in the above example each customer is put into one group out of the 10 groups.

Soft Clustering: In soft clustering, instead of putting each data point into a separate cluster, a probability or likelihood of that data point to be in those clusters is assigned. For example, from the above scenario each costumer is assigned a probability to be in either of 10 clusters of the retail store.

TYPES OF CLUSTERING ALGORITHMS:

The different methodologies follows a different set of rules for defining the ‘similarity’ among data points. In fact, there are more than 100 clustering algorithms known. But few of the algorithms are used popularly, let’s look at them in detail:

(a) Connectivity models: In connectivity models, the data points closer in data space exhibit more similarity to each other than the data points lying farther away. These models can follow two approaches.

In the first approach, they start with classifying all data points into separate clusters & then aggregating them as the distance decreases. (Agglomerative or bottom-up clustering method​)

In the second approach, all data points are classified as a single cluster and then partitioned as the distance increases. Also, the choice of distance function is subjective. These models are very easy to interpret but lacks scalability for handling big datasets. (Divisive or top-down clustering method ​)

Examples of these models are hierarchical clustering algorithm.

(b) Centroid models: These are iterative clustering algorithms in which the notion of similarity is derived by the closeness of a data point to the centroid of the clusters. K-Means clustering algorithm is a popular algorithm that falls into this category. In these models, the no. of clusters required at the end have to be mentioned beforehand, which makes it important to have prior knowledge of the dataset. These models run iteratively to find the local optima.

(c) Distribution models: These clustering models are based on the notion of how probable is it that all data points in the cluster belong to the same distribution (For example: Normal, Gaussian). These models often suffer from overfitting.

A popular example of these models is Expectation-maximization algorithm which uses multivariate normal distributions.

(d) Density Models:​ These models search the data space for areas of varied density of data points in the data space. It isolates various different density regions and assign the data points within these regions in the same cluster. Popular examples of density models are DBSCAN and OPTICS.

REQUIREMENTS OF CLUSTERING:

  • Scalability
  • Ability to deal with different types of attributes
  • Discovery of clusters with arbitrary shape
  • Minimal requirements for domain knowledge to determine input parameters
  • Able to deal with noise and outliers
  • Insensitive to order of input records
  • High dimensionality
  • Incorporation of user-specified constraints
  • Interpretability and usability

GOOD CLUSTERING METHOD:

A good clustering method will produce high quality clusters with

  • high intra-class similarity
  • low inter-class similarity

The quality of a clustering result depends on both the similarity measure used by the method and its implementation.

The quality of a clustering method is also measured by its ability to discover some or all of the hidden patterns.

EXAMPLES:

1) Marketing: Help marketers discover distinct groups in their customer bases, and then use this knowledge to develop targeted marketing programs

2) Land use: Identification of areas of similar land use in an earth observation database

3) Insurance: Identifying groups of motor insurance policy holders with a high average claim cost

4) City-planning: Identifying groups of houses according to their house type, value, and geographical location

5) Earth-quake studies: Observed earth quake epicenters should be clustered along continent faults

DESCRIPTIVE STATISTICS:

Descriptive statistics is the term given to the analysis of data that helps describe, show or summarize data in a meaningful way such that, for example, patterns might emerge from the data. Descriptive statistics do not allow us to make conclusions beyond the data we have analysed or reach conclusions regarding any hypotheses we might have made. They are simply a way to describe our data.

There are two general types of statistic that are used to describe data:

Measures of central tendency: These measures are used for describing the central position of a frequency distribution for a group of data. We can describe this central position using a number of statistics, including the median, and mean.

Measures of spread: These measure are used for summarizing a group of data by describing how spread out the scores are. The measures used to describe the spread of data includes the range, quartiles, absolute deviation, variance and standard deviation.

A graphical representation of data is another method of descriptive statistics. When we use descriptive statistics it is useful to summarize our group of data using a combination of tabulated description (i.e., tables), graphical description (i.e., graphs and charts). Examples of this visual representation are histograms, bar graphs and pie graphs, etc.,

This provides a quick method to make comparisons between different data sets and to spot the smallest and largest values and trends or changes over a period of time.

EXAMPLE:

Suppose we want to describe the test scores in a specific class of 30 students. We record all of the test scores and calculate the summary statistics and produce graphs.

The data displaying test scores of students is given to be:

75.34529, 73.1057, 81.27668, 76.54832, 80.33573, 66.98396, 86.81477, 76.22811, 74.31525, 94.8884, 67.77032, 83.61273, 73.68581, 79.04877, 81.98413, 66.21308, 68.74711, 81.38832, 85.73142, 76.68694, 76.6193, 78.08708, 86.12625, 87.3732, 83.0061, 91.79527, 96.52943, 72.14013, 79.55772, 73.57527.

The histogram for this data is given by:

TABLE OF STATISTIC:

STATISTIC    CLASS VALUE
   Mean 79.18
Range    66.21 – 96.53

These results indicate that the mean score of this class is 79.18. The scores range from 66.21 to 96.53, and the distribution is symmetrically centered around the mean. Collectively, this information gives us a pretty good picture of this specific class. There is no uncertainty surrounding these statistics because we gathered the scores for everyone in the class.

INFERENTIAL STATISTICS:

Descriptive statistics describes data (for example, a chart or graph) and inferential statistics allows you to make predictions (“inferences”) from that data. With inferential statistics, you take data from samples and make generalizations about a population.

EXAMPLE:

You stand in a mall and ask a sample of 100 people if they like shopping at malls. You could make a bar chart of yes or no answers (that would be descriptive statistics) or you could use your research (and inferential statistics) to reason that around 75-80% of the population (all shoppers in all malls) like shopping at Malls.

Suppose if you have some sample data about a potential new cancer drug. You could use descriptive statistics to describe your sample, including:

  • Sample mean
  • Sample standard deviation
  • Making a bar chart or boxplot
  • Describing the shape of the sample probability distribution

With inferential statistics you take that sample data from a small number of people and and try to determine if the data can predict whether the drug will work for everyone (i.e. the population). Inferential statistics use statistical models to help you compare your sample data to other samples or to previous research. Most research uses statistical models called the Generalized Linear model and include Student’s t-tests, ANOVA (Analysis of Variance), regression analysis and various other models that result in straight-line (“linear”) probabilities and results.

AREAS OF INFERENTIAL STATISTICS:

There are two main areas of inferential statistics:

1) Estimating parameters: This means taking a statistic from your sample data (for example the sample mean) and using it to say something about a population parameter (i.e. the population mean).

2) Hypothesis tests: This is where you can use sample data to answer research questions. For example, you might be interested in knowing if a new cancer drug is effective. Or if breakfast helps children perform better in schools.

DIFFERENCES BETWEEN DESCRIPTIVE AND INFERENTIAL STATISTICS:

For descriptive statistics, we choose a group that we want to describe and then measure all subjects in that group. The statistical summary describes this group with complete certainty.

For inferential statistics, we need to define the population and then devise a sampling plan that produces a representative sample. The statistical results incorporate the uncertainty that is inherent in using a sample to understand an entire population.

A study using descriptive statistics is simpler to perform.

However, if you need evidence that an effect or relationship between variables exists in an entire population rather than only your sample, you need to use inferential statistics.

Know the answer?
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for?
Ask your own homework help question
Similar Questions
what are examples of descriptive statistics and an example of inferential statistics common to everyday life...
what are examples of descriptive statistics and an example of inferential statistics common to everyday life ?
Determine if the following is an example of descriptive or inferential statistics. A survey was done...
Determine if the following is an example of descriptive or inferential statistics. A survey was done asking 300 students how old they are. A histogram was produced to display the results of the survey. Descriptive Statistics Inferential Statistics
Describe the difference between descriptive and inferential statistics. Write about a time when you used descriptive...
Describe the difference between descriptive and inferential statistics. Write about a time when you used descriptive statistic to make a decision in your life.
Please explain the difference between descriptive and inferential statistics. Please provide and / or illustrations to...
Please explain the difference between descriptive and inferential statistics. Please provide and / or illustrations to support your responses.
Why is it important to calculate the basic descriptive statistics before completing inferential statistical analyses?
Why is it important to calculate the basic descriptive statistics before completing inferential statistical analyses?
Which of the following statements best describes the process that characterizes inferential statistics? Descriptive methods are...
Which of the following statements best describes the process that characterizes inferential statistics? Descriptive methods are used to characterize the information in a population so that conclusions may be reached regarding samples taken from that population. Descriptive methods are used to obtain parameters that are then used to make inferences about the sample. Descriptive methods are used to obtain statistics that are then used to make inferences about the population. Finite populations are aggregated to obtain (nearly) infinite populations that...
For each of the following examples, indicate whether it involves the use of descriptive or inferential...
For each of the following examples, indicate whether it involves the use of descriptive or inferential statistics. Justify your answer. a. The number of women who voted for Trump in 2016. b. Determining students' opinions about the quality of food at the cafe based on a sample of 100 students. c. The national incidence of breast cancer among Asian women. d. Conducting a study to determine the rating of the quality of a new smartphone, gathered from 1000 new buyers....
differentiate between a population and a sample, and a paremeter and a statistic. in your explanation...
differentiate between a population and a sample, and a paremeter and a statistic. in your explanation apply the concept of descriptive and inferential statistics. provide specific examples to illustrate each. discuss why the ability to differentiate these terms is critically important when interpreting research findings related to public health.
Traditionally textbooks on statistics (or biostatistics) began with coverage of descriptive statistics (calculating mean, standard deviation,...
Traditionally textbooks on statistics (or biostatistics) began with coverage of descriptive statistics (calculating mean, standard deviation, and the like) and probability. In contrast, most contemporary statistics textbooks begin with the topic of exploratory data analysis (using graphical and quantitative methods to learn properties of data, before further analysis is done). Why do you think newer textbooks made this change and do you think it was a good idea?
A median, mode, frequency, and range are examples of statistical analysis techniques for which level of...
A median, mode, frequency, and range are examples of statistical analysis techniques for which level of measurement?
ADVERTISEMENT
Need Online Homework Help?

Get Answers For Free
Most questions answered within 1 hours.

Ask a Question
ADVERTISEMENT