When we create an array of values and calculate the standard deviation of the array in R (using sd() function) and Python (using std() function from NumPy package.), think and give an explanation why the results are different
This can be illustrated with an example.
R code pasted below for finding standard deviation
a<-c(10,5,30,20,80,76,38)
sd(a)
Output
Screen
Python code pasted below for finding standard deviation
import numpy as np
a=np.array([10,5,30,20,80,76,38])
print(a.std())
Python Code in IDLE pasted
Output Screen
The reason we are getting
different results is because of the way by which the standard
deviation/variance is calculated. R calculates denominator
with N-1
, while numpy calculates
denominator with N
. We will get a numpy result equal
to the R result by using a.std(ddof=1)
, which tells
numpy to use N-1
as the denominator when calculating
the variance.
Python code pasted below for finding standard deviation
import numpy as np
a=np.array([10,5,30,20,80,76,38])
print(a.std(ddof=1))
Python Code in IDLE pasted
Output Screen
So now the output of both R and Python numpy are the same.
Get Answers For Free
Most questions answered within 1 hours.