Numpy transformation to normal distribution

up vote
1
down vote

favorite

I have an array of data. I checked if it was normally distributed:

import sys

import scipy

from scipy import stats

from scipy.stats import mstats

from scipy.stats import normaltest



Data = 

for line in open(sys.argv[1]):

    line = line.strip()

    Data.append(float(line))

print scipy.stats.normaltest(Data)

The output was: (36.444648754208075, 1.2193968690198398e-08)

Then, I wrote a small script to normalise the data:

import sys

import numpy as np

fileopen = open(sys.argv[1])

UntransformedArray = 

for line in fileopen:

    line = float(line.strip())

    UntransformedArray.append(line)

TransformedArray = (UntransformedArray - np.mean(UntransformedArray)/np.std(UntransformedArray))

NewList = TransformedArray.tolist()

for i in NewList:

    print i

And then I checked for normality again using the first script and the output was
(36.444648754209595, 1.2193968690189117e-08).

...the same as the previous score, and not normally distributed.

is one of my scripts wrong?

Also, should I mention that the average of my data is 0.056, the numbers range from 0.014 to 0.171 (85 observations), I'm not sure if the fact that the numbers are so small matters.

A sample of the untransformed and transformed data:

Untransformed:

Transformed data:

-2.13696814254

-2.11796814254

-2.14296814254

-2.12496814254

-2.15396814254

-2.15496814254

-2.14696814254

Edit 1:

When I edit the code slightly to account for parenthesis being in the wrong place:

TransformedMean = (UntransformedArray - np.mean(UntransformedArray))

TransformedArray = (TransformedMean/np.std(UntransformedArray))

NewList = TransformedArray.tolist()

for i in NewList:

    print i

The output I get it different:

Example:

-0.0385683544143

0.705333390576

-0.273484694937

0.431264326632

-0.704164652563

-0.743317375984

However, when I check for normality:
(36.444648754241328, 1.2193968689995659e-08)

It is still not normally distributed (and is still the exact same score as the other times)?

Edit 2:

I then tried a different method of normalising the data:

import sys

import scipy

from scipy import stats

from scipy.stats import boxcox



Data = [(float(line.strip())) for line in open(sys.argv[1])]

scipy.stats.boxcox(Data)

I get the error: TypeError: unsupported operand type(s) for ** or pow(): 'list' and 'float'

EDIT 3: Due to comment from user, the problem was understanding the difference in normalising values, versus normalising a distribution.

Edited code:

import sys

import numpy as np



fileopen = open(sys.argv[1])

UntransformedArray = 

for line in fileopen:

    line = float(line.strip())

    UntransformedArray.append(line)



List1 =  np.log(UntransformedArray) 

for i in List1:

    print i

Checking for normalisation:
(4.0435072214905938, 0.13242304287973003)

(works in this case, depending on skewness of the data).

Edit 4: Or using a BoxCox transformation:

import sys

import scipy

from scipy import stats

from scipy.stats import boxcox

import numpy as np



Data = 

for line in open(sys.argv[1]):

    line = line.strip()

    Data.append(float(line))



data = scipy.stats.boxcox(np.array(Data))

for i in data[0]:

    print i

Check for normalisation: (2.9085877478631956, 0.23356523218452238)

edited Nov 30 '15 at 15:25

asked Nov 30 '15 at 13:23

Tom

8812

1

Don't you have a parenthesis problem in the TransformedArray calc? ( UntransformedArray - np.mean(UntransformedArray) ) /np.std(UntransformedArray)
– joao
Nov 30 '15 at 13:30

This is what I have:TransformedArray = (UntransformedArray - np.mean(UntransformedArray)/np.std(UntransformedArray)) and it seems to run without complaining? Don't get any error about parenthesis?
– Tom
Nov 30 '15 at 13:42

1

Arithmetic division (/) has not the same priority has the minus (-) operation. Thus, you are dividing the mean/std, and then only after the subtraction is applied. I believe your parenthesis are misplaced there.
– joao
Nov 30 '15 at 13:51

Thanks. I've changed the script slightly (see edit). Is it possibly something wrong with the checking for normality script? The reason I ask is that now I've given the checking for normality script two different lists, (for example, my original transformed output, where all the numbers start with -2.XXX, and in my edit, where the numbers are e.g. 0.43, -0.7 etc), and I still get the exact same output from checking for normality script?
– Tom
Nov 30 '15 at 14:21

Re. boxcox: Try scipy.stats.boxcox(np.array(Data)) (and add import numpy as np at the top of your script if you don't already have it). By the way, scipy.stats.boxcox(Data) works in newer versions of scipy. What version are you using? Run import scipy; print(scipy.__version__) to find out.
– Warren Weckesser
Nov 30 '15 at 15:16

add a comment |

up vote
1
down vote

favorite

I have an array of data. I checked if it was normally distributed:

import sys

import scipy

from scipy import stats

from scipy.stats import mstats

from scipy.stats import normaltest



Data = 

for line in open(sys.argv[1]):

    line = line.strip()

    Data.append(float(line))

print scipy.stats.normaltest(Data)

The output was: (36.444648754208075, 1.2193968690198398e-08)

Then, I wrote a small script to normalise the data:

import sys

import numpy as np

fileopen = open(sys.argv[1])

UntransformedArray = 

for line in fileopen:

    line = float(line.strip())

    UntransformedArray.append(line)

TransformedArray = (UntransformedArray - np.mean(UntransformedArray)/np.std(UntransformedArray))

NewList = TransformedArray.tolist()

for i in NewList:

    print i

And then I checked for normality again using the first script and the output was
(36.444648754209595, 1.2193968690189117e-08).

...the same as the previous score, and not normally distributed.

is one of my scripts wrong?

Also, should I mention that the average of my data is 0.056, the numbers range from 0.014 to 0.171 (85 observations), I'm not sure if the fact that the numbers are so small matters.

A sample of the untransformed and transformed data:

Untransformed:

Transformed data:

-2.13696814254

-2.11796814254

-2.14296814254

-2.12496814254

-2.15396814254

-2.15496814254

-2.14696814254

Edit 1:

When I edit the code slightly to account for parenthesis being in the wrong place:

TransformedMean = (UntransformedArray - np.mean(UntransformedArray))

TransformedArray = (TransformedMean/np.std(UntransformedArray))

NewList = TransformedArray.tolist()

for i in NewList:

    print i

The output I get it different:

Example:

-0.0385683544143

0.705333390576

-0.273484694937

0.431264326632

-0.704164652563

-0.743317375984

However, when I check for normality:
(36.444648754241328, 1.2193968689995659e-08)

It is still not normally distributed (and is still the exact same score as the other times)?

Edit 2:

I then tried a different method of normalising the data:

import sys

import scipy

from scipy import stats

from scipy.stats import boxcox



Data = [(float(line.strip())) for line in open(sys.argv[1])]

scipy.stats.boxcox(Data)

I get the error: TypeError: unsupported operand type(s) for ** or pow(): 'list' and 'float'

EDIT 3: Due to comment from user, the problem was understanding the difference in normalising values, versus normalising a distribution.

Edited code:

import sys

import numpy as np



fileopen = open(sys.argv[1])

UntransformedArray = 

for line in fileopen:

    line = float(line.strip())

    UntransformedArray.append(line)



List1 =  np.log(UntransformedArray) 

for i in List1:

    print i

Checking for normalisation:
(4.0435072214905938, 0.13242304287973003)

(works in this case, depending on skewness of the data).

Edit 4: Or using a BoxCox transformation:

import sys

import scipy

from scipy import stats

from scipy.stats import boxcox

import numpy as np



Data = 

for line in open(sys.argv[1]):

    line = line.strip()

    Data.append(float(line))



data = scipy.stats.boxcox(np.array(Data))

for i in data[0]:

    print i

Check for normalisation: (2.9085877478631956, 0.23356523218452238)

edited Nov 30 '15 at 15:25

asked Nov 30 '15 at 13:23

Tom

8812

1

Don't you have a parenthesis problem in the TransformedArray calc? ( UntransformedArray - np.mean(UntransformedArray) ) /np.std(UntransformedArray)
– joao
Nov 30 '15 at 13:30

This is what I have:TransformedArray = (UntransformedArray - np.mean(UntransformedArray)/np.std(UntransformedArray)) and it seems to run without complaining? Don't get any error about parenthesis?
– Tom
Nov 30 '15 at 13:42

1

Arithmetic division (/) has not the same priority has the minus (-) operation. Thus, you are dividing the mean/std, and then only after the subtraction is applied. I believe your parenthesis are misplaced there.
– joao
Nov 30 '15 at 13:51

Thanks. I've changed the script slightly (see edit). Is it possibly something wrong with the checking for normality script? The reason I ask is that now I've given the checking for normality script two different lists, (for example, my original transformed output, where all the numbers start with -2.XXX, and in my edit, where the numbers are e.g. 0.43, -0.7 etc), and I still get the exact same output from checking for normality script?
– Tom
Nov 30 '15 at 14:21

Re. boxcox: Try scipy.stats.boxcox(np.array(Data)) (and add import numpy as np at the top of your script if you don't already have it). By the way, scipy.stats.boxcox(Data) works in newer versions of scipy. What version are you using? Run import scipy; print(scipy.__version__) to find out.
– Warren Weckesser
Nov 30 '15 at 15:16

add a comment |

up vote
1
down vote

favorite

I have an array of data. I checked if it was normally distributed:

import sys

import scipy

from scipy import stats

from scipy.stats import mstats

from scipy.stats import normaltest



Data = 

for line in open(sys.argv[1]):

    line = line.strip()

    Data.append(float(line))

print scipy.stats.normaltest(Data)

The output was: (36.444648754208075, 1.2193968690198398e-08)

Then, I wrote a small script to normalise the data:

import sys

import numpy as np

fileopen = open(sys.argv[1])

UntransformedArray = 

for line in fileopen:

    line = float(line.strip())

    UntransformedArray.append(line)

TransformedArray = (UntransformedArray - np.mean(UntransformedArray)/np.std(UntransformedArray))

NewList = TransformedArray.tolist()

for i in NewList:

    print i

And then I checked for normality again using the first script and the output was
(36.444648754209595, 1.2193968690189117e-08).

...the same as the previous score, and not normally distributed.

is one of my scripts wrong?

Also, should I mention that the average of my data is 0.056, the numbers range from 0.014 to 0.171 (85 observations), I'm not sure if the fact that the numbers are so small matters.

A sample of the untransformed and transformed data:

Untransformed:

Transformed data:

-2.13696814254

-2.11796814254

-2.14296814254

-2.12496814254

-2.15396814254

-2.15496814254

-2.14696814254

Edit 1:

When I edit the code slightly to account for parenthesis being in the wrong place:

TransformedMean = (UntransformedArray - np.mean(UntransformedArray))

TransformedArray = (TransformedMean/np.std(UntransformedArray))

NewList = TransformedArray.tolist()

for i in NewList:

    print i

The output I get it different:

Example:

-0.0385683544143

0.705333390576

-0.273484694937

0.431264326632

-0.704164652563

-0.743317375984

However, when I check for normality:
(36.444648754241328, 1.2193968689995659e-08)

It is still not normally distributed (and is still the exact same score as the other times)?

Edit 2:

I then tried a different method of normalising the data:

import sys

import scipy

from scipy import stats

from scipy.stats import boxcox



Data = [(float(line.strip())) for line in open(sys.argv[1])]

scipy.stats.boxcox(Data)

I get the error: TypeError: unsupported operand type(s) for ** or pow(): 'list' and 'float'

EDIT 3: Due to comment from user, the problem was understanding the difference in normalising values, versus normalising a distribution.

Edited code:

import sys

import numpy as np



fileopen = open(sys.argv[1])

UntransformedArray = 

for line in fileopen:

    line = float(line.strip())

    UntransformedArray.append(line)



List1 =  np.log(UntransformedArray) 

for i in List1:

    print i

Checking for normalisation:
(4.0435072214905938, 0.13242304287973003)

(works in this case, depending on skewness of the data).

Edit 4: Or using a BoxCox transformation:

import sys

import scipy

from scipy import stats

from scipy.stats import boxcox

import numpy as np



Data = 

for line in open(sys.argv[1]):

    line = line.strip()

    Data.append(float(line))



data = scipy.stats.boxcox(np.array(Data))

for i in data[0]:

    print i

Check for normalisation: (2.9085877478631956, 0.23356523218452238)

edited Nov 30 '15 at 15:25

asked Nov 30 '15 at 13:23

Tom

8812

I have an array of data. I checked if it was normally distributed:

import sys

import scipy

from scipy import stats

from scipy.stats import mstats

from scipy.stats import normaltest



Data = 

for line in open(sys.argv[1]):

    line = line.strip()

    Data.append(float(line))

print scipy.stats.normaltest(Data)

The output was: (36.444648754208075, 1.2193968690198398e-08)

Then, I wrote a small script to normalise the data:

import sys

import numpy as np

fileopen = open(sys.argv[1])

UntransformedArray = 

for line in fileopen:

    line = float(line.strip())

    UntransformedArray.append(line)

TransformedArray = (UntransformedArray - np.mean(UntransformedArray)/np.std(UntransformedArray))

NewList = TransformedArray.tolist()

for i in NewList:

    print i

And then I checked for normality again using the first script and the output was
(36.444648754209595, 1.2193968690189117e-08).

...the same as the previous score, and not normally distributed.

is one of my scripts wrong?

Also, should I mention that the average of my data is 0.056, the numbers range from 0.014 to 0.171 (85 observations), I'm not sure if the fact that the numbers are so small matters.

A sample of the untransformed and transformed data:

Untransformed:

Transformed data:

-2.13696814254

-2.11796814254

-2.14296814254

-2.12496814254

-2.15396814254

-2.15496814254

-2.14696814254

Edit 1:

When I edit the code slightly to account for parenthesis being in the wrong place:

TransformedMean = (UntransformedArray - np.mean(UntransformedArray))

TransformedArray = (TransformedMean/np.std(UntransformedArray))

NewList = TransformedArray.tolist()

for i in NewList:

    print i

The output I get it different:

Example:

-0.0385683544143

0.705333390576

-0.273484694937

0.431264326632

-0.704164652563

-0.743317375984

However, when I check for normality:
(36.444648754241328, 1.2193968689995659e-08)

It is still not normally distributed (and is still the exact same score as the other times)?

Edit 2:

I then tried a different method of normalising the data:

import sys

import scipy

from scipy import stats

from scipy.stats import boxcox



Data = [(float(line.strip())) for line in open(sys.argv[1])]

scipy.stats.boxcox(Data)

I get the error: TypeError: unsupported operand type(s) for ** or pow(): 'list' and 'float'

EDIT 3: Due to comment from user, the problem was understanding the difference in normalising values, versus normalising a distribution.

Edited code:

import sys

import numpy as np



fileopen = open(sys.argv[1])

UntransformedArray = 

for line in fileopen:

    line = float(line.strip())

    UntransformedArray.append(line)



List1 =  np.log(UntransformedArray) 

for i in List1:

    print i

Checking for normalisation:
(4.0435072214905938, 0.13242304287973003)

(works in this case, depending on skewness of the data).

Edit 4: Or using a BoxCox transformation:

import sys

import scipy

from scipy import stats

from scipy.stats import boxcox

import numpy as np



Data = 

for line in open(sys.argv[1]):

    line = line.strip()

    Data.append(float(line))



data = scipy.stats.boxcox(np.array(Data))

for i in data[0]:

    print i

Check for normalisation: (2.9085877478631956, 0.23356523218452238)

python numpy normalization

edited Nov 30 '15 at 15:25

asked Nov 30 '15 at 13:23

Tom

8812

edited Nov 30 '15 at 15:25

asked Nov 30 '15 at 13:23

Tom

8812

edited Nov 30 '15 at 15:25

asked Nov 30 '15 at 13:23

Tom

8812

asked Nov 30 '15 at 13:23

Tom

8812

asked Nov 30 '15 at 13:23

Tom

8812

1

Don't you have a parenthesis problem in the TransformedArray calc? ( UntransformedArray - np.mean(UntransformedArray) ) /np.std(UntransformedArray)
– joao
Nov 30 '15 at 13:30

This is what I have:TransformedArray = (UntransformedArray - np.mean(UntransformedArray)/np.std(UntransformedArray)) and it seems to run without complaining? Don't get any error about parenthesis?
– Tom
Nov 30 '15 at 13:42

1

Arithmetic division (/) has not the same priority has the minus (-) operation. Thus, you are dividing the mean/std, and then only after the subtraction is applied. I believe your parenthesis are misplaced there.
– joao
Nov 30 '15 at 13:51

Thanks. I've changed the script slightly (see edit). Is it possibly something wrong with the checking for normality script? The reason I ask is that now I've given the checking for normality script two different lists, (for example, my original transformed output, where all the numbers start with -2.XXX, and in my edit, where the numbers are e.g. 0.43, -0.7 etc), and I still get the exact same output from checking for normality script?
– Tom
Nov 30 '15 at 14:21

Re. boxcox: Try scipy.stats.boxcox(np.array(Data)) (and add import numpy as np at the top of your script if you don't already have it). By the way, scipy.stats.boxcox(Data) works in newer versions of scipy. What version are you using? Run import scipy; print(scipy.__version__) to find out.
– Warren Weckesser
Nov 30 '15 at 15:16

add a comment |

1

Don't you have a parenthesis problem in the TransformedArray calc? ( UntransformedArray - np.mean(UntransformedArray) ) /np.std(UntransformedArray)
– joao
Nov 30 '15 at 13:30

This is what I have:TransformedArray = (UntransformedArray - np.mean(UntransformedArray)/np.std(UntransformedArray)) and it seems to run without complaining? Don't get any error about parenthesis?
– Tom
Nov 30 '15 at 13:42

1

Arithmetic division (/) has not the same priority has the minus (-) operation. Thus, you are dividing the mean/std, and then only after the subtraction is applied. I believe your parenthesis are misplaced there.
– joao
Nov 30 '15 at 13:51

Thanks. I've changed the script slightly (see edit). Is it possibly something wrong with the checking for normality script? The reason I ask is that now I've given the checking for normality script two different lists, (for example, my original transformed output, where all the numbers start with -2.XXX, and in my edit, where the numbers are e.g. 0.43, -0.7 etc), and I still get the exact same output from checking for normality script?
– Tom
Nov 30 '15 at 14:21

Re. boxcox: Try scipy.stats.boxcox(np.array(Data)) (and add import numpy as np at the top of your script if you don't already have it). By the way, scipy.stats.boxcox(Data) works in newer versions of scipy. What version are you using? Run import scipy; print(scipy.__version__) to find out.
– Warren Weckesser
Nov 30 '15 at 15:16

Don't you have a parenthesis problem in the TransformedArray calc? ( UntransformedArray - np.mean(UntransformedArray) ) /np.std(UntransformedArray)
– joao
Nov 30 '15 at 13:30

This is what I have:TransformedArray = (UntransformedArray - np.mean(UntransformedArray)/np.std(UntransformedArray)) and it seems to run without complaining? Don't get any error about parenthesis?
– Tom
Nov 30 '15 at 13:42

Arithmetic division (/) has not the same priority has the minus (-) operation. Thus, you are dividing the mean/std, and then only after the subtraction is applied. I believe your parenthesis are misplaced there.
– joao
Nov 30 '15 at 13:51

Thanks. I've changed the script slightly (see edit). Is it possibly something wrong with the checking for normality script? The reason I ask is that now I've given the checking for normality script two different lists, (for example, my original transformed output, where all the numbers start with -2.XXX, and in my edit, where the numbers are e.g. 0.43, -0.7 etc), and I still get the exact same output from checking for normality script?
– Tom
Nov 30 '15 at 14:21

Re. boxcox: Try scipy.stats.boxcox(np.array(Data)) (and add import numpy as np at the top of your script if you don't already have it). By the way, scipy.stats.boxcox(Data) works in newer versions of scipy. What version are you using? Run import scipy; print(scipy.__version__) to find out.
– Warren Weckesser
Nov 30 '15 at 15:16

add a comment |

3 Answers
3

active

oldest

votes

up vote
2
down vote

As expected, subtracting the mean and rescaling to unit variance does not change the shape of the distribution. normaltest correctly returns the same output in both cases, telling you that your data is not normally distributed.

answered Nov 30 '15 at 14:19

thomas

1,200513

add a comment |

up vote
1
down vote

I agree with Thomas. But to be more precise: You are standardizing the distribution of your array! This does not change the shape of the distribution! You might want to use the numpy.histogram() function to get an impression of the distributions!

I think you have fallen prey to the confusing double usage of 'normalization'. On the one hand, normalization is used to describe standardization of variables (getting variables on the same scale - this is what you did). On the other hand, normalization is used to describe attempts of changing the shape of a probability distribution (the scipy.stats.normaltest() is used to check the shape of such distributions). One easy strategy to try to get a distribution more normally is to use a log transformation. numpy.log() might do the trick here, but only if the original distribution is not too skewed.

answered Nov 30 '15 at 15:10

Dominix

814

This was really useful thank you, particularly in the clarification of the understanding. I have made an edit with the updated code that I used.
– Tom
Nov 30 '15 at 15:16

glad it helped!
– Dominix
Nov 30 '15 at 20:52

add a comment |

up vote
0
down vote

I came across the same problem. My data was not normal like yours and I had to transform my data to a normal distribution. For transforming your data to normal you should use normal score transform by different methods like as it is described here. You can also use these formulas. I have written a python code for changing your list of elements to normal distribution as follows:

X = [0.055, 0.074, 0.049, 0.067, 0.038, 0.037, 0.045, 0.041]



from scipy.stats import rankdata, norm



newX = norm.ppf(rankdata(x)/(len(x) + 1))

print(newX)



output:

[ 0.4307273   1.22064035  0.1397103   0.76470967 -0.76470967 -1.22064035

-0.1397103  -0.4307273 ]

You can see that your new data is completely normal after this transformation as you can see by Q-Q plot:

from scipy import stats

import matplotlib.pyplot as plt



ax4 = plt.subplot(111)

res = stats.probplot(newX, plot=plt)

plt.show()

edited Nov 21 at 21:53

answered Nov 21 at 21:31

Sara

1086

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f33999669%2fnumpy-transformation-to-normal-distribution%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

up vote
2
down vote

answered Nov 30 '15 at 14:19

thomas

1,200513

add a comment |

up vote
2
down vote

answered Nov 30 '15 at 14:19

thomas

1,200513

add a comment |

up vote
2
down vote

answered Nov 30 '15 at 14:19

thomas

1,200513

answered Nov 30 '15 at 14:19

thomas

1,200513

answered Nov 30 '15 at 14:19

thomas

1,200513

answered Nov 30 '15 at 14:19

thomas

1,200513

answered Nov 30 '15 at 14:19

thomas

1,200513

add a comment |

up vote
1
down vote

answered Nov 30 '15 at 15:10

Dominix

814

This was really useful thank you, particularly in the clarification of the understanding. I have made an edit with the updated code that I used.
– Tom
Nov 30 '15 at 15:16

glad it helped!
– Dominix
Nov 30 '15 at 20:52

add a comment |

up vote
1
down vote

answered Nov 30 '15 at 15:10

Dominix

814

This was really useful thank you, particularly in the clarification of the understanding. I have made an edit with the updated code that I used.
– Tom
Nov 30 '15 at 15:16

glad it helped!
– Dominix
Nov 30 '15 at 20:52

add a comment |

up vote
1
down vote

answered Nov 30 '15 at 15:10

Dominix

814

answered Nov 30 '15 at 15:10

Dominix

814

answered Nov 30 '15 at 15:10

Dominix

814

answered Nov 30 '15 at 15:10

Dominix

814

answered Nov 30 '15 at 15:10

Dominix

814

This was really useful thank you, particularly in the clarification of the understanding. I have made an edit with the updated code that I used.
– Tom
Nov 30 '15 at 15:16

glad it helped!
– Dominix
Nov 30 '15 at 20:52

add a comment |

This was really useful thank you, particularly in the clarification of the understanding. I have made an edit with the updated code that I used.
– Tom
Nov 30 '15 at 15:16

glad it helped!
– Dominix
Nov 30 '15 at 20:52

This was really useful thank you, particularly in the clarification of the understanding. I have made an edit with the updated code that I used.
– Tom
Nov 30 '15 at 15:16

glad it helped!
– Dominix
Nov 30 '15 at 20:52

add a comment |

up vote
0
down vote

X = [0.055, 0.074, 0.049, 0.067, 0.038, 0.037, 0.045, 0.041]



from scipy.stats import rankdata, norm



newX = norm.ppf(rankdata(x)/(len(x) + 1))

print(newX)



output:

[ 0.4307273   1.22064035  0.1397103   0.76470967 -0.76470967 -1.22064035

-0.1397103  -0.4307273 ]

You can see that your new data is completely normal after this transformation as you can see by Q-Q plot:

from scipy import stats

import matplotlib.pyplot as plt



ax4 = plt.subplot(111)

res = stats.probplot(newX, plot=plt)

plt.show()

edited Nov 21 at 21:53

answered Nov 21 at 21:31

Sara

1086

add a comment |

up vote
0
down vote

X = [0.055, 0.074, 0.049, 0.067, 0.038, 0.037, 0.045, 0.041]



from scipy.stats import rankdata, norm



newX = norm.ppf(rankdata(x)/(len(x) + 1))

print(newX)



output:

[ 0.4307273   1.22064035  0.1397103   0.76470967 -0.76470967 -1.22064035

-0.1397103  -0.4307273 ]

You can see that your new data is completely normal after this transformation as you can see by Q-Q plot:

from scipy import stats

import matplotlib.pyplot as plt



ax4 = plt.subplot(111)

res = stats.probplot(newX, plot=plt)

plt.show()

edited Nov 21 at 21:53

answered Nov 21 at 21:31

Sara

1086

add a comment |

up vote
0
down vote

X = [0.055, 0.074, 0.049, 0.067, 0.038, 0.037, 0.045, 0.041]



from scipy.stats import rankdata, norm



newX = norm.ppf(rankdata(x)/(len(x) + 1))

print(newX)



output:

[ 0.4307273   1.22064035  0.1397103   0.76470967 -0.76470967 -1.22064035

-0.1397103  -0.4307273 ]

You can see that your new data is completely normal after this transformation as you can see by Q-Q plot:

from scipy import stats

import matplotlib.pyplot as plt



ax4 = plt.subplot(111)

res = stats.probplot(newX, plot=plt)

plt.show()

edited Nov 21 at 21:53

answered Nov 21 at 21:31

Sara

1086

X = [0.055, 0.074, 0.049, 0.067, 0.038, 0.037, 0.045, 0.041]



from scipy.stats import rankdata, norm



newX = norm.ppf(rankdata(x)/(len(x) + 1))

print(newX)



output:

[ 0.4307273   1.22064035  0.1397103   0.76470967 -0.76470967 -1.22064035

-0.1397103  -0.4307273 ]

You can see that your new data is completely normal after this transformation as you can see by Q-Q plot:

from scipy import stats

import matplotlib.pyplot as plt



ax4 = plt.subplot(111)

res = stats.probplot(newX, plot=plt)

plt.show()

edited Nov 21 at 21:53

answered Nov 21 at 21:31

Sara

1086

edited Nov 21 at 21:53

answered Nov 21 at 21:31

Sara

1086

answered Nov 21 at 21:31

Sara

1086

answered Nov 21 at 21:31

Sara

1086

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Htykuut