How can I calculate probability for all each numpy value at once?











up vote
1
down vote

favorite












I have a function for calculating probability like below:



def multinormpdf(x, mu, var): # calculate probability of multi Gaussian distribution
k = len(x)
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = math.sqrt(((2*math.pi)**k)*det)
numerator = np.dot((x - mean).transpose(), inv)
numerator = np.dot(numerator, (x - mean))
numerator = math.exp(-0.5 * numerator)
return numerator/denominator


and I have mean vector, covariance matrix and 2D numpy array for test



mu = np.array([100, 105, 42]) # mean vector
var = np.array([[100, 124, 11], # covariance matrix
[124, 150, 44],
[11, 44, 130]])

arr = np.array([[42, 234, 124], # arr is 43923794 x 3 matrix
[123, 222, 112],
[42, 213, 11],
...(so many values about 40,000,000 rows),
[23, 55, 251]])


I have to calculate for probability for each value, so I used this code



for i in arr:
print(multinormpdf(i, mu, var)) # I already know mean_vector and variance_matrix


But it is so slow...



Is there any faster way to calculate probability?
Or is there any way to calculate probability for test arr at once like 'batch'?










share|improve this question
























  • Where does normpdf come from?
    – Nils Werner
    Nov 21 at 21:54










  • @Nils Werner I think that is not important. But I updated code for normpdf
    – Yeong-Hwa Jin
    Nov 22 at 2:36












  • Can you give a complete example with input and output?
    – Nils Werner
    Nov 22 at 7:17










  • @Nils Werner I uploaded more explanation
    – Yeong-Hwa Jin
    Nov 22 at 12:04










  • Your code is not valid Python code, and does not work, even after fixing the syntax issues. Please post a proper MVCE!
    – Nils Werner
    Nov 22 at 12:09

















up vote
1
down vote

favorite












I have a function for calculating probability like below:



def multinormpdf(x, mu, var): # calculate probability of multi Gaussian distribution
k = len(x)
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = math.sqrt(((2*math.pi)**k)*det)
numerator = np.dot((x - mean).transpose(), inv)
numerator = np.dot(numerator, (x - mean))
numerator = math.exp(-0.5 * numerator)
return numerator/denominator


and I have mean vector, covariance matrix and 2D numpy array for test



mu = np.array([100, 105, 42]) # mean vector
var = np.array([[100, 124, 11], # covariance matrix
[124, 150, 44],
[11, 44, 130]])

arr = np.array([[42, 234, 124], # arr is 43923794 x 3 matrix
[123, 222, 112],
[42, 213, 11],
...(so many values about 40,000,000 rows),
[23, 55, 251]])


I have to calculate for probability for each value, so I used this code



for i in arr:
print(multinormpdf(i, mu, var)) # I already know mean_vector and variance_matrix


But it is so slow...



Is there any faster way to calculate probability?
Or is there any way to calculate probability for test arr at once like 'batch'?










share|improve this question
























  • Where does normpdf come from?
    – Nils Werner
    Nov 21 at 21:54










  • @Nils Werner I think that is not important. But I updated code for normpdf
    – Yeong-Hwa Jin
    Nov 22 at 2:36












  • Can you give a complete example with input and output?
    – Nils Werner
    Nov 22 at 7:17










  • @Nils Werner I uploaded more explanation
    – Yeong-Hwa Jin
    Nov 22 at 12:04










  • Your code is not valid Python code, and does not work, even after fixing the syntax issues. Please post a proper MVCE!
    – Nils Werner
    Nov 22 at 12:09















up vote
1
down vote

favorite









up vote
1
down vote

favorite











I have a function for calculating probability like below:



def multinormpdf(x, mu, var): # calculate probability of multi Gaussian distribution
k = len(x)
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = math.sqrt(((2*math.pi)**k)*det)
numerator = np.dot((x - mean).transpose(), inv)
numerator = np.dot(numerator, (x - mean))
numerator = math.exp(-0.5 * numerator)
return numerator/denominator


and I have mean vector, covariance matrix and 2D numpy array for test



mu = np.array([100, 105, 42]) # mean vector
var = np.array([[100, 124, 11], # covariance matrix
[124, 150, 44],
[11, 44, 130]])

arr = np.array([[42, 234, 124], # arr is 43923794 x 3 matrix
[123, 222, 112],
[42, 213, 11],
...(so many values about 40,000,000 rows),
[23, 55, 251]])


I have to calculate for probability for each value, so I used this code



for i in arr:
print(multinormpdf(i, mu, var)) # I already know mean_vector and variance_matrix


But it is so slow...



Is there any faster way to calculate probability?
Or is there any way to calculate probability for test arr at once like 'batch'?










share|improve this question















I have a function for calculating probability like below:



def multinormpdf(x, mu, var): # calculate probability of multi Gaussian distribution
k = len(x)
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = math.sqrt(((2*math.pi)**k)*det)
numerator = np.dot((x - mean).transpose(), inv)
numerator = np.dot(numerator, (x - mean))
numerator = math.exp(-0.5 * numerator)
return numerator/denominator


and I have mean vector, covariance matrix and 2D numpy array for test



mu = np.array([100, 105, 42]) # mean vector
var = np.array([[100, 124, 11], # covariance matrix
[124, 150, 44],
[11, 44, 130]])

arr = np.array([[42, 234, 124], # arr is 43923794 x 3 matrix
[123, 222, 112],
[42, 213, 11],
...(so many values about 40,000,000 rows),
[23, 55, 251]])


I have to calculate for probability for each value, so I used this code



for i in arr:
print(multinormpdf(i, mu, var)) # I already know mean_vector and variance_matrix


But it is so slow...



Is there any faster way to calculate probability?
Or is there any way to calculate probability for test arr at once like 'batch'?







python python-3.x probability






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 29 at 14:05

























asked Nov 21 at 16:28









Yeong-Hwa Jin

336




336












  • Where does normpdf come from?
    – Nils Werner
    Nov 21 at 21:54










  • @Nils Werner I think that is not important. But I updated code for normpdf
    – Yeong-Hwa Jin
    Nov 22 at 2:36












  • Can you give a complete example with input and output?
    – Nils Werner
    Nov 22 at 7:17










  • @Nils Werner I uploaded more explanation
    – Yeong-Hwa Jin
    Nov 22 at 12:04










  • Your code is not valid Python code, and does not work, even after fixing the syntax issues. Please post a proper MVCE!
    – Nils Werner
    Nov 22 at 12:09




















  • Where does normpdf come from?
    – Nils Werner
    Nov 21 at 21:54










  • @Nils Werner I think that is not important. But I updated code for normpdf
    – Yeong-Hwa Jin
    Nov 22 at 2:36












  • Can you give a complete example with input and output?
    – Nils Werner
    Nov 22 at 7:17










  • @Nils Werner I uploaded more explanation
    – Yeong-Hwa Jin
    Nov 22 at 12:04










  • Your code is not valid Python code, and does not work, even after fixing the syntax issues. Please post a proper MVCE!
    – Nils Werner
    Nov 22 at 12:09


















Where does normpdf come from?
– Nils Werner
Nov 21 at 21:54




Where does normpdf come from?
– Nils Werner
Nov 21 at 21:54












@Nils Werner I think that is not important. But I updated code for normpdf
– Yeong-Hwa Jin
Nov 22 at 2:36






@Nils Werner I think that is not important. But I updated code for normpdf
– Yeong-Hwa Jin
Nov 22 at 2:36














Can you give a complete example with input and output?
– Nils Werner
Nov 22 at 7:17




Can you give a complete example with input and output?
– Nils Werner
Nov 22 at 7:17












@Nils Werner I uploaded more explanation
– Yeong-Hwa Jin
Nov 22 at 12:04




@Nils Werner I uploaded more explanation
– Yeong-Hwa Jin
Nov 22 at 12:04












Your code is not valid Python code, and does not work, even after fixing the syntax issues. Please post a proper MVCE!
– Nils Werner
Nov 22 at 12:09






Your code is not valid Python code, and does not work, even after fixing the syntax issues. Please post a proper MVCE!
– Nils Werner
Nov 22 at 12:09














2 Answers
2






active

oldest

votes

















up vote
2
down vote



accepted










You can vectorize your function easily:



import numpy as np

def fast_multinormpdf(x, mu, var):
mu = np.asarray(mu)
var = np.asarray(var)
k = x.shape[-1]
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = np.sqrt(((2*np.pi)**k)*det)
numerator = np.dot((x - mu), inv)
numerator = np.sum((x - mu) * numerator, axis=-1)
numerator = np.exp(-0.5 * numerator)
return numerator/denominator


arr = np.array([[42, 234, 124],
[123, 222, 112],
[42, 213, 11],
[42, 213, 11]])

mu = [0, 0, 1]
var = [[1, 100, 100],
[100, 1, 100],
[100, 100, 1]]

slow_out = np.array([multinormpdf(i, mu, var) for i in arr])
fast_out = fast_multinormpdf(arr, mu, var)

np.allclose(slow_out, fast_out) # True


With fast_multinormpdf being about 1000 times faster than your unvectorized function:



long_arr = np.tile(arr, (10000, 1))

%timeit np.array([multinormpdf(i, mu, var) for i in long_arr])
# 2.12 s ± 93.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit fast_multinormpdf(long_arr, mu, var)
# 2.56 ms ± 76.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)





share|improve this answer























  • Thanks Nils! I appreciate your advice!
    – Yeong-Hwa Jin
    Nov 22 at 13:24


















up vote
1
down vote













You can try numba. Just decorate your function with @numba.vectorize.



@numba.vectorize
def multinormpdf(x, mu, var):
# ...
return caculated_probability

new_arr = multinormpdf(arr)


If your multinormpdf doesn't contains any unsupported functions, it can be accelerated. See here: https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html



Moreover, you can use the experimental feature target='parallel' like this.



@numba.vectorize(target='parallel')





share|improve this answer





















  • My input for multinormpdf is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is like multinormpdf([51, 23 ,251], mu_vector, cov_matrix)). Can I use @numba.vectorize yet?
    – Yeong-Hwa Jin
    Nov 22 at 3:41













Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53416511%2fhow-can-i-calculate-probability-for-all-each-numpy-value-at-once%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
2
down vote



accepted










You can vectorize your function easily:



import numpy as np

def fast_multinormpdf(x, mu, var):
mu = np.asarray(mu)
var = np.asarray(var)
k = x.shape[-1]
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = np.sqrt(((2*np.pi)**k)*det)
numerator = np.dot((x - mu), inv)
numerator = np.sum((x - mu) * numerator, axis=-1)
numerator = np.exp(-0.5 * numerator)
return numerator/denominator


arr = np.array([[42, 234, 124],
[123, 222, 112],
[42, 213, 11],
[42, 213, 11]])

mu = [0, 0, 1]
var = [[1, 100, 100],
[100, 1, 100],
[100, 100, 1]]

slow_out = np.array([multinormpdf(i, mu, var) for i in arr])
fast_out = fast_multinormpdf(arr, mu, var)

np.allclose(slow_out, fast_out) # True


With fast_multinormpdf being about 1000 times faster than your unvectorized function:



long_arr = np.tile(arr, (10000, 1))

%timeit np.array([multinormpdf(i, mu, var) for i in long_arr])
# 2.12 s ± 93.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit fast_multinormpdf(long_arr, mu, var)
# 2.56 ms ± 76.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)





share|improve this answer























  • Thanks Nils! I appreciate your advice!
    – Yeong-Hwa Jin
    Nov 22 at 13:24















up vote
2
down vote



accepted










You can vectorize your function easily:



import numpy as np

def fast_multinormpdf(x, mu, var):
mu = np.asarray(mu)
var = np.asarray(var)
k = x.shape[-1]
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = np.sqrt(((2*np.pi)**k)*det)
numerator = np.dot((x - mu), inv)
numerator = np.sum((x - mu) * numerator, axis=-1)
numerator = np.exp(-0.5 * numerator)
return numerator/denominator


arr = np.array([[42, 234, 124],
[123, 222, 112],
[42, 213, 11],
[42, 213, 11]])

mu = [0, 0, 1]
var = [[1, 100, 100],
[100, 1, 100],
[100, 100, 1]]

slow_out = np.array([multinormpdf(i, mu, var) for i in arr])
fast_out = fast_multinormpdf(arr, mu, var)

np.allclose(slow_out, fast_out) # True


With fast_multinormpdf being about 1000 times faster than your unvectorized function:



long_arr = np.tile(arr, (10000, 1))

%timeit np.array([multinormpdf(i, mu, var) for i in long_arr])
# 2.12 s ± 93.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit fast_multinormpdf(long_arr, mu, var)
# 2.56 ms ± 76.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)





share|improve this answer























  • Thanks Nils! I appreciate your advice!
    – Yeong-Hwa Jin
    Nov 22 at 13:24













up vote
2
down vote



accepted







up vote
2
down vote



accepted






You can vectorize your function easily:



import numpy as np

def fast_multinormpdf(x, mu, var):
mu = np.asarray(mu)
var = np.asarray(var)
k = x.shape[-1]
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = np.sqrt(((2*np.pi)**k)*det)
numerator = np.dot((x - mu), inv)
numerator = np.sum((x - mu) * numerator, axis=-1)
numerator = np.exp(-0.5 * numerator)
return numerator/denominator


arr = np.array([[42, 234, 124],
[123, 222, 112],
[42, 213, 11],
[42, 213, 11]])

mu = [0, 0, 1]
var = [[1, 100, 100],
[100, 1, 100],
[100, 100, 1]]

slow_out = np.array([multinormpdf(i, mu, var) for i in arr])
fast_out = fast_multinormpdf(arr, mu, var)

np.allclose(slow_out, fast_out) # True


With fast_multinormpdf being about 1000 times faster than your unvectorized function:



long_arr = np.tile(arr, (10000, 1))

%timeit np.array([multinormpdf(i, mu, var) for i in long_arr])
# 2.12 s ± 93.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit fast_multinormpdf(long_arr, mu, var)
# 2.56 ms ± 76.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)





share|improve this answer














You can vectorize your function easily:



import numpy as np

def fast_multinormpdf(x, mu, var):
mu = np.asarray(mu)
var = np.asarray(var)
k = x.shape[-1]
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = np.sqrt(((2*np.pi)**k)*det)
numerator = np.dot((x - mu), inv)
numerator = np.sum((x - mu) * numerator, axis=-1)
numerator = np.exp(-0.5 * numerator)
return numerator/denominator


arr = np.array([[42, 234, 124],
[123, 222, 112],
[42, 213, 11],
[42, 213, 11]])

mu = [0, 0, 1]
var = [[1, 100, 100],
[100, 1, 100],
[100, 100, 1]]

slow_out = np.array([multinormpdf(i, mu, var) for i in arr])
fast_out = fast_multinormpdf(arr, mu, var)

np.allclose(slow_out, fast_out) # True


With fast_multinormpdf being about 1000 times faster than your unvectorized function:



long_arr = np.tile(arr, (10000, 1))

%timeit np.array([multinormpdf(i, mu, var) for i in long_arr])
# 2.12 s ± 93.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit fast_multinormpdf(long_arr, mu, var)
# 2.56 ms ± 76.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)






share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 22 at 9:16

























answered Nov 21 at 21:58









Nils Werner

17.2k13859




17.2k13859












  • Thanks Nils! I appreciate your advice!
    – Yeong-Hwa Jin
    Nov 22 at 13:24


















  • Thanks Nils! I appreciate your advice!
    – Yeong-Hwa Jin
    Nov 22 at 13:24
















Thanks Nils! I appreciate your advice!
– Yeong-Hwa Jin
Nov 22 at 13:24




Thanks Nils! I appreciate your advice!
– Yeong-Hwa Jin
Nov 22 at 13:24












up vote
1
down vote













You can try numba. Just decorate your function with @numba.vectorize.



@numba.vectorize
def multinormpdf(x, mu, var):
# ...
return caculated_probability

new_arr = multinormpdf(arr)


If your multinormpdf doesn't contains any unsupported functions, it can be accelerated. See here: https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html



Moreover, you can use the experimental feature target='parallel' like this.



@numba.vectorize(target='parallel')





share|improve this answer





















  • My input for multinormpdf is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is like multinormpdf([51, 23 ,251], mu_vector, cov_matrix)). Can I use @numba.vectorize yet?
    – Yeong-Hwa Jin
    Nov 22 at 3:41

















up vote
1
down vote













You can try numba. Just decorate your function with @numba.vectorize.



@numba.vectorize
def multinormpdf(x, mu, var):
# ...
return caculated_probability

new_arr = multinormpdf(arr)


If your multinormpdf doesn't contains any unsupported functions, it can be accelerated. See here: https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html



Moreover, you can use the experimental feature target='parallel' like this.



@numba.vectorize(target='parallel')





share|improve this answer





















  • My input for multinormpdf is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is like multinormpdf([51, 23 ,251], mu_vector, cov_matrix)). Can I use @numba.vectorize yet?
    – Yeong-Hwa Jin
    Nov 22 at 3:41















up vote
1
down vote










up vote
1
down vote









You can try numba. Just decorate your function with @numba.vectorize.



@numba.vectorize
def multinormpdf(x, mu, var):
# ...
return caculated_probability

new_arr = multinormpdf(arr)


If your multinormpdf doesn't contains any unsupported functions, it can be accelerated. See here: https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html



Moreover, you can use the experimental feature target='parallel' like this.



@numba.vectorize(target='parallel')





share|improve this answer












You can try numba. Just decorate your function with @numba.vectorize.



@numba.vectorize
def multinormpdf(x, mu, var):
# ...
return caculated_probability

new_arr = multinormpdf(arr)


If your multinormpdf doesn't contains any unsupported functions, it can be accelerated. See here: https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html



Moreover, you can use the experimental feature target='parallel' like this.



@numba.vectorize(target='parallel')






share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 21 at 20:28









anch2150

113




113












  • My input for multinormpdf is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is like multinormpdf([51, 23 ,251], mu_vector, cov_matrix)). Can I use @numba.vectorize yet?
    – Yeong-Hwa Jin
    Nov 22 at 3:41




















  • My input for multinormpdf is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is like multinormpdf([51, 23 ,251], mu_vector, cov_matrix)). Can I use @numba.vectorize yet?
    – Yeong-Hwa Jin
    Nov 22 at 3:41


















My input for multinormpdf is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is like multinormpdf([51, 23 ,251], mu_vector, cov_matrix)). Can I use @numba.vectorize yet?
– Yeong-Hwa Jin
Nov 22 at 3:41






My input for multinormpdf is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is like multinormpdf([51, 23 ,251], mu_vector, cov_matrix)). Can I use @numba.vectorize yet?
– Yeong-Hwa Jin
Nov 22 at 3:41




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53416511%2fhow-can-i-calculate-probability-for-all-each-numpy-value-at-once%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Berounka

Different font size/position of beamer's navigation symbols template's content depending on regular/plain...

Sphinx de Gizeh