How can I calculate probability for all each numpy value at once?
up vote
1
down vote
favorite
I have a function for calculating probability like below:
def multinormpdf(x, mu, var): # calculate probability of multi Gaussian distribution
k = len(x)
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = math.sqrt(((2*math.pi)**k)*det)
numerator = np.dot((x - mean).transpose(), inv)
numerator = np.dot(numerator, (x - mean))
numerator = math.exp(-0.5 * numerator)
return numerator/denominator
and I have mean vector, covariance matrix and 2D numpy array for test
mu = np.array([100, 105, 42]) # mean vector
var = np.array([[100, 124, 11], # covariance matrix
[124, 150, 44],
[11, 44, 130]])
arr = np.array([[42, 234, 124], # arr is 43923794 x 3 matrix
[123, 222, 112],
[42, 213, 11],
...(so many values about 40,000,000 rows),
[23, 55, 251]])
I have to calculate for probability for each value, so I used this code
for i in arr:
print(multinormpdf(i, mu, var)) # I already know mean_vector and variance_matrix
But it is so slow...
Is there any faster way to calculate probability?
Or is there any way to calculate probability for test arr at once like 'batch'?
python python-3.x probability
|
show 1 more comment
up vote
1
down vote
favorite
I have a function for calculating probability like below:
def multinormpdf(x, mu, var): # calculate probability of multi Gaussian distribution
k = len(x)
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = math.sqrt(((2*math.pi)**k)*det)
numerator = np.dot((x - mean).transpose(), inv)
numerator = np.dot(numerator, (x - mean))
numerator = math.exp(-0.5 * numerator)
return numerator/denominator
and I have mean vector, covariance matrix and 2D numpy array for test
mu = np.array([100, 105, 42]) # mean vector
var = np.array([[100, 124, 11], # covariance matrix
[124, 150, 44],
[11, 44, 130]])
arr = np.array([[42, 234, 124], # arr is 43923794 x 3 matrix
[123, 222, 112],
[42, 213, 11],
...(so many values about 40,000,000 rows),
[23, 55, 251]])
I have to calculate for probability for each value, so I used this code
for i in arr:
print(multinormpdf(i, mu, var)) # I already know mean_vector and variance_matrix
But it is so slow...
Is there any faster way to calculate probability?
Or is there any way to calculate probability for test arr at once like 'batch'?
python python-3.x probability
Where doesnormpdf
come from?
– Nils Werner
Nov 21 at 21:54
@Nils Werner I think that is not important. But I updated code fornormpdf
– Yeong-Hwa Jin
Nov 22 at 2:36
Can you give a complete example with input and output?
– Nils Werner
Nov 22 at 7:17
@Nils Werner I uploaded more explanation
– Yeong-Hwa Jin
Nov 22 at 12:04
Your code is not valid Python code, and does not work, even after fixing the syntax issues. Please post a proper MVCE!
– Nils Werner
Nov 22 at 12:09
|
show 1 more comment
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I have a function for calculating probability like below:
def multinormpdf(x, mu, var): # calculate probability of multi Gaussian distribution
k = len(x)
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = math.sqrt(((2*math.pi)**k)*det)
numerator = np.dot((x - mean).transpose(), inv)
numerator = np.dot(numerator, (x - mean))
numerator = math.exp(-0.5 * numerator)
return numerator/denominator
and I have mean vector, covariance matrix and 2D numpy array for test
mu = np.array([100, 105, 42]) # mean vector
var = np.array([[100, 124, 11], # covariance matrix
[124, 150, 44],
[11, 44, 130]])
arr = np.array([[42, 234, 124], # arr is 43923794 x 3 matrix
[123, 222, 112],
[42, 213, 11],
...(so many values about 40,000,000 rows),
[23, 55, 251]])
I have to calculate for probability for each value, so I used this code
for i in arr:
print(multinormpdf(i, mu, var)) # I already know mean_vector and variance_matrix
But it is so slow...
Is there any faster way to calculate probability?
Or is there any way to calculate probability for test arr at once like 'batch'?
python python-3.x probability
I have a function for calculating probability like below:
def multinormpdf(x, mu, var): # calculate probability of multi Gaussian distribution
k = len(x)
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = math.sqrt(((2*math.pi)**k)*det)
numerator = np.dot((x - mean).transpose(), inv)
numerator = np.dot(numerator, (x - mean))
numerator = math.exp(-0.5 * numerator)
return numerator/denominator
and I have mean vector, covariance matrix and 2D numpy array for test
mu = np.array([100, 105, 42]) # mean vector
var = np.array([[100, 124, 11], # covariance matrix
[124, 150, 44],
[11, 44, 130]])
arr = np.array([[42, 234, 124], # arr is 43923794 x 3 matrix
[123, 222, 112],
[42, 213, 11],
...(so many values about 40,000,000 rows),
[23, 55, 251]])
I have to calculate for probability for each value, so I used this code
for i in arr:
print(multinormpdf(i, mu, var)) # I already know mean_vector and variance_matrix
But it is so slow...
Is there any faster way to calculate probability?
Or is there any way to calculate probability for test arr at once like 'batch'?
python python-3.x probability
python python-3.x probability
edited Nov 29 at 14:05
asked Nov 21 at 16:28
Yeong-Hwa Jin
336
336
Where doesnormpdf
come from?
– Nils Werner
Nov 21 at 21:54
@Nils Werner I think that is not important. But I updated code fornormpdf
– Yeong-Hwa Jin
Nov 22 at 2:36
Can you give a complete example with input and output?
– Nils Werner
Nov 22 at 7:17
@Nils Werner I uploaded more explanation
– Yeong-Hwa Jin
Nov 22 at 12:04
Your code is not valid Python code, and does not work, even after fixing the syntax issues. Please post a proper MVCE!
– Nils Werner
Nov 22 at 12:09
|
show 1 more comment
Where doesnormpdf
come from?
– Nils Werner
Nov 21 at 21:54
@Nils Werner I think that is not important. But I updated code fornormpdf
– Yeong-Hwa Jin
Nov 22 at 2:36
Can you give a complete example with input and output?
– Nils Werner
Nov 22 at 7:17
@Nils Werner I uploaded more explanation
– Yeong-Hwa Jin
Nov 22 at 12:04
Your code is not valid Python code, and does not work, even after fixing the syntax issues. Please post a proper MVCE!
– Nils Werner
Nov 22 at 12:09
Where does
normpdf
come from?– Nils Werner
Nov 21 at 21:54
Where does
normpdf
come from?– Nils Werner
Nov 21 at 21:54
@Nils Werner I think that is not important. But I updated code for
normpdf
– Yeong-Hwa Jin
Nov 22 at 2:36
@Nils Werner I think that is not important. But I updated code for
normpdf
– Yeong-Hwa Jin
Nov 22 at 2:36
Can you give a complete example with input and output?
– Nils Werner
Nov 22 at 7:17
Can you give a complete example with input and output?
– Nils Werner
Nov 22 at 7:17
@Nils Werner I uploaded more explanation
– Yeong-Hwa Jin
Nov 22 at 12:04
@Nils Werner I uploaded more explanation
– Yeong-Hwa Jin
Nov 22 at 12:04
Your code is not valid Python code, and does not work, even after fixing the syntax issues. Please post a proper MVCE!
– Nils Werner
Nov 22 at 12:09
Your code is not valid Python code, and does not work, even after fixing the syntax issues. Please post a proper MVCE!
– Nils Werner
Nov 22 at 12:09
|
show 1 more comment
2 Answers
2
active
oldest
votes
up vote
2
down vote
accepted
You can vectorize your function easily:
import numpy as np
def fast_multinormpdf(x, mu, var):
mu = np.asarray(mu)
var = np.asarray(var)
k = x.shape[-1]
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = np.sqrt(((2*np.pi)**k)*det)
numerator = np.dot((x - mu), inv)
numerator = np.sum((x - mu) * numerator, axis=-1)
numerator = np.exp(-0.5 * numerator)
return numerator/denominator
arr = np.array([[42, 234, 124],
[123, 222, 112],
[42, 213, 11],
[42, 213, 11]])
mu = [0, 0, 1]
var = [[1, 100, 100],
[100, 1, 100],
[100, 100, 1]]
slow_out = np.array([multinormpdf(i, mu, var) for i in arr])
fast_out = fast_multinormpdf(arr, mu, var)
np.allclose(slow_out, fast_out) # True
With fast_multinormpdf
being about 1000 times faster than your unvectorized function:
long_arr = np.tile(arr, (10000, 1))
%timeit np.array([multinormpdf(i, mu, var) for i in long_arr])
# 2.12 s ± 93.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit fast_multinormpdf(long_arr, mu, var)
# 2.56 ms ± 76.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Thanks Nils! I appreciate your advice!
– Yeong-Hwa Jin
Nov 22 at 13:24
add a comment |
up vote
1
down vote
You can try numba. Just decorate your function with @numba.vectorize
.
@numba.vectorize
def multinormpdf(x, mu, var):
# ...
return caculated_probability
new_arr = multinormpdf(arr)
If your multinormpdf
doesn't contains any unsupported functions, it can be accelerated. See here: https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html
Moreover, you can use the experimental feature target='parallel'
like this.
@numba.vectorize(target='parallel')
My input formultinormpdf
is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is likemultinormpdf([51, 23 ,251], mu_vector, cov_matrix)
). Can I use @numba.vectorize yet?
– Yeong-Hwa Jin
Nov 22 at 3:41
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
accepted
You can vectorize your function easily:
import numpy as np
def fast_multinormpdf(x, mu, var):
mu = np.asarray(mu)
var = np.asarray(var)
k = x.shape[-1]
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = np.sqrt(((2*np.pi)**k)*det)
numerator = np.dot((x - mu), inv)
numerator = np.sum((x - mu) * numerator, axis=-1)
numerator = np.exp(-0.5 * numerator)
return numerator/denominator
arr = np.array([[42, 234, 124],
[123, 222, 112],
[42, 213, 11],
[42, 213, 11]])
mu = [0, 0, 1]
var = [[1, 100, 100],
[100, 1, 100],
[100, 100, 1]]
slow_out = np.array([multinormpdf(i, mu, var) for i in arr])
fast_out = fast_multinormpdf(arr, mu, var)
np.allclose(slow_out, fast_out) # True
With fast_multinormpdf
being about 1000 times faster than your unvectorized function:
long_arr = np.tile(arr, (10000, 1))
%timeit np.array([multinormpdf(i, mu, var) for i in long_arr])
# 2.12 s ± 93.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit fast_multinormpdf(long_arr, mu, var)
# 2.56 ms ± 76.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Thanks Nils! I appreciate your advice!
– Yeong-Hwa Jin
Nov 22 at 13:24
add a comment |
up vote
2
down vote
accepted
You can vectorize your function easily:
import numpy as np
def fast_multinormpdf(x, mu, var):
mu = np.asarray(mu)
var = np.asarray(var)
k = x.shape[-1]
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = np.sqrt(((2*np.pi)**k)*det)
numerator = np.dot((x - mu), inv)
numerator = np.sum((x - mu) * numerator, axis=-1)
numerator = np.exp(-0.5 * numerator)
return numerator/denominator
arr = np.array([[42, 234, 124],
[123, 222, 112],
[42, 213, 11],
[42, 213, 11]])
mu = [0, 0, 1]
var = [[1, 100, 100],
[100, 1, 100],
[100, 100, 1]]
slow_out = np.array([multinormpdf(i, mu, var) for i in arr])
fast_out = fast_multinormpdf(arr, mu, var)
np.allclose(slow_out, fast_out) # True
With fast_multinormpdf
being about 1000 times faster than your unvectorized function:
long_arr = np.tile(arr, (10000, 1))
%timeit np.array([multinormpdf(i, mu, var) for i in long_arr])
# 2.12 s ± 93.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit fast_multinormpdf(long_arr, mu, var)
# 2.56 ms ± 76.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Thanks Nils! I appreciate your advice!
– Yeong-Hwa Jin
Nov 22 at 13:24
add a comment |
up vote
2
down vote
accepted
up vote
2
down vote
accepted
You can vectorize your function easily:
import numpy as np
def fast_multinormpdf(x, mu, var):
mu = np.asarray(mu)
var = np.asarray(var)
k = x.shape[-1]
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = np.sqrt(((2*np.pi)**k)*det)
numerator = np.dot((x - mu), inv)
numerator = np.sum((x - mu) * numerator, axis=-1)
numerator = np.exp(-0.5 * numerator)
return numerator/denominator
arr = np.array([[42, 234, 124],
[123, 222, 112],
[42, 213, 11],
[42, 213, 11]])
mu = [0, 0, 1]
var = [[1, 100, 100],
[100, 1, 100],
[100, 100, 1]]
slow_out = np.array([multinormpdf(i, mu, var) for i in arr])
fast_out = fast_multinormpdf(arr, mu, var)
np.allclose(slow_out, fast_out) # True
With fast_multinormpdf
being about 1000 times faster than your unvectorized function:
long_arr = np.tile(arr, (10000, 1))
%timeit np.array([multinormpdf(i, mu, var) for i in long_arr])
# 2.12 s ± 93.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit fast_multinormpdf(long_arr, mu, var)
# 2.56 ms ± 76.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
You can vectorize your function easily:
import numpy as np
def fast_multinormpdf(x, mu, var):
mu = np.asarray(mu)
var = np.asarray(var)
k = x.shape[-1]
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = np.sqrt(((2*np.pi)**k)*det)
numerator = np.dot((x - mu), inv)
numerator = np.sum((x - mu) * numerator, axis=-1)
numerator = np.exp(-0.5 * numerator)
return numerator/denominator
arr = np.array([[42, 234, 124],
[123, 222, 112],
[42, 213, 11],
[42, 213, 11]])
mu = [0, 0, 1]
var = [[1, 100, 100],
[100, 1, 100],
[100, 100, 1]]
slow_out = np.array([multinormpdf(i, mu, var) for i in arr])
fast_out = fast_multinormpdf(arr, mu, var)
np.allclose(slow_out, fast_out) # True
With fast_multinormpdf
being about 1000 times faster than your unvectorized function:
long_arr = np.tile(arr, (10000, 1))
%timeit np.array([multinormpdf(i, mu, var) for i in long_arr])
# 2.12 s ± 93.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit fast_multinormpdf(long_arr, mu, var)
# 2.56 ms ± 76.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
edited Nov 22 at 9:16
answered Nov 21 at 21:58
Nils Werner
17.2k13859
17.2k13859
Thanks Nils! I appreciate your advice!
– Yeong-Hwa Jin
Nov 22 at 13:24
add a comment |
Thanks Nils! I appreciate your advice!
– Yeong-Hwa Jin
Nov 22 at 13:24
Thanks Nils! I appreciate your advice!
– Yeong-Hwa Jin
Nov 22 at 13:24
Thanks Nils! I appreciate your advice!
– Yeong-Hwa Jin
Nov 22 at 13:24
add a comment |
up vote
1
down vote
You can try numba. Just decorate your function with @numba.vectorize
.
@numba.vectorize
def multinormpdf(x, mu, var):
# ...
return caculated_probability
new_arr = multinormpdf(arr)
If your multinormpdf
doesn't contains any unsupported functions, it can be accelerated. See here: https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html
Moreover, you can use the experimental feature target='parallel'
like this.
@numba.vectorize(target='parallel')
My input formultinormpdf
is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is likemultinormpdf([51, 23 ,251], mu_vector, cov_matrix)
). Can I use @numba.vectorize yet?
– Yeong-Hwa Jin
Nov 22 at 3:41
add a comment |
up vote
1
down vote
You can try numba. Just decorate your function with @numba.vectorize
.
@numba.vectorize
def multinormpdf(x, mu, var):
# ...
return caculated_probability
new_arr = multinormpdf(arr)
If your multinormpdf
doesn't contains any unsupported functions, it can be accelerated. See here: https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html
Moreover, you can use the experimental feature target='parallel'
like this.
@numba.vectorize(target='parallel')
My input formultinormpdf
is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is likemultinormpdf([51, 23 ,251], mu_vector, cov_matrix)
). Can I use @numba.vectorize yet?
– Yeong-Hwa Jin
Nov 22 at 3:41
add a comment |
up vote
1
down vote
up vote
1
down vote
You can try numba. Just decorate your function with @numba.vectorize
.
@numba.vectorize
def multinormpdf(x, mu, var):
# ...
return caculated_probability
new_arr = multinormpdf(arr)
If your multinormpdf
doesn't contains any unsupported functions, it can be accelerated. See here: https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html
Moreover, you can use the experimental feature target='parallel'
like this.
@numba.vectorize(target='parallel')
You can try numba. Just decorate your function with @numba.vectorize
.
@numba.vectorize
def multinormpdf(x, mu, var):
# ...
return caculated_probability
new_arr = multinormpdf(arr)
If your multinormpdf
doesn't contains any unsupported functions, it can be accelerated. See here: https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html
Moreover, you can use the experimental feature target='parallel'
like this.
@numba.vectorize(target='parallel')
answered Nov 21 at 20:28
anch2150
113
113
My input formultinormpdf
is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is likemultinormpdf([51, 23 ,251], mu_vector, cov_matrix)
). Can I use @numba.vectorize yet?
– Yeong-Hwa Jin
Nov 22 at 3:41
add a comment |
My input formultinormpdf
is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is likemultinormpdf([51, 23 ,251], mu_vector, cov_matrix)
). Can I use @numba.vectorize yet?
– Yeong-Hwa Jin
Nov 22 at 3:41
My input for
multinormpdf
is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is like multinormpdf([51, 23 ,251], mu_vector, cov_matrix)
). Can I use @numba.vectorize yet?– Yeong-Hwa Jin
Nov 22 at 3:41
My input for
multinormpdf
is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is like multinormpdf([51, 23 ,251], mu_vector, cov_matrix)
). Can I use @numba.vectorize yet?– Yeong-Hwa Jin
Nov 22 at 3:41
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53416511%2fhow-can-i-calculate-probability-for-all-each-numpy-value-at-once%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Where does
normpdf
come from?– Nils Werner
Nov 21 at 21:54
@Nils Werner I think that is not important. But I updated code for
normpdf
– Yeong-Hwa Jin
Nov 22 at 2:36
Can you give a complete example with input and output?
– Nils Werner
Nov 22 at 7:17
@Nils Werner I uploaded more explanation
– Yeong-Hwa Jin
Nov 22 at 12:04
Your code is not valid Python code, and does not work, even after fixing the syntax issues. Please post a proper MVCE!
– Nils Werner
Nov 22 at 12:09