MLE for $Sigma$ based on sequences of observations
up vote
0
down vote
favorite
Given $bf{x}$$_1$,...,$bf{x}$$_n$ be a sequence of random vectors that are independent and identically distributed from $N_p(mu_0,Sigma)$ where $mu_0$ is known.
(i) Show that the MLE for $Sigma$ is;
$$
widehat{Sigma}=frac{1}{n}sumlimits_{i=1}^n(mathbf{x}_i-mu_0)(mathbf{x}_i-mu_0)'
$$
(ii) Now let $bf{y}$$_1$,...,$bf{y}$$_m$ be another sequence of random vectors that are independent and identically distributed from $N_p(mu_1,Sigma)$ where $mu_1$ is known. Calculate the MLE for $Sigma$ based on the sequences of observations {$bf{x}$$_1$,...,$bf{x}$$_n$} and {$bf{y}$$_1$,...,$bf{y}$$_m$}. What happens to $Sigma$ if both $mu_i$ for $i=0,1$ are assumed to be unknown?
I do not have any questions on part (i), I have already shown that the MLE for $Sigma$ is indeed $widehat{Sigma}$. My concerns are towards part (ii). I don't quite understand in what ways $widehat{Sigma}$ will change under this scenario and how to represent it notation-wise. Also, when $mu_i$ for $i=0,1$ are assumed to be unknown; do we simply replace $mu_0$ and $mu_1$ in the new MLE for $Sigma$ by $hat{mathbf{mu}}_0$ and $hat{mathbf{mu}}_1$?
Any form of help is much appreciated.
statistics normal-distribution maximum-likelihood
|
show 1 more comment
up vote
0
down vote
favorite
Given $bf{x}$$_1$,...,$bf{x}$$_n$ be a sequence of random vectors that are independent and identically distributed from $N_p(mu_0,Sigma)$ where $mu_0$ is known.
(i) Show that the MLE for $Sigma$ is;
$$
widehat{Sigma}=frac{1}{n}sumlimits_{i=1}^n(mathbf{x}_i-mu_0)(mathbf{x}_i-mu_0)'
$$
(ii) Now let $bf{y}$$_1$,...,$bf{y}$$_m$ be another sequence of random vectors that are independent and identically distributed from $N_p(mu_1,Sigma)$ where $mu_1$ is known. Calculate the MLE for $Sigma$ based on the sequences of observations {$bf{x}$$_1$,...,$bf{x}$$_n$} and {$bf{y}$$_1$,...,$bf{y}$$_m$}. What happens to $Sigma$ if both $mu_i$ for $i=0,1$ are assumed to be unknown?
I do not have any questions on part (i), I have already shown that the MLE for $Sigma$ is indeed $widehat{Sigma}$. My concerns are towards part (ii). I don't quite understand in what ways $widehat{Sigma}$ will change under this scenario and how to represent it notation-wise. Also, when $mu_i$ for $i=0,1$ are assumed to be unknown; do we simply replace $mu_0$ and $mu_1$ in the new MLE for $Sigma$ by $hat{mathbf{mu}}_0$ and $hat{mathbf{mu}}_1$?
Any form of help is much appreciated.
statistics normal-distribution maximum-likelihood
I think you need to find the joint probability $$p(mathbf{x}_1,ldots,mathbf{x}_n,mathbf{y}_1,ldots,mathbf{y}_m)$$ and extract the covariance matrix from that.
– BlackMath
Nov 26 at 3:31
Thus, would claiming $(mathbf{x}_1,...,mathbf{x}_n,mathbf{y}_1,...,mathbf{y}_n) sim N_p(nmu_0+mmu_1,(n+m)Sigma)$ and then considering the MLE for $(n+m)Sigma$ be the correct approach?
– Nelly
Nov 26 at 5:39
How did you find the covariance in part (i)? I would assume that you found the joint PDF $p(mathbf{x}_1,ldots,mathbf{x}_n)$, which is a multivariate Gaussian distribution, and from there you found the covariance matrix. Here I think it's the same thing, but you need to find the joint PDF $p(mathbf{x}_1,ldots,mathbf{x}_n,mathbf{y}_1,ldots,mathbf{y}_m)$ instead. Since these are independent, this can be written as $$prod_ip(mathbf{x}_i)prod_jp(mathbf{y}_j)$$
– BlackMath
Nov 26 at 6:00
In part (i) I derived $widehat{Sigma}$ by using the maximum likelihood method of finding the log-likelihood, taking the partial derivative with respect to $Sigma^{-1}$ and setting the equation equal to 0. I didn't necessarily find $Sigma$ as you're suggesting. Maybe I'm just not following what you're saying?
– Nelly
Nov 26 at 7:02
1
I'm describing the same thing, I think. You need first to find the joint PDF, find the log-likelihood of that function, and then do the derivation. The joint PDF in the first case is $$prod_ip(mathbf{x}_i)$$ while in the second case it will be $$prod_ip(mathbf{x}_i)prod_jp(mathbf{y}_j)$$ where $$p(mathbf{x}_i)=text{det}left(2piSigmaright)^{-1/2}expleft(-frac{1}{2}(mathbf{x}_i-mu_0)^TSigma^{-1}(mathbf{x}_i-mu_0)right)$$ and $$p(mathbf{y}_j)=text{det}left(2piSigmaright)^{-1/2}expleft(-frac{1}{2}(mathbf{y}_j-mu_1)^TSigma^{-1}(mathbf{y}_j-mu_1)right)$$
– BlackMath
Nov 26 at 7:13
|
show 1 more comment
up vote
0
down vote
favorite
up vote
0
down vote
favorite
Given $bf{x}$$_1$,...,$bf{x}$$_n$ be a sequence of random vectors that are independent and identically distributed from $N_p(mu_0,Sigma)$ where $mu_0$ is known.
(i) Show that the MLE for $Sigma$ is;
$$
widehat{Sigma}=frac{1}{n}sumlimits_{i=1}^n(mathbf{x}_i-mu_0)(mathbf{x}_i-mu_0)'
$$
(ii) Now let $bf{y}$$_1$,...,$bf{y}$$_m$ be another sequence of random vectors that are independent and identically distributed from $N_p(mu_1,Sigma)$ where $mu_1$ is known. Calculate the MLE for $Sigma$ based on the sequences of observations {$bf{x}$$_1$,...,$bf{x}$$_n$} and {$bf{y}$$_1$,...,$bf{y}$$_m$}. What happens to $Sigma$ if both $mu_i$ for $i=0,1$ are assumed to be unknown?
I do not have any questions on part (i), I have already shown that the MLE for $Sigma$ is indeed $widehat{Sigma}$. My concerns are towards part (ii). I don't quite understand in what ways $widehat{Sigma}$ will change under this scenario and how to represent it notation-wise. Also, when $mu_i$ for $i=0,1$ are assumed to be unknown; do we simply replace $mu_0$ and $mu_1$ in the new MLE for $Sigma$ by $hat{mathbf{mu}}_0$ and $hat{mathbf{mu}}_1$?
Any form of help is much appreciated.
statistics normal-distribution maximum-likelihood
Given $bf{x}$$_1$,...,$bf{x}$$_n$ be a sequence of random vectors that are independent and identically distributed from $N_p(mu_0,Sigma)$ where $mu_0$ is known.
(i) Show that the MLE for $Sigma$ is;
$$
widehat{Sigma}=frac{1}{n}sumlimits_{i=1}^n(mathbf{x}_i-mu_0)(mathbf{x}_i-mu_0)'
$$
(ii) Now let $bf{y}$$_1$,...,$bf{y}$$_m$ be another sequence of random vectors that are independent and identically distributed from $N_p(mu_1,Sigma)$ where $mu_1$ is known. Calculate the MLE for $Sigma$ based on the sequences of observations {$bf{x}$$_1$,...,$bf{x}$$_n$} and {$bf{y}$$_1$,...,$bf{y}$$_m$}. What happens to $Sigma$ if both $mu_i$ for $i=0,1$ are assumed to be unknown?
I do not have any questions on part (i), I have already shown that the MLE for $Sigma$ is indeed $widehat{Sigma}$. My concerns are towards part (ii). I don't quite understand in what ways $widehat{Sigma}$ will change under this scenario and how to represent it notation-wise. Also, when $mu_i$ for $i=0,1$ are assumed to be unknown; do we simply replace $mu_0$ and $mu_1$ in the new MLE for $Sigma$ by $hat{mathbf{mu}}_0$ and $hat{mathbf{mu}}_1$?
Any form of help is much appreciated.
statistics normal-distribution maximum-likelihood
statistics normal-distribution maximum-likelihood
asked Nov 26 at 2:28
Nelly
74110
74110
I think you need to find the joint probability $$p(mathbf{x}_1,ldots,mathbf{x}_n,mathbf{y}_1,ldots,mathbf{y}_m)$$ and extract the covariance matrix from that.
– BlackMath
Nov 26 at 3:31
Thus, would claiming $(mathbf{x}_1,...,mathbf{x}_n,mathbf{y}_1,...,mathbf{y}_n) sim N_p(nmu_0+mmu_1,(n+m)Sigma)$ and then considering the MLE for $(n+m)Sigma$ be the correct approach?
– Nelly
Nov 26 at 5:39
How did you find the covariance in part (i)? I would assume that you found the joint PDF $p(mathbf{x}_1,ldots,mathbf{x}_n)$, which is a multivariate Gaussian distribution, and from there you found the covariance matrix. Here I think it's the same thing, but you need to find the joint PDF $p(mathbf{x}_1,ldots,mathbf{x}_n,mathbf{y}_1,ldots,mathbf{y}_m)$ instead. Since these are independent, this can be written as $$prod_ip(mathbf{x}_i)prod_jp(mathbf{y}_j)$$
– BlackMath
Nov 26 at 6:00
In part (i) I derived $widehat{Sigma}$ by using the maximum likelihood method of finding the log-likelihood, taking the partial derivative with respect to $Sigma^{-1}$ and setting the equation equal to 0. I didn't necessarily find $Sigma$ as you're suggesting. Maybe I'm just not following what you're saying?
– Nelly
Nov 26 at 7:02
1
I'm describing the same thing, I think. You need first to find the joint PDF, find the log-likelihood of that function, and then do the derivation. The joint PDF in the first case is $$prod_ip(mathbf{x}_i)$$ while in the second case it will be $$prod_ip(mathbf{x}_i)prod_jp(mathbf{y}_j)$$ where $$p(mathbf{x}_i)=text{det}left(2piSigmaright)^{-1/2}expleft(-frac{1}{2}(mathbf{x}_i-mu_0)^TSigma^{-1}(mathbf{x}_i-mu_0)right)$$ and $$p(mathbf{y}_j)=text{det}left(2piSigmaright)^{-1/2}expleft(-frac{1}{2}(mathbf{y}_j-mu_1)^TSigma^{-1}(mathbf{y}_j-mu_1)right)$$
– BlackMath
Nov 26 at 7:13
|
show 1 more comment
I think you need to find the joint probability $$p(mathbf{x}_1,ldots,mathbf{x}_n,mathbf{y}_1,ldots,mathbf{y}_m)$$ and extract the covariance matrix from that.
– BlackMath
Nov 26 at 3:31
Thus, would claiming $(mathbf{x}_1,...,mathbf{x}_n,mathbf{y}_1,...,mathbf{y}_n) sim N_p(nmu_0+mmu_1,(n+m)Sigma)$ and then considering the MLE for $(n+m)Sigma$ be the correct approach?
– Nelly
Nov 26 at 5:39
How did you find the covariance in part (i)? I would assume that you found the joint PDF $p(mathbf{x}_1,ldots,mathbf{x}_n)$, which is a multivariate Gaussian distribution, and from there you found the covariance matrix. Here I think it's the same thing, but you need to find the joint PDF $p(mathbf{x}_1,ldots,mathbf{x}_n,mathbf{y}_1,ldots,mathbf{y}_m)$ instead. Since these are independent, this can be written as $$prod_ip(mathbf{x}_i)prod_jp(mathbf{y}_j)$$
– BlackMath
Nov 26 at 6:00
In part (i) I derived $widehat{Sigma}$ by using the maximum likelihood method of finding the log-likelihood, taking the partial derivative with respect to $Sigma^{-1}$ and setting the equation equal to 0. I didn't necessarily find $Sigma$ as you're suggesting. Maybe I'm just not following what you're saying?
– Nelly
Nov 26 at 7:02
1
I'm describing the same thing, I think. You need first to find the joint PDF, find the log-likelihood of that function, and then do the derivation. The joint PDF in the first case is $$prod_ip(mathbf{x}_i)$$ while in the second case it will be $$prod_ip(mathbf{x}_i)prod_jp(mathbf{y}_j)$$ where $$p(mathbf{x}_i)=text{det}left(2piSigmaright)^{-1/2}expleft(-frac{1}{2}(mathbf{x}_i-mu_0)^TSigma^{-1}(mathbf{x}_i-mu_0)right)$$ and $$p(mathbf{y}_j)=text{det}left(2piSigmaright)^{-1/2}expleft(-frac{1}{2}(mathbf{y}_j-mu_1)^TSigma^{-1}(mathbf{y}_j-mu_1)right)$$
– BlackMath
Nov 26 at 7:13
I think you need to find the joint probability $$p(mathbf{x}_1,ldots,mathbf{x}_n,mathbf{y}_1,ldots,mathbf{y}_m)$$ and extract the covariance matrix from that.
– BlackMath
Nov 26 at 3:31
I think you need to find the joint probability $$p(mathbf{x}_1,ldots,mathbf{x}_n,mathbf{y}_1,ldots,mathbf{y}_m)$$ and extract the covariance matrix from that.
– BlackMath
Nov 26 at 3:31
Thus, would claiming $(mathbf{x}_1,...,mathbf{x}_n,mathbf{y}_1,...,mathbf{y}_n) sim N_p(nmu_0+mmu_1,(n+m)Sigma)$ and then considering the MLE for $(n+m)Sigma$ be the correct approach?
– Nelly
Nov 26 at 5:39
Thus, would claiming $(mathbf{x}_1,...,mathbf{x}_n,mathbf{y}_1,...,mathbf{y}_n) sim N_p(nmu_0+mmu_1,(n+m)Sigma)$ and then considering the MLE for $(n+m)Sigma$ be the correct approach?
– Nelly
Nov 26 at 5:39
How did you find the covariance in part (i)? I would assume that you found the joint PDF $p(mathbf{x}_1,ldots,mathbf{x}_n)$, which is a multivariate Gaussian distribution, and from there you found the covariance matrix. Here I think it's the same thing, but you need to find the joint PDF $p(mathbf{x}_1,ldots,mathbf{x}_n,mathbf{y}_1,ldots,mathbf{y}_m)$ instead. Since these are independent, this can be written as $$prod_ip(mathbf{x}_i)prod_jp(mathbf{y}_j)$$
– BlackMath
Nov 26 at 6:00
How did you find the covariance in part (i)? I would assume that you found the joint PDF $p(mathbf{x}_1,ldots,mathbf{x}_n)$, which is a multivariate Gaussian distribution, and from there you found the covariance matrix. Here I think it's the same thing, but you need to find the joint PDF $p(mathbf{x}_1,ldots,mathbf{x}_n,mathbf{y}_1,ldots,mathbf{y}_m)$ instead. Since these are independent, this can be written as $$prod_ip(mathbf{x}_i)prod_jp(mathbf{y}_j)$$
– BlackMath
Nov 26 at 6:00
In part (i) I derived $widehat{Sigma}$ by using the maximum likelihood method of finding the log-likelihood, taking the partial derivative with respect to $Sigma^{-1}$ and setting the equation equal to 0. I didn't necessarily find $Sigma$ as you're suggesting. Maybe I'm just not following what you're saying?
– Nelly
Nov 26 at 7:02
In part (i) I derived $widehat{Sigma}$ by using the maximum likelihood method of finding the log-likelihood, taking the partial derivative with respect to $Sigma^{-1}$ and setting the equation equal to 0. I didn't necessarily find $Sigma$ as you're suggesting. Maybe I'm just not following what you're saying?
– Nelly
Nov 26 at 7:02
1
1
I'm describing the same thing, I think. You need first to find the joint PDF, find the log-likelihood of that function, and then do the derivation. The joint PDF in the first case is $$prod_ip(mathbf{x}_i)$$ while in the second case it will be $$prod_ip(mathbf{x}_i)prod_jp(mathbf{y}_j)$$ where $$p(mathbf{x}_i)=text{det}left(2piSigmaright)^{-1/2}expleft(-frac{1}{2}(mathbf{x}_i-mu_0)^TSigma^{-1}(mathbf{x}_i-mu_0)right)$$ and $$p(mathbf{y}_j)=text{det}left(2piSigmaright)^{-1/2}expleft(-frac{1}{2}(mathbf{y}_j-mu_1)^TSigma^{-1}(mathbf{y}_j-mu_1)right)$$
– BlackMath
Nov 26 at 7:13
I'm describing the same thing, I think. You need first to find the joint PDF, find the log-likelihood of that function, and then do the derivation. The joint PDF in the first case is $$prod_ip(mathbf{x}_i)$$ while in the second case it will be $$prod_ip(mathbf{x}_i)prod_jp(mathbf{y}_j)$$ where $$p(mathbf{x}_i)=text{det}left(2piSigmaright)^{-1/2}expleft(-frac{1}{2}(mathbf{x}_i-mu_0)^TSigma^{-1}(mathbf{x}_i-mu_0)right)$$ and $$p(mathbf{y}_j)=text{det}left(2piSigmaright)^{-1/2}expleft(-frac{1}{2}(mathbf{y}_j-mu_1)^TSigma^{-1}(mathbf{y}_j-mu_1)right)$$
– BlackMath
Nov 26 at 7:13
|
show 1 more comment
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3013730%2fmle-for-sigma-based-on-sequences-of-observations%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
I think you need to find the joint probability $$p(mathbf{x}_1,ldots,mathbf{x}_n,mathbf{y}_1,ldots,mathbf{y}_m)$$ and extract the covariance matrix from that.
– BlackMath
Nov 26 at 3:31
Thus, would claiming $(mathbf{x}_1,...,mathbf{x}_n,mathbf{y}_1,...,mathbf{y}_n) sim N_p(nmu_0+mmu_1,(n+m)Sigma)$ and then considering the MLE for $(n+m)Sigma$ be the correct approach?
– Nelly
Nov 26 at 5:39
How did you find the covariance in part (i)? I would assume that you found the joint PDF $p(mathbf{x}_1,ldots,mathbf{x}_n)$, which is a multivariate Gaussian distribution, and from there you found the covariance matrix. Here I think it's the same thing, but you need to find the joint PDF $p(mathbf{x}_1,ldots,mathbf{x}_n,mathbf{y}_1,ldots,mathbf{y}_m)$ instead. Since these are independent, this can be written as $$prod_ip(mathbf{x}_i)prod_jp(mathbf{y}_j)$$
– BlackMath
Nov 26 at 6:00
In part (i) I derived $widehat{Sigma}$ by using the maximum likelihood method of finding the log-likelihood, taking the partial derivative with respect to $Sigma^{-1}$ and setting the equation equal to 0. I didn't necessarily find $Sigma$ as you're suggesting. Maybe I'm just not following what you're saying?
– Nelly
Nov 26 at 7:02
1
I'm describing the same thing, I think. You need first to find the joint PDF, find the log-likelihood of that function, and then do the derivation. The joint PDF in the first case is $$prod_ip(mathbf{x}_i)$$ while in the second case it will be $$prod_ip(mathbf{x}_i)prod_jp(mathbf{y}_j)$$ where $$p(mathbf{x}_i)=text{det}left(2piSigmaright)^{-1/2}expleft(-frac{1}{2}(mathbf{x}_i-mu_0)^TSigma^{-1}(mathbf{x}_i-mu_0)right)$$ and $$p(mathbf{y}_j)=text{det}left(2piSigmaright)^{-1/2}expleft(-frac{1}{2}(mathbf{y}_j-mu_1)^TSigma^{-1}(mathbf{y}_j-mu_1)right)$$
– BlackMath
Nov 26 at 7:13