What is an intuitive explanation for how the t-distribution, normal distribution, F-distribution and...











up vote
4
down vote

favorite
4












What is an intuitive explanation for how the t-distribution, normal distribution, F-distribution, and Chi-square distribution relate to each other?



Could anyone explain this clearly with a sensible example?



I am a biologist and 've been trying to understand this nearly 10 years now. Every time use the statistical tests without a proper understanding of the base. Textbooks do not refer to this question either, moreover, we are not math or stat specialized in the university.










share|cite|improve this question


























    up vote
    4
    down vote

    favorite
    4












    What is an intuitive explanation for how the t-distribution, normal distribution, F-distribution, and Chi-square distribution relate to each other?



    Could anyone explain this clearly with a sensible example?



    I am a biologist and 've been trying to understand this nearly 10 years now. Every time use the statistical tests without a proper understanding of the base. Textbooks do not refer to this question either, moreover, we are not math or stat specialized in the university.










    share|cite|improve this question
























      up vote
      4
      down vote

      favorite
      4









      up vote
      4
      down vote

      favorite
      4






      4





      What is an intuitive explanation for how the t-distribution, normal distribution, F-distribution, and Chi-square distribution relate to each other?



      Could anyone explain this clearly with a sensible example?



      I am a biologist and 've been trying to understand this nearly 10 years now. Every time use the statistical tests without a proper understanding of the base. Textbooks do not refer to this question either, moreover, we are not math or stat specialized in the university.










      share|cite|improve this question













      What is an intuitive explanation for how the t-distribution, normal distribution, F-distribution, and Chi-square distribution relate to each other?



      Could anyone explain this clearly with a sensible example?



      I am a biologist and 've been trying to understand this nearly 10 years now. Every time use the statistical tests without a proper understanding of the base. Textbooks do not refer to this question either, moreover, we are not math or stat specialized in the university.







      probability normal-distribution chi-squared






      share|cite|improve this question













      share|cite|improve this question











      share|cite|improve this question




      share|cite|improve this question










      asked Nov 24 at 14:04









      Kynda

      215




      215






















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          2
          down vote













          It is not totally clear to me precisely what you are looking for, but suppose $X_1,X_2,....,X_n$ are i.i.d. normally distributed random variables with mean $mu$ and variance $sigma^2$,




          • writing their average as $bar{X}={frac1n}sumlimits_{i=1}^{n} X_i$, then $dfrac{bar{X} -mu}{sigma/sqrt{n}}$ has a standard normal distribution $N(0,1)$ indicating the distribution of the sample mean


          • and $sumlimits_{i=1}^{n} left(frac{X_i-mu}{sigma}right)^2$ has a $chi_n^2$-distribution, i.e. a chi-squared distribution with $n$ degrees of freedom as the sum of the squares of $n$ independent standard normal random variables


          • while estimating the unbiased sample variance as $S^2=frac1{n-1}sumlimits_{i=1}^{n} left({X_i-bar{X}}right)^2$ you have $(n-1)frac{S^2}{sigma^2}$ having a $chi_{n-1}^2$-distribution, i.e. a chi-squared distribution with $n-1$ degrees of freedom since $bar{X}$ is affected by the individual $X_i$


          • and looking at the distribution of the sample mean you have $dfrac{bar{X} -mu}{S/sqrt{n}}$ having a Student $t$-distribution with $n-1$ degrees of freedom - not quite the same as the standard normal distribution in the first point, but close for large $n$; you can use this to test the hypothesis that the population mean is actually $mu$ without knowing $sigma^2$


          • as a tool in comparing variances, if $Z_1 sim chi^2_{d_1}$ and independently $Z_2 sim chi^2_{d_2}$, i.e. have chi-squared distributions with $d_1$ and $d_2$ degrees of freedom, then $frac{Z_1 / d_1}{Z_2 / d_2} sim mathrm{F}(d_1, d_2)$, i.e. has an $F$-distribution with parameters $d_1$ and $d_2$


          • and in particular if $Y_1,Y_2,....,Y_m$ are also i.i.d. normally distributed random variables with a different mean $mu_Y^{,}$ and but the same variance $sigma^2$ as the earlier $X_i$, then using the third bullet point, $dfrac{sumlimits_{i=1}^{n} left({X_i-bar{X}}right)^2}{sumlimits_{j=1}^{n} left({Y_j-bar{Y}}right)^2}sim mathrm{F}(n-1, m-1)$, i.e. has an $F$-distribution with parameters $n-1$ and $m-1$ and you can use this as a test of the hypothesis that the variances are equal without knowing their value or the value of the means



          You may not know that the $X_1,X_2,....,X_n$ are in fact normally distributed, but the Central Limit Theorem suggests that for large $n$ and finite $mu$ and $sigma^2$ you should have $bar{X}$ approximately normally distributed as in the first bullet point, which may turn out to be good enough for the other properties, though for $n$ too small it may not be






          share|cite|improve this answer




























            up vote
            2
            down vote













            The short answer is as follows:




            • While probability studies the implications of assumed probability distributions, statistics assesses how well the data bear out these assumptions, by measuring something whose distribution is thereby predictable.

            • The distributions you've asked about are important because you can construct statistical tests where the null hypothesis would imply such distributions, approximately or otherwise, are those of quantities called test statistics, which if too "abnormal" in their value motivate rejection of the null hypothesis.

            • Given $n$ independent variables, each having a Normal distribution of mean $0$ and standard deviation $1$ (hereafter a standard Normal distribution), the sum of their squares has a chi-squared distribution with $n$ degrees of freedom.

            • If $X,,Y$ are independent variables, $X$ having a standard Normal distribution and $Y^2$ having a chi-squared distribution, $X/Y$ has a $t$-distribution.

            • If you scale two independent chi-squared variables to each have standard deviation $1$, the ratio of these scaled variables has an $F$-distribution, so squaring a $t$-distributed-variable (in which $Y$ has $1$ degree of freedom, so that its standard deviation is $1$) obtains one example of an $F$-distributed variable.


            Now for the long answer:



            A Normal distribution is specified by its mean $mu$ (which can be chosen arbitrarily) and its standard deviation $sigma$ (which can be any positive number). If a random variable $X$ has such a distribution, we write $Xsim N(mu,,sigma^2)$, where $sigma^2$ is the variance. The number of standard deviations from $mu$ to $X$ is a random variable in its own right, typically denoted $Z$, viz. $X=mu+sigma Z$. It turns out that $Zsim N(0,,1)$; we say $Z$ has a Standard normal distribution.



            There are various scenarios in which random variables admit a Normal approximation. For example, the classical central limit theorem (CLT) states that the mean of a large number of independent samples from a finite-variance distribution has an approximately Normal distribution. We'll come back to that one. For another example, when you try to fit a model to data, you have noise terms $epsilon$ viz. $y=f(x)+epsilon$, and we can often justify the assumption $epsilonsim N(0,,sigma^2)$ for some $sigma>0$. Let's say we have $n$ observations. If we divide all noise terms by $sigma$, square the results and sum the squares, the result has a chi-squared distribution with $n$ degrees of freedom. This lets us quantify how surprising it is that the data deviate from expectations as much as they do, because with a distribution in mind we can obtain a $p$-value.



            It's time to come back to the CLT. If you knew a distribution's mean $mu$ and variance $sigma^2$, a large sample's mean $overline{X}$ is a random variable, with an approximately Normal distribution. In particular, $frac{overline{X}-mu}{sigma}approx N(0,,1)$. But what makes you think you know the mean and variance? You can estimate these parameters from an existing sample, but then something funny happens. Because we've replaced the true parameter values with estimates of them that are also random variables, it turns out that the Normal approximation no longer works. In particular, if $mu$ is estimated as $m$ and $sigma$ is estimated as $S$, $frac{overline{X}-m}{S}$ has a $t$ distribution. As with the chi squared distribution, the distribution's shape depend on its number of degrees of freedom.



            I mentioned noise terms with Normal distributions. They result in a variance with a ch-squared distribution, up to scaling. Now say I wonder whether two variables have the same variance. Because the variance of a sample is a random variable, the ratio of two samples' variance is $F$-distributed, up to scaling. This is the basis of the F-test of equality of variances.






            share|cite|improve this answer





















              Your Answer





              StackExchange.ifUsing("editor", function () {
              return StackExchange.using("mathjaxEditing", function () {
              StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
              StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
              });
              });
              }, "mathjax-editing");

              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "69"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              convertImagesToLinks: true,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              noCode: true, onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });














              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3011590%2fwhat-is-an-intuitive-explanation-for-how-the-t-distribution-normal-distribution%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes








              up vote
              2
              down vote













              It is not totally clear to me precisely what you are looking for, but suppose $X_1,X_2,....,X_n$ are i.i.d. normally distributed random variables with mean $mu$ and variance $sigma^2$,




              • writing their average as $bar{X}={frac1n}sumlimits_{i=1}^{n} X_i$, then $dfrac{bar{X} -mu}{sigma/sqrt{n}}$ has a standard normal distribution $N(0,1)$ indicating the distribution of the sample mean


              • and $sumlimits_{i=1}^{n} left(frac{X_i-mu}{sigma}right)^2$ has a $chi_n^2$-distribution, i.e. a chi-squared distribution with $n$ degrees of freedom as the sum of the squares of $n$ independent standard normal random variables


              • while estimating the unbiased sample variance as $S^2=frac1{n-1}sumlimits_{i=1}^{n} left({X_i-bar{X}}right)^2$ you have $(n-1)frac{S^2}{sigma^2}$ having a $chi_{n-1}^2$-distribution, i.e. a chi-squared distribution with $n-1$ degrees of freedom since $bar{X}$ is affected by the individual $X_i$


              • and looking at the distribution of the sample mean you have $dfrac{bar{X} -mu}{S/sqrt{n}}$ having a Student $t$-distribution with $n-1$ degrees of freedom - not quite the same as the standard normal distribution in the first point, but close for large $n$; you can use this to test the hypothesis that the population mean is actually $mu$ without knowing $sigma^2$


              • as a tool in comparing variances, if $Z_1 sim chi^2_{d_1}$ and independently $Z_2 sim chi^2_{d_2}$, i.e. have chi-squared distributions with $d_1$ and $d_2$ degrees of freedom, then $frac{Z_1 / d_1}{Z_2 / d_2} sim mathrm{F}(d_1, d_2)$, i.e. has an $F$-distribution with parameters $d_1$ and $d_2$


              • and in particular if $Y_1,Y_2,....,Y_m$ are also i.i.d. normally distributed random variables with a different mean $mu_Y^{,}$ and but the same variance $sigma^2$ as the earlier $X_i$, then using the third bullet point, $dfrac{sumlimits_{i=1}^{n} left({X_i-bar{X}}right)^2}{sumlimits_{j=1}^{n} left({Y_j-bar{Y}}right)^2}sim mathrm{F}(n-1, m-1)$, i.e. has an $F$-distribution with parameters $n-1$ and $m-1$ and you can use this as a test of the hypothesis that the variances are equal without knowing their value or the value of the means



              You may not know that the $X_1,X_2,....,X_n$ are in fact normally distributed, but the Central Limit Theorem suggests that for large $n$ and finite $mu$ and $sigma^2$ you should have $bar{X}$ approximately normally distributed as in the first bullet point, which may turn out to be good enough for the other properties, though for $n$ too small it may not be






              share|cite|improve this answer

























                up vote
                2
                down vote













                It is not totally clear to me precisely what you are looking for, but suppose $X_1,X_2,....,X_n$ are i.i.d. normally distributed random variables with mean $mu$ and variance $sigma^2$,




                • writing their average as $bar{X}={frac1n}sumlimits_{i=1}^{n} X_i$, then $dfrac{bar{X} -mu}{sigma/sqrt{n}}$ has a standard normal distribution $N(0,1)$ indicating the distribution of the sample mean


                • and $sumlimits_{i=1}^{n} left(frac{X_i-mu}{sigma}right)^2$ has a $chi_n^2$-distribution, i.e. a chi-squared distribution with $n$ degrees of freedom as the sum of the squares of $n$ independent standard normal random variables


                • while estimating the unbiased sample variance as $S^2=frac1{n-1}sumlimits_{i=1}^{n} left({X_i-bar{X}}right)^2$ you have $(n-1)frac{S^2}{sigma^2}$ having a $chi_{n-1}^2$-distribution, i.e. a chi-squared distribution with $n-1$ degrees of freedom since $bar{X}$ is affected by the individual $X_i$


                • and looking at the distribution of the sample mean you have $dfrac{bar{X} -mu}{S/sqrt{n}}$ having a Student $t$-distribution with $n-1$ degrees of freedom - not quite the same as the standard normal distribution in the first point, but close for large $n$; you can use this to test the hypothesis that the population mean is actually $mu$ without knowing $sigma^2$


                • as a tool in comparing variances, if $Z_1 sim chi^2_{d_1}$ and independently $Z_2 sim chi^2_{d_2}$, i.e. have chi-squared distributions with $d_1$ and $d_2$ degrees of freedom, then $frac{Z_1 / d_1}{Z_2 / d_2} sim mathrm{F}(d_1, d_2)$, i.e. has an $F$-distribution with parameters $d_1$ and $d_2$


                • and in particular if $Y_1,Y_2,....,Y_m$ are also i.i.d. normally distributed random variables with a different mean $mu_Y^{,}$ and but the same variance $sigma^2$ as the earlier $X_i$, then using the third bullet point, $dfrac{sumlimits_{i=1}^{n} left({X_i-bar{X}}right)^2}{sumlimits_{j=1}^{n} left({Y_j-bar{Y}}right)^2}sim mathrm{F}(n-1, m-1)$, i.e. has an $F$-distribution with parameters $n-1$ and $m-1$ and you can use this as a test of the hypothesis that the variances are equal without knowing their value or the value of the means



                You may not know that the $X_1,X_2,....,X_n$ are in fact normally distributed, but the Central Limit Theorem suggests that for large $n$ and finite $mu$ and $sigma^2$ you should have $bar{X}$ approximately normally distributed as in the first bullet point, which may turn out to be good enough for the other properties, though for $n$ too small it may not be






                share|cite|improve this answer























                  up vote
                  2
                  down vote










                  up vote
                  2
                  down vote









                  It is not totally clear to me precisely what you are looking for, but suppose $X_1,X_2,....,X_n$ are i.i.d. normally distributed random variables with mean $mu$ and variance $sigma^2$,




                  • writing their average as $bar{X}={frac1n}sumlimits_{i=1}^{n} X_i$, then $dfrac{bar{X} -mu}{sigma/sqrt{n}}$ has a standard normal distribution $N(0,1)$ indicating the distribution of the sample mean


                  • and $sumlimits_{i=1}^{n} left(frac{X_i-mu}{sigma}right)^2$ has a $chi_n^2$-distribution, i.e. a chi-squared distribution with $n$ degrees of freedom as the sum of the squares of $n$ independent standard normal random variables


                  • while estimating the unbiased sample variance as $S^2=frac1{n-1}sumlimits_{i=1}^{n} left({X_i-bar{X}}right)^2$ you have $(n-1)frac{S^2}{sigma^2}$ having a $chi_{n-1}^2$-distribution, i.e. a chi-squared distribution with $n-1$ degrees of freedom since $bar{X}$ is affected by the individual $X_i$


                  • and looking at the distribution of the sample mean you have $dfrac{bar{X} -mu}{S/sqrt{n}}$ having a Student $t$-distribution with $n-1$ degrees of freedom - not quite the same as the standard normal distribution in the first point, but close for large $n$; you can use this to test the hypothesis that the population mean is actually $mu$ without knowing $sigma^2$


                  • as a tool in comparing variances, if $Z_1 sim chi^2_{d_1}$ and independently $Z_2 sim chi^2_{d_2}$, i.e. have chi-squared distributions with $d_1$ and $d_2$ degrees of freedom, then $frac{Z_1 / d_1}{Z_2 / d_2} sim mathrm{F}(d_1, d_2)$, i.e. has an $F$-distribution with parameters $d_1$ and $d_2$


                  • and in particular if $Y_1,Y_2,....,Y_m$ are also i.i.d. normally distributed random variables with a different mean $mu_Y^{,}$ and but the same variance $sigma^2$ as the earlier $X_i$, then using the third bullet point, $dfrac{sumlimits_{i=1}^{n} left({X_i-bar{X}}right)^2}{sumlimits_{j=1}^{n} left({Y_j-bar{Y}}right)^2}sim mathrm{F}(n-1, m-1)$, i.e. has an $F$-distribution with parameters $n-1$ and $m-1$ and you can use this as a test of the hypothesis that the variances are equal without knowing their value or the value of the means



                  You may not know that the $X_1,X_2,....,X_n$ are in fact normally distributed, but the Central Limit Theorem suggests that for large $n$ and finite $mu$ and $sigma^2$ you should have $bar{X}$ approximately normally distributed as in the first bullet point, which may turn out to be good enough for the other properties, though for $n$ too small it may not be






                  share|cite|improve this answer












                  It is not totally clear to me precisely what you are looking for, but suppose $X_1,X_2,....,X_n$ are i.i.d. normally distributed random variables with mean $mu$ and variance $sigma^2$,




                  • writing their average as $bar{X}={frac1n}sumlimits_{i=1}^{n} X_i$, then $dfrac{bar{X} -mu}{sigma/sqrt{n}}$ has a standard normal distribution $N(0,1)$ indicating the distribution of the sample mean


                  • and $sumlimits_{i=1}^{n} left(frac{X_i-mu}{sigma}right)^2$ has a $chi_n^2$-distribution, i.e. a chi-squared distribution with $n$ degrees of freedom as the sum of the squares of $n$ independent standard normal random variables


                  • while estimating the unbiased sample variance as $S^2=frac1{n-1}sumlimits_{i=1}^{n} left({X_i-bar{X}}right)^2$ you have $(n-1)frac{S^2}{sigma^2}$ having a $chi_{n-1}^2$-distribution, i.e. a chi-squared distribution with $n-1$ degrees of freedom since $bar{X}$ is affected by the individual $X_i$


                  • and looking at the distribution of the sample mean you have $dfrac{bar{X} -mu}{S/sqrt{n}}$ having a Student $t$-distribution with $n-1$ degrees of freedom - not quite the same as the standard normal distribution in the first point, but close for large $n$; you can use this to test the hypothesis that the population mean is actually $mu$ without knowing $sigma^2$


                  • as a tool in comparing variances, if $Z_1 sim chi^2_{d_1}$ and independently $Z_2 sim chi^2_{d_2}$, i.e. have chi-squared distributions with $d_1$ and $d_2$ degrees of freedom, then $frac{Z_1 / d_1}{Z_2 / d_2} sim mathrm{F}(d_1, d_2)$, i.e. has an $F$-distribution with parameters $d_1$ and $d_2$


                  • and in particular if $Y_1,Y_2,....,Y_m$ are also i.i.d. normally distributed random variables with a different mean $mu_Y^{,}$ and but the same variance $sigma^2$ as the earlier $X_i$, then using the third bullet point, $dfrac{sumlimits_{i=1}^{n} left({X_i-bar{X}}right)^2}{sumlimits_{j=1}^{n} left({Y_j-bar{Y}}right)^2}sim mathrm{F}(n-1, m-1)$, i.e. has an $F$-distribution with parameters $n-1$ and $m-1$ and you can use this as a test of the hypothesis that the variances are equal without knowing their value or the value of the means



                  You may not know that the $X_1,X_2,....,X_n$ are in fact normally distributed, but the Central Limit Theorem suggests that for large $n$ and finite $mu$ and $sigma^2$ you should have $bar{X}$ approximately normally distributed as in the first bullet point, which may turn out to be good enough for the other properties, though for $n$ too small it may not be







                  share|cite|improve this answer












                  share|cite|improve this answer



                  share|cite|improve this answer










                  answered Nov 24 at 16:58









                  Henry

                  96.9k474154




                  96.9k474154






















                      up vote
                      2
                      down vote













                      The short answer is as follows:




                      • While probability studies the implications of assumed probability distributions, statistics assesses how well the data bear out these assumptions, by measuring something whose distribution is thereby predictable.

                      • The distributions you've asked about are important because you can construct statistical tests where the null hypothesis would imply such distributions, approximately or otherwise, are those of quantities called test statistics, which if too "abnormal" in their value motivate rejection of the null hypothesis.

                      • Given $n$ independent variables, each having a Normal distribution of mean $0$ and standard deviation $1$ (hereafter a standard Normal distribution), the sum of their squares has a chi-squared distribution with $n$ degrees of freedom.

                      • If $X,,Y$ are independent variables, $X$ having a standard Normal distribution and $Y^2$ having a chi-squared distribution, $X/Y$ has a $t$-distribution.

                      • If you scale two independent chi-squared variables to each have standard deviation $1$, the ratio of these scaled variables has an $F$-distribution, so squaring a $t$-distributed-variable (in which $Y$ has $1$ degree of freedom, so that its standard deviation is $1$) obtains one example of an $F$-distributed variable.


                      Now for the long answer:



                      A Normal distribution is specified by its mean $mu$ (which can be chosen arbitrarily) and its standard deviation $sigma$ (which can be any positive number). If a random variable $X$ has such a distribution, we write $Xsim N(mu,,sigma^2)$, where $sigma^2$ is the variance. The number of standard deviations from $mu$ to $X$ is a random variable in its own right, typically denoted $Z$, viz. $X=mu+sigma Z$. It turns out that $Zsim N(0,,1)$; we say $Z$ has a Standard normal distribution.



                      There are various scenarios in which random variables admit a Normal approximation. For example, the classical central limit theorem (CLT) states that the mean of a large number of independent samples from a finite-variance distribution has an approximately Normal distribution. We'll come back to that one. For another example, when you try to fit a model to data, you have noise terms $epsilon$ viz. $y=f(x)+epsilon$, and we can often justify the assumption $epsilonsim N(0,,sigma^2)$ for some $sigma>0$. Let's say we have $n$ observations. If we divide all noise terms by $sigma$, square the results and sum the squares, the result has a chi-squared distribution with $n$ degrees of freedom. This lets us quantify how surprising it is that the data deviate from expectations as much as they do, because with a distribution in mind we can obtain a $p$-value.



                      It's time to come back to the CLT. If you knew a distribution's mean $mu$ and variance $sigma^2$, a large sample's mean $overline{X}$ is a random variable, with an approximately Normal distribution. In particular, $frac{overline{X}-mu}{sigma}approx N(0,,1)$. But what makes you think you know the mean and variance? You can estimate these parameters from an existing sample, but then something funny happens. Because we've replaced the true parameter values with estimates of them that are also random variables, it turns out that the Normal approximation no longer works. In particular, if $mu$ is estimated as $m$ and $sigma$ is estimated as $S$, $frac{overline{X}-m}{S}$ has a $t$ distribution. As with the chi squared distribution, the distribution's shape depend on its number of degrees of freedom.



                      I mentioned noise terms with Normal distributions. They result in a variance with a ch-squared distribution, up to scaling. Now say I wonder whether two variables have the same variance. Because the variance of a sample is a random variable, the ratio of two samples' variance is $F$-distributed, up to scaling. This is the basis of the F-test of equality of variances.






                      share|cite|improve this answer

























                        up vote
                        2
                        down vote













                        The short answer is as follows:




                        • While probability studies the implications of assumed probability distributions, statistics assesses how well the data bear out these assumptions, by measuring something whose distribution is thereby predictable.

                        • The distributions you've asked about are important because you can construct statistical tests where the null hypothesis would imply such distributions, approximately or otherwise, are those of quantities called test statistics, which if too "abnormal" in their value motivate rejection of the null hypothesis.

                        • Given $n$ independent variables, each having a Normal distribution of mean $0$ and standard deviation $1$ (hereafter a standard Normal distribution), the sum of their squares has a chi-squared distribution with $n$ degrees of freedom.

                        • If $X,,Y$ are independent variables, $X$ having a standard Normal distribution and $Y^2$ having a chi-squared distribution, $X/Y$ has a $t$-distribution.

                        • If you scale two independent chi-squared variables to each have standard deviation $1$, the ratio of these scaled variables has an $F$-distribution, so squaring a $t$-distributed-variable (in which $Y$ has $1$ degree of freedom, so that its standard deviation is $1$) obtains one example of an $F$-distributed variable.


                        Now for the long answer:



                        A Normal distribution is specified by its mean $mu$ (which can be chosen arbitrarily) and its standard deviation $sigma$ (which can be any positive number). If a random variable $X$ has such a distribution, we write $Xsim N(mu,,sigma^2)$, where $sigma^2$ is the variance. The number of standard deviations from $mu$ to $X$ is a random variable in its own right, typically denoted $Z$, viz. $X=mu+sigma Z$. It turns out that $Zsim N(0,,1)$; we say $Z$ has a Standard normal distribution.



                        There are various scenarios in which random variables admit a Normal approximation. For example, the classical central limit theorem (CLT) states that the mean of a large number of independent samples from a finite-variance distribution has an approximately Normal distribution. We'll come back to that one. For another example, when you try to fit a model to data, you have noise terms $epsilon$ viz. $y=f(x)+epsilon$, and we can often justify the assumption $epsilonsim N(0,,sigma^2)$ for some $sigma>0$. Let's say we have $n$ observations. If we divide all noise terms by $sigma$, square the results and sum the squares, the result has a chi-squared distribution with $n$ degrees of freedom. This lets us quantify how surprising it is that the data deviate from expectations as much as they do, because with a distribution in mind we can obtain a $p$-value.



                        It's time to come back to the CLT. If you knew a distribution's mean $mu$ and variance $sigma^2$, a large sample's mean $overline{X}$ is a random variable, with an approximately Normal distribution. In particular, $frac{overline{X}-mu}{sigma}approx N(0,,1)$. But what makes you think you know the mean and variance? You can estimate these parameters from an existing sample, but then something funny happens. Because we've replaced the true parameter values with estimates of them that are also random variables, it turns out that the Normal approximation no longer works. In particular, if $mu$ is estimated as $m$ and $sigma$ is estimated as $S$, $frac{overline{X}-m}{S}$ has a $t$ distribution. As with the chi squared distribution, the distribution's shape depend on its number of degrees of freedom.



                        I mentioned noise terms with Normal distributions. They result in a variance with a ch-squared distribution, up to scaling. Now say I wonder whether two variables have the same variance. Because the variance of a sample is a random variable, the ratio of two samples' variance is $F$-distributed, up to scaling. This is the basis of the F-test of equality of variances.






                        share|cite|improve this answer























                          up vote
                          2
                          down vote










                          up vote
                          2
                          down vote









                          The short answer is as follows:




                          • While probability studies the implications of assumed probability distributions, statistics assesses how well the data bear out these assumptions, by measuring something whose distribution is thereby predictable.

                          • The distributions you've asked about are important because you can construct statistical tests where the null hypothesis would imply such distributions, approximately or otherwise, are those of quantities called test statistics, which if too "abnormal" in their value motivate rejection of the null hypothesis.

                          • Given $n$ independent variables, each having a Normal distribution of mean $0$ and standard deviation $1$ (hereafter a standard Normal distribution), the sum of their squares has a chi-squared distribution with $n$ degrees of freedom.

                          • If $X,,Y$ are independent variables, $X$ having a standard Normal distribution and $Y^2$ having a chi-squared distribution, $X/Y$ has a $t$-distribution.

                          • If you scale two independent chi-squared variables to each have standard deviation $1$, the ratio of these scaled variables has an $F$-distribution, so squaring a $t$-distributed-variable (in which $Y$ has $1$ degree of freedom, so that its standard deviation is $1$) obtains one example of an $F$-distributed variable.


                          Now for the long answer:



                          A Normal distribution is specified by its mean $mu$ (which can be chosen arbitrarily) and its standard deviation $sigma$ (which can be any positive number). If a random variable $X$ has such a distribution, we write $Xsim N(mu,,sigma^2)$, where $sigma^2$ is the variance. The number of standard deviations from $mu$ to $X$ is a random variable in its own right, typically denoted $Z$, viz. $X=mu+sigma Z$. It turns out that $Zsim N(0,,1)$; we say $Z$ has a Standard normal distribution.



                          There are various scenarios in which random variables admit a Normal approximation. For example, the classical central limit theorem (CLT) states that the mean of a large number of independent samples from a finite-variance distribution has an approximately Normal distribution. We'll come back to that one. For another example, when you try to fit a model to data, you have noise terms $epsilon$ viz. $y=f(x)+epsilon$, and we can often justify the assumption $epsilonsim N(0,,sigma^2)$ for some $sigma>0$. Let's say we have $n$ observations. If we divide all noise terms by $sigma$, square the results and sum the squares, the result has a chi-squared distribution with $n$ degrees of freedom. This lets us quantify how surprising it is that the data deviate from expectations as much as they do, because with a distribution in mind we can obtain a $p$-value.



                          It's time to come back to the CLT. If you knew a distribution's mean $mu$ and variance $sigma^2$, a large sample's mean $overline{X}$ is a random variable, with an approximately Normal distribution. In particular, $frac{overline{X}-mu}{sigma}approx N(0,,1)$. But what makes you think you know the mean and variance? You can estimate these parameters from an existing sample, but then something funny happens. Because we've replaced the true parameter values with estimates of them that are also random variables, it turns out that the Normal approximation no longer works. In particular, if $mu$ is estimated as $m$ and $sigma$ is estimated as $S$, $frac{overline{X}-m}{S}$ has a $t$ distribution. As with the chi squared distribution, the distribution's shape depend on its number of degrees of freedom.



                          I mentioned noise terms with Normal distributions. They result in a variance with a ch-squared distribution, up to scaling. Now say I wonder whether two variables have the same variance. Because the variance of a sample is a random variable, the ratio of two samples' variance is $F$-distributed, up to scaling. This is the basis of the F-test of equality of variances.






                          share|cite|improve this answer












                          The short answer is as follows:




                          • While probability studies the implications of assumed probability distributions, statistics assesses how well the data bear out these assumptions, by measuring something whose distribution is thereby predictable.

                          • The distributions you've asked about are important because you can construct statistical tests where the null hypothesis would imply such distributions, approximately or otherwise, are those of quantities called test statistics, which if too "abnormal" in their value motivate rejection of the null hypothesis.

                          • Given $n$ independent variables, each having a Normal distribution of mean $0$ and standard deviation $1$ (hereafter a standard Normal distribution), the sum of their squares has a chi-squared distribution with $n$ degrees of freedom.

                          • If $X,,Y$ are independent variables, $X$ having a standard Normal distribution and $Y^2$ having a chi-squared distribution, $X/Y$ has a $t$-distribution.

                          • If you scale two independent chi-squared variables to each have standard deviation $1$, the ratio of these scaled variables has an $F$-distribution, so squaring a $t$-distributed-variable (in which $Y$ has $1$ degree of freedom, so that its standard deviation is $1$) obtains one example of an $F$-distributed variable.


                          Now for the long answer:



                          A Normal distribution is specified by its mean $mu$ (which can be chosen arbitrarily) and its standard deviation $sigma$ (which can be any positive number). If a random variable $X$ has such a distribution, we write $Xsim N(mu,,sigma^2)$, where $sigma^2$ is the variance. The number of standard deviations from $mu$ to $X$ is a random variable in its own right, typically denoted $Z$, viz. $X=mu+sigma Z$. It turns out that $Zsim N(0,,1)$; we say $Z$ has a Standard normal distribution.



                          There are various scenarios in which random variables admit a Normal approximation. For example, the classical central limit theorem (CLT) states that the mean of a large number of independent samples from a finite-variance distribution has an approximately Normal distribution. We'll come back to that one. For another example, when you try to fit a model to data, you have noise terms $epsilon$ viz. $y=f(x)+epsilon$, and we can often justify the assumption $epsilonsim N(0,,sigma^2)$ for some $sigma>0$. Let's say we have $n$ observations. If we divide all noise terms by $sigma$, square the results and sum the squares, the result has a chi-squared distribution with $n$ degrees of freedom. This lets us quantify how surprising it is that the data deviate from expectations as much as they do, because with a distribution in mind we can obtain a $p$-value.



                          It's time to come back to the CLT. If you knew a distribution's mean $mu$ and variance $sigma^2$, a large sample's mean $overline{X}$ is a random variable, with an approximately Normal distribution. In particular, $frac{overline{X}-mu}{sigma}approx N(0,,1)$. But what makes you think you know the mean and variance? You can estimate these parameters from an existing sample, but then something funny happens. Because we've replaced the true parameter values with estimates of them that are also random variables, it turns out that the Normal approximation no longer works. In particular, if $mu$ is estimated as $m$ and $sigma$ is estimated as $S$, $frac{overline{X}-m}{S}$ has a $t$ distribution. As with the chi squared distribution, the distribution's shape depend on its number of degrees of freedom.



                          I mentioned noise terms with Normal distributions. They result in a variance with a ch-squared distribution, up to scaling. Now say I wonder whether two variables have the same variance. Because the variance of a sample is a random variable, the ratio of two samples' variance is $F$-distributed, up to scaling. This is the basis of the F-test of equality of variances.







                          share|cite|improve this answer












                          share|cite|improve this answer



                          share|cite|improve this answer










                          answered Nov 24 at 18:04









                          J.G.

                          19.8k21932




                          19.8k21932






























                              draft saved

                              draft discarded




















































                              Thanks for contributing an answer to Mathematics Stack Exchange!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              Use MathJax to format equations. MathJax reference.


                              To learn more, see our tips on writing great answers.





                              Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                              Please pay close attention to the following guidance:


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3011590%2fwhat-is-an-intuitive-explanation-for-how-the-t-distribution-normal-distribution%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              Sphinx de Gizeh

                              Dijon

                              Guerrita