Interpretation of an ANOVA output.











up vote
1
down vote

favorite
1












Given a dataset that has a numerical variable called "number of children" and a categorical variable "Standard of living" with $4$ levels, I used anova to see if there is a relationship between the number of children and the different standards of living.



But first, I evaluated the mean of the number of children of each standard of living ($1$=low$,2,3,$$4$=high):



begin{align}
&text{standard of living} hspace{20pt} 1 hspace{30pt} 2 hspace{27pt} 3 hspace{30pt} 4 \
&text{number of children} hspace{14pt} 3.25 hspace{15pt} 3.30 hspace{15pt} 3.28 hspace{17pt} 3.42 \
end{align}




After running the ANOVA command in R, the result was F value $= 0.05$
and Pr(>F)$= 0.985$.



Since the F value is low and the means are very close, does that mean that there is not a
significant relationship between the number of children in a couple
and their standard of living?




I did the same thing but now for the categorical variable wife's education which has four levels $(1$=low$, 2,3, 4=$high$)$ and the result was



begin{align}
&text{Wife's education} hspace{20pt} 1 hspace{30pt} 2 hspace{27pt} 3 hspace{30pt} 4 \
&text{number of children} hspace{11pt} 4.42 hspace{15pt} 3.51 hspace{15pt} 3.23 hspace{17pt} 2.83 \
end{align}




With F value $=20.67$ and Pr(>F)$=4.06e-13$. There is a strong
relationship between the number o children in a couple and the wife's
education?



In this case what is "strong relationship"? And what values of $F$ are "high" or "low"?




Running a Tukey post hoc test:



          diff        lwr         upr     p adj

2-1 -0.9120706 -1.4940399 -0.33010131 0.0003402

3-1 -1.1869063 -1.7517540 -0.62205855 0.0000005

4-1 -1.5891636 -2.1314528 -1.04687427 0.0000000

3-2 -0.2748357 -0.7132637 0.16359229 0.3719251

4-2 -0.6770930 -1.0860475 -0.26813844 0.0001286

4-3 -0.4022573 -0.7864558 -0.01805873 0.0360224



There is not a significant difference in the number of children
between the wife's education level $3$ and level $2$. And there is a
significant difference between the wife's education level $4$ and
level $1$ etc.



But what is a "significant difference" in this context?











share|cite|improve this question




























    up vote
    1
    down vote

    favorite
    1












    Given a dataset that has a numerical variable called "number of children" and a categorical variable "Standard of living" with $4$ levels, I used anova to see if there is a relationship between the number of children and the different standards of living.



    But first, I evaluated the mean of the number of children of each standard of living ($1$=low$,2,3,$$4$=high):



    begin{align}
    &text{standard of living} hspace{20pt} 1 hspace{30pt} 2 hspace{27pt} 3 hspace{30pt} 4 \
    &text{number of children} hspace{14pt} 3.25 hspace{15pt} 3.30 hspace{15pt} 3.28 hspace{17pt} 3.42 \
    end{align}




    After running the ANOVA command in R, the result was F value $= 0.05$
    and Pr(>F)$= 0.985$.



    Since the F value is low and the means are very close, does that mean that there is not a
    significant relationship between the number of children in a couple
    and their standard of living?




    I did the same thing but now for the categorical variable wife's education which has four levels $(1$=low$, 2,3, 4=$high$)$ and the result was



    begin{align}
    &text{Wife's education} hspace{20pt} 1 hspace{30pt} 2 hspace{27pt} 3 hspace{30pt} 4 \
    &text{number of children} hspace{11pt} 4.42 hspace{15pt} 3.51 hspace{15pt} 3.23 hspace{17pt} 2.83 \
    end{align}




    With F value $=20.67$ and Pr(>F)$=4.06e-13$. There is a strong
    relationship between the number o children in a couple and the wife's
    education?



    In this case what is "strong relationship"? And what values of $F$ are "high" or "low"?




    Running a Tukey post hoc test:



              diff        lwr         upr     p adj

    2-1 -0.9120706 -1.4940399 -0.33010131 0.0003402

    3-1 -1.1869063 -1.7517540 -0.62205855 0.0000005

    4-1 -1.5891636 -2.1314528 -1.04687427 0.0000000

    3-2 -0.2748357 -0.7132637 0.16359229 0.3719251

    4-2 -0.6770930 -1.0860475 -0.26813844 0.0001286

    4-3 -0.4022573 -0.7864558 -0.01805873 0.0360224



    There is not a significant difference in the number of children
    between the wife's education level $3$ and level $2$. And there is a
    significant difference between the wife's education level $4$ and
    level $1$ etc.



    But what is a "significant difference" in this context?











    share|cite|improve this question


























      up vote
      1
      down vote

      favorite
      1









      up vote
      1
      down vote

      favorite
      1






      1





      Given a dataset that has a numerical variable called "number of children" and a categorical variable "Standard of living" with $4$ levels, I used anova to see if there is a relationship between the number of children and the different standards of living.



      But first, I evaluated the mean of the number of children of each standard of living ($1$=low$,2,3,$$4$=high):



      begin{align}
      &text{standard of living} hspace{20pt} 1 hspace{30pt} 2 hspace{27pt} 3 hspace{30pt} 4 \
      &text{number of children} hspace{14pt} 3.25 hspace{15pt} 3.30 hspace{15pt} 3.28 hspace{17pt} 3.42 \
      end{align}




      After running the ANOVA command in R, the result was F value $= 0.05$
      and Pr(>F)$= 0.985$.



      Since the F value is low and the means are very close, does that mean that there is not a
      significant relationship between the number of children in a couple
      and their standard of living?




      I did the same thing but now for the categorical variable wife's education which has four levels $(1$=low$, 2,3, 4=$high$)$ and the result was



      begin{align}
      &text{Wife's education} hspace{20pt} 1 hspace{30pt} 2 hspace{27pt} 3 hspace{30pt} 4 \
      &text{number of children} hspace{11pt} 4.42 hspace{15pt} 3.51 hspace{15pt} 3.23 hspace{17pt} 2.83 \
      end{align}




      With F value $=20.67$ and Pr(>F)$=4.06e-13$. There is a strong
      relationship between the number o children in a couple and the wife's
      education?



      In this case what is "strong relationship"? And what values of $F$ are "high" or "low"?




      Running a Tukey post hoc test:



                diff        lwr         upr     p adj

      2-1 -0.9120706 -1.4940399 -0.33010131 0.0003402

      3-1 -1.1869063 -1.7517540 -0.62205855 0.0000005

      4-1 -1.5891636 -2.1314528 -1.04687427 0.0000000

      3-2 -0.2748357 -0.7132637 0.16359229 0.3719251

      4-2 -0.6770930 -1.0860475 -0.26813844 0.0001286

      4-3 -0.4022573 -0.7864558 -0.01805873 0.0360224



      There is not a significant difference in the number of children
      between the wife's education level $3$ and level $2$. And there is a
      significant difference between the wife's education level $4$ and
      level $1$ etc.



      But what is a "significant difference" in this context?











      share|cite|improve this question















      Given a dataset that has a numerical variable called "number of children" and a categorical variable "Standard of living" with $4$ levels, I used anova to see if there is a relationship between the number of children and the different standards of living.



      But first, I evaluated the mean of the number of children of each standard of living ($1$=low$,2,3,$$4$=high):



      begin{align}
      &text{standard of living} hspace{20pt} 1 hspace{30pt} 2 hspace{27pt} 3 hspace{30pt} 4 \
      &text{number of children} hspace{14pt} 3.25 hspace{15pt} 3.30 hspace{15pt} 3.28 hspace{17pt} 3.42 \
      end{align}




      After running the ANOVA command in R, the result was F value $= 0.05$
      and Pr(>F)$= 0.985$.



      Since the F value is low and the means are very close, does that mean that there is not a
      significant relationship between the number of children in a couple
      and their standard of living?




      I did the same thing but now for the categorical variable wife's education which has four levels $(1$=low$, 2,3, 4=$high$)$ and the result was



      begin{align}
      &text{Wife's education} hspace{20pt} 1 hspace{30pt} 2 hspace{27pt} 3 hspace{30pt} 4 \
      &text{number of children} hspace{11pt} 4.42 hspace{15pt} 3.51 hspace{15pt} 3.23 hspace{17pt} 2.83 \
      end{align}




      With F value $=20.67$ and Pr(>F)$=4.06e-13$. There is a strong
      relationship between the number o children in a couple and the wife's
      education?



      In this case what is "strong relationship"? And what values of $F$ are "high" or "low"?




      Running a Tukey post hoc test:



                diff        lwr         upr     p adj

      2-1 -0.9120706 -1.4940399 -0.33010131 0.0003402

      3-1 -1.1869063 -1.7517540 -0.62205855 0.0000005

      4-1 -1.5891636 -2.1314528 -1.04687427 0.0000000

      3-2 -0.2748357 -0.7132637 0.16359229 0.3719251

      4-2 -0.6770930 -1.0860475 -0.26813844 0.0001286

      4-3 -0.4022573 -0.7864558 -0.01805873 0.0360224



      There is not a significant difference in the number of children
      between the wife's education level $3$ and level $2$. And there is a
      significant difference between the wife's education level $4$ and
      level $1$ etc.



      But what is a "significant difference" in this context?








      statistics






      share|cite|improve this question















      share|cite|improve this question













      share|cite|improve this question




      share|cite|improve this question








      edited yesterday









      BruceET

      34.7k71440




      34.7k71440










      asked 2 days ago









      Pinteco

      617212




      617212






















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          0
          down vote













          (1) Standard of living and number of children: Small F-statistic and consequent large P-value mean that no significant differences have been found. Thus there is no reason
          to do ad hoc Tukey tests.



          (2) Wife's education and number of children: Large F-statistic and consequent small P-value mean that there are some significant differences. You do an ad hoc Tukey procedure to
          see what can be determined about the pattern of differences.



          Roughly speaking, the number of children tend to decrease as wife's education level increases. If we used ${4 choose 2} = 6$ different t-tests to check for differences, results might be confusing. We could
          do each t-test at the 5% level of significance, but we would have no idea what
          risk we run of falsely declaring differences in some of the six comparisons. We say
          that the 'family error' rate for the pattern of differences among the four educational levels is indeterminate, at worst is might almost as high as $6(.05) = 0.3)$ or 30%.



          The Tukey test is somewhat more 'reluctant' to declare differences. The criterion for
          declaring an 'Honest Significant Difference' (HSD) is chosen in such a way as to
          keep the family error rate below 5%. Thus the difference $3.51 - 3.23 = 0.28$ between education levels 2 and 3 is not sufficiently large to be declared 'significant'.



          By contrast, for example, the difference $3.23 - 2.83 = 0.40$ between education levels 3 and 4 is (barely) large enough to be declared significant. The much larger difference between education levels 1 and 2 is (more easily) declared significant. (If the sample sizes differ from level to level of the categorical variable the value of HSD may differ from one comparison to another.)






          share|cite|improve this answer























            Your Answer





            StackExchange.ifUsing("editor", function () {
            return StackExchange.using("mathjaxEditing", function () {
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
            });
            });
            }, "mathjax-editing");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "69"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            noCode: true, onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














             

            draft saved


            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3005628%2finterpretation-of-an-anova-output%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            0
            down vote













            (1) Standard of living and number of children: Small F-statistic and consequent large P-value mean that no significant differences have been found. Thus there is no reason
            to do ad hoc Tukey tests.



            (2) Wife's education and number of children: Large F-statistic and consequent small P-value mean that there are some significant differences. You do an ad hoc Tukey procedure to
            see what can be determined about the pattern of differences.



            Roughly speaking, the number of children tend to decrease as wife's education level increases. If we used ${4 choose 2} = 6$ different t-tests to check for differences, results might be confusing. We could
            do each t-test at the 5% level of significance, but we would have no idea what
            risk we run of falsely declaring differences in some of the six comparisons. We say
            that the 'family error' rate for the pattern of differences among the four educational levels is indeterminate, at worst is might almost as high as $6(.05) = 0.3)$ or 30%.



            The Tukey test is somewhat more 'reluctant' to declare differences. The criterion for
            declaring an 'Honest Significant Difference' (HSD) is chosen in such a way as to
            keep the family error rate below 5%. Thus the difference $3.51 - 3.23 = 0.28$ between education levels 2 and 3 is not sufficiently large to be declared 'significant'.



            By contrast, for example, the difference $3.23 - 2.83 = 0.40$ between education levels 3 and 4 is (barely) large enough to be declared significant. The much larger difference between education levels 1 and 2 is (more easily) declared significant. (If the sample sizes differ from level to level of the categorical variable the value of HSD may differ from one comparison to another.)






            share|cite|improve this answer



























              up vote
              0
              down vote













              (1) Standard of living and number of children: Small F-statistic and consequent large P-value mean that no significant differences have been found. Thus there is no reason
              to do ad hoc Tukey tests.



              (2) Wife's education and number of children: Large F-statistic and consequent small P-value mean that there are some significant differences. You do an ad hoc Tukey procedure to
              see what can be determined about the pattern of differences.



              Roughly speaking, the number of children tend to decrease as wife's education level increases. If we used ${4 choose 2} = 6$ different t-tests to check for differences, results might be confusing. We could
              do each t-test at the 5% level of significance, but we would have no idea what
              risk we run of falsely declaring differences in some of the six comparisons. We say
              that the 'family error' rate for the pattern of differences among the four educational levels is indeterminate, at worst is might almost as high as $6(.05) = 0.3)$ or 30%.



              The Tukey test is somewhat more 'reluctant' to declare differences. The criterion for
              declaring an 'Honest Significant Difference' (HSD) is chosen in such a way as to
              keep the family error rate below 5%. Thus the difference $3.51 - 3.23 = 0.28$ between education levels 2 and 3 is not sufficiently large to be declared 'significant'.



              By contrast, for example, the difference $3.23 - 2.83 = 0.40$ between education levels 3 and 4 is (barely) large enough to be declared significant. The much larger difference between education levels 1 and 2 is (more easily) declared significant. (If the sample sizes differ from level to level of the categorical variable the value of HSD may differ from one comparison to another.)






              share|cite|improve this answer

























                up vote
                0
                down vote










                up vote
                0
                down vote









                (1) Standard of living and number of children: Small F-statistic and consequent large P-value mean that no significant differences have been found. Thus there is no reason
                to do ad hoc Tukey tests.



                (2) Wife's education and number of children: Large F-statistic and consequent small P-value mean that there are some significant differences. You do an ad hoc Tukey procedure to
                see what can be determined about the pattern of differences.



                Roughly speaking, the number of children tend to decrease as wife's education level increases. If we used ${4 choose 2} = 6$ different t-tests to check for differences, results might be confusing. We could
                do each t-test at the 5% level of significance, but we would have no idea what
                risk we run of falsely declaring differences in some of the six comparisons. We say
                that the 'family error' rate for the pattern of differences among the four educational levels is indeterminate, at worst is might almost as high as $6(.05) = 0.3)$ or 30%.



                The Tukey test is somewhat more 'reluctant' to declare differences. The criterion for
                declaring an 'Honest Significant Difference' (HSD) is chosen in such a way as to
                keep the family error rate below 5%. Thus the difference $3.51 - 3.23 = 0.28$ between education levels 2 and 3 is not sufficiently large to be declared 'significant'.



                By contrast, for example, the difference $3.23 - 2.83 = 0.40$ between education levels 3 and 4 is (barely) large enough to be declared significant. The much larger difference between education levels 1 and 2 is (more easily) declared significant. (If the sample sizes differ from level to level of the categorical variable the value of HSD may differ from one comparison to another.)






                share|cite|improve this answer














                (1) Standard of living and number of children: Small F-statistic and consequent large P-value mean that no significant differences have been found. Thus there is no reason
                to do ad hoc Tukey tests.



                (2) Wife's education and number of children: Large F-statistic and consequent small P-value mean that there are some significant differences. You do an ad hoc Tukey procedure to
                see what can be determined about the pattern of differences.



                Roughly speaking, the number of children tend to decrease as wife's education level increases. If we used ${4 choose 2} = 6$ different t-tests to check for differences, results might be confusing. We could
                do each t-test at the 5% level of significance, but we would have no idea what
                risk we run of falsely declaring differences in some of the six comparisons. We say
                that the 'family error' rate for the pattern of differences among the four educational levels is indeterminate, at worst is might almost as high as $6(.05) = 0.3)$ or 30%.



                The Tukey test is somewhat more 'reluctant' to declare differences. The criterion for
                declaring an 'Honest Significant Difference' (HSD) is chosen in such a way as to
                keep the family error rate below 5%. Thus the difference $3.51 - 3.23 = 0.28$ between education levels 2 and 3 is not sufficiently large to be declared 'significant'.



                By contrast, for example, the difference $3.23 - 2.83 = 0.40$ between education levels 3 and 4 is (barely) large enough to be declared significant. The much larger difference between education levels 1 and 2 is (more easily) declared significant. (If the sample sizes differ from level to level of the categorical variable the value of HSD may differ from one comparison to another.)







                share|cite|improve this answer














                share|cite|improve this answer



                share|cite|improve this answer








                edited yesterday

























                answered yesterday









                BruceET

                34.7k71440




                34.7k71440






























                     

                    draft saved


                    draft discarded



















































                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3005628%2finterpretation-of-an-anova-output%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Berounka

                    Sphinx de Gizeh

                    Different font size/position of beamer's navigation symbols template's content depending on regular/plain...