pandasql: count occurrences of pairs












-1















I was trying to count the number of matches that A and B have ever played,
the dataset looks like this:



This is how the data looks like in Notebook



so the number of matches team1 and team 29 have played is 2 as they each once were HomeTeam and AwayTeam, however, using my query I could only count it as one:



SELECT HomeTeamID, AwayTeamID, Count(*) AS num_matches
FROM games GROUP BY HomeTeamID, AwayTeamID


I know where my problem is but don't know how to solve it.










share|improve this question

























  • With that data, can you post your expected output?

    – FrankerZ
    Nov 23 '18 at 20:48











  • @FrankerZ She said the expected output is 2 for teams 1 and 29.

    – Barmar
    Nov 23 '18 at 20:50













  • Where is Team 29 the Home Team in your sample dataset?

    – Barmar
    Nov 23 '18 at 20:52













  • Yeah sure, so the task is asked to write in sql languages but in python notebook, so it should return 3 columns, the focal team id, the opposing team ids that have play against the focal team, and the number of times that the two teams have played. I hope it makes sense!

    – Kristy.Z
    Nov 23 '18 at 20:53











  • There's nothing in the question about a focal team, and the query you tried doesn't match a specific team ID.

    – Barmar
    Nov 23 '18 at 20:54
















-1















I was trying to count the number of matches that A and B have ever played,
the dataset looks like this:



This is how the data looks like in Notebook



so the number of matches team1 and team 29 have played is 2 as they each once were HomeTeam and AwayTeam, however, using my query I could only count it as one:



SELECT HomeTeamID, AwayTeamID, Count(*) AS num_matches
FROM games GROUP BY HomeTeamID, AwayTeamID


I know where my problem is but don't know how to solve it.










share|improve this question

























  • With that data, can you post your expected output?

    – FrankerZ
    Nov 23 '18 at 20:48











  • @FrankerZ She said the expected output is 2 for teams 1 and 29.

    – Barmar
    Nov 23 '18 at 20:50













  • Where is Team 29 the Home Team in your sample dataset?

    – Barmar
    Nov 23 '18 at 20:52













  • Yeah sure, so the task is asked to write in sql languages but in python notebook, so it should return 3 columns, the focal team id, the opposing team ids that have play against the focal team, and the number of times that the two teams have played. I hope it makes sense!

    – Kristy.Z
    Nov 23 '18 at 20:53











  • There's nothing in the question about a focal team, and the query you tried doesn't match a specific team ID.

    – Barmar
    Nov 23 '18 at 20:54














-1












-1








-1








I was trying to count the number of matches that A and B have ever played,
the dataset looks like this:



This is how the data looks like in Notebook



so the number of matches team1 and team 29 have played is 2 as they each once were HomeTeam and AwayTeam, however, using my query I could only count it as one:



SELECT HomeTeamID, AwayTeamID, Count(*) AS num_matches
FROM games GROUP BY HomeTeamID, AwayTeamID


I know where my problem is but don't know how to solve it.










share|improve this question
















I was trying to count the number of matches that A and B have ever played,
the dataset looks like this:



This is how the data looks like in Notebook



so the number of matches team1 and team 29 have played is 2 as they each once were HomeTeam and AwayTeam, however, using my query I could only count it as one:



SELECT HomeTeamID, AwayTeamID, Count(*) AS num_matches
FROM games GROUP BY HomeTeamID, AwayTeamID


I know where my problem is but don't know how to solve it.







sql sqldf pandasql






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 26 '18 at 11:48









Salman A

177k66337426




177k66337426










asked Nov 23 '18 at 20:45









Kristy.ZKristy.Z

93




93













  • With that data, can you post your expected output?

    – FrankerZ
    Nov 23 '18 at 20:48











  • @FrankerZ She said the expected output is 2 for teams 1 and 29.

    – Barmar
    Nov 23 '18 at 20:50













  • Where is Team 29 the Home Team in your sample dataset?

    – Barmar
    Nov 23 '18 at 20:52













  • Yeah sure, so the task is asked to write in sql languages but in python notebook, so it should return 3 columns, the focal team id, the opposing team ids that have play against the focal team, and the number of times that the two teams have played. I hope it makes sense!

    – Kristy.Z
    Nov 23 '18 at 20:53











  • There's nothing in the question about a focal team, and the query you tried doesn't match a specific team ID.

    – Barmar
    Nov 23 '18 at 20:54



















  • With that data, can you post your expected output?

    – FrankerZ
    Nov 23 '18 at 20:48











  • @FrankerZ She said the expected output is 2 for teams 1 and 29.

    – Barmar
    Nov 23 '18 at 20:50













  • Where is Team 29 the Home Team in your sample dataset?

    – Barmar
    Nov 23 '18 at 20:52













  • Yeah sure, so the task is asked to write in sql languages but in python notebook, so it should return 3 columns, the focal team id, the opposing team ids that have play against the focal team, and the number of times that the two teams have played. I hope it makes sense!

    – Kristy.Z
    Nov 23 '18 at 20:53











  • There's nothing in the question about a focal team, and the query you tried doesn't match a specific team ID.

    – Barmar
    Nov 23 '18 at 20:54

















With that data, can you post your expected output?

– FrankerZ
Nov 23 '18 at 20:48





With that data, can you post your expected output?

– FrankerZ
Nov 23 '18 at 20:48













@FrankerZ She said the expected output is 2 for teams 1 and 29.

– Barmar
Nov 23 '18 at 20:50







@FrankerZ She said the expected output is 2 for teams 1 and 29.

– Barmar
Nov 23 '18 at 20:50















Where is Team 29 the Home Team in your sample dataset?

– Barmar
Nov 23 '18 at 20:52







Where is Team 29 the Home Team in your sample dataset?

– Barmar
Nov 23 '18 at 20:52















Yeah sure, so the task is asked to write in sql languages but in python notebook, so it should return 3 columns, the focal team id, the opposing team ids that have play against the focal team, and the number of times that the two teams have played. I hope it makes sense!

– Kristy.Z
Nov 23 '18 at 20:53





Yeah sure, so the task is asked to write in sql languages but in python notebook, so it should return 3 columns, the focal team id, the opposing team ids that have play against the focal team, and the number of times that the two teams have played. I hope it makes sense!

– Kristy.Z
Nov 23 '18 at 20:53













There's nothing in the question about a focal team, and the query you tried doesn't match a specific team ID.

– Barmar
Nov 23 '18 at 20:54





There's nothing in the question about a focal team, and the query you tried doesn't match a specific team ID.

– Barmar
Nov 23 '18 at 20:54












2 Answers
2






active

oldest

votes


















0














Put the teams in a common order so the grouping isn't sensitive to whether they're home or away.



SELECT GREATEST(HomeTeamID, AwayTeamID) AS team1, LEAST(HomeTeamID, AwayTeamID) AS team2, COUNT(*) as num_matches
FROM games
GROUP BY team1, team2





share|improve this answer































    0














    Revised Answer



    Suppose you have this data:



    Sample Data



    Then you can simply join list of all teams with list of all games:



    SELECT
    teams.TeamID,
    CASE WHEN teams.TeamID = games.HomeTeamID THEN games.AwayTeamID ELSE games.HomeTeamID END AS OtherTeamID,
    COUNT(*) AS GamesBetween
    FROM (
    SELECT HomeTeamID AS TeamID FROM games
    UNION
    SELECT AwayTeamID FROM games
    ) AS teams
    INNER JOIN games ON teams.TeamID = games.HomeTeamID OR teams.TeamID = games.AwayTeamID
    GROUP BY
    teams.TeamID,
    CASE WHEN teams.TeamID = games.HomeTeamID THEN games.AwayTeamID ELSE games.HomeTeamID END


    To get results like this:



    Grouped by team 1 and team 2



    Or this (left as exercise):



    Grouped by team 1






    share|improve this answer





















    • 1





      You and I need to fight it out whether the higher or lower ID should be team 1 :)

      – Barmar
      Nov 23 '18 at 20:51











    • Hi, since the task is required in pandasql, it says no such function 'Least' exists :(

      – Kristy.Z
      Nov 23 '18 at 21:01











    • @Leah.Z We dont need to recompute the expression used in the Select clause; we can directly refer to its alias in the group by clause.

      – Madhur Bhaiya
      Nov 23 '18 at 21:12











    • @SalmanA However, I also want to see team number 29 listed under 'Team 1' and the num_matches should also be 2 when the corresponding 'Team2' is team number 1. For this logic, we would only be able to see smaller team numbers in 'Team 1' and higher numbers in 'Team2'

      – Kristy.Z
      Nov 23 '18 at 22:54











    • @Kristy.Z see revised answer.

      – Salman A
      Nov 26 '18 at 11:49











    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53452755%2fpandasql-count-occurrences-of-pairs%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    Put the teams in a common order so the grouping isn't sensitive to whether they're home or away.



    SELECT GREATEST(HomeTeamID, AwayTeamID) AS team1, LEAST(HomeTeamID, AwayTeamID) AS team2, COUNT(*) as num_matches
    FROM games
    GROUP BY team1, team2





    share|improve this answer




























      0














      Put the teams in a common order so the grouping isn't sensitive to whether they're home or away.



      SELECT GREATEST(HomeTeamID, AwayTeamID) AS team1, LEAST(HomeTeamID, AwayTeamID) AS team2, COUNT(*) as num_matches
      FROM games
      GROUP BY team1, team2





      share|improve this answer


























        0












        0








        0







        Put the teams in a common order so the grouping isn't sensitive to whether they're home or away.



        SELECT GREATEST(HomeTeamID, AwayTeamID) AS team1, LEAST(HomeTeamID, AwayTeamID) AS team2, COUNT(*) as num_matches
        FROM games
        GROUP BY team1, team2





        share|improve this answer













        Put the teams in a common order so the grouping isn't sensitive to whether they're home or away.



        SELECT GREATEST(HomeTeamID, AwayTeamID) AS team1, LEAST(HomeTeamID, AwayTeamID) AS team2, COUNT(*) as num_matches
        FROM games
        GROUP BY team1, team2






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 23 '18 at 20:50









        BarmarBarmar

        423k35244346




        423k35244346

























            0














            Revised Answer



            Suppose you have this data:



            Sample Data



            Then you can simply join list of all teams with list of all games:



            SELECT
            teams.TeamID,
            CASE WHEN teams.TeamID = games.HomeTeamID THEN games.AwayTeamID ELSE games.HomeTeamID END AS OtherTeamID,
            COUNT(*) AS GamesBetween
            FROM (
            SELECT HomeTeamID AS TeamID FROM games
            UNION
            SELECT AwayTeamID FROM games
            ) AS teams
            INNER JOIN games ON teams.TeamID = games.HomeTeamID OR teams.TeamID = games.AwayTeamID
            GROUP BY
            teams.TeamID,
            CASE WHEN teams.TeamID = games.HomeTeamID THEN games.AwayTeamID ELSE games.HomeTeamID END


            To get results like this:



            Grouped by team 1 and team 2



            Or this (left as exercise):



            Grouped by team 1






            share|improve this answer





















            • 1





              You and I need to fight it out whether the higher or lower ID should be team 1 :)

              – Barmar
              Nov 23 '18 at 20:51











            • Hi, since the task is required in pandasql, it says no such function 'Least' exists :(

              – Kristy.Z
              Nov 23 '18 at 21:01











            • @Leah.Z We dont need to recompute the expression used in the Select clause; we can directly refer to its alias in the group by clause.

              – Madhur Bhaiya
              Nov 23 '18 at 21:12











            • @SalmanA However, I also want to see team number 29 listed under 'Team 1' and the num_matches should also be 2 when the corresponding 'Team2' is team number 1. For this logic, we would only be able to see smaller team numbers in 'Team 1' and higher numbers in 'Team2'

              – Kristy.Z
              Nov 23 '18 at 22:54











            • @Kristy.Z see revised answer.

              – Salman A
              Nov 26 '18 at 11:49
















            0














            Revised Answer



            Suppose you have this data:



            Sample Data



            Then you can simply join list of all teams with list of all games:



            SELECT
            teams.TeamID,
            CASE WHEN teams.TeamID = games.HomeTeamID THEN games.AwayTeamID ELSE games.HomeTeamID END AS OtherTeamID,
            COUNT(*) AS GamesBetween
            FROM (
            SELECT HomeTeamID AS TeamID FROM games
            UNION
            SELECT AwayTeamID FROM games
            ) AS teams
            INNER JOIN games ON teams.TeamID = games.HomeTeamID OR teams.TeamID = games.AwayTeamID
            GROUP BY
            teams.TeamID,
            CASE WHEN teams.TeamID = games.HomeTeamID THEN games.AwayTeamID ELSE games.HomeTeamID END


            To get results like this:



            Grouped by team 1 and team 2



            Or this (left as exercise):



            Grouped by team 1






            share|improve this answer





















            • 1





              You and I need to fight it out whether the higher or lower ID should be team 1 :)

              – Barmar
              Nov 23 '18 at 20:51











            • Hi, since the task is required in pandasql, it says no such function 'Least' exists :(

              – Kristy.Z
              Nov 23 '18 at 21:01











            • @Leah.Z We dont need to recompute the expression used in the Select clause; we can directly refer to its alias in the group by clause.

              – Madhur Bhaiya
              Nov 23 '18 at 21:12











            • @SalmanA However, I also want to see team number 29 listed under 'Team 1' and the num_matches should also be 2 when the corresponding 'Team2' is team number 1. For this logic, we would only be able to see smaller team numbers in 'Team 1' and higher numbers in 'Team2'

              – Kristy.Z
              Nov 23 '18 at 22:54











            • @Kristy.Z see revised answer.

              – Salman A
              Nov 26 '18 at 11:49














            0












            0








            0







            Revised Answer



            Suppose you have this data:



            Sample Data



            Then you can simply join list of all teams with list of all games:



            SELECT
            teams.TeamID,
            CASE WHEN teams.TeamID = games.HomeTeamID THEN games.AwayTeamID ELSE games.HomeTeamID END AS OtherTeamID,
            COUNT(*) AS GamesBetween
            FROM (
            SELECT HomeTeamID AS TeamID FROM games
            UNION
            SELECT AwayTeamID FROM games
            ) AS teams
            INNER JOIN games ON teams.TeamID = games.HomeTeamID OR teams.TeamID = games.AwayTeamID
            GROUP BY
            teams.TeamID,
            CASE WHEN teams.TeamID = games.HomeTeamID THEN games.AwayTeamID ELSE games.HomeTeamID END


            To get results like this:



            Grouped by team 1 and team 2



            Or this (left as exercise):



            Grouped by team 1






            share|improve this answer















            Revised Answer



            Suppose you have this data:



            Sample Data



            Then you can simply join list of all teams with list of all games:



            SELECT
            teams.TeamID,
            CASE WHEN teams.TeamID = games.HomeTeamID THEN games.AwayTeamID ELSE games.HomeTeamID END AS OtherTeamID,
            COUNT(*) AS GamesBetween
            FROM (
            SELECT HomeTeamID AS TeamID FROM games
            UNION
            SELECT AwayTeamID FROM games
            ) AS teams
            INNER JOIN games ON teams.TeamID = games.HomeTeamID OR teams.TeamID = games.AwayTeamID
            GROUP BY
            teams.TeamID,
            CASE WHEN teams.TeamID = games.HomeTeamID THEN games.AwayTeamID ELSE games.HomeTeamID END


            To get results like this:



            Grouped by team 1 and team 2



            Or this (left as exercise):



            Grouped by team 1







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 24 '18 at 12:01

























            answered Nov 23 '18 at 20:50









            Salman ASalman A

            177k66337426




            177k66337426








            • 1





              You and I need to fight it out whether the higher or lower ID should be team 1 :)

              – Barmar
              Nov 23 '18 at 20:51











            • Hi, since the task is required in pandasql, it says no such function 'Least' exists :(

              – Kristy.Z
              Nov 23 '18 at 21:01











            • @Leah.Z We dont need to recompute the expression used in the Select clause; we can directly refer to its alias in the group by clause.

              – Madhur Bhaiya
              Nov 23 '18 at 21:12











            • @SalmanA However, I also want to see team number 29 listed under 'Team 1' and the num_matches should also be 2 when the corresponding 'Team2' is team number 1. For this logic, we would only be able to see smaller team numbers in 'Team 1' and higher numbers in 'Team2'

              – Kristy.Z
              Nov 23 '18 at 22:54











            • @Kristy.Z see revised answer.

              – Salman A
              Nov 26 '18 at 11:49














            • 1





              You and I need to fight it out whether the higher or lower ID should be team 1 :)

              – Barmar
              Nov 23 '18 at 20:51











            • Hi, since the task is required in pandasql, it says no such function 'Least' exists :(

              – Kristy.Z
              Nov 23 '18 at 21:01











            • @Leah.Z We dont need to recompute the expression used in the Select clause; we can directly refer to its alias in the group by clause.

              – Madhur Bhaiya
              Nov 23 '18 at 21:12











            • @SalmanA However, I also want to see team number 29 listed under 'Team 1' and the num_matches should also be 2 when the corresponding 'Team2' is team number 1. For this logic, we would only be able to see smaller team numbers in 'Team 1' and higher numbers in 'Team2'

              – Kristy.Z
              Nov 23 '18 at 22:54











            • @Kristy.Z see revised answer.

              – Salman A
              Nov 26 '18 at 11:49








            1




            1





            You and I need to fight it out whether the higher or lower ID should be team 1 :)

            – Barmar
            Nov 23 '18 at 20:51





            You and I need to fight it out whether the higher or lower ID should be team 1 :)

            – Barmar
            Nov 23 '18 at 20:51













            Hi, since the task is required in pandasql, it says no such function 'Least' exists :(

            – Kristy.Z
            Nov 23 '18 at 21:01





            Hi, since the task is required in pandasql, it says no such function 'Least' exists :(

            – Kristy.Z
            Nov 23 '18 at 21:01













            @Leah.Z We dont need to recompute the expression used in the Select clause; we can directly refer to its alias in the group by clause.

            – Madhur Bhaiya
            Nov 23 '18 at 21:12





            @Leah.Z We dont need to recompute the expression used in the Select clause; we can directly refer to its alias in the group by clause.

            – Madhur Bhaiya
            Nov 23 '18 at 21:12













            @SalmanA However, I also want to see team number 29 listed under 'Team 1' and the num_matches should also be 2 when the corresponding 'Team2' is team number 1. For this logic, we would only be able to see smaller team numbers in 'Team 1' and higher numbers in 'Team2'

            – Kristy.Z
            Nov 23 '18 at 22:54





            @SalmanA However, I also want to see team number 29 listed under 'Team 1' and the num_matches should also be 2 when the corresponding 'Team2' is team number 1. For this logic, we would only be able to see smaller team numbers in 'Team 1' and higher numbers in 'Team2'

            – Kristy.Z
            Nov 23 '18 at 22:54













            @Kristy.Z see revised answer.

            – Salman A
            Nov 26 '18 at 11:49





            @Kristy.Z see revised answer.

            – Salman A
            Nov 26 '18 at 11:49


















            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53452755%2fpandasql-count-occurrences-of-pairs%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Berounka

            Sphinx de Gizeh

            Different font size/position of beamer's navigation symbols template's content depending on regular/plain...