pandasql: count occurrences of pairs
I was trying to count the number of matches that A and B have ever played,
the dataset looks like this:
so the number of matches team1 and team 29 have played is 2 as they each once were HomeTeam and AwayTeam, however, using my query I could only count it as one:
SELECT HomeTeamID, AwayTeamID, Count(*) AS num_matches
FROM games GROUP BY HomeTeamID, AwayTeamID
I know where my problem is but don't know how to solve it.
sql sqldf pandasql
add a comment |
I was trying to count the number of matches that A and B have ever played,
the dataset looks like this:
so the number of matches team1 and team 29 have played is 2 as they each once were HomeTeam and AwayTeam, however, using my query I could only count it as one:
SELECT HomeTeamID, AwayTeamID, Count(*) AS num_matches
FROM games GROUP BY HomeTeamID, AwayTeamID
I know where my problem is but don't know how to solve it.
sql sqldf pandasql
With that data, can you post your expected output?
– FrankerZ
Nov 23 '18 at 20:48
@FrankerZ She said the expected output is 2 for teams 1 and 29.
– Barmar
Nov 23 '18 at 20:50
Where is Team 29 the Home Team in your sample dataset?
– Barmar
Nov 23 '18 at 20:52
Yeah sure, so the task is asked to write in sql languages but in python notebook, so it should return 3 columns, the focal team id, the opposing team ids that have play against the focal team, and the number of times that the two teams have played. I hope it makes sense!
– Kristy.Z
Nov 23 '18 at 20:53
There's nothing in the question about a focal team, and the query you tried doesn't match a specific team ID.
– Barmar
Nov 23 '18 at 20:54
add a comment |
I was trying to count the number of matches that A and B have ever played,
the dataset looks like this:
so the number of matches team1 and team 29 have played is 2 as they each once were HomeTeam and AwayTeam, however, using my query I could only count it as one:
SELECT HomeTeamID, AwayTeamID, Count(*) AS num_matches
FROM games GROUP BY HomeTeamID, AwayTeamID
I know where my problem is but don't know how to solve it.
sql sqldf pandasql
I was trying to count the number of matches that A and B have ever played,
the dataset looks like this:
so the number of matches team1 and team 29 have played is 2 as they each once were HomeTeam and AwayTeam, however, using my query I could only count it as one:
SELECT HomeTeamID, AwayTeamID, Count(*) AS num_matches
FROM games GROUP BY HomeTeamID, AwayTeamID
I know where my problem is but don't know how to solve it.
sql sqldf pandasql
sql sqldf pandasql
edited Nov 26 '18 at 11:48
Salman A
177k66337426
177k66337426
asked Nov 23 '18 at 20:45
Kristy.ZKristy.Z
93
93
With that data, can you post your expected output?
– FrankerZ
Nov 23 '18 at 20:48
@FrankerZ She said the expected output is 2 for teams 1 and 29.
– Barmar
Nov 23 '18 at 20:50
Where is Team 29 the Home Team in your sample dataset?
– Barmar
Nov 23 '18 at 20:52
Yeah sure, so the task is asked to write in sql languages but in python notebook, so it should return 3 columns, the focal team id, the opposing team ids that have play against the focal team, and the number of times that the two teams have played. I hope it makes sense!
– Kristy.Z
Nov 23 '18 at 20:53
There's nothing in the question about a focal team, and the query you tried doesn't match a specific team ID.
– Barmar
Nov 23 '18 at 20:54
add a comment |
With that data, can you post your expected output?
– FrankerZ
Nov 23 '18 at 20:48
@FrankerZ She said the expected output is 2 for teams 1 and 29.
– Barmar
Nov 23 '18 at 20:50
Where is Team 29 the Home Team in your sample dataset?
– Barmar
Nov 23 '18 at 20:52
Yeah sure, so the task is asked to write in sql languages but in python notebook, so it should return 3 columns, the focal team id, the opposing team ids that have play against the focal team, and the number of times that the two teams have played. I hope it makes sense!
– Kristy.Z
Nov 23 '18 at 20:53
There's nothing in the question about a focal team, and the query you tried doesn't match a specific team ID.
– Barmar
Nov 23 '18 at 20:54
With that data, can you post your expected output?
– FrankerZ
Nov 23 '18 at 20:48
With that data, can you post your expected output?
– FrankerZ
Nov 23 '18 at 20:48
@FrankerZ She said the expected output is 2 for teams 1 and 29.
– Barmar
Nov 23 '18 at 20:50
@FrankerZ She said the expected output is 2 for teams 1 and 29.
– Barmar
Nov 23 '18 at 20:50
Where is Team 29 the Home Team in your sample dataset?
– Barmar
Nov 23 '18 at 20:52
Where is Team 29 the Home Team in your sample dataset?
– Barmar
Nov 23 '18 at 20:52
Yeah sure, so the task is asked to write in sql languages but in python notebook, so it should return 3 columns, the focal team id, the opposing team ids that have play against the focal team, and the number of times that the two teams have played. I hope it makes sense!
– Kristy.Z
Nov 23 '18 at 20:53
Yeah sure, so the task is asked to write in sql languages but in python notebook, so it should return 3 columns, the focal team id, the opposing team ids that have play against the focal team, and the number of times that the two teams have played. I hope it makes sense!
– Kristy.Z
Nov 23 '18 at 20:53
There's nothing in the question about a focal team, and the query you tried doesn't match a specific team ID.
– Barmar
Nov 23 '18 at 20:54
There's nothing in the question about a focal team, and the query you tried doesn't match a specific team ID.
– Barmar
Nov 23 '18 at 20:54
add a comment |
2 Answers
2
active
oldest
votes
Put the teams in a common order so the grouping isn't sensitive to whether they're home or away.
SELECT GREATEST(HomeTeamID, AwayTeamID) AS team1, LEAST(HomeTeamID, AwayTeamID) AS team2, COUNT(*) as num_matches
FROM games
GROUP BY team1, team2
add a comment |
Revised Answer
Suppose you have this data:
Then you can simply join list of all teams with list of all games:
SELECT
teams.TeamID,
CASE WHEN teams.TeamID = games.HomeTeamID THEN games.AwayTeamID ELSE games.HomeTeamID END AS OtherTeamID,
COUNT(*) AS GamesBetween
FROM (
SELECT HomeTeamID AS TeamID FROM games
UNION
SELECT AwayTeamID FROM games
) AS teams
INNER JOIN games ON teams.TeamID = games.HomeTeamID OR teams.TeamID = games.AwayTeamID
GROUP BY
teams.TeamID,
CASE WHEN teams.TeamID = games.HomeTeamID THEN games.AwayTeamID ELSE games.HomeTeamID END
To get results like this:
Or this (left as exercise):
1
You and I need to fight it out whether the higher or lower ID should be team 1 :)
– Barmar
Nov 23 '18 at 20:51
Hi, since the task is required in pandasql, it says no such function 'Least' exists :(
– Kristy.Z
Nov 23 '18 at 21:01
@Leah.Z We dont need to recompute the expression used in the Select clause; we can directly refer to its alias in thegroup by
clause.
– Madhur Bhaiya
Nov 23 '18 at 21:12
@SalmanA However, I also want to see team number 29 listed under 'Team 1' and the num_matches should also be 2 when the corresponding 'Team2' is team number 1. For this logic, we would only be able to see smaller team numbers in 'Team 1' and higher numbers in 'Team2'
– Kristy.Z
Nov 23 '18 at 22:54
@Kristy.Z see revised answer.
– Salman A
Nov 26 '18 at 11:49
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53452755%2fpandasql-count-occurrences-of-pairs%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Put the teams in a common order so the grouping isn't sensitive to whether they're home or away.
SELECT GREATEST(HomeTeamID, AwayTeamID) AS team1, LEAST(HomeTeamID, AwayTeamID) AS team2, COUNT(*) as num_matches
FROM games
GROUP BY team1, team2
add a comment |
Put the teams in a common order so the grouping isn't sensitive to whether they're home or away.
SELECT GREATEST(HomeTeamID, AwayTeamID) AS team1, LEAST(HomeTeamID, AwayTeamID) AS team2, COUNT(*) as num_matches
FROM games
GROUP BY team1, team2
add a comment |
Put the teams in a common order so the grouping isn't sensitive to whether they're home or away.
SELECT GREATEST(HomeTeamID, AwayTeamID) AS team1, LEAST(HomeTeamID, AwayTeamID) AS team2, COUNT(*) as num_matches
FROM games
GROUP BY team1, team2
Put the teams in a common order so the grouping isn't sensitive to whether they're home or away.
SELECT GREATEST(HomeTeamID, AwayTeamID) AS team1, LEAST(HomeTeamID, AwayTeamID) AS team2, COUNT(*) as num_matches
FROM games
GROUP BY team1, team2
answered Nov 23 '18 at 20:50
BarmarBarmar
423k35244346
423k35244346
add a comment |
add a comment |
Revised Answer
Suppose you have this data:
Then you can simply join list of all teams with list of all games:
SELECT
teams.TeamID,
CASE WHEN teams.TeamID = games.HomeTeamID THEN games.AwayTeamID ELSE games.HomeTeamID END AS OtherTeamID,
COUNT(*) AS GamesBetween
FROM (
SELECT HomeTeamID AS TeamID FROM games
UNION
SELECT AwayTeamID FROM games
) AS teams
INNER JOIN games ON teams.TeamID = games.HomeTeamID OR teams.TeamID = games.AwayTeamID
GROUP BY
teams.TeamID,
CASE WHEN teams.TeamID = games.HomeTeamID THEN games.AwayTeamID ELSE games.HomeTeamID END
To get results like this:
Or this (left as exercise):
1
You and I need to fight it out whether the higher or lower ID should be team 1 :)
– Barmar
Nov 23 '18 at 20:51
Hi, since the task is required in pandasql, it says no such function 'Least' exists :(
– Kristy.Z
Nov 23 '18 at 21:01
@Leah.Z We dont need to recompute the expression used in the Select clause; we can directly refer to its alias in thegroup by
clause.
– Madhur Bhaiya
Nov 23 '18 at 21:12
@SalmanA However, I also want to see team number 29 listed under 'Team 1' and the num_matches should also be 2 when the corresponding 'Team2' is team number 1. For this logic, we would only be able to see smaller team numbers in 'Team 1' and higher numbers in 'Team2'
– Kristy.Z
Nov 23 '18 at 22:54
@Kristy.Z see revised answer.
– Salman A
Nov 26 '18 at 11:49
add a comment |
Revised Answer
Suppose you have this data:
Then you can simply join list of all teams with list of all games:
SELECT
teams.TeamID,
CASE WHEN teams.TeamID = games.HomeTeamID THEN games.AwayTeamID ELSE games.HomeTeamID END AS OtherTeamID,
COUNT(*) AS GamesBetween
FROM (
SELECT HomeTeamID AS TeamID FROM games
UNION
SELECT AwayTeamID FROM games
) AS teams
INNER JOIN games ON teams.TeamID = games.HomeTeamID OR teams.TeamID = games.AwayTeamID
GROUP BY
teams.TeamID,
CASE WHEN teams.TeamID = games.HomeTeamID THEN games.AwayTeamID ELSE games.HomeTeamID END
To get results like this:
Or this (left as exercise):
1
You and I need to fight it out whether the higher or lower ID should be team 1 :)
– Barmar
Nov 23 '18 at 20:51
Hi, since the task is required in pandasql, it says no such function 'Least' exists :(
– Kristy.Z
Nov 23 '18 at 21:01
@Leah.Z We dont need to recompute the expression used in the Select clause; we can directly refer to its alias in thegroup by
clause.
– Madhur Bhaiya
Nov 23 '18 at 21:12
@SalmanA However, I also want to see team number 29 listed under 'Team 1' and the num_matches should also be 2 when the corresponding 'Team2' is team number 1. For this logic, we would only be able to see smaller team numbers in 'Team 1' and higher numbers in 'Team2'
– Kristy.Z
Nov 23 '18 at 22:54
@Kristy.Z see revised answer.
– Salman A
Nov 26 '18 at 11:49
add a comment |
Revised Answer
Suppose you have this data:
Then you can simply join list of all teams with list of all games:
SELECT
teams.TeamID,
CASE WHEN teams.TeamID = games.HomeTeamID THEN games.AwayTeamID ELSE games.HomeTeamID END AS OtherTeamID,
COUNT(*) AS GamesBetween
FROM (
SELECT HomeTeamID AS TeamID FROM games
UNION
SELECT AwayTeamID FROM games
) AS teams
INNER JOIN games ON teams.TeamID = games.HomeTeamID OR teams.TeamID = games.AwayTeamID
GROUP BY
teams.TeamID,
CASE WHEN teams.TeamID = games.HomeTeamID THEN games.AwayTeamID ELSE games.HomeTeamID END
To get results like this:
Or this (left as exercise):
Revised Answer
Suppose you have this data:
Then you can simply join list of all teams with list of all games:
SELECT
teams.TeamID,
CASE WHEN teams.TeamID = games.HomeTeamID THEN games.AwayTeamID ELSE games.HomeTeamID END AS OtherTeamID,
COUNT(*) AS GamesBetween
FROM (
SELECT HomeTeamID AS TeamID FROM games
UNION
SELECT AwayTeamID FROM games
) AS teams
INNER JOIN games ON teams.TeamID = games.HomeTeamID OR teams.TeamID = games.AwayTeamID
GROUP BY
teams.TeamID,
CASE WHEN teams.TeamID = games.HomeTeamID THEN games.AwayTeamID ELSE games.HomeTeamID END
To get results like this:
Or this (left as exercise):
edited Nov 24 '18 at 12:01
answered Nov 23 '18 at 20:50
Salman ASalman A
177k66337426
177k66337426
1
You and I need to fight it out whether the higher or lower ID should be team 1 :)
– Barmar
Nov 23 '18 at 20:51
Hi, since the task is required in pandasql, it says no such function 'Least' exists :(
– Kristy.Z
Nov 23 '18 at 21:01
@Leah.Z We dont need to recompute the expression used in the Select clause; we can directly refer to its alias in thegroup by
clause.
– Madhur Bhaiya
Nov 23 '18 at 21:12
@SalmanA However, I also want to see team number 29 listed under 'Team 1' and the num_matches should also be 2 when the corresponding 'Team2' is team number 1. For this logic, we would only be able to see smaller team numbers in 'Team 1' and higher numbers in 'Team2'
– Kristy.Z
Nov 23 '18 at 22:54
@Kristy.Z see revised answer.
– Salman A
Nov 26 '18 at 11:49
add a comment |
1
You and I need to fight it out whether the higher or lower ID should be team 1 :)
– Barmar
Nov 23 '18 at 20:51
Hi, since the task is required in pandasql, it says no such function 'Least' exists :(
– Kristy.Z
Nov 23 '18 at 21:01
@Leah.Z We dont need to recompute the expression used in the Select clause; we can directly refer to its alias in thegroup by
clause.
– Madhur Bhaiya
Nov 23 '18 at 21:12
@SalmanA However, I also want to see team number 29 listed under 'Team 1' and the num_matches should also be 2 when the corresponding 'Team2' is team number 1. For this logic, we would only be able to see smaller team numbers in 'Team 1' and higher numbers in 'Team2'
– Kristy.Z
Nov 23 '18 at 22:54
@Kristy.Z see revised answer.
– Salman A
Nov 26 '18 at 11:49
1
1
You and I need to fight it out whether the higher or lower ID should be team 1 :)
– Barmar
Nov 23 '18 at 20:51
You and I need to fight it out whether the higher or lower ID should be team 1 :)
– Barmar
Nov 23 '18 at 20:51
Hi, since the task is required in pandasql, it says no such function 'Least' exists :(
– Kristy.Z
Nov 23 '18 at 21:01
Hi, since the task is required in pandasql, it says no such function 'Least' exists :(
– Kristy.Z
Nov 23 '18 at 21:01
@Leah.Z We dont need to recompute the expression used in the Select clause; we can directly refer to its alias in the
group by
clause.– Madhur Bhaiya
Nov 23 '18 at 21:12
@Leah.Z We dont need to recompute the expression used in the Select clause; we can directly refer to its alias in the
group by
clause.– Madhur Bhaiya
Nov 23 '18 at 21:12
@SalmanA However, I also want to see team number 29 listed under 'Team 1' and the num_matches should also be 2 when the corresponding 'Team2' is team number 1. For this logic, we would only be able to see smaller team numbers in 'Team 1' and higher numbers in 'Team2'
– Kristy.Z
Nov 23 '18 at 22:54
@SalmanA However, I also want to see team number 29 listed under 'Team 1' and the num_matches should also be 2 when the corresponding 'Team2' is team number 1. For this logic, we would only be able to see smaller team numbers in 'Team 1' and higher numbers in 'Team2'
– Kristy.Z
Nov 23 '18 at 22:54
@Kristy.Z see revised answer.
– Salman A
Nov 26 '18 at 11:49
@Kristy.Z see revised answer.
– Salman A
Nov 26 '18 at 11:49
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53452755%2fpandasql-count-occurrences-of-pairs%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
With that data, can you post your expected output?
– FrankerZ
Nov 23 '18 at 20:48
@FrankerZ She said the expected output is 2 for teams 1 and 29.
– Barmar
Nov 23 '18 at 20:50
Where is Team 29 the Home Team in your sample dataset?
– Barmar
Nov 23 '18 at 20:52
Yeah sure, so the task is asked to write in sql languages but in python notebook, so it should return 3 columns, the focal team id, the opposing team ids that have play against the focal team, and the number of times that the two teams have played. I hope it makes sense!
– Kristy.Z
Nov 23 '18 at 20:53
There's nothing in the question about a focal team, and the query you tried doesn't match a specific team ID.
– Barmar
Nov 23 '18 at 20:54