dropping dataframe rows based on values in other dataframe
up vote
0
down vote
favorite
I am working on IPL dataset from Kaggle (https://www.kaggle.com/manasgarg/ipl). It has two .csv files with a primary key to connect the data.
I want to drop rows where batting team has lost the match.
df_deliv has batting team
df_match has the winner of the match
I achieved it using the below code but its very slow due to the for loop.
import pandas as pd
import numpy as np
df_deliv = pd.read_csv("deliveries.csv")
df_match = pd.read_csv("matches.csv")
df_deliv = df_deliv[["match_id", "batting_team", "batsman", "batsman_runs"]]
df_deliv["winner"] = [df_match.loc[i-1]["winner"] for i in df_deliv["match_id"]] #makes it very slow
df_deliv.drop(df_deliv[df_deliv["batting_team"] != df_deliv["winner"]].index, inplace = True)
print(df_deliv)
is there a way to do in one df.drop statement rather than the for loop???
python pandas
add a comment |
up vote
0
down vote
favorite
I am working on IPL dataset from Kaggle (https://www.kaggle.com/manasgarg/ipl). It has two .csv files with a primary key to connect the data.
I want to drop rows where batting team has lost the match.
df_deliv has batting team
df_match has the winner of the match
I achieved it using the below code but its very slow due to the for loop.
import pandas as pd
import numpy as np
df_deliv = pd.read_csv("deliveries.csv")
df_match = pd.read_csv("matches.csv")
df_deliv = df_deliv[["match_id", "batting_team", "batsman", "batsman_runs"]]
df_deliv["winner"] = [df_match.loc[i-1]["winner"] for i in df_deliv["match_id"]] #makes it very slow
df_deliv.drop(df_deliv[df_deliv["batting_team"] != df_deliv["winner"]].index, inplace = True)
print(df_deliv)
is there a way to do in one df.drop statement rather than the for loop???
python pandas
3
Please, post a reproducible example. Why don't you join them and then just filter by the conditions you want instead of using a drop ?
– Antonio Manrique
Nov 21 at 17:44
You could probably join the two dataframes usingmerge()
. Please postdf_deliv.head()
anddf_match.head()
so we can see structure of dataframes and offer a more complete solution.
– Gal Sivan
Nov 21 at 17:45
@AntonioManrique sir, i am very new to asking questions and to data science... please let me know what is a reproducible example.
– Yash Mishra
Nov 21 at 18:29
@YashMishra of course i can :) It's basically to post the code that allow's us to reproduce your dataset and your error. Here you have a better explanation: stackoverflow.com/questions/20109391/…
– Antonio Manrique
Nov 21 at 19:03
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I am working on IPL dataset from Kaggle (https://www.kaggle.com/manasgarg/ipl). It has two .csv files with a primary key to connect the data.
I want to drop rows where batting team has lost the match.
df_deliv has batting team
df_match has the winner of the match
I achieved it using the below code but its very slow due to the for loop.
import pandas as pd
import numpy as np
df_deliv = pd.read_csv("deliveries.csv")
df_match = pd.read_csv("matches.csv")
df_deliv = df_deliv[["match_id", "batting_team", "batsman", "batsman_runs"]]
df_deliv["winner"] = [df_match.loc[i-1]["winner"] for i in df_deliv["match_id"]] #makes it very slow
df_deliv.drop(df_deliv[df_deliv["batting_team"] != df_deliv["winner"]].index, inplace = True)
print(df_deliv)
is there a way to do in one df.drop statement rather than the for loop???
python pandas
I am working on IPL dataset from Kaggle (https://www.kaggle.com/manasgarg/ipl). It has two .csv files with a primary key to connect the data.
I want to drop rows where batting team has lost the match.
df_deliv has batting team
df_match has the winner of the match
I achieved it using the below code but its very slow due to the for loop.
import pandas as pd
import numpy as np
df_deliv = pd.read_csv("deliveries.csv")
df_match = pd.read_csv("matches.csv")
df_deliv = df_deliv[["match_id", "batting_team", "batsman", "batsman_runs"]]
df_deliv["winner"] = [df_match.loc[i-1]["winner"] for i in df_deliv["match_id"]] #makes it very slow
df_deliv.drop(df_deliv[df_deliv["batting_team"] != df_deliv["winner"]].index, inplace = True)
print(df_deliv)
is there a way to do in one df.drop statement rather than the for loop???
python pandas
python pandas
edited Nov 21 at 18:46
asked Nov 21 at 17:42
Yash Mishra
264
264
3
Please, post a reproducible example. Why don't you join them and then just filter by the conditions you want instead of using a drop ?
– Antonio Manrique
Nov 21 at 17:44
You could probably join the two dataframes usingmerge()
. Please postdf_deliv.head()
anddf_match.head()
so we can see structure of dataframes and offer a more complete solution.
– Gal Sivan
Nov 21 at 17:45
@AntonioManrique sir, i am very new to asking questions and to data science... please let me know what is a reproducible example.
– Yash Mishra
Nov 21 at 18:29
@YashMishra of course i can :) It's basically to post the code that allow's us to reproduce your dataset and your error. Here you have a better explanation: stackoverflow.com/questions/20109391/…
– Antonio Manrique
Nov 21 at 19:03
add a comment |
3
Please, post a reproducible example. Why don't you join them and then just filter by the conditions you want instead of using a drop ?
– Antonio Manrique
Nov 21 at 17:44
You could probably join the two dataframes usingmerge()
. Please postdf_deliv.head()
anddf_match.head()
so we can see structure of dataframes and offer a more complete solution.
– Gal Sivan
Nov 21 at 17:45
@AntonioManrique sir, i am very new to asking questions and to data science... please let me know what is a reproducible example.
– Yash Mishra
Nov 21 at 18:29
@YashMishra of course i can :) It's basically to post the code that allow's us to reproduce your dataset and your error. Here you have a better explanation: stackoverflow.com/questions/20109391/…
– Antonio Manrique
Nov 21 at 19:03
3
3
Please, post a reproducible example. Why don't you join them and then just filter by the conditions you want instead of using a drop ?
– Antonio Manrique
Nov 21 at 17:44
Please, post a reproducible example. Why don't you join them and then just filter by the conditions you want instead of using a drop ?
– Antonio Manrique
Nov 21 at 17:44
You could probably join the two dataframes using
merge()
. Please post df_deliv.head()
and df_match.head()
so we can see structure of dataframes and offer a more complete solution.– Gal Sivan
Nov 21 at 17:45
You could probably join the two dataframes using
merge()
. Please post df_deliv.head()
and df_match.head()
so we can see structure of dataframes and offer a more complete solution.– Gal Sivan
Nov 21 at 17:45
@AntonioManrique sir, i am very new to asking questions and to data science... please let me know what is a reproducible example.
– Yash Mishra
Nov 21 at 18:29
@AntonioManrique sir, i am very new to asking questions and to data science... please let me know what is a reproducible example.
– Yash Mishra
Nov 21 at 18:29
@YashMishra of course i can :) It's basically to post the code that allow's us to reproduce your dataset and your error. Here you have a better explanation: stackoverflow.com/questions/20109391/…
– Antonio Manrique
Nov 21 at 19:03
@YashMishra of course i can :) It's basically to post the code that allow's us to reproduce your dataset and your error. Here you have a better explanation: stackoverflow.com/questions/20109391/…
– Antonio Manrique
Nov 21 at 19:03
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
Instead of droping, you can just filter the rows that you need. Something like this:
df_deliv = df_deliv[df_deliv['batting_team']==df_deliv['winner']]
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
Instead of droping, you can just filter the rows that you need. Something like this:
df_deliv = df_deliv[df_deliv['batting_team']==df_deliv['winner']]
add a comment |
up vote
0
down vote
Instead of droping, you can just filter the rows that you need. Something like this:
df_deliv = df_deliv[df_deliv['batting_team']==df_deliv['winner']]
add a comment |
up vote
0
down vote
up vote
0
down vote
Instead of droping, you can just filter the rows that you need. Something like this:
df_deliv = df_deliv[df_deliv['batting_team']==df_deliv['winner']]
Instead of droping, you can just filter the rows that you need. Something like this:
df_deliv = df_deliv[df_deliv['batting_team']==df_deliv['winner']]
answered Nov 21 at 17:52
Ronnie
518
518
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53417804%2fdropping-dataframe-rows-based-on-values-in-other-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
3
Please, post a reproducible example. Why don't you join them and then just filter by the conditions you want instead of using a drop ?
– Antonio Manrique
Nov 21 at 17:44
You could probably join the two dataframes using
merge()
. Please postdf_deliv.head()
anddf_match.head()
so we can see structure of dataframes and offer a more complete solution.– Gal Sivan
Nov 21 at 17:45
@AntonioManrique sir, i am very new to asking questions and to data science... please let me know what is a reproducible example.
– Yash Mishra
Nov 21 at 18:29
@YashMishra of course i can :) It's basically to post the code that allow's us to reproduce your dataset and your error. Here you have a better explanation: stackoverflow.com/questions/20109391/…
– Antonio Manrique
Nov 21 at 19:03