Compare dataframe columns with conditions
up vote
1
down vote
favorite
I have 2 dataframes as below:
df1:
ID col1 col2
1 A1 B1
2 A2 B2
3 A3 B3
4 A4 B4
5 A5 B5
6 A6 B6
df2:
col1 col2
A1 B1
A2 O5
H3 B3
A4 B4
A5 66
A6 C6
Expected Result: I would like to generate a result df based on the condition - Each value in col1,col2 of df1 should exist in col1,col2 values of df2
Expected Result df:
ID col1 col2 Error
1 A1 B1 No mismatch with df2
2 A2 B2 col2 mismatch with df2
3 A3 B3 col1 mismatch with df2
4 A4 B4 No mismatch with df2
5 A5 B5 col2 mismatch with df2
6 A6 B6 col2 mismatch with df2
python pandas dataframe
add a comment |
up vote
1
down vote
favorite
I have 2 dataframes as below:
df1:
ID col1 col2
1 A1 B1
2 A2 B2
3 A3 B3
4 A4 B4
5 A5 B5
6 A6 B6
df2:
col1 col2
A1 B1
A2 O5
H3 B3
A4 B4
A5 66
A6 C6
Expected Result: I would like to generate a result df based on the condition - Each value in col1,col2 of df1 should exist in col1,col2 values of df2
Expected Result df:
ID col1 col2 Error
1 A1 B1 No mismatch with df2
2 A2 B2 col2 mismatch with df2
3 A3 B3 col1 mismatch with df2
4 A4 B4 No mismatch with df2
5 A5 B5 col2 mismatch with df2
6 A6 B6 col2 mismatch with df2
python pandas dataframe
You do not have list in df2
– W-B
Nov 22 at 1:25
list is a column in df1 and its value list1 and list2 are just dropdownlist names ; the accepted values are given in columns list1,list2 in df2. So, the data from column "value" of df1 based on its list value should be checked with df2 list1 & list2 values.
– Osceria
Nov 22 at 9:54
Edited the Question
– Osceria
Nov 22 at 11:46
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I have 2 dataframes as below:
df1:
ID col1 col2
1 A1 B1
2 A2 B2
3 A3 B3
4 A4 B4
5 A5 B5
6 A6 B6
df2:
col1 col2
A1 B1
A2 O5
H3 B3
A4 B4
A5 66
A6 C6
Expected Result: I would like to generate a result df based on the condition - Each value in col1,col2 of df1 should exist in col1,col2 values of df2
Expected Result df:
ID col1 col2 Error
1 A1 B1 No mismatch with df2
2 A2 B2 col2 mismatch with df2
3 A3 B3 col1 mismatch with df2
4 A4 B4 No mismatch with df2
5 A5 B5 col2 mismatch with df2
6 A6 B6 col2 mismatch with df2
python pandas dataframe
I have 2 dataframes as below:
df1:
ID col1 col2
1 A1 B1
2 A2 B2
3 A3 B3
4 A4 B4
5 A5 B5
6 A6 B6
df2:
col1 col2
A1 B1
A2 O5
H3 B3
A4 B4
A5 66
A6 C6
Expected Result: I would like to generate a result df based on the condition - Each value in col1,col2 of df1 should exist in col1,col2 values of df2
Expected Result df:
ID col1 col2 Error
1 A1 B1 No mismatch with df2
2 A2 B2 col2 mismatch with df2
3 A3 B3 col1 mismatch with df2
4 A4 B4 No mismatch with df2
5 A5 B5 col2 mismatch with df2
6 A6 B6 col2 mismatch with df2
python pandas dataframe
python pandas dataframe
edited Nov 22 at 11:45
asked Nov 21 at 23:54
Osceria
479
479
You do not have list in df2
– W-B
Nov 22 at 1:25
list is a column in df1 and its value list1 and list2 are just dropdownlist names ; the accepted values are given in columns list1,list2 in df2. So, the data from column "value" of df1 based on its list value should be checked with df2 list1 & list2 values.
– Osceria
Nov 22 at 9:54
Edited the Question
– Osceria
Nov 22 at 11:46
add a comment |
You do not have list in df2
– W-B
Nov 22 at 1:25
list is a column in df1 and its value list1 and list2 are just dropdownlist names ; the accepted values are given in columns list1,list2 in df2. So, the data from column "value" of df1 based on its list value should be checked with df2 list1 & list2 values.
– Osceria
Nov 22 at 9:54
Edited the Question
– Osceria
Nov 22 at 11:46
You do not have list in df2
– W-B
Nov 22 at 1:25
You do not have list in df2
– W-B
Nov 22 at 1:25
list is a column in df1 and its value list1 and list2 are just dropdownlist names ; the accepted values are given in columns list1,list2 in df2. So, the data from column "value" of df1 based on its list value should be checked with df2 list1 & list2 values.
– Osceria
Nov 22 at 9:54
list is a column in df1 and its value list1 and list2 are just dropdownlist names ; the accepted values are given in columns list1,list2 in df2. So, the data from column "value" of df1 based on its list value should be checked with df2 list1 & list2 values.
– Osceria
Nov 22 at 9:54
Edited the Question
– Osceria
Nov 22 at 11:46
Edited the Question
– Osceria
Nov 22 at 11:46
add a comment |
2 Answers
2
active
oldest
votes
up vote
0
down vote
accepted
Create helper DataFrame with dictionary comprehension and comparing with isin
:
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']})
print (m)
col1 col2
0 False False
1 False True
2 True False
3 False False
4 False True
5 False True
And then numpy.where
with mask by any
for test at least one True
per rows and dot
with matrix multiplication for get column names:
df1['Error'] = np.where(m.any(axis=1),
m.dot(m.columns + ', ').str.rstrip(', ') + ' mismatch with df2',
'No mismatch with df2')
print (df1)
ID col1 col2 Error
0 1 A1 B1 No mismatch with df2
1 2 A2 B2 col2 mismatch with df2
2 3 A3 B3 col1 mismatch with df2
3 4 A4 B4 No mismatch with df2
4 5 A5 B5 col2 mismatch with df2
5 6 A6 B6 col2 mismatch with df2
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?
– Osceria
Nov 22 at 14:37
code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}
– Osceria
Nov 22 at 14:41
@Osceria - yes, you are right. You can also pass columns to dict comprehension likem = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})
– jezrael
Nov 22 at 14:43
1
yeah, it works in this way too
– Osceria
Nov 22 at 15:07
add a comment |
up vote
0
down vote
Something like this should do the trick but there may be an easier way.
diff = pd.concat([df1[col] == df2[col] for col in df1], axis=1)
def m(row):
mismatches =
for col in diff.columns:
if not row[col]:
mismatches.append(col)
if mismatches == :
return 'No mismatch'
return 'Mismatches: ' + ', '.join(mismatches)
df1['Error'] = diff.apply(m, axis=1)
When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"
– Osceria
Nov 22 at 10:23
Edited the Question
– Osceria
Nov 22 at 11:46
@Osceria do you get the same error with the following reproducible datasets:df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})
– leoburgy
Nov 22 at 12:01
It's because yourdf1
anddf2
had different columns, right? I noticed you edited the question now, does it work with those dataframes?
– lieblos
Nov 22 at 12:47
If I run what I answered with the dataframes above, it seems like it works.
– lieblos
Nov 22 at 12:49
|
show 1 more comment
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53422071%2fcompare-dataframe-columns-with-conditions%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
accepted
Create helper DataFrame with dictionary comprehension and comparing with isin
:
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']})
print (m)
col1 col2
0 False False
1 False True
2 True False
3 False False
4 False True
5 False True
And then numpy.where
with mask by any
for test at least one True
per rows and dot
with matrix multiplication for get column names:
df1['Error'] = np.where(m.any(axis=1),
m.dot(m.columns + ', ').str.rstrip(', ') + ' mismatch with df2',
'No mismatch with df2')
print (df1)
ID col1 col2 Error
0 1 A1 B1 No mismatch with df2
1 2 A2 B2 col2 mismatch with df2
2 3 A3 B3 col1 mismatch with df2
3 4 A4 B4 No mismatch with df2
4 5 A5 B5 col2 mismatch with df2
5 6 A6 B6 col2 mismatch with df2
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?
– Osceria
Nov 22 at 14:37
code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}
– Osceria
Nov 22 at 14:41
@Osceria - yes, you are right. You can also pass columns to dict comprehension likem = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})
– jezrael
Nov 22 at 14:43
1
yeah, it works in this way too
– Osceria
Nov 22 at 15:07
add a comment |
up vote
0
down vote
accepted
Create helper DataFrame with dictionary comprehension and comparing with isin
:
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']})
print (m)
col1 col2
0 False False
1 False True
2 True False
3 False False
4 False True
5 False True
And then numpy.where
with mask by any
for test at least one True
per rows and dot
with matrix multiplication for get column names:
df1['Error'] = np.where(m.any(axis=1),
m.dot(m.columns + ', ').str.rstrip(', ') + ' mismatch with df2',
'No mismatch with df2')
print (df1)
ID col1 col2 Error
0 1 A1 B1 No mismatch with df2
1 2 A2 B2 col2 mismatch with df2
2 3 A3 B3 col1 mismatch with df2
3 4 A4 B4 No mismatch with df2
4 5 A5 B5 col2 mismatch with df2
5 6 A6 B6 col2 mismatch with df2
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?
– Osceria
Nov 22 at 14:37
code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}
– Osceria
Nov 22 at 14:41
@Osceria - yes, you are right. You can also pass columns to dict comprehension likem = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})
– jezrael
Nov 22 at 14:43
1
yeah, it works in this way too
– Osceria
Nov 22 at 15:07
add a comment |
up vote
0
down vote
accepted
up vote
0
down vote
accepted
Create helper DataFrame with dictionary comprehension and comparing with isin
:
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']})
print (m)
col1 col2
0 False False
1 False True
2 True False
3 False False
4 False True
5 False True
And then numpy.where
with mask by any
for test at least one True
per rows and dot
with matrix multiplication for get column names:
df1['Error'] = np.where(m.any(axis=1),
m.dot(m.columns + ', ').str.rstrip(', ') + ' mismatch with df2',
'No mismatch with df2')
print (df1)
ID col1 col2 Error
0 1 A1 B1 No mismatch with df2
1 2 A2 B2 col2 mismatch with df2
2 3 A3 B3 col1 mismatch with df2
3 4 A4 B4 No mismatch with df2
4 5 A5 B5 col2 mismatch with df2
5 6 A6 B6 col2 mismatch with df2
Create helper DataFrame with dictionary comprehension and comparing with isin
:
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']})
print (m)
col1 col2
0 False False
1 False True
2 True False
3 False False
4 False True
5 False True
And then numpy.where
with mask by any
for test at least one True
per rows and dot
with matrix multiplication for get column names:
df1['Error'] = np.where(m.any(axis=1),
m.dot(m.columns + ', ').str.rstrip(', ') + ' mismatch with df2',
'No mismatch with df2')
print (df1)
ID col1 col2 Error
0 1 A1 B1 No mismatch with df2
1 2 A2 B2 col2 mismatch with df2
2 3 A3 B3 col1 mismatch with df2
3 4 A4 B4 No mismatch with df2
4 5 A5 B5 col2 mismatch with df2
5 6 A6 B6 col2 mismatch with df2
answered Nov 22 at 12:08
jezrael
316k22256333
316k22256333
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?
– Osceria
Nov 22 at 14:37
code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}
– Osceria
Nov 22 at 14:41
@Osceria - yes, you are right. You can also pass columns to dict comprehension likem = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})
– jezrael
Nov 22 at 14:43
1
yeah, it works in this way too
– Osceria
Nov 22 at 15:07
add a comment |
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?
– Osceria
Nov 22 at 14:37
code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}
– Osceria
Nov 22 at 14:41
@Osceria - yes, you are right. You can also pass columns to dict comprehension likem = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})
– jezrael
Nov 22 at 14:43
1
yeah, it works in this way too
– Osceria
Nov 22 at 15:07
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?
– Osceria
Nov 22 at 14:37
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?
– Osceria
Nov 22 at 14:37
code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}
– Osceria
Nov 22 at 14:41
code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}
– Osceria
Nov 22 at 14:41
@Osceria - yes, you are right. You can also pass columns to dict comprehension like
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})
– jezrael
Nov 22 at 14:43
@Osceria - yes, you are right. You can also pass columns to dict comprehension like
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})
– jezrael
Nov 22 at 14:43
1
1
yeah, it works in this way too
– Osceria
Nov 22 at 15:07
yeah, it works in this way too
– Osceria
Nov 22 at 15:07
add a comment |
up vote
0
down vote
Something like this should do the trick but there may be an easier way.
diff = pd.concat([df1[col] == df2[col] for col in df1], axis=1)
def m(row):
mismatches =
for col in diff.columns:
if not row[col]:
mismatches.append(col)
if mismatches == :
return 'No mismatch'
return 'Mismatches: ' + ', '.join(mismatches)
df1['Error'] = diff.apply(m, axis=1)
When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"
– Osceria
Nov 22 at 10:23
Edited the Question
– Osceria
Nov 22 at 11:46
@Osceria do you get the same error with the following reproducible datasets:df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})
– leoburgy
Nov 22 at 12:01
It's because yourdf1
anddf2
had different columns, right? I noticed you edited the question now, does it work with those dataframes?
– lieblos
Nov 22 at 12:47
If I run what I answered with the dataframes above, it seems like it works.
– lieblos
Nov 22 at 12:49
|
show 1 more comment
up vote
0
down vote
Something like this should do the trick but there may be an easier way.
diff = pd.concat([df1[col] == df2[col] for col in df1], axis=1)
def m(row):
mismatches =
for col in diff.columns:
if not row[col]:
mismatches.append(col)
if mismatches == :
return 'No mismatch'
return 'Mismatches: ' + ', '.join(mismatches)
df1['Error'] = diff.apply(m, axis=1)
When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"
– Osceria
Nov 22 at 10:23
Edited the Question
– Osceria
Nov 22 at 11:46
@Osceria do you get the same error with the following reproducible datasets:df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})
– leoburgy
Nov 22 at 12:01
It's because yourdf1
anddf2
had different columns, right? I noticed you edited the question now, does it work with those dataframes?
– lieblos
Nov 22 at 12:47
If I run what I answered with the dataframes above, it seems like it works.
– lieblos
Nov 22 at 12:49
|
show 1 more comment
up vote
0
down vote
up vote
0
down vote
Something like this should do the trick but there may be an easier way.
diff = pd.concat([df1[col] == df2[col] for col in df1], axis=1)
def m(row):
mismatches =
for col in diff.columns:
if not row[col]:
mismatches.append(col)
if mismatches == :
return 'No mismatch'
return 'Mismatches: ' + ', '.join(mismatches)
df1['Error'] = diff.apply(m, axis=1)
Something like this should do the trick but there may be an easier way.
diff = pd.concat([df1[col] == df2[col] for col in df1], axis=1)
def m(row):
mismatches =
for col in diff.columns:
if not row[col]:
mismatches.append(col)
if mismatches == :
return 'No mismatch'
return 'Mismatches: ' + ', '.join(mismatches)
df1['Error'] = diff.apply(m, axis=1)
answered Nov 22 at 0:20
lieblos
1029
1029
When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"
– Osceria
Nov 22 at 10:23
Edited the Question
– Osceria
Nov 22 at 11:46
@Osceria do you get the same error with the following reproducible datasets:df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})
– leoburgy
Nov 22 at 12:01
It's because yourdf1
anddf2
had different columns, right? I noticed you edited the question now, does it work with those dataframes?
– lieblos
Nov 22 at 12:47
If I run what I answered with the dataframes above, it seems like it works.
– lieblos
Nov 22 at 12:49
|
show 1 more comment
When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"
– Osceria
Nov 22 at 10:23
Edited the Question
– Osceria
Nov 22 at 11:46
@Osceria do you get the same error with the following reproducible datasets:df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})
– leoburgy
Nov 22 at 12:01
It's because yourdf1
anddf2
had different columns, right? I noticed you edited the question now, does it work with those dataframes?
– lieblos
Nov 22 at 12:47
If I run what I answered with the dataframes above, it seems like it works.
– lieblos
Nov 22 at 12:49
When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"
– Osceria
Nov 22 at 10:23
When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"
– Osceria
Nov 22 at 10:23
Edited the Question
– Osceria
Nov 22 at 11:46
Edited the Question
– Osceria
Nov 22 at 11:46
@Osceria do you get the same error with the following reproducible datasets:
df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})
– leoburgy
Nov 22 at 12:01
@Osceria do you get the same error with the following reproducible datasets:
df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})
– leoburgy
Nov 22 at 12:01
It's because your
df1
and df2
had different columns, right? I noticed you edited the question now, does it work with those dataframes?– lieblos
Nov 22 at 12:47
It's because your
df1
and df2
had different columns, right? I noticed you edited the question now, does it work with those dataframes?– lieblos
Nov 22 at 12:47
If I run what I answered with the dataframes above, it seems like it works.
– lieblos
Nov 22 at 12:49
If I run what I answered with the dataframes above, it seems like it works.
– lieblos
Nov 22 at 12:49
|
show 1 more comment
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53422071%2fcompare-dataframe-columns-with-conditions%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
You do not have list in df2
– W-B
Nov 22 at 1:25
list is a column in df1 and its value list1 and list2 are just dropdownlist names ; the accepted values are given in columns list1,list2 in df2. So, the data from column "value" of df1 based on its list value should be checked with df2 list1 & list2 values.
– Osceria
Nov 22 at 9:54
Edited the Question
– Osceria
Nov 22 at 11:46