Python: add values from one dataframe to another (with multiple conditions)
I have two dataframes df1 and df2 that kind of look like this:
Example:
x1 = [{'partner': "Afghanistan", 'trade_value':100, 'commodity': 1},
{'partner':"Zambia",'trade_value':110, 'commodity': 2},
{'partner': "Germany",'trade_value':120, 'commodity': 2},
{'partner': "Afghanistan",'trade_value':150, 'commodity': 2},
{'partner': "USA",'trade_value':1120, 'commodity': 5}];
df1 = pd.DataFrame(x1)
x2 = [{'country': "Afghanistan", 'commodity': 5, 'tariff': 3.5},
{'country': "Afghanistan", 'commodity': 3, 'tariff': 6.2},
{'country': "Afghanistan", 'commodity': 1, 'tariff': 9.9},
{'country': "Afghanistan", 'commodity': 2, 'tariff': 1.4},
{'country': "USA", 'commodity': 5, 'tariff': 4.3},
{'country': "Germany", 'commodity': 7, 'tariff': 6.5},
{'country': "Germany", 'commodity': 2, 'tariff': 8.8}];
df2 = pd.DataFrame(x2)
I want to add a new column to df1 called 'tariff' and assign each 'partner' and 'commodity' in df1 with its appropriate 'tariff' from df2.
Note: sometimes a 'partner' country in df1 is repeated due to multiple trades. Also not all Tariffs are available in df2 so I don't mind leaving a cell in df1 empty.
so far I am at this stage:
#Add new column
df1['tariff'] = 0;
for index, row in df1.iterrows():
for index, row2 in df2.iterrows():
if row['partner'] == row2['country']:
if row['commodity'] == row2['commodity']
#Dont know what to put here
If I use df1['tariff'].replace(row['tariff'],row2['tariff'],inplace=True); I am getting all the tariff columns filled with the tariff 9.9
The output of df1 should look like this:
| partner | trade_value | commodity | tariff |
|------------|-------------|-----------|--------|
| Afghanistan| 100 | 1 | 9.9 |
| Zambia | 110 | 2 | NaN |
| Germany | 120 | 2 | 8.8 |
| Afghanistan| 150 | 2 | 1.4 |
| USA | 1120 | 5 | 4.3 |
python pandas dataframe row multiple-columns
add a comment |
I have two dataframes df1 and df2 that kind of look like this:
Example:
x1 = [{'partner': "Afghanistan", 'trade_value':100, 'commodity': 1},
{'partner':"Zambia",'trade_value':110, 'commodity': 2},
{'partner': "Germany",'trade_value':120, 'commodity': 2},
{'partner': "Afghanistan",'trade_value':150, 'commodity': 2},
{'partner': "USA",'trade_value':1120, 'commodity': 5}];
df1 = pd.DataFrame(x1)
x2 = [{'country': "Afghanistan", 'commodity': 5, 'tariff': 3.5},
{'country': "Afghanistan", 'commodity': 3, 'tariff': 6.2},
{'country': "Afghanistan", 'commodity': 1, 'tariff': 9.9},
{'country': "Afghanistan", 'commodity': 2, 'tariff': 1.4},
{'country': "USA", 'commodity': 5, 'tariff': 4.3},
{'country': "Germany", 'commodity': 7, 'tariff': 6.5},
{'country': "Germany", 'commodity': 2, 'tariff': 8.8}];
df2 = pd.DataFrame(x2)
I want to add a new column to df1 called 'tariff' and assign each 'partner' and 'commodity' in df1 with its appropriate 'tariff' from df2.
Note: sometimes a 'partner' country in df1 is repeated due to multiple trades. Also not all Tariffs are available in df2 so I don't mind leaving a cell in df1 empty.
so far I am at this stage:
#Add new column
df1['tariff'] = 0;
for index, row in df1.iterrows():
for index, row2 in df2.iterrows():
if row['partner'] == row2['country']:
if row['commodity'] == row2['commodity']
#Dont know what to put here
If I use df1['tariff'].replace(row['tariff'],row2['tariff'],inplace=True); I am getting all the tariff columns filled with the tariff 9.9
The output of df1 should look like this:
| partner | trade_value | commodity | tariff |
|------------|-------------|-----------|--------|
| Afghanistan| 100 | 1 | 9.9 |
| Zambia | 110 | 2 | NaN |
| Germany | 120 | 2 | 8.8 |
| Afghanistan| 150 | 2 | 1.4 |
| USA | 1120 | 5 | 4.3 |
python pandas dataframe row multiple-columns
What is your expected output
– W-B
Nov 23 '18 at 15:52
@W-B I am expected to have an extra column fordf1calledtariff. The values under tariff should be according to the country, and the commodity code found indf2... so basically i have two conditions to add a tariff (country and commodity must match)
– Hassan Dbouk
Nov 23 '18 at 16:05
add a comment |
I have two dataframes df1 and df2 that kind of look like this:
Example:
x1 = [{'partner': "Afghanistan", 'trade_value':100, 'commodity': 1},
{'partner':"Zambia",'trade_value':110, 'commodity': 2},
{'partner': "Germany",'trade_value':120, 'commodity': 2},
{'partner': "Afghanistan",'trade_value':150, 'commodity': 2},
{'partner': "USA",'trade_value':1120, 'commodity': 5}];
df1 = pd.DataFrame(x1)
x2 = [{'country': "Afghanistan", 'commodity': 5, 'tariff': 3.5},
{'country': "Afghanistan", 'commodity': 3, 'tariff': 6.2},
{'country': "Afghanistan", 'commodity': 1, 'tariff': 9.9},
{'country': "Afghanistan", 'commodity': 2, 'tariff': 1.4},
{'country': "USA", 'commodity': 5, 'tariff': 4.3},
{'country': "Germany", 'commodity': 7, 'tariff': 6.5},
{'country': "Germany", 'commodity': 2, 'tariff': 8.8}];
df2 = pd.DataFrame(x2)
I want to add a new column to df1 called 'tariff' and assign each 'partner' and 'commodity' in df1 with its appropriate 'tariff' from df2.
Note: sometimes a 'partner' country in df1 is repeated due to multiple trades. Also not all Tariffs are available in df2 so I don't mind leaving a cell in df1 empty.
so far I am at this stage:
#Add new column
df1['tariff'] = 0;
for index, row in df1.iterrows():
for index, row2 in df2.iterrows():
if row['partner'] == row2['country']:
if row['commodity'] == row2['commodity']
#Dont know what to put here
If I use df1['tariff'].replace(row['tariff'],row2['tariff'],inplace=True); I am getting all the tariff columns filled with the tariff 9.9
The output of df1 should look like this:
| partner | trade_value | commodity | tariff |
|------------|-------------|-----------|--------|
| Afghanistan| 100 | 1 | 9.9 |
| Zambia | 110 | 2 | NaN |
| Germany | 120 | 2 | 8.8 |
| Afghanistan| 150 | 2 | 1.4 |
| USA | 1120 | 5 | 4.3 |
python pandas dataframe row multiple-columns
I have two dataframes df1 and df2 that kind of look like this:
Example:
x1 = [{'partner': "Afghanistan", 'trade_value':100, 'commodity': 1},
{'partner':"Zambia",'trade_value':110, 'commodity': 2},
{'partner': "Germany",'trade_value':120, 'commodity': 2},
{'partner': "Afghanistan",'trade_value':150, 'commodity': 2},
{'partner': "USA",'trade_value':1120, 'commodity': 5}];
df1 = pd.DataFrame(x1)
x2 = [{'country': "Afghanistan", 'commodity': 5, 'tariff': 3.5},
{'country': "Afghanistan", 'commodity': 3, 'tariff': 6.2},
{'country': "Afghanistan", 'commodity': 1, 'tariff': 9.9},
{'country': "Afghanistan", 'commodity': 2, 'tariff': 1.4},
{'country': "USA", 'commodity': 5, 'tariff': 4.3},
{'country': "Germany", 'commodity': 7, 'tariff': 6.5},
{'country': "Germany", 'commodity': 2, 'tariff': 8.8}];
df2 = pd.DataFrame(x2)
I want to add a new column to df1 called 'tariff' and assign each 'partner' and 'commodity' in df1 with its appropriate 'tariff' from df2.
Note: sometimes a 'partner' country in df1 is repeated due to multiple trades. Also not all Tariffs are available in df2 so I don't mind leaving a cell in df1 empty.
so far I am at this stage:
#Add new column
df1['tariff'] = 0;
for index, row in df1.iterrows():
for index, row2 in df2.iterrows():
if row['partner'] == row2['country']:
if row['commodity'] == row2['commodity']
#Dont know what to put here
If I use df1['tariff'].replace(row['tariff'],row2['tariff'],inplace=True); I am getting all the tariff columns filled with the tariff 9.9
The output of df1 should look like this:
| partner | trade_value | commodity | tariff |
|------------|-------------|-----------|--------|
| Afghanistan| 100 | 1 | 9.9 |
| Zambia | 110 | 2 | NaN |
| Germany | 120 | 2 | 8.8 |
| Afghanistan| 150 | 2 | 1.4 |
| USA | 1120 | 5 | 4.3 |
python pandas dataframe row multiple-columns
python pandas dataframe row multiple-columns
edited Nov 23 '18 at 16:15
Hassan Dbouk
asked Nov 23 '18 at 15:49
Hassan DboukHassan Dbouk
494
494
What is your expected output
– W-B
Nov 23 '18 at 15:52
@W-B I am expected to have an extra column fordf1calledtariff. The values under tariff should be according to the country, and the commodity code found indf2... so basically i have two conditions to add a tariff (country and commodity must match)
– Hassan Dbouk
Nov 23 '18 at 16:05
add a comment |
What is your expected output
– W-B
Nov 23 '18 at 15:52
@W-B I am expected to have an extra column fordf1calledtariff. The values under tariff should be according to the country, and the commodity code found indf2... so basically i have two conditions to add a tariff (country and commodity must match)
– Hassan Dbouk
Nov 23 '18 at 16:05
What is your expected output
– W-B
Nov 23 '18 at 15:52
What is your expected output
– W-B
Nov 23 '18 at 15:52
@W-B I am expected to have an extra column for
df1 called tariff. The values under tariff should be according to the country, and the commodity code found in df2... so basically i have two conditions to add a tariff (country and commodity must match)– Hassan Dbouk
Nov 23 '18 at 16:05
@W-B I am expected to have an extra column for
df1 called tariff. The values under tariff should be according to the country, and the commodity code found in df2... so basically i have two conditions to add a tariff (country and commodity must match)– Hassan Dbouk
Nov 23 '18 at 16:05
add a comment |
1 Answer
1
active
oldest
votes
merge
You can simply use merge to join the two dataframes on the overlapping columns:
pd.merge(left=df1, right=df2, how='left', left_on=['partner', 'commodity'],
right_on = ['country', 'commodity']).drop(['country'], axis = 1)
commodity partner trade_value tariff
0 1 Afghanistan 100 9.9
1 2 Zambia 110 NaN
2 2 Germany 120 8.8
3 2 Afghanistan 150 1.4
4 5 USA 1120 4.3
1
you should addhow='left'to keep country that has no tariff
– Ricky Kim
Nov 23 '18 at 16:13
Thanks for pointing out @RickyKim, did not notice there where more countries indf1
– yatu
Nov 23 '18 at 16:16
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53449608%2fpython-add-values-from-one-dataframe-to-another-with-multiple-conditions%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
merge
You can simply use merge to join the two dataframes on the overlapping columns:
pd.merge(left=df1, right=df2, how='left', left_on=['partner', 'commodity'],
right_on = ['country', 'commodity']).drop(['country'], axis = 1)
commodity partner trade_value tariff
0 1 Afghanistan 100 9.9
1 2 Zambia 110 NaN
2 2 Germany 120 8.8
3 2 Afghanistan 150 1.4
4 5 USA 1120 4.3
1
you should addhow='left'to keep country that has no tariff
– Ricky Kim
Nov 23 '18 at 16:13
Thanks for pointing out @RickyKim, did not notice there where more countries indf1
– yatu
Nov 23 '18 at 16:16
add a comment |
merge
You can simply use merge to join the two dataframes on the overlapping columns:
pd.merge(left=df1, right=df2, how='left', left_on=['partner', 'commodity'],
right_on = ['country', 'commodity']).drop(['country'], axis = 1)
commodity partner trade_value tariff
0 1 Afghanistan 100 9.9
1 2 Zambia 110 NaN
2 2 Germany 120 8.8
3 2 Afghanistan 150 1.4
4 5 USA 1120 4.3
1
you should addhow='left'to keep country that has no tariff
– Ricky Kim
Nov 23 '18 at 16:13
Thanks for pointing out @RickyKim, did not notice there where more countries indf1
– yatu
Nov 23 '18 at 16:16
add a comment |
merge
You can simply use merge to join the two dataframes on the overlapping columns:
pd.merge(left=df1, right=df2, how='left', left_on=['partner', 'commodity'],
right_on = ['country', 'commodity']).drop(['country'], axis = 1)
commodity partner trade_value tariff
0 1 Afghanistan 100 9.9
1 2 Zambia 110 NaN
2 2 Germany 120 8.8
3 2 Afghanistan 150 1.4
4 5 USA 1120 4.3
merge
You can simply use merge to join the two dataframes on the overlapping columns:
pd.merge(left=df1, right=df2, how='left', left_on=['partner', 'commodity'],
right_on = ['country', 'commodity']).drop(['country'], axis = 1)
commodity partner trade_value tariff
0 1 Afghanistan 100 9.9
1 2 Zambia 110 NaN
2 2 Germany 120 8.8
3 2 Afghanistan 150 1.4
4 5 USA 1120 4.3
edited Nov 23 '18 at 16:15
answered Nov 23 '18 at 16:08
yatuyatu
6,9351826
6,9351826
1
you should addhow='left'to keep country that has no tariff
– Ricky Kim
Nov 23 '18 at 16:13
Thanks for pointing out @RickyKim, did not notice there where more countries indf1
– yatu
Nov 23 '18 at 16:16
add a comment |
1
you should addhow='left'to keep country that has no tariff
– Ricky Kim
Nov 23 '18 at 16:13
Thanks for pointing out @RickyKim, did not notice there where more countries indf1
– yatu
Nov 23 '18 at 16:16
1
1
you should add
how='left' to keep country that has no tariff– Ricky Kim
Nov 23 '18 at 16:13
you should add
how='left' to keep country that has no tariff– Ricky Kim
Nov 23 '18 at 16:13
Thanks for pointing out @RickyKim, did not notice there where more countries in
df1– yatu
Nov 23 '18 at 16:16
Thanks for pointing out @RickyKim, did not notice there where more countries in
df1– yatu
Nov 23 '18 at 16:16
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53449608%2fpython-add-values-from-one-dataframe-to-another-with-multiple-conditions%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
What is your expected output
– W-B
Nov 23 '18 at 15:52
@W-B I am expected to have an extra column for
df1calledtariff. The values under tariff should be according to the country, and the commodity code found indf2... so basically i have two conditions to add a tariff (country and commodity must match)– Hassan Dbouk
Nov 23 '18 at 16:05