R returns NaN when it shouldn't [closed]
up vote
-2
down vote
favorite
I would like to calculate the share of transfers as part of total income (transfer + salary). However, R returns NaN even though it mathematically should not.
My data-structure is a dataframe, and looks similar to this (in total I have 500.000 rows and 50 columns):
My code at the moment would for the example data frame above look like this:
df$transfershare <- (rowSums(df[,c(2,4,6)])/rowSums(df[,c(2,4,6,3,5,7)]))*100
Based on:
Transfershare = total transfer / (total transfer + total salary)*100
Total transfer is the sum of transfer2012-2014, and total salary is the sum of transfer 2012-2014.
The problem is that before running this code, my df has 0 missing values, but after running this it suddenly has 3000. I have read that NaN normally occurs if something is divided by 0, but no observations should be divided by 0 in this data set. Thus, I am thinking, I might have done something wrong in the code?
r nan rowsum
closed as off-topic by dww, phiver, Rob, lagom, Jimi Nov 22 at 1:31
This question appears to be off-topic. The users who voted to close gave this specific reason:
- "This question was caused by a problem that can no longer be reproduced or a simple typographical error. While similar questions may be on-topic here, this one was resolved in a manner unlikely to help future readers. This can often be avoided by identifying and closely inspecting the shortest program necessary to reproduce the problem before posting." – dww, phiver, Rob, lagom, Jimi
If this question can be reworded to fit the rules in the help center, please edit the question.
add a comment |
up vote
-2
down vote
favorite
I would like to calculate the share of transfers as part of total income (transfer + salary). However, R returns NaN even though it mathematically should not.
My data-structure is a dataframe, and looks similar to this (in total I have 500.000 rows and 50 columns):
My code at the moment would for the example data frame above look like this:
df$transfershare <- (rowSums(df[,c(2,4,6)])/rowSums(df[,c(2,4,6,3,5,7)]))*100
Based on:
Transfershare = total transfer / (total transfer + total salary)*100
Total transfer is the sum of transfer2012-2014, and total salary is the sum of transfer 2012-2014.
The problem is that before running this code, my df has 0 missing values, but after running this it suddenly has 3000. I have read that NaN normally occurs if something is divided by 0, but no observations should be divided by 0 in this data set. Thus, I am thinking, I might have done something wrong in the code?
r nan rowsum
closed as off-topic by dww, phiver, Rob, lagom, Jimi Nov 22 at 1:31
This question appears to be off-topic. The users who voted to close gave this specific reason:
- "This question was caused by a problem that can no longer be reproduced or a simple typographical error. While similar questions may be on-topic here, this one was resolved in a manner unlikely to help future readers. This can often be avoided by identifying and closely inspecting the shortest program necessary to reproduce the problem before posting." – dww, phiver, Rob, lagom, Jimi
If this question can be reworded to fit the rules in the help center, please edit the question.
At least in one row there seem to be zeros in the denominator. Do some checking of the data.
– Martin Schmelzer
Nov 21 at 15:48
2
I'd suggest running something likedf[is.nan(df$transfershare), ]
and looking at what you get.
– Lyngbakr
Nov 21 at 15:50
add a comment |
up vote
-2
down vote
favorite
up vote
-2
down vote
favorite
I would like to calculate the share of transfers as part of total income (transfer + salary). However, R returns NaN even though it mathematically should not.
My data-structure is a dataframe, and looks similar to this (in total I have 500.000 rows and 50 columns):
My code at the moment would for the example data frame above look like this:
df$transfershare <- (rowSums(df[,c(2,4,6)])/rowSums(df[,c(2,4,6,3,5,7)]))*100
Based on:
Transfershare = total transfer / (total transfer + total salary)*100
Total transfer is the sum of transfer2012-2014, and total salary is the sum of transfer 2012-2014.
The problem is that before running this code, my df has 0 missing values, but after running this it suddenly has 3000. I have read that NaN normally occurs if something is divided by 0, but no observations should be divided by 0 in this data set. Thus, I am thinking, I might have done something wrong in the code?
r nan rowsum
I would like to calculate the share of transfers as part of total income (transfer + salary). However, R returns NaN even though it mathematically should not.
My data-structure is a dataframe, and looks similar to this (in total I have 500.000 rows and 50 columns):
My code at the moment would for the example data frame above look like this:
df$transfershare <- (rowSums(df[,c(2,4,6)])/rowSums(df[,c(2,4,6,3,5,7)]))*100
Based on:
Transfershare = total transfer / (total transfer + total salary)*100
Total transfer is the sum of transfer2012-2014, and total salary is the sum of transfer 2012-2014.
The problem is that before running this code, my df has 0 missing values, but after running this it suddenly has 3000. I have read that NaN normally occurs if something is divided by 0, but no observations should be divided by 0 in this data set. Thus, I am thinking, I might have done something wrong in the code?
r nan rowsum
r nan rowsum
edited Nov 21 at 15:55
asked Nov 21 at 15:42
rica
117
117
closed as off-topic by dww, phiver, Rob, lagom, Jimi Nov 22 at 1:31
This question appears to be off-topic. The users who voted to close gave this specific reason:
- "This question was caused by a problem that can no longer be reproduced or a simple typographical error. While similar questions may be on-topic here, this one was resolved in a manner unlikely to help future readers. This can often be avoided by identifying and closely inspecting the shortest program necessary to reproduce the problem before posting." – dww, phiver, Rob, lagom, Jimi
If this question can be reworded to fit the rules in the help center, please edit the question.
closed as off-topic by dww, phiver, Rob, lagom, Jimi Nov 22 at 1:31
This question appears to be off-topic. The users who voted to close gave this specific reason:
- "This question was caused by a problem that can no longer be reproduced or a simple typographical error. While similar questions may be on-topic here, this one was resolved in a manner unlikely to help future readers. This can often be avoided by identifying and closely inspecting the shortest program necessary to reproduce the problem before posting." – dww, phiver, Rob, lagom, Jimi
If this question can be reworded to fit the rules in the help center, please edit the question.
At least in one row there seem to be zeros in the denominator. Do some checking of the data.
– Martin Schmelzer
Nov 21 at 15:48
2
I'd suggest running something likedf[is.nan(df$transfershare), ]
and looking at what you get.
– Lyngbakr
Nov 21 at 15:50
add a comment |
At least in one row there seem to be zeros in the denominator. Do some checking of the data.
– Martin Schmelzer
Nov 21 at 15:48
2
I'd suggest running something likedf[is.nan(df$transfershare), ]
and looking at what you get.
– Lyngbakr
Nov 21 at 15:50
At least in one row there seem to be zeros in the denominator. Do some checking of the data.
– Martin Schmelzer
Nov 21 at 15:48
At least in one row there seem to be zeros in the denominator. Do some checking of the data.
– Martin Schmelzer
Nov 21 at 15:48
2
2
I'd suggest running something like
df[is.nan(df$transfershare), ]
and looking at what you get.– Lyngbakr
Nov 21 at 15:50
I'd suggest running something like
df[is.nan(df$transfershare), ]
and looking at what you get.– Lyngbakr
Nov 21 at 15:50
add a comment |
1 Answer
1
active
oldest
votes
up vote
1
down vote
I'm not getting any errors:
df <- data.frame(id= c(1,2,3,4), Transfer2012 = c(200,0,0,300), Salary2012 = c(0,300,0,200), Transfer2013 = c(200,250,200,300),
Salary2013 = c(0,0,0,0), Transfer2014 = c(200,0,0,200), Salary2014 = c(0,300,0,0))
id Transfer2012 Salary2012 Transfer2013 Salary2013 Transfer2014 Salary2014
1 1 200 0 200 0 200 0
2 2 0 300 250 0 0 300
3 3 0 0 200 0 0 0
4 4 300 200 300 0 200 0
df$transfershare <- (rowSums(df[,c(2,4,6)])/rowSums(df[,c(2:7)]))*100
id Transfer2012 Salary2012 Transfer2013 Salary2013 Transfer2014 Salary2014 transfershare
1 1 200 0 200 0 200 0 100.00000
2 2 0 300 250 0 0 300 29.41176
3 3 0 0 200 0 0 0 100.00000
4 4 300 200 300 0 200 0 80.00000
Have you confirmed that your variables are numeric?
str(df)
'data.frame': 4 obs. of 7 variables:
$ id : num 1 2 3 4
$ Transfer2012: num 200 0 0 300
$ Salary2012 : num 0 300 0 200
$ Transfer2013: num 200 250 200 300
$ Salary2013 : num 0 0 0 0
$ Transfer2014: num 200 0 0 200
$ Salary2014 : num 0 300 0 0
1
Thank you for your help - I looked it through with the is.nan function and found out that there in fact are some 0/0, which does make sense.
– rica
Nov 21 at 16:07
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
I'm not getting any errors:
df <- data.frame(id= c(1,2,3,4), Transfer2012 = c(200,0,0,300), Salary2012 = c(0,300,0,200), Transfer2013 = c(200,250,200,300),
Salary2013 = c(0,0,0,0), Transfer2014 = c(200,0,0,200), Salary2014 = c(0,300,0,0))
id Transfer2012 Salary2012 Transfer2013 Salary2013 Transfer2014 Salary2014
1 1 200 0 200 0 200 0
2 2 0 300 250 0 0 300
3 3 0 0 200 0 0 0
4 4 300 200 300 0 200 0
df$transfershare <- (rowSums(df[,c(2,4,6)])/rowSums(df[,c(2:7)]))*100
id Transfer2012 Salary2012 Transfer2013 Salary2013 Transfer2014 Salary2014 transfershare
1 1 200 0 200 0 200 0 100.00000
2 2 0 300 250 0 0 300 29.41176
3 3 0 0 200 0 0 0 100.00000
4 4 300 200 300 0 200 0 80.00000
Have you confirmed that your variables are numeric?
str(df)
'data.frame': 4 obs. of 7 variables:
$ id : num 1 2 3 4
$ Transfer2012: num 200 0 0 300
$ Salary2012 : num 0 300 0 200
$ Transfer2013: num 200 250 200 300
$ Salary2013 : num 0 0 0 0
$ Transfer2014: num 200 0 0 200
$ Salary2014 : num 0 300 0 0
1
Thank you for your help - I looked it through with the is.nan function and found out that there in fact are some 0/0, which does make sense.
– rica
Nov 21 at 16:07
add a comment |
up vote
1
down vote
I'm not getting any errors:
df <- data.frame(id= c(1,2,3,4), Transfer2012 = c(200,0,0,300), Salary2012 = c(0,300,0,200), Transfer2013 = c(200,250,200,300),
Salary2013 = c(0,0,0,0), Transfer2014 = c(200,0,0,200), Salary2014 = c(0,300,0,0))
id Transfer2012 Salary2012 Transfer2013 Salary2013 Transfer2014 Salary2014
1 1 200 0 200 0 200 0
2 2 0 300 250 0 0 300
3 3 0 0 200 0 0 0
4 4 300 200 300 0 200 0
df$transfershare <- (rowSums(df[,c(2,4,6)])/rowSums(df[,c(2:7)]))*100
id Transfer2012 Salary2012 Transfer2013 Salary2013 Transfer2014 Salary2014 transfershare
1 1 200 0 200 0 200 0 100.00000
2 2 0 300 250 0 0 300 29.41176
3 3 0 0 200 0 0 0 100.00000
4 4 300 200 300 0 200 0 80.00000
Have you confirmed that your variables are numeric?
str(df)
'data.frame': 4 obs. of 7 variables:
$ id : num 1 2 3 4
$ Transfer2012: num 200 0 0 300
$ Salary2012 : num 0 300 0 200
$ Transfer2013: num 200 250 200 300
$ Salary2013 : num 0 0 0 0
$ Transfer2014: num 200 0 0 200
$ Salary2014 : num 0 300 0 0
1
Thank you for your help - I looked it through with the is.nan function and found out that there in fact are some 0/0, which does make sense.
– rica
Nov 21 at 16:07
add a comment |
up vote
1
down vote
up vote
1
down vote
I'm not getting any errors:
df <- data.frame(id= c(1,2,3,4), Transfer2012 = c(200,0,0,300), Salary2012 = c(0,300,0,200), Transfer2013 = c(200,250,200,300),
Salary2013 = c(0,0,0,0), Transfer2014 = c(200,0,0,200), Salary2014 = c(0,300,0,0))
id Transfer2012 Salary2012 Transfer2013 Salary2013 Transfer2014 Salary2014
1 1 200 0 200 0 200 0
2 2 0 300 250 0 0 300
3 3 0 0 200 0 0 0
4 4 300 200 300 0 200 0
df$transfershare <- (rowSums(df[,c(2,4,6)])/rowSums(df[,c(2:7)]))*100
id Transfer2012 Salary2012 Transfer2013 Salary2013 Transfer2014 Salary2014 transfershare
1 1 200 0 200 0 200 0 100.00000
2 2 0 300 250 0 0 300 29.41176
3 3 0 0 200 0 0 0 100.00000
4 4 300 200 300 0 200 0 80.00000
Have you confirmed that your variables are numeric?
str(df)
'data.frame': 4 obs. of 7 variables:
$ id : num 1 2 3 4
$ Transfer2012: num 200 0 0 300
$ Salary2012 : num 0 300 0 200
$ Transfer2013: num 200 250 200 300
$ Salary2013 : num 0 0 0 0
$ Transfer2014: num 200 0 0 200
$ Salary2014 : num 0 300 0 0
I'm not getting any errors:
df <- data.frame(id= c(1,2,3,4), Transfer2012 = c(200,0,0,300), Salary2012 = c(0,300,0,200), Transfer2013 = c(200,250,200,300),
Salary2013 = c(0,0,0,0), Transfer2014 = c(200,0,0,200), Salary2014 = c(0,300,0,0))
id Transfer2012 Salary2012 Transfer2013 Salary2013 Transfer2014 Salary2014
1 1 200 0 200 0 200 0
2 2 0 300 250 0 0 300
3 3 0 0 200 0 0 0
4 4 300 200 300 0 200 0
df$transfershare <- (rowSums(df[,c(2,4,6)])/rowSums(df[,c(2:7)]))*100
id Transfer2012 Salary2012 Transfer2013 Salary2013 Transfer2014 Salary2014 transfershare
1 1 200 0 200 0 200 0 100.00000
2 2 0 300 250 0 0 300 29.41176
3 3 0 0 200 0 0 0 100.00000
4 4 300 200 300 0 200 0 80.00000
Have you confirmed that your variables are numeric?
str(df)
'data.frame': 4 obs. of 7 variables:
$ id : num 1 2 3 4
$ Transfer2012: num 200 0 0 300
$ Salary2012 : num 0 300 0 200
$ Transfer2013: num 200 250 200 300
$ Salary2013 : num 0 0 0 0
$ Transfer2014: num 200 0 0 200
$ Salary2014 : num 0 300 0 0
answered Nov 21 at 15:59
Alexandra Thayer
416
416
1
Thank you for your help - I looked it through with the is.nan function and found out that there in fact are some 0/0, which does make sense.
– rica
Nov 21 at 16:07
add a comment |
1
Thank you for your help - I looked it through with the is.nan function and found out that there in fact are some 0/0, which does make sense.
– rica
Nov 21 at 16:07
1
1
Thank you for your help - I looked it through with the is.nan function and found out that there in fact are some 0/0, which does make sense.
– rica
Nov 21 at 16:07
Thank you for your help - I looked it through with the is.nan function and found out that there in fact are some 0/0, which does make sense.
– rica
Nov 21 at 16:07
add a comment |
At least in one row there seem to be zeros in the denominator. Do some checking of the data.
– Martin Schmelzer
Nov 21 at 15:48
2
I'd suggest running something like
df[is.nan(df$transfershare), ]
and looking at what you get.– Lyngbakr
Nov 21 at 15:50