Calculate differences between groups in R
For an example dataframe:
df1 <- structure(list(name = c("a", "b", "c", "d", "e", "f", "g", "h",
"i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
"v", "w", "x", "y", "z", "a", "b", "c", "d", "e", "f", "g", "h",
"i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
"v", "w", "x", "y", "z", "a", "b", "c", "d", "e", "f", "g", "h",
"i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
"v", "w", "x", "y", "z"), amount = c(5.5, 5.4, 5.2, 5.3, 5.1,
5.1, 5, 5, 4.9, 4.5, 6, 5.9, 5.7, 5.4, 5.3, 5.1, 5.6, 5.4, 5.3,
5.6, 4.6, 4.2, 4.5, 4.2, 4, 3.8, 6, 5.8, 5.7, 5.6, 5.3, 5.6,
5.4, 5.5, 5.4, 5.1, 9, 8.8, 8.6, 8.4, 8.2, 8, 7.8, 7.6, 7.4,
7.2, 6, 5.75, 5.5, 5.25, 5, 4.75, 10, 8.9, 7.8, 6.7, 5.6, 4.5,
3.4, 2.3, 1.2, 0.1, 6, 5.8, 5.7, 5.6, 5.5, 5.5, 5.4, 5.6, 5.8,
5.1, 6, 5.5, 5.4, 5.3, 5.2, 5.1), decile = c(1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L,
10L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L,
4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L,
3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L), time = c(2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L)), .Names = c("name", "amount",
"decile", "time"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-78L), spec = structure(list(cols = structure(list(name = structure(list(), class = c("collector_character",
"collector")), amount = structure(list(), class = c("collector_double",
"collector")), decile = structure(list(), class = c("collector_integer",
"collector")), time = structure(list(), class = c("collector_integer",
"collector"))), .Names = c("name", "amount", "decile", "time"
)), default = structure(list(), class = c("collector_guess",
"collector"))), .Names = c("cols", "default"), class = "col_spec"))
I wish to calculate the mean result for deciles 1, 5 and 10 BY each year (2016, 17 etc.). I then wish to create a final table detailing year in the first column and then the gap between the mean result for deciles 1 and 10 (i.e. decile 10 result minus decile 1 result), and then the gradient between the mean results for deciles 5 and 10 (i.e. 10 mean result minus 5 mean result)
which is the difference in means between deciles 5 and 10.
To illustrate I have create an working example of the data for 2016. I list the values for deciles 1, 5 and 10 for 2016. I then use these values to work out the gap and gradient difference.
summary2016 <- structure(list(`2016` = c(NA_character_, NA_character_, NA_character_,
NA_character_), `1` = c("5", "10", "Gap", "Gradient"), `5.5` = c(5.1,
4.5, 1.4, 0.3), `6` = c(5.3, 5.6, NA, NA), `11.5` = c(10.4, 10.1,
NA, NA)), .Names = c("2016", "1", "5.5", "6", "11.5"), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -4L), spec = structure(list(
cols = structure(list(`2016` = structure(list(), class = c("collector_character",
"collector")), `1` = structure(list(), class = c("collector_character",
"collector")), `5.5` = structure(list(), class = c("collector_double",
"collector")), `6` = structure(list(), class = c("collector_double",
"collector")), `11.5` = structure(list(), class = c("collector_double",
"collector"))), .Names = c("2016", "1", "5.5", "6", "11.5"
)), default = structure(list(), class = c("collector_guess",
"collector"))), .Names = c("cols", "default"), class = "col_spec"))
Can this be done in one step, or would I need to break it down?
r
add a comment |
For an example dataframe:
df1 <- structure(list(name = c("a", "b", "c", "d", "e", "f", "g", "h",
"i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
"v", "w", "x", "y", "z", "a", "b", "c", "d", "e", "f", "g", "h",
"i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
"v", "w", "x", "y", "z", "a", "b", "c", "d", "e", "f", "g", "h",
"i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
"v", "w", "x", "y", "z"), amount = c(5.5, 5.4, 5.2, 5.3, 5.1,
5.1, 5, 5, 4.9, 4.5, 6, 5.9, 5.7, 5.4, 5.3, 5.1, 5.6, 5.4, 5.3,
5.6, 4.6, 4.2, 4.5, 4.2, 4, 3.8, 6, 5.8, 5.7, 5.6, 5.3, 5.6,
5.4, 5.5, 5.4, 5.1, 9, 8.8, 8.6, 8.4, 8.2, 8, 7.8, 7.6, 7.4,
7.2, 6, 5.75, 5.5, 5.25, 5, 4.75, 10, 8.9, 7.8, 6.7, 5.6, 4.5,
3.4, 2.3, 1.2, 0.1, 6, 5.8, 5.7, 5.6, 5.5, 5.5, 5.4, 5.6, 5.8,
5.1, 6, 5.5, 5.4, 5.3, 5.2, 5.1), decile = c(1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L,
10L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L,
4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L,
3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L), time = c(2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L)), .Names = c("name", "amount",
"decile", "time"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-78L), spec = structure(list(cols = structure(list(name = structure(list(), class = c("collector_character",
"collector")), amount = structure(list(), class = c("collector_double",
"collector")), decile = structure(list(), class = c("collector_integer",
"collector")), time = structure(list(), class = c("collector_integer",
"collector"))), .Names = c("name", "amount", "decile", "time"
)), default = structure(list(), class = c("collector_guess",
"collector"))), .Names = c("cols", "default"), class = "col_spec"))
I wish to calculate the mean result for deciles 1, 5 and 10 BY each year (2016, 17 etc.). I then wish to create a final table detailing year in the first column and then the gap between the mean result for deciles 1 and 10 (i.e. decile 10 result minus decile 1 result), and then the gradient between the mean results for deciles 5 and 10 (i.e. 10 mean result minus 5 mean result)
which is the difference in means between deciles 5 and 10.
To illustrate I have create an working example of the data for 2016. I list the values for deciles 1, 5 and 10 for 2016. I then use these values to work out the gap and gradient difference.
summary2016 <- structure(list(`2016` = c(NA_character_, NA_character_, NA_character_,
NA_character_), `1` = c("5", "10", "Gap", "Gradient"), `5.5` = c(5.1,
4.5, 1.4, 0.3), `6` = c(5.3, 5.6, NA, NA), `11.5` = c(10.4, 10.1,
NA, NA)), .Names = c("2016", "1", "5.5", "6", "11.5"), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -4L), spec = structure(list(
cols = structure(list(`2016` = structure(list(), class = c("collector_character",
"collector")), `1` = structure(list(), class = c("collector_character",
"collector")), `5.5` = structure(list(), class = c("collector_double",
"collector")), `6` = structure(list(), class = c("collector_double",
"collector")), `11.5` = structure(list(), class = c("collector_double",
"collector"))), .Names = c("2016", "1", "5.5", "6", "11.5"
)), default = structure(list(), class = c("collector_guess",
"collector"))), .Names = c("cols", "default"), class = "col_spec"))
Can this be done in one step, or would I need to break it down?
r
add a comment |
For an example dataframe:
df1 <- structure(list(name = c("a", "b", "c", "d", "e", "f", "g", "h",
"i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
"v", "w", "x", "y", "z", "a", "b", "c", "d", "e", "f", "g", "h",
"i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
"v", "w", "x", "y", "z", "a", "b", "c", "d", "e", "f", "g", "h",
"i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
"v", "w", "x", "y", "z"), amount = c(5.5, 5.4, 5.2, 5.3, 5.1,
5.1, 5, 5, 4.9, 4.5, 6, 5.9, 5.7, 5.4, 5.3, 5.1, 5.6, 5.4, 5.3,
5.6, 4.6, 4.2, 4.5, 4.2, 4, 3.8, 6, 5.8, 5.7, 5.6, 5.3, 5.6,
5.4, 5.5, 5.4, 5.1, 9, 8.8, 8.6, 8.4, 8.2, 8, 7.8, 7.6, 7.4,
7.2, 6, 5.75, 5.5, 5.25, 5, 4.75, 10, 8.9, 7.8, 6.7, 5.6, 4.5,
3.4, 2.3, 1.2, 0.1, 6, 5.8, 5.7, 5.6, 5.5, 5.5, 5.4, 5.6, 5.8,
5.1, 6, 5.5, 5.4, 5.3, 5.2, 5.1), decile = c(1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L,
10L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L,
4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L,
3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L), time = c(2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L)), .Names = c("name", "amount",
"decile", "time"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-78L), spec = structure(list(cols = structure(list(name = structure(list(), class = c("collector_character",
"collector")), amount = structure(list(), class = c("collector_double",
"collector")), decile = structure(list(), class = c("collector_integer",
"collector")), time = structure(list(), class = c("collector_integer",
"collector"))), .Names = c("name", "amount", "decile", "time"
)), default = structure(list(), class = c("collector_guess",
"collector"))), .Names = c("cols", "default"), class = "col_spec"))
I wish to calculate the mean result for deciles 1, 5 and 10 BY each year (2016, 17 etc.). I then wish to create a final table detailing year in the first column and then the gap between the mean result for deciles 1 and 10 (i.e. decile 10 result minus decile 1 result), and then the gradient between the mean results for deciles 5 and 10 (i.e. 10 mean result minus 5 mean result)
which is the difference in means between deciles 5 and 10.
To illustrate I have create an working example of the data for 2016. I list the values for deciles 1, 5 and 10 for 2016. I then use these values to work out the gap and gradient difference.
summary2016 <- structure(list(`2016` = c(NA_character_, NA_character_, NA_character_,
NA_character_), `1` = c("5", "10", "Gap", "Gradient"), `5.5` = c(5.1,
4.5, 1.4, 0.3), `6` = c(5.3, 5.6, NA, NA), `11.5` = c(10.4, 10.1,
NA, NA)), .Names = c("2016", "1", "5.5", "6", "11.5"), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -4L), spec = structure(list(
cols = structure(list(`2016` = structure(list(), class = c("collector_character",
"collector")), `1` = structure(list(), class = c("collector_character",
"collector")), `5.5` = structure(list(), class = c("collector_double",
"collector")), `6` = structure(list(), class = c("collector_double",
"collector")), `11.5` = structure(list(), class = c("collector_double",
"collector"))), .Names = c("2016", "1", "5.5", "6", "11.5"
)), default = structure(list(), class = c("collector_guess",
"collector"))), .Names = c("cols", "default"), class = "col_spec"))
Can this be done in one step, or would I need to break it down?
r
For an example dataframe:
df1 <- structure(list(name = c("a", "b", "c", "d", "e", "f", "g", "h",
"i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
"v", "w", "x", "y", "z", "a", "b", "c", "d", "e", "f", "g", "h",
"i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
"v", "w", "x", "y", "z", "a", "b", "c", "d", "e", "f", "g", "h",
"i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
"v", "w", "x", "y", "z"), amount = c(5.5, 5.4, 5.2, 5.3, 5.1,
5.1, 5, 5, 4.9, 4.5, 6, 5.9, 5.7, 5.4, 5.3, 5.1, 5.6, 5.4, 5.3,
5.6, 4.6, 4.2, 4.5, 4.2, 4, 3.8, 6, 5.8, 5.7, 5.6, 5.3, 5.6,
5.4, 5.5, 5.4, 5.1, 9, 8.8, 8.6, 8.4, 8.2, 8, 7.8, 7.6, 7.4,
7.2, 6, 5.75, 5.5, 5.25, 5, 4.75, 10, 8.9, 7.8, 6.7, 5.6, 4.5,
3.4, 2.3, 1.2, 0.1, 6, 5.8, 5.7, 5.6, 5.5, 5.5, 5.4, 5.6, 5.8,
5.1, 6, 5.5, 5.4, 5.3, 5.2, 5.1), decile = c(1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L,
10L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L,
4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L,
3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L), time = c(2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L)), .Names = c("name", "amount",
"decile", "time"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-78L), spec = structure(list(cols = structure(list(name = structure(list(), class = c("collector_character",
"collector")), amount = structure(list(), class = c("collector_double",
"collector")), decile = structure(list(), class = c("collector_integer",
"collector")), time = structure(list(), class = c("collector_integer",
"collector"))), .Names = c("name", "amount", "decile", "time"
)), default = structure(list(), class = c("collector_guess",
"collector"))), .Names = c("cols", "default"), class = "col_spec"))
I wish to calculate the mean result for deciles 1, 5 and 10 BY each year (2016, 17 etc.). I then wish to create a final table detailing year in the first column and then the gap between the mean result for deciles 1 and 10 (i.e. decile 10 result minus decile 1 result), and then the gradient between the mean results for deciles 5 and 10 (i.e. 10 mean result minus 5 mean result)
which is the difference in means between deciles 5 and 10.
To illustrate I have create an working example of the data for 2016. I list the values for deciles 1, 5 and 10 for 2016. I then use these values to work out the gap and gradient difference.
summary2016 <- structure(list(`2016` = c(NA_character_, NA_character_, NA_character_,
NA_character_), `1` = c("5", "10", "Gap", "Gradient"), `5.5` = c(5.1,
4.5, 1.4, 0.3), `6` = c(5.3, 5.6, NA, NA), `11.5` = c(10.4, 10.1,
NA, NA)), .Names = c("2016", "1", "5.5", "6", "11.5"), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -4L), spec = structure(list(
cols = structure(list(`2016` = structure(list(), class = c("collector_character",
"collector")), `1` = structure(list(), class = c("collector_character",
"collector")), `5.5` = structure(list(), class = c("collector_double",
"collector")), `6` = structure(list(), class = c("collector_double",
"collector")), `11.5` = structure(list(), class = c("collector_double",
"collector"))), .Names = c("2016", "1", "5.5", "6", "11.5"
)), default = structure(list(), class = c("collector_guess",
"collector"))), .Names = c("cols", "default"), class = "col_spec"))
Can this be done in one step, or would I need to break it down?
r
r
asked Nov 23 '18 at 15:44
KT_1KT_1
2,532103352
2,532103352
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
library(tidyverse)
df1 %>% filter(decile %in% c(1, 5, 10)) %>%
group_by(time, decile) %>% summarise(mean = mean(amount)) %>%
mutate(gap1 = mean - mean[1], gap5 = mean - mean[2])
# A tibble: 9 x 5
# Groups: time [3]
# time decile mean gap1 gap5
# <int> <int> <dbl> <dbl> <dbl>
# 1 2016 1 5.75 0 0.55
# 2 2016 5 5.20 -0.55 0
# 3 2016 10 5.05 -0.7 -0.150
# 4 2017 1 6.4 0 0.775
# 5 2017 5 5.62 -0.775 0
# 6 2017 10 6.15 -0.25 0.525
# 7 2018 1 7.33 0 1.90
# 8 2018 5 5.43 -1.90 0
# 9 2018 10 2.60 -4.73 -2.83
Numbers are different from yours, so perhaps you are looking for some other kind of gaps. Your example summary2016 also has a somewhat unusual structure, while the solution above produces something more than you ask, but is in a nicer format.
In particular, gap1 is mean(decile i) - mean(decile 1), where i = 1, 5, 10, while gap5 is mean(decile i) - mean(decile 5).
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53449541%2fcalculate-differences-between-groups-in-r%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
library(tidyverse)
df1 %>% filter(decile %in% c(1, 5, 10)) %>%
group_by(time, decile) %>% summarise(mean = mean(amount)) %>%
mutate(gap1 = mean - mean[1], gap5 = mean - mean[2])
# A tibble: 9 x 5
# Groups: time [3]
# time decile mean gap1 gap5
# <int> <int> <dbl> <dbl> <dbl>
# 1 2016 1 5.75 0 0.55
# 2 2016 5 5.20 -0.55 0
# 3 2016 10 5.05 -0.7 -0.150
# 4 2017 1 6.4 0 0.775
# 5 2017 5 5.62 -0.775 0
# 6 2017 10 6.15 -0.25 0.525
# 7 2018 1 7.33 0 1.90
# 8 2018 5 5.43 -1.90 0
# 9 2018 10 2.60 -4.73 -2.83
Numbers are different from yours, so perhaps you are looking for some other kind of gaps. Your example summary2016 also has a somewhat unusual structure, while the solution above produces something more than you ask, but is in a nicer format.
In particular, gap1 is mean(decile i) - mean(decile 1), where i = 1, 5, 10, while gap5 is mean(decile i) - mean(decile 5).
add a comment |
library(tidyverse)
df1 %>% filter(decile %in% c(1, 5, 10)) %>%
group_by(time, decile) %>% summarise(mean = mean(amount)) %>%
mutate(gap1 = mean - mean[1], gap5 = mean - mean[2])
# A tibble: 9 x 5
# Groups: time [3]
# time decile mean gap1 gap5
# <int> <int> <dbl> <dbl> <dbl>
# 1 2016 1 5.75 0 0.55
# 2 2016 5 5.20 -0.55 0
# 3 2016 10 5.05 -0.7 -0.150
# 4 2017 1 6.4 0 0.775
# 5 2017 5 5.62 -0.775 0
# 6 2017 10 6.15 -0.25 0.525
# 7 2018 1 7.33 0 1.90
# 8 2018 5 5.43 -1.90 0
# 9 2018 10 2.60 -4.73 -2.83
Numbers are different from yours, so perhaps you are looking for some other kind of gaps. Your example summary2016 also has a somewhat unusual structure, while the solution above produces something more than you ask, but is in a nicer format.
In particular, gap1 is mean(decile i) - mean(decile 1), where i = 1, 5, 10, while gap5 is mean(decile i) - mean(decile 5).
add a comment |
library(tidyverse)
df1 %>% filter(decile %in% c(1, 5, 10)) %>%
group_by(time, decile) %>% summarise(mean = mean(amount)) %>%
mutate(gap1 = mean - mean[1], gap5 = mean - mean[2])
# A tibble: 9 x 5
# Groups: time [3]
# time decile mean gap1 gap5
# <int> <int> <dbl> <dbl> <dbl>
# 1 2016 1 5.75 0 0.55
# 2 2016 5 5.20 -0.55 0
# 3 2016 10 5.05 -0.7 -0.150
# 4 2017 1 6.4 0 0.775
# 5 2017 5 5.62 -0.775 0
# 6 2017 10 6.15 -0.25 0.525
# 7 2018 1 7.33 0 1.90
# 8 2018 5 5.43 -1.90 0
# 9 2018 10 2.60 -4.73 -2.83
Numbers are different from yours, so perhaps you are looking for some other kind of gaps. Your example summary2016 also has a somewhat unusual structure, while the solution above produces something more than you ask, but is in a nicer format.
In particular, gap1 is mean(decile i) - mean(decile 1), where i = 1, 5, 10, while gap5 is mean(decile i) - mean(decile 5).
library(tidyverse)
df1 %>% filter(decile %in% c(1, 5, 10)) %>%
group_by(time, decile) %>% summarise(mean = mean(amount)) %>%
mutate(gap1 = mean - mean[1], gap5 = mean - mean[2])
# A tibble: 9 x 5
# Groups: time [3]
# time decile mean gap1 gap5
# <int> <int> <dbl> <dbl> <dbl>
# 1 2016 1 5.75 0 0.55
# 2 2016 5 5.20 -0.55 0
# 3 2016 10 5.05 -0.7 -0.150
# 4 2017 1 6.4 0 0.775
# 5 2017 5 5.62 -0.775 0
# 6 2017 10 6.15 -0.25 0.525
# 7 2018 1 7.33 0 1.90
# 8 2018 5 5.43 -1.90 0
# 9 2018 10 2.60 -4.73 -2.83
Numbers are different from yours, so perhaps you are looking for some other kind of gaps. Your example summary2016 also has a somewhat unusual structure, while the solution above produces something more than you ask, but is in a nicer format.
In particular, gap1 is mean(decile i) - mean(decile 1), where i = 1, 5, 10, while gap5 is mean(decile i) - mean(decile 5).
answered Nov 23 '18 at 16:09
Julius VainoraJulius Vainora
34.5k76079
34.5k76079
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53449541%2fcalculate-differences-between-groups-in-r%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown