Calculate differences between groups in R












0















For an example dataframe:



df1 <- structure(list(name = c("a", "b", "c", "d", "e", "f", "g", "h", 
"i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
"v", "w", "x", "y", "z", "a", "b", "c", "d", "e", "f", "g", "h",
"i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
"v", "w", "x", "y", "z", "a", "b", "c", "d", "e", "f", "g", "h",
"i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
"v", "w", "x", "y", "z"), amount = c(5.5, 5.4, 5.2, 5.3, 5.1,
5.1, 5, 5, 4.9, 4.5, 6, 5.9, 5.7, 5.4, 5.3, 5.1, 5.6, 5.4, 5.3,
5.6, 4.6, 4.2, 4.5, 4.2, 4, 3.8, 6, 5.8, 5.7, 5.6, 5.3, 5.6,
5.4, 5.5, 5.4, 5.1, 9, 8.8, 8.6, 8.4, 8.2, 8, 7.8, 7.6, 7.4,
7.2, 6, 5.75, 5.5, 5.25, 5, 4.75, 10, 8.9, 7.8, 6.7, 5.6, 4.5,
3.4, 2.3, 1.2, 0.1, 6, 5.8, 5.7, 5.6, 5.5, 5.5, 5.4, 5.6, 5.8,
5.1, 6, 5.5, 5.4, 5.3, 5.2, 5.1), decile = c(1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L,
10L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L,
4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L,
3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L), time = c(2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L)), .Names = c("name", "amount",
"decile", "time"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-78L), spec = structure(list(cols = structure(list(name = structure(list(), class = c("collector_character",
"collector")), amount = structure(list(), class = c("collector_double",
"collector")), decile = structure(list(), class = c("collector_integer",
"collector")), time = structure(list(), class = c("collector_integer",
"collector"))), .Names = c("name", "amount", "decile", "time"
)), default = structure(list(), class = c("collector_guess",
"collector"))), .Names = c("cols", "default"), class = "col_spec"))


I wish to calculate the mean result for deciles 1, 5 and 10 BY each year (2016, 17 etc.). I then wish to create a final table detailing year in the first column and then the gap between the mean result for deciles 1 and 10 (i.e. decile 10 result minus decile 1 result), and then the gradient between the mean results for deciles 5 and 10 (i.e. 10 mean result minus 5 mean result)
which is the difference in means between deciles 5 and 10.



To illustrate I have create an working example of the data for 2016. I list the values for deciles 1, 5 and 10 for 2016. I then use these values to work out the gap and gradient difference.



summary2016 <- structure(list(`2016` = c(NA_character_, NA_character_, NA_character_, 
NA_character_), `1` = c("5", "10", "Gap", "Gradient"), `5.5` = c(5.1,
4.5, 1.4, 0.3), `6` = c(5.3, 5.6, NA, NA), `11.5` = c(10.4, 10.1,
NA, NA)), .Names = c("2016", "1", "5.5", "6", "11.5"), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -4L), spec = structure(list(
cols = structure(list(`2016` = structure(list(), class = c("collector_character",
"collector")), `1` = structure(list(), class = c("collector_character",
"collector")), `5.5` = structure(list(), class = c("collector_double",
"collector")), `6` = structure(list(), class = c("collector_double",
"collector")), `11.5` = structure(list(), class = c("collector_double",
"collector"))), .Names = c("2016", "1", "5.5", "6", "11.5"
)), default = structure(list(), class = c("collector_guess",
"collector"))), .Names = c("cols", "default"), class = "col_spec"))


Can this be done in one step, or would I need to break it down?










share|improve this question



























    0















    For an example dataframe:



    df1 <- structure(list(name = c("a", "b", "c", "d", "e", "f", "g", "h", 
    "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
    "v", "w", "x", "y", "z", "a", "b", "c", "d", "e", "f", "g", "h",
    "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
    "v", "w", "x", "y", "z", "a", "b", "c", "d", "e", "f", "g", "h",
    "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
    "v", "w", "x", "y", "z"), amount = c(5.5, 5.4, 5.2, 5.3, 5.1,
    5.1, 5, 5, 4.9, 4.5, 6, 5.9, 5.7, 5.4, 5.3, 5.1, 5.6, 5.4, 5.3,
    5.6, 4.6, 4.2, 4.5, 4.2, 4, 3.8, 6, 5.8, 5.7, 5.6, 5.3, 5.6,
    5.4, 5.5, 5.4, 5.1, 9, 8.8, 8.6, 8.4, 8.2, 8, 7.8, 7.6, 7.4,
    7.2, 6, 5.75, 5.5, 5.25, 5, 4.75, 10, 8.9, 7.8, 6.7, 5.6, 4.5,
    3.4, 2.3, 1.2, 0.1, 6, 5.8, 5.7, 5.6, 5.5, 5.5, 5.4, 5.6, 5.8,
    5.1, 6, 5.5, 5.4, 5.3, 5.2, 5.1), decile = c(1L, 2L, 3L, 4L,
    5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L,
    10L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
    9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L,
    4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L,
    3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L), time = c(2016L,
    2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
    2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
    2016L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
    2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
    2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
    2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2018L, 2018L, 2018L,
    2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
    2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
    2018L, 2018L, 2018L, 2018L, 2018L)), .Names = c("name", "amount",
    "decile", "time"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
    -78L), spec = structure(list(cols = structure(list(name = structure(list(), class = c("collector_character",
    "collector")), amount = structure(list(), class = c("collector_double",
    "collector")), decile = structure(list(), class = c("collector_integer",
    "collector")), time = structure(list(), class = c("collector_integer",
    "collector"))), .Names = c("name", "amount", "decile", "time"
    )), default = structure(list(), class = c("collector_guess",
    "collector"))), .Names = c("cols", "default"), class = "col_spec"))


    I wish to calculate the mean result for deciles 1, 5 and 10 BY each year (2016, 17 etc.). I then wish to create a final table detailing year in the first column and then the gap between the mean result for deciles 1 and 10 (i.e. decile 10 result minus decile 1 result), and then the gradient between the mean results for deciles 5 and 10 (i.e. 10 mean result minus 5 mean result)
    which is the difference in means between deciles 5 and 10.



    To illustrate I have create an working example of the data for 2016. I list the values for deciles 1, 5 and 10 for 2016. I then use these values to work out the gap and gradient difference.



    summary2016 <- structure(list(`2016` = c(NA_character_, NA_character_, NA_character_, 
    NA_character_), `1` = c("5", "10", "Gap", "Gradient"), `5.5` = c(5.1,
    4.5, 1.4, 0.3), `6` = c(5.3, 5.6, NA, NA), `11.5` = c(10.4, 10.1,
    NA, NA)), .Names = c("2016", "1", "5.5", "6", "11.5"), class = c("tbl_df",
    "tbl", "data.frame"), row.names = c(NA, -4L), spec = structure(list(
    cols = structure(list(`2016` = structure(list(), class = c("collector_character",
    "collector")), `1` = structure(list(), class = c("collector_character",
    "collector")), `5.5` = structure(list(), class = c("collector_double",
    "collector")), `6` = structure(list(), class = c("collector_double",
    "collector")), `11.5` = structure(list(), class = c("collector_double",
    "collector"))), .Names = c("2016", "1", "5.5", "6", "11.5"
    )), default = structure(list(), class = c("collector_guess",
    "collector"))), .Names = c("cols", "default"), class = "col_spec"))


    Can this be done in one step, or would I need to break it down?










    share|improve this question

























      0












      0








      0








      For an example dataframe:



      df1 <- structure(list(name = c("a", "b", "c", "d", "e", "f", "g", "h", 
      "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
      "v", "w", "x", "y", "z", "a", "b", "c", "d", "e", "f", "g", "h",
      "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
      "v", "w", "x", "y", "z", "a", "b", "c", "d", "e", "f", "g", "h",
      "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
      "v", "w", "x", "y", "z"), amount = c(5.5, 5.4, 5.2, 5.3, 5.1,
      5.1, 5, 5, 4.9, 4.5, 6, 5.9, 5.7, 5.4, 5.3, 5.1, 5.6, 5.4, 5.3,
      5.6, 4.6, 4.2, 4.5, 4.2, 4, 3.8, 6, 5.8, 5.7, 5.6, 5.3, 5.6,
      5.4, 5.5, 5.4, 5.1, 9, 8.8, 8.6, 8.4, 8.2, 8, 7.8, 7.6, 7.4,
      7.2, 6, 5.75, 5.5, 5.25, 5, 4.75, 10, 8.9, 7.8, 6.7, 5.6, 4.5,
      3.4, 2.3, 1.2, 0.1, 6, 5.8, 5.7, 5.6, 5.5, 5.5, 5.4, 5.6, 5.8,
      5.1, 6, 5.5, 5.4, 5.3, 5.2, 5.1), decile = c(1L, 2L, 3L, 4L,
      5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L,
      10L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
      9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L,
      4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L,
      3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L), time = c(2016L,
      2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
      2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
      2016L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
      2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
      2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
      2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2018L, 2018L, 2018L,
      2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
      2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
      2018L, 2018L, 2018L, 2018L, 2018L)), .Names = c("name", "amount",
      "decile", "time"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
      -78L), spec = structure(list(cols = structure(list(name = structure(list(), class = c("collector_character",
      "collector")), amount = structure(list(), class = c("collector_double",
      "collector")), decile = structure(list(), class = c("collector_integer",
      "collector")), time = structure(list(), class = c("collector_integer",
      "collector"))), .Names = c("name", "amount", "decile", "time"
      )), default = structure(list(), class = c("collector_guess",
      "collector"))), .Names = c("cols", "default"), class = "col_spec"))


      I wish to calculate the mean result for deciles 1, 5 and 10 BY each year (2016, 17 etc.). I then wish to create a final table detailing year in the first column and then the gap between the mean result for deciles 1 and 10 (i.e. decile 10 result minus decile 1 result), and then the gradient between the mean results for deciles 5 and 10 (i.e. 10 mean result minus 5 mean result)
      which is the difference in means between deciles 5 and 10.



      To illustrate I have create an working example of the data for 2016. I list the values for deciles 1, 5 and 10 for 2016. I then use these values to work out the gap and gradient difference.



      summary2016 <- structure(list(`2016` = c(NA_character_, NA_character_, NA_character_, 
      NA_character_), `1` = c("5", "10", "Gap", "Gradient"), `5.5` = c(5.1,
      4.5, 1.4, 0.3), `6` = c(5.3, 5.6, NA, NA), `11.5` = c(10.4, 10.1,
      NA, NA)), .Names = c("2016", "1", "5.5", "6", "11.5"), class = c("tbl_df",
      "tbl", "data.frame"), row.names = c(NA, -4L), spec = structure(list(
      cols = structure(list(`2016` = structure(list(), class = c("collector_character",
      "collector")), `1` = structure(list(), class = c("collector_character",
      "collector")), `5.5` = structure(list(), class = c("collector_double",
      "collector")), `6` = structure(list(), class = c("collector_double",
      "collector")), `11.5` = structure(list(), class = c("collector_double",
      "collector"))), .Names = c("2016", "1", "5.5", "6", "11.5"
      )), default = structure(list(), class = c("collector_guess",
      "collector"))), .Names = c("cols", "default"), class = "col_spec"))


      Can this be done in one step, or would I need to break it down?










      share|improve this question














      For an example dataframe:



      df1 <- structure(list(name = c("a", "b", "c", "d", "e", "f", "g", "h", 
      "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
      "v", "w", "x", "y", "z", "a", "b", "c", "d", "e", "f", "g", "h",
      "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
      "v", "w", "x", "y", "z", "a", "b", "c", "d", "e", "f", "g", "h",
      "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
      "v", "w", "x", "y", "z"), amount = c(5.5, 5.4, 5.2, 5.3, 5.1,
      5.1, 5, 5, 4.9, 4.5, 6, 5.9, 5.7, 5.4, 5.3, 5.1, 5.6, 5.4, 5.3,
      5.6, 4.6, 4.2, 4.5, 4.2, 4, 3.8, 6, 5.8, 5.7, 5.6, 5.3, 5.6,
      5.4, 5.5, 5.4, 5.1, 9, 8.8, 8.6, 8.4, 8.2, 8, 7.8, 7.6, 7.4,
      7.2, 6, 5.75, 5.5, 5.25, 5, 4.75, 10, 8.9, 7.8, 6.7, 5.6, 4.5,
      3.4, 2.3, 1.2, 0.1, 6, 5.8, 5.7, 5.6, 5.5, 5.5, 5.4, 5.6, 5.8,
      5.1, 6, 5.5, 5.4, 5.3, 5.2, 5.1), decile = c(1L, 2L, 3L, 4L,
      5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L,
      10L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
      9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L,
      4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L,
      3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L), time = c(2016L,
      2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
      2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
      2016L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
      2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
      2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
      2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2018L, 2018L, 2018L,
      2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
      2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
      2018L, 2018L, 2018L, 2018L, 2018L)), .Names = c("name", "amount",
      "decile", "time"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
      -78L), spec = structure(list(cols = structure(list(name = structure(list(), class = c("collector_character",
      "collector")), amount = structure(list(), class = c("collector_double",
      "collector")), decile = structure(list(), class = c("collector_integer",
      "collector")), time = structure(list(), class = c("collector_integer",
      "collector"))), .Names = c("name", "amount", "decile", "time"
      )), default = structure(list(), class = c("collector_guess",
      "collector"))), .Names = c("cols", "default"), class = "col_spec"))


      I wish to calculate the mean result for deciles 1, 5 and 10 BY each year (2016, 17 etc.). I then wish to create a final table detailing year in the first column and then the gap between the mean result for deciles 1 and 10 (i.e. decile 10 result minus decile 1 result), and then the gradient between the mean results for deciles 5 and 10 (i.e. 10 mean result minus 5 mean result)
      which is the difference in means between deciles 5 and 10.



      To illustrate I have create an working example of the data for 2016. I list the values for deciles 1, 5 and 10 for 2016. I then use these values to work out the gap and gradient difference.



      summary2016 <- structure(list(`2016` = c(NA_character_, NA_character_, NA_character_, 
      NA_character_), `1` = c("5", "10", "Gap", "Gradient"), `5.5` = c(5.1,
      4.5, 1.4, 0.3), `6` = c(5.3, 5.6, NA, NA), `11.5` = c(10.4, 10.1,
      NA, NA)), .Names = c("2016", "1", "5.5", "6", "11.5"), class = c("tbl_df",
      "tbl", "data.frame"), row.names = c(NA, -4L), spec = structure(list(
      cols = structure(list(`2016` = structure(list(), class = c("collector_character",
      "collector")), `1` = structure(list(), class = c("collector_character",
      "collector")), `5.5` = structure(list(), class = c("collector_double",
      "collector")), `6` = structure(list(), class = c("collector_double",
      "collector")), `11.5` = structure(list(), class = c("collector_double",
      "collector"))), .Names = c("2016", "1", "5.5", "6", "11.5"
      )), default = structure(list(), class = c("collector_guess",
      "collector"))), .Names = c("cols", "default"), class = "col_spec"))


      Can this be done in one step, or would I need to break it down?







      r






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 23 '18 at 15:44









      KT_1KT_1

      2,532103352




      2,532103352
























          1 Answer
          1






          active

          oldest

          votes


















          1














          library(tidyverse)
          df1 %>% filter(decile %in% c(1, 5, 10)) %>%
          group_by(time, decile) %>% summarise(mean = mean(amount)) %>%
          mutate(gap1 = mean - mean[1], gap5 = mean - mean[2])

          # A tibble: 9 x 5
          # Groups: time [3]
          # time decile mean gap1 gap5
          # <int> <int> <dbl> <dbl> <dbl>
          # 1 2016 1 5.75 0 0.55
          # 2 2016 5 5.20 -0.55 0
          # 3 2016 10 5.05 -0.7 -0.150
          # 4 2017 1 6.4 0 0.775
          # 5 2017 5 5.62 -0.775 0
          # 6 2017 10 6.15 -0.25 0.525
          # 7 2018 1 7.33 0 1.90
          # 8 2018 5 5.43 -1.90 0
          # 9 2018 10 2.60 -4.73 -2.83


          Numbers are different from yours, so perhaps you are looking for some other kind of gaps. Your example summary2016 also has a somewhat unusual structure, while the solution above produces something more than you ask, but is in a nicer format.



          In particular, gap1 is mean(decile i) - mean(decile 1), where i = 1, 5, 10, while gap5 is mean(decile i) - mean(decile 5).






          share|improve this answer























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53449541%2fcalculate-differences-between-groups-in-r%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            1














            library(tidyverse)
            df1 %>% filter(decile %in% c(1, 5, 10)) %>%
            group_by(time, decile) %>% summarise(mean = mean(amount)) %>%
            mutate(gap1 = mean - mean[1], gap5 = mean - mean[2])

            # A tibble: 9 x 5
            # Groups: time [3]
            # time decile mean gap1 gap5
            # <int> <int> <dbl> <dbl> <dbl>
            # 1 2016 1 5.75 0 0.55
            # 2 2016 5 5.20 -0.55 0
            # 3 2016 10 5.05 -0.7 -0.150
            # 4 2017 1 6.4 0 0.775
            # 5 2017 5 5.62 -0.775 0
            # 6 2017 10 6.15 -0.25 0.525
            # 7 2018 1 7.33 0 1.90
            # 8 2018 5 5.43 -1.90 0
            # 9 2018 10 2.60 -4.73 -2.83


            Numbers are different from yours, so perhaps you are looking for some other kind of gaps. Your example summary2016 also has a somewhat unusual structure, while the solution above produces something more than you ask, but is in a nicer format.



            In particular, gap1 is mean(decile i) - mean(decile 1), where i = 1, 5, 10, while gap5 is mean(decile i) - mean(decile 5).






            share|improve this answer




























              1














              library(tidyverse)
              df1 %>% filter(decile %in% c(1, 5, 10)) %>%
              group_by(time, decile) %>% summarise(mean = mean(amount)) %>%
              mutate(gap1 = mean - mean[1], gap5 = mean - mean[2])

              # A tibble: 9 x 5
              # Groups: time [3]
              # time decile mean gap1 gap5
              # <int> <int> <dbl> <dbl> <dbl>
              # 1 2016 1 5.75 0 0.55
              # 2 2016 5 5.20 -0.55 0
              # 3 2016 10 5.05 -0.7 -0.150
              # 4 2017 1 6.4 0 0.775
              # 5 2017 5 5.62 -0.775 0
              # 6 2017 10 6.15 -0.25 0.525
              # 7 2018 1 7.33 0 1.90
              # 8 2018 5 5.43 -1.90 0
              # 9 2018 10 2.60 -4.73 -2.83


              Numbers are different from yours, so perhaps you are looking for some other kind of gaps. Your example summary2016 also has a somewhat unusual structure, while the solution above produces something more than you ask, but is in a nicer format.



              In particular, gap1 is mean(decile i) - mean(decile 1), where i = 1, 5, 10, while gap5 is mean(decile i) - mean(decile 5).






              share|improve this answer


























                1












                1








                1







                library(tidyverse)
                df1 %>% filter(decile %in% c(1, 5, 10)) %>%
                group_by(time, decile) %>% summarise(mean = mean(amount)) %>%
                mutate(gap1 = mean - mean[1], gap5 = mean - mean[2])

                # A tibble: 9 x 5
                # Groups: time [3]
                # time decile mean gap1 gap5
                # <int> <int> <dbl> <dbl> <dbl>
                # 1 2016 1 5.75 0 0.55
                # 2 2016 5 5.20 -0.55 0
                # 3 2016 10 5.05 -0.7 -0.150
                # 4 2017 1 6.4 0 0.775
                # 5 2017 5 5.62 -0.775 0
                # 6 2017 10 6.15 -0.25 0.525
                # 7 2018 1 7.33 0 1.90
                # 8 2018 5 5.43 -1.90 0
                # 9 2018 10 2.60 -4.73 -2.83


                Numbers are different from yours, so perhaps you are looking for some other kind of gaps. Your example summary2016 also has a somewhat unusual structure, while the solution above produces something more than you ask, but is in a nicer format.



                In particular, gap1 is mean(decile i) - mean(decile 1), where i = 1, 5, 10, while gap5 is mean(decile i) - mean(decile 5).






                share|improve this answer













                library(tidyverse)
                df1 %>% filter(decile %in% c(1, 5, 10)) %>%
                group_by(time, decile) %>% summarise(mean = mean(amount)) %>%
                mutate(gap1 = mean - mean[1], gap5 = mean - mean[2])

                # A tibble: 9 x 5
                # Groups: time [3]
                # time decile mean gap1 gap5
                # <int> <int> <dbl> <dbl> <dbl>
                # 1 2016 1 5.75 0 0.55
                # 2 2016 5 5.20 -0.55 0
                # 3 2016 10 5.05 -0.7 -0.150
                # 4 2017 1 6.4 0 0.775
                # 5 2017 5 5.62 -0.775 0
                # 6 2017 10 6.15 -0.25 0.525
                # 7 2018 1 7.33 0 1.90
                # 8 2018 5 5.43 -1.90 0
                # 9 2018 10 2.60 -4.73 -2.83


                Numbers are different from yours, so perhaps you are looking for some other kind of gaps. Your example summary2016 also has a somewhat unusual structure, while the solution above produces something more than you ask, but is in a nicer format.



                In particular, gap1 is mean(decile i) - mean(decile 1), where i = 1, 5, 10, while gap5 is mean(decile i) - mean(decile 5).







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 23 '18 at 16:09









                Julius VainoraJulius Vainora

                34.5k76079




                34.5k76079






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53449541%2fcalculate-differences-between-groups-in-r%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Sphinx de Gizeh

                    Dijon

                    Guerrita