add values of one group into another group in R












1















I have a question on how to add the value from a group to rest of the elements in the group then delete that row. for ex:



df <- data.frame(Year=c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2),
Cluster=c("a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","c","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","d"),
Seed=c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,99,99,99,99,99,99),
Day=c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1),
value=c(5,2,1,2,8,6,7,9,3,5,2,1,2,8,6,55,66,77,88,99,10))


in the above example, my data is grouped by Year, Cluster, Seed and Day where seed=99 values need to be added to above rows based on (Year, Cluster and Day) group then delete this row. for ex: Row # 16, is part of (Year=1, Cluster=a,Day=1 and Seed=99) group and the value of Row #16 which is 55 should be added to Row #1 (5+55), Row # 6 (6+55) and Row # 11 (2+55) and row # 16 should be deleted. But when it comes to Row #21, which is in cluster=C with seed=99, should remain in the database as is as it cannot find any matching in year+cluster+day combination.



My actual data is of 1 million records with 10 years, 80 clusters, 500 days and 10+1 (1 to 10 and 99) seeds, so looking for so looking for an efficient solution.



     Year Cluster Seed Day value
1 1 a 1 1 60
2 1 a 1 2 68
3 1 a 1 3 78
4 1 a 1 4 90
5 1 a 1 5 107
6 1 a 2 1 61
7 1 a 2 2 73
8 1 a 2 3 86
9 1 a 2 4 91
10 1 a 2 5 104
11 1 a 3 1 57
12 1 a 3 2 67
13 1 a 3 3 79
14 1 a 3 4 96
15 1 a 3 5 105
16 1 c 99 1 10
17 2 b 1 1 60
18 2 b 1 2 68
19 2 b 1 3 78
20 2 b 1 4 90
21 2 b 1 5 107
22 2 b 2 1 61
23 2 b 2 2 73
24 2 b 2 3 86
25 2 b 2 4 91
26 2 b 2 5 104
27 2 b 3 1 57
28 2 b 3 2 67
29 2 b 3 3 79
30 2 b 3 4 96
31 2 b 3 5 105
32 2 d 99 1 10









share|improve this question

























  • please post expected output

    – Vivek Kalyanarangan
    Nov 23 '18 at 20:59
















1















I have a question on how to add the value from a group to rest of the elements in the group then delete that row. for ex:



df <- data.frame(Year=c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2),
Cluster=c("a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","c","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","d"),
Seed=c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,99,99,99,99,99,99),
Day=c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1),
value=c(5,2,1,2,8,6,7,9,3,5,2,1,2,8,6,55,66,77,88,99,10))


in the above example, my data is grouped by Year, Cluster, Seed and Day where seed=99 values need to be added to above rows based on (Year, Cluster and Day) group then delete this row. for ex: Row # 16, is part of (Year=1, Cluster=a,Day=1 and Seed=99) group and the value of Row #16 which is 55 should be added to Row #1 (5+55), Row # 6 (6+55) and Row # 11 (2+55) and row # 16 should be deleted. But when it comes to Row #21, which is in cluster=C with seed=99, should remain in the database as is as it cannot find any matching in year+cluster+day combination.



My actual data is of 1 million records with 10 years, 80 clusters, 500 days and 10+1 (1 to 10 and 99) seeds, so looking for so looking for an efficient solution.



     Year Cluster Seed Day value
1 1 a 1 1 60
2 1 a 1 2 68
3 1 a 1 3 78
4 1 a 1 4 90
5 1 a 1 5 107
6 1 a 2 1 61
7 1 a 2 2 73
8 1 a 2 3 86
9 1 a 2 4 91
10 1 a 2 5 104
11 1 a 3 1 57
12 1 a 3 2 67
13 1 a 3 3 79
14 1 a 3 4 96
15 1 a 3 5 105
16 1 c 99 1 10
17 2 b 1 1 60
18 2 b 1 2 68
19 2 b 1 3 78
20 2 b 1 4 90
21 2 b 1 5 107
22 2 b 2 1 61
23 2 b 2 2 73
24 2 b 2 3 86
25 2 b 2 4 91
26 2 b 2 5 104
27 2 b 3 1 57
28 2 b 3 2 67
29 2 b 3 3 79
30 2 b 3 4 96
31 2 b 3 5 105
32 2 d 99 1 10









share|improve this question

























  • please post expected output

    – Vivek Kalyanarangan
    Nov 23 '18 at 20:59














1












1








1


1






I have a question on how to add the value from a group to rest of the elements in the group then delete that row. for ex:



df <- data.frame(Year=c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2),
Cluster=c("a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","c","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","d"),
Seed=c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,99,99,99,99,99,99),
Day=c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1),
value=c(5,2,1,2,8,6,7,9,3,5,2,1,2,8,6,55,66,77,88,99,10))


in the above example, my data is grouped by Year, Cluster, Seed and Day where seed=99 values need to be added to above rows based on (Year, Cluster and Day) group then delete this row. for ex: Row # 16, is part of (Year=1, Cluster=a,Day=1 and Seed=99) group and the value of Row #16 which is 55 should be added to Row #1 (5+55), Row # 6 (6+55) and Row # 11 (2+55) and row # 16 should be deleted. But when it comes to Row #21, which is in cluster=C with seed=99, should remain in the database as is as it cannot find any matching in year+cluster+day combination.



My actual data is of 1 million records with 10 years, 80 clusters, 500 days and 10+1 (1 to 10 and 99) seeds, so looking for so looking for an efficient solution.



     Year Cluster Seed Day value
1 1 a 1 1 60
2 1 a 1 2 68
3 1 a 1 3 78
4 1 a 1 4 90
5 1 a 1 5 107
6 1 a 2 1 61
7 1 a 2 2 73
8 1 a 2 3 86
9 1 a 2 4 91
10 1 a 2 5 104
11 1 a 3 1 57
12 1 a 3 2 67
13 1 a 3 3 79
14 1 a 3 4 96
15 1 a 3 5 105
16 1 c 99 1 10
17 2 b 1 1 60
18 2 b 1 2 68
19 2 b 1 3 78
20 2 b 1 4 90
21 2 b 1 5 107
22 2 b 2 1 61
23 2 b 2 2 73
24 2 b 2 3 86
25 2 b 2 4 91
26 2 b 2 5 104
27 2 b 3 1 57
28 2 b 3 2 67
29 2 b 3 3 79
30 2 b 3 4 96
31 2 b 3 5 105
32 2 d 99 1 10









share|improve this question
















I have a question on how to add the value from a group to rest of the elements in the group then delete that row. for ex:



df <- data.frame(Year=c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2),
Cluster=c("a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","c","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","d"),
Seed=c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,99,99,99,99,99,99),
Day=c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1),
value=c(5,2,1,2,8,6,7,9,3,5,2,1,2,8,6,55,66,77,88,99,10))


in the above example, my data is grouped by Year, Cluster, Seed and Day where seed=99 values need to be added to above rows based on (Year, Cluster and Day) group then delete this row. for ex: Row # 16, is part of (Year=1, Cluster=a,Day=1 and Seed=99) group and the value of Row #16 which is 55 should be added to Row #1 (5+55), Row # 6 (6+55) and Row # 11 (2+55) and row # 16 should be deleted. But when it comes to Row #21, which is in cluster=C with seed=99, should remain in the database as is as it cannot find any matching in year+cluster+day combination.



My actual data is of 1 million records with 10 years, 80 clusters, 500 days and 10+1 (1 to 10 and 99) seeds, so looking for so looking for an efficient solution.



     Year Cluster Seed Day value
1 1 a 1 1 60
2 1 a 1 2 68
3 1 a 1 3 78
4 1 a 1 4 90
5 1 a 1 5 107
6 1 a 2 1 61
7 1 a 2 2 73
8 1 a 2 3 86
9 1 a 2 4 91
10 1 a 2 5 104
11 1 a 3 1 57
12 1 a 3 2 67
13 1 a 3 3 79
14 1 a 3 4 96
15 1 a 3 5 105
16 1 c 99 1 10
17 2 b 1 1 60
18 2 b 1 2 68
19 2 b 1 3 78
20 2 b 1 4 90
21 2 b 1 5 107
22 2 b 2 1 61
23 2 b 2 2 73
24 2 b 2 3 86
25 2 b 2 4 91
26 2 b 2 5 104
27 2 b 3 1 57
28 2 b 3 2 67
29 2 b 3 3 79
30 2 b 3 4 96
31 2 b 3 5 105
32 2 d 99 1 10






r datatable dplyr tidyr zoo






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 23 '18 at 21:06







Ravindar G

















asked Nov 23 '18 at 20:41









Ravindar GRavindar G

103




103













  • please post expected output

    – Vivek Kalyanarangan
    Nov 23 '18 at 20:59



















  • please post expected output

    – Vivek Kalyanarangan
    Nov 23 '18 at 20:59

















please post expected output

– Vivek Kalyanarangan
Nov 23 '18 at 20:59





please post expected output

– Vivek Kalyanarangan
Nov 23 '18 at 20:59












2 Answers
2






active

oldest

votes


















0














Here's an approach using the tidyverse. If you're looking for speed with a million rows, a data.table solution will probably perform better.



library(tidyverse)

df <- data.frame(Year=c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2),
Cluster=c("a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","c","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","d"),
Seed=c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,99,99,99,99,99,99),
Day=c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1),
value=c(5,2,1,2,8,6,7,9,3,5,2,1,2,8,6,55,66,77,88,99,10))

seeds <- df %>%
filter(Seed == 99)

matches <- df %>%
filter(Seed != 99) %>%
inner_join(select(seeds, -Seed), by = c("Year", "Cluster", "Day")) %>%
mutate(value = value.x + value.y) %>%
select(Year, Cluster, Seed, Day, value)

no_matches <- anti_join(seeds, matches, by = c("Year", "Cluster", "Day"))

bind_rows(matches, no_matches) %>%
arrange(Year, Cluster, Seed, Day)
#> Year Cluster Seed Day value
#> 1 1 a 1 1 60
#> 2 1 a 1 2 68
#> 3 1 a 1 3 78
#> 4 1 a 1 4 90
#> 5 1 a 1 5 107
#> 6 1 a 2 1 61
#> 7 1 a 2 2 73
#> 8 1 a 2 3 86
#> 9 1 a 2 4 91
#> 10 1 a 2 5 104
#> 11 1 a 3 1 57
#> 12 1 a 3 2 67
#> 13 1 a 3 3 79
#> 14 1 a 3 4 96
#> 15 1 a 3 5 105
#> 16 1 c 99 1 10
#> 17 2 b 1 1 60
#> 18 2 b 1 2 68
#> 19 2 b 1 3 78
#> 20 2 b 1 4 90
#> 21 2 b 1 5 107
#> 22 2 b 2 1 61
#> 23 2 b 2 2 73
#> 24 2 b 2 3 86
#> 25 2 b 2 4 91
#> 26 2 b 2 5 104
#> 27 2 b 3 1 57
#> 28 2 b 3 2 67
#> 29 2 b 3 3 79
#> 30 2 b 3 4 96
#> 31 2 b 3 5 105
#> 32 2 d 99 1 10


Created on 2018-11-23 by the reprex package (v0.2.1)






share|improve this answer































    0














    A data.table approach:



    library(data.table)

    df <- setDT(df)[, `:=` (value = ifelse(Seed != 99, value + value[Seed == 99], value),
    flag = Seed == 99 & .N == 1), by = .(Year, Cluster, Day)][!(Seed == 99 & flag == FALSE),][, "flag" := NULL]


    Output:



    df

    Year Cluster Seed Day value
    1: 1 a 1 1 60
    2: 1 a 1 2 68
    3: 1 a 1 3 78
    4: 1 a 1 4 90
    5: 1 a 1 5 107
    6: 1 a 2 1 61
    7: 1 a 2 2 73
    8: 1 a 2 3 86
    9: 1 a 2 4 91
    10: 1 a 2 5 104
    11: 1 a 3 1 57
    12: 1 a 3 2 67
    13: 1 a 3 3 79
    14: 1 a 3 4 96
    15: 1 a 3 5 105
    16: 1 c 99 1 10
    17: 2 b 1 1 60
    18: 2 b 1 2 68
    19: 2 b 1 3 78
    20: 2 b 1 4 90
    21: 2 b 1 5 107
    22: 2 b 2 1 61
    23: 2 b 2 2 73
    24: 2 b 2 3 86
    25: 2 b 2 4 91
    26: 2 b 2 5 104
    27: 2 b 3 1 57
    28: 2 b 3 2 67
    29: 2 b 3 3 79
    30: 2 b 3 4 96
    31: 2 b 3 5 105
    32: 2 d 99 1 10





    share|improve this answer

























      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53452730%2fadd-values-of-one-group-into-another-group-in-r%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      0














      Here's an approach using the tidyverse. If you're looking for speed with a million rows, a data.table solution will probably perform better.



      library(tidyverse)

      df <- data.frame(Year=c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2),
      Cluster=c("a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","c","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","d"),
      Seed=c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,99,99,99,99,99,99),
      Day=c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1),
      value=c(5,2,1,2,8,6,7,9,3,5,2,1,2,8,6,55,66,77,88,99,10))

      seeds <- df %>%
      filter(Seed == 99)

      matches <- df %>%
      filter(Seed != 99) %>%
      inner_join(select(seeds, -Seed), by = c("Year", "Cluster", "Day")) %>%
      mutate(value = value.x + value.y) %>%
      select(Year, Cluster, Seed, Day, value)

      no_matches <- anti_join(seeds, matches, by = c("Year", "Cluster", "Day"))

      bind_rows(matches, no_matches) %>%
      arrange(Year, Cluster, Seed, Day)
      #> Year Cluster Seed Day value
      #> 1 1 a 1 1 60
      #> 2 1 a 1 2 68
      #> 3 1 a 1 3 78
      #> 4 1 a 1 4 90
      #> 5 1 a 1 5 107
      #> 6 1 a 2 1 61
      #> 7 1 a 2 2 73
      #> 8 1 a 2 3 86
      #> 9 1 a 2 4 91
      #> 10 1 a 2 5 104
      #> 11 1 a 3 1 57
      #> 12 1 a 3 2 67
      #> 13 1 a 3 3 79
      #> 14 1 a 3 4 96
      #> 15 1 a 3 5 105
      #> 16 1 c 99 1 10
      #> 17 2 b 1 1 60
      #> 18 2 b 1 2 68
      #> 19 2 b 1 3 78
      #> 20 2 b 1 4 90
      #> 21 2 b 1 5 107
      #> 22 2 b 2 1 61
      #> 23 2 b 2 2 73
      #> 24 2 b 2 3 86
      #> 25 2 b 2 4 91
      #> 26 2 b 2 5 104
      #> 27 2 b 3 1 57
      #> 28 2 b 3 2 67
      #> 29 2 b 3 3 79
      #> 30 2 b 3 4 96
      #> 31 2 b 3 5 105
      #> 32 2 d 99 1 10


      Created on 2018-11-23 by the reprex package (v0.2.1)






      share|improve this answer




























        0














        Here's an approach using the tidyverse. If you're looking for speed with a million rows, a data.table solution will probably perform better.



        library(tidyverse)

        df <- data.frame(Year=c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2),
        Cluster=c("a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","c","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","d"),
        Seed=c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,99,99,99,99,99,99),
        Day=c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1),
        value=c(5,2,1,2,8,6,7,9,3,5,2,1,2,8,6,55,66,77,88,99,10))

        seeds <- df %>%
        filter(Seed == 99)

        matches <- df %>%
        filter(Seed != 99) %>%
        inner_join(select(seeds, -Seed), by = c("Year", "Cluster", "Day")) %>%
        mutate(value = value.x + value.y) %>%
        select(Year, Cluster, Seed, Day, value)

        no_matches <- anti_join(seeds, matches, by = c("Year", "Cluster", "Day"))

        bind_rows(matches, no_matches) %>%
        arrange(Year, Cluster, Seed, Day)
        #> Year Cluster Seed Day value
        #> 1 1 a 1 1 60
        #> 2 1 a 1 2 68
        #> 3 1 a 1 3 78
        #> 4 1 a 1 4 90
        #> 5 1 a 1 5 107
        #> 6 1 a 2 1 61
        #> 7 1 a 2 2 73
        #> 8 1 a 2 3 86
        #> 9 1 a 2 4 91
        #> 10 1 a 2 5 104
        #> 11 1 a 3 1 57
        #> 12 1 a 3 2 67
        #> 13 1 a 3 3 79
        #> 14 1 a 3 4 96
        #> 15 1 a 3 5 105
        #> 16 1 c 99 1 10
        #> 17 2 b 1 1 60
        #> 18 2 b 1 2 68
        #> 19 2 b 1 3 78
        #> 20 2 b 1 4 90
        #> 21 2 b 1 5 107
        #> 22 2 b 2 1 61
        #> 23 2 b 2 2 73
        #> 24 2 b 2 3 86
        #> 25 2 b 2 4 91
        #> 26 2 b 2 5 104
        #> 27 2 b 3 1 57
        #> 28 2 b 3 2 67
        #> 29 2 b 3 3 79
        #> 30 2 b 3 4 96
        #> 31 2 b 3 5 105
        #> 32 2 d 99 1 10


        Created on 2018-11-23 by the reprex package (v0.2.1)






        share|improve this answer


























          0












          0








          0







          Here's an approach using the tidyverse. If you're looking for speed with a million rows, a data.table solution will probably perform better.



          library(tidyverse)

          df <- data.frame(Year=c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2),
          Cluster=c("a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","c","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","d"),
          Seed=c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,99,99,99,99,99,99),
          Day=c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1),
          value=c(5,2,1,2,8,6,7,9,3,5,2,1,2,8,6,55,66,77,88,99,10))

          seeds <- df %>%
          filter(Seed == 99)

          matches <- df %>%
          filter(Seed != 99) %>%
          inner_join(select(seeds, -Seed), by = c("Year", "Cluster", "Day")) %>%
          mutate(value = value.x + value.y) %>%
          select(Year, Cluster, Seed, Day, value)

          no_matches <- anti_join(seeds, matches, by = c("Year", "Cluster", "Day"))

          bind_rows(matches, no_matches) %>%
          arrange(Year, Cluster, Seed, Day)
          #> Year Cluster Seed Day value
          #> 1 1 a 1 1 60
          #> 2 1 a 1 2 68
          #> 3 1 a 1 3 78
          #> 4 1 a 1 4 90
          #> 5 1 a 1 5 107
          #> 6 1 a 2 1 61
          #> 7 1 a 2 2 73
          #> 8 1 a 2 3 86
          #> 9 1 a 2 4 91
          #> 10 1 a 2 5 104
          #> 11 1 a 3 1 57
          #> 12 1 a 3 2 67
          #> 13 1 a 3 3 79
          #> 14 1 a 3 4 96
          #> 15 1 a 3 5 105
          #> 16 1 c 99 1 10
          #> 17 2 b 1 1 60
          #> 18 2 b 1 2 68
          #> 19 2 b 1 3 78
          #> 20 2 b 1 4 90
          #> 21 2 b 1 5 107
          #> 22 2 b 2 1 61
          #> 23 2 b 2 2 73
          #> 24 2 b 2 3 86
          #> 25 2 b 2 4 91
          #> 26 2 b 2 5 104
          #> 27 2 b 3 1 57
          #> 28 2 b 3 2 67
          #> 29 2 b 3 3 79
          #> 30 2 b 3 4 96
          #> 31 2 b 3 5 105
          #> 32 2 d 99 1 10


          Created on 2018-11-23 by the reprex package (v0.2.1)






          share|improve this answer













          Here's an approach using the tidyverse. If you're looking for speed with a million rows, a data.table solution will probably perform better.



          library(tidyverse)

          df <- data.frame(Year=c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2),
          Cluster=c("a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","c","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","d"),
          Seed=c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,99,99,99,99,99,99),
          Day=c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1),
          value=c(5,2,1,2,8,6,7,9,3,5,2,1,2,8,6,55,66,77,88,99,10))

          seeds <- df %>%
          filter(Seed == 99)

          matches <- df %>%
          filter(Seed != 99) %>%
          inner_join(select(seeds, -Seed), by = c("Year", "Cluster", "Day")) %>%
          mutate(value = value.x + value.y) %>%
          select(Year, Cluster, Seed, Day, value)

          no_matches <- anti_join(seeds, matches, by = c("Year", "Cluster", "Day"))

          bind_rows(matches, no_matches) %>%
          arrange(Year, Cluster, Seed, Day)
          #> Year Cluster Seed Day value
          #> 1 1 a 1 1 60
          #> 2 1 a 1 2 68
          #> 3 1 a 1 3 78
          #> 4 1 a 1 4 90
          #> 5 1 a 1 5 107
          #> 6 1 a 2 1 61
          #> 7 1 a 2 2 73
          #> 8 1 a 2 3 86
          #> 9 1 a 2 4 91
          #> 10 1 a 2 5 104
          #> 11 1 a 3 1 57
          #> 12 1 a 3 2 67
          #> 13 1 a 3 3 79
          #> 14 1 a 3 4 96
          #> 15 1 a 3 5 105
          #> 16 1 c 99 1 10
          #> 17 2 b 1 1 60
          #> 18 2 b 1 2 68
          #> 19 2 b 1 3 78
          #> 20 2 b 1 4 90
          #> 21 2 b 1 5 107
          #> 22 2 b 2 1 61
          #> 23 2 b 2 2 73
          #> 24 2 b 2 3 86
          #> 25 2 b 2 4 91
          #> 26 2 b 2 5 104
          #> 27 2 b 3 1 57
          #> 28 2 b 3 2 67
          #> 29 2 b 3 3 79
          #> 30 2 b 3 4 96
          #> 31 2 b 3 5 105
          #> 32 2 d 99 1 10


          Created on 2018-11-23 by the reprex package (v0.2.1)







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 23 '18 at 21:13









          Jake KauppJake Kaupp

          5,50221428




          5,50221428

























              0














              A data.table approach:



              library(data.table)

              df <- setDT(df)[, `:=` (value = ifelse(Seed != 99, value + value[Seed == 99], value),
              flag = Seed == 99 & .N == 1), by = .(Year, Cluster, Day)][!(Seed == 99 & flag == FALSE),][, "flag" := NULL]


              Output:



              df

              Year Cluster Seed Day value
              1: 1 a 1 1 60
              2: 1 a 1 2 68
              3: 1 a 1 3 78
              4: 1 a 1 4 90
              5: 1 a 1 5 107
              6: 1 a 2 1 61
              7: 1 a 2 2 73
              8: 1 a 2 3 86
              9: 1 a 2 4 91
              10: 1 a 2 5 104
              11: 1 a 3 1 57
              12: 1 a 3 2 67
              13: 1 a 3 3 79
              14: 1 a 3 4 96
              15: 1 a 3 5 105
              16: 1 c 99 1 10
              17: 2 b 1 1 60
              18: 2 b 1 2 68
              19: 2 b 1 3 78
              20: 2 b 1 4 90
              21: 2 b 1 5 107
              22: 2 b 2 1 61
              23: 2 b 2 2 73
              24: 2 b 2 3 86
              25: 2 b 2 4 91
              26: 2 b 2 5 104
              27: 2 b 3 1 57
              28: 2 b 3 2 67
              29: 2 b 3 3 79
              30: 2 b 3 4 96
              31: 2 b 3 5 105
              32: 2 d 99 1 10





              share|improve this answer






























                0














                A data.table approach:



                library(data.table)

                df <- setDT(df)[, `:=` (value = ifelse(Seed != 99, value + value[Seed == 99], value),
                flag = Seed == 99 & .N == 1), by = .(Year, Cluster, Day)][!(Seed == 99 & flag == FALSE),][, "flag" := NULL]


                Output:



                df

                Year Cluster Seed Day value
                1: 1 a 1 1 60
                2: 1 a 1 2 68
                3: 1 a 1 3 78
                4: 1 a 1 4 90
                5: 1 a 1 5 107
                6: 1 a 2 1 61
                7: 1 a 2 2 73
                8: 1 a 2 3 86
                9: 1 a 2 4 91
                10: 1 a 2 5 104
                11: 1 a 3 1 57
                12: 1 a 3 2 67
                13: 1 a 3 3 79
                14: 1 a 3 4 96
                15: 1 a 3 5 105
                16: 1 c 99 1 10
                17: 2 b 1 1 60
                18: 2 b 1 2 68
                19: 2 b 1 3 78
                20: 2 b 1 4 90
                21: 2 b 1 5 107
                22: 2 b 2 1 61
                23: 2 b 2 2 73
                24: 2 b 2 3 86
                25: 2 b 2 4 91
                26: 2 b 2 5 104
                27: 2 b 3 1 57
                28: 2 b 3 2 67
                29: 2 b 3 3 79
                30: 2 b 3 4 96
                31: 2 b 3 5 105
                32: 2 d 99 1 10





                share|improve this answer




























                  0












                  0








                  0







                  A data.table approach:



                  library(data.table)

                  df <- setDT(df)[, `:=` (value = ifelse(Seed != 99, value + value[Seed == 99], value),
                  flag = Seed == 99 & .N == 1), by = .(Year, Cluster, Day)][!(Seed == 99 & flag == FALSE),][, "flag" := NULL]


                  Output:



                  df

                  Year Cluster Seed Day value
                  1: 1 a 1 1 60
                  2: 1 a 1 2 68
                  3: 1 a 1 3 78
                  4: 1 a 1 4 90
                  5: 1 a 1 5 107
                  6: 1 a 2 1 61
                  7: 1 a 2 2 73
                  8: 1 a 2 3 86
                  9: 1 a 2 4 91
                  10: 1 a 2 5 104
                  11: 1 a 3 1 57
                  12: 1 a 3 2 67
                  13: 1 a 3 3 79
                  14: 1 a 3 4 96
                  15: 1 a 3 5 105
                  16: 1 c 99 1 10
                  17: 2 b 1 1 60
                  18: 2 b 1 2 68
                  19: 2 b 1 3 78
                  20: 2 b 1 4 90
                  21: 2 b 1 5 107
                  22: 2 b 2 1 61
                  23: 2 b 2 2 73
                  24: 2 b 2 3 86
                  25: 2 b 2 4 91
                  26: 2 b 2 5 104
                  27: 2 b 3 1 57
                  28: 2 b 3 2 67
                  29: 2 b 3 3 79
                  30: 2 b 3 4 96
                  31: 2 b 3 5 105
                  32: 2 d 99 1 10





                  share|improve this answer















                  A data.table approach:



                  library(data.table)

                  df <- setDT(df)[, `:=` (value = ifelse(Seed != 99, value + value[Seed == 99], value),
                  flag = Seed == 99 & .N == 1), by = .(Year, Cluster, Day)][!(Seed == 99 & flag == FALSE),][, "flag" := NULL]


                  Output:



                  df

                  Year Cluster Seed Day value
                  1: 1 a 1 1 60
                  2: 1 a 1 2 68
                  3: 1 a 1 3 78
                  4: 1 a 1 4 90
                  5: 1 a 1 5 107
                  6: 1 a 2 1 61
                  7: 1 a 2 2 73
                  8: 1 a 2 3 86
                  9: 1 a 2 4 91
                  10: 1 a 2 5 104
                  11: 1 a 3 1 57
                  12: 1 a 3 2 67
                  13: 1 a 3 3 79
                  14: 1 a 3 4 96
                  15: 1 a 3 5 105
                  16: 1 c 99 1 10
                  17: 2 b 1 1 60
                  18: 2 b 1 2 68
                  19: 2 b 1 3 78
                  20: 2 b 1 4 90
                  21: 2 b 1 5 107
                  22: 2 b 2 1 61
                  23: 2 b 2 2 73
                  24: 2 b 2 3 86
                  25: 2 b 2 4 91
                  26: 2 b 2 5 104
                  27: 2 b 3 1 57
                  28: 2 b 3 2 67
                  29: 2 b 3 3 79
                  30: 2 b 3 4 96
                  31: 2 b 3 5 105
                  32: 2 d 99 1 10






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Nov 23 '18 at 21:29

























                  answered Nov 23 '18 at 21:00









                  arg0nautarg0naut

                  2,239314




                  2,239314






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53452730%2fadd-values-of-one-group-into-another-group-in-r%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Berounka

                      Different font size/position of beamer's navigation symbols template's content depending on regular/plain...

                      Sphinx de Gizeh