Average neighbours inside a vector

My data :

data <- c(1,5,11,15,24,31,32,65)

There are 2 neighbours: 31 and 32. I wish to remove them and keep only the mean value (e.g. 31.5), in such a way data would be :

data <- c(1,5,11,15,24,31.5,65)

It seems simple, but I wish to do it automatically, and sometimes with vectors containing more neighbours. For instance :

data_2 <- c(1,5,11,15,24,31,32,65,99,100,101,140)

edited Dec 10 '18 at 11:47

asked Dec 10 '18 at 11:40

Loulou

1558

1

Is this only about pairs of consecutive numbers or also about longer runs, e.g. 31, 32, 33, 34?

– Klaus Gütter
Dec 10 '18 at 12:08

It could be also longer runs (like 99, 100, 101 in data_2)

– Loulou
Dec 10 '18 at 12:11

2

Maybe use the cumsum(...diff(... idiom to create groups, like tapply(data, cumsum(c(1L, diff(data) > 1)), mean)

– Henrik
Dec 10 '18 at 12:26

Is your data sorted?

– Konrad Rudolph
Dec 10 '18 at 13:55

Yes, always growing order

– Loulou
Dec 10 '18 at 14:57

add a comment |

My data :

data <- c(1,5,11,15,24,31,32,65)

There are 2 neighbours: 31 and 32. I wish to remove them and keep only the mean value (e.g. 31.5), in such a way data would be :

data <- c(1,5,11,15,24,31.5,65)

It seems simple, but I wish to do it automatically, and sometimes with vectors containing more neighbours. For instance :

data_2 <- c(1,5,11,15,24,31,32,65,99,100,101,140)

edited Dec 10 '18 at 11:47

asked Dec 10 '18 at 11:40

Loulou

1558

1

Is this only about pairs of consecutive numbers or also about longer runs, e.g. 31, 32, 33, 34?

– Klaus Gütter
Dec 10 '18 at 12:08

It could be also longer runs (like 99, 100, 101 in data_2)

– Loulou
Dec 10 '18 at 12:11

2

Maybe use the cumsum(...diff(... idiom to create groups, like tapply(data, cumsum(c(1L, diff(data) > 1)), mean)

– Henrik
Dec 10 '18 at 12:26

Is your data sorted?

– Konrad Rudolph
Dec 10 '18 at 13:55

Yes, always growing order

– Loulou
Dec 10 '18 at 14:57

add a comment |

My data :

data <- c(1,5,11,15,24,31,32,65)

There are 2 neighbours: 31 and 32. I wish to remove them and keep only the mean value (e.g. 31.5), in such a way data would be :

data <- c(1,5,11,15,24,31.5,65)

It seems simple, but I wish to do it automatically, and sometimes with vectors containing more neighbours. For instance :

data_2 <- c(1,5,11,15,24,31,32,65,99,100,101,140)

edited Dec 10 '18 at 11:47

asked Dec 10 '18 at 11:40

Loulou

1558

My data :

data <- c(1,5,11,15,24,31,32,65)

There are 2 neighbours: 31 and 32. I wish to remove them and keep only the mean value (e.g. 31.5), in such a way data would be :

data <- c(1,5,11,15,24,31.5,65)

It seems simple, but I wish to do it automatically, and sometimes with vectors containing more neighbours. For instance :

data_2 <- c(1,5,11,15,24,31,32,65,99,100,101,140)

r vector difference neighbours

edited Dec 10 '18 at 11:47

asked Dec 10 '18 at 11:40

Loulou

1558

edited Dec 10 '18 at 11:47

asked Dec 10 '18 at 11:40

Loulou

1558

edited Dec 10 '18 at 11:47

asked Dec 10 '18 at 11:40

Loulou

1558

asked Dec 10 '18 at 11:40

Loulou

1558

asked Dec 10 '18 at 11:40

Loulou

1558

1

Is this only about pairs of consecutive numbers or also about longer runs, e.g. 31, 32, 33, 34?

– Klaus Gütter
Dec 10 '18 at 12:08

It could be also longer runs (like 99, 100, 101 in data_2)

– Loulou
Dec 10 '18 at 12:11

2

Maybe use the cumsum(...diff(... idiom to create groups, like tapply(data, cumsum(c(1L, diff(data) > 1)), mean)

– Henrik
Dec 10 '18 at 12:26

Is your data sorted?

– Konrad Rudolph
Dec 10 '18 at 13:55

Yes, always growing order

– Loulou
Dec 10 '18 at 14:57

add a comment |

1

Is this only about pairs of consecutive numbers or also about longer runs, e.g. 31, 32, 33, 34?

– Klaus Gütter
Dec 10 '18 at 12:08

It could be also longer runs (like 99, 100, 101 in data_2)

– Loulou
Dec 10 '18 at 12:11

2

Maybe use the cumsum(...diff(... idiom to create groups, like tapply(data, cumsum(c(1L, diff(data) > 1)), mean)

– Henrik
Dec 10 '18 at 12:26

Is your data sorted?

– Konrad Rudolph
Dec 10 '18 at 13:55

Yes, always growing order

– Loulou
Dec 10 '18 at 14:57

Is this only about pairs of consecutive numbers or also about longer runs, e.g. 31, 32, 33, 34?

– Klaus Gütter
Dec 10 '18 at 12:08

It could be also longer runs (like 99, 100, 101 in data_2)

– Loulou
Dec 10 '18 at 12:11

Maybe use the cumsum(...diff(... idiom to create groups, like tapply(data, cumsum(c(1L, diff(data) > 1)), mean)

– Henrik
Dec 10 '18 at 12:26

Is your data sorted?

– Konrad Rudolph
Dec 10 '18 at 13:55

Yes, always growing order

– Loulou
Dec 10 '18 at 14:57

add a comment |

4 Answers
4

active

oldest

votes

Here is another idea that creates an id via cumsum(c(TRUE, diff(a) > 1)), where 1 shows the gap threshold, i.e.

#our group variable

grp <- cumsum(c(TRUE, diff(a) > 1))



#keep only groups with length 1 (i.e. with no neighbor)

i1 <- a[!!!ave(a, grp, FUN = function(i) length(i) > 1)] 



#Find the mean of the groups with more than 1 rows,

i2 <- unname(tapply(a, grp, function(i)mean(i[length(i) > 1])))



#Concatenate the above 2 (eliminating NAs from i2) to get final result

c(i1, i2[!is.na(i2)])

#[1]  1.0  5.0 11.0 15.0 24.0 65.0 31.5

You can also wrap it in a function. I left the gap as a parameter so you can adjust,

get_vec <- function(x, gap) {

    grp <- cumsum(c(TRUE, diff(x) > gap))

    i1 <- x[!!!ave(x, grp, FUN = function(i) length(i) > 1)]

    i2 <- unname(tapply(x, grp, function(i) mean(i[length(i) > 1])))

    return(c(i1, i2[!is.na(i2)]))

}



get_vec(a, 1)

#[1]  1.0  5.0 11.0 15.0 24.0 65.0 31.5



get_vec(a_2, 1)

#[1]   1.0   5.0  11.0  15.0  24.0  65.0 140.0  31.5 100.0

DATA:

a <- c(1,5,11,15,24,31,32,65)

a_2 <- c(1, 5, 11, 15, 24, 31, 32, 65, 99, 100, 101, 140)

edited Dec 10 '18 at 13:50

answered Dec 10 '18 at 12:30

Sotos

29.3k51640

add a comment |

Here is my solution, which uses run-length encoding to identify groups:

foo <- function(x) {

  y <- x - seq_along(x) #normalize to zero differences in groups

  ind <- rle(y) #run-length encoding

  ind$values <- ind$lengths != 1 #to find groups

  ind$values[ind$values] <- cumsum(ind$values[ind$values]) #group ids

  ind <- inverse.rle(ind)

  xnew <- x

  xnew[ind != 0] <- ave(x, ind, FUN = mean)[ind != 0] #calculate means

  xnew[!(duplicated(ind) & ind != 0)] #remove duplicates from groups

}



foo(data)

#[1]  1.0  5.0 11.0 15.0 24.0 31.5 65.0

foo(data_2)

#[1]   1.0   5.0  11.0  15.0  24.0  31.5  65.0 100.0 140.0

data_3 <- c(1, 2, 4, 1, 2)

foo(data_3)

#[1] 1.5 4.0 1.5

I assume that you don't need an extremely efficient solution. If you do, I'd recommend a simple C++ for loop in Rcpp.

edited Dec 10 '18 at 12:19

answered Dec 10 '18 at 12:05

Roland

99.6k6112184

add a comment |

I have a data.table based solution, same could be translated into dplyr I guess:

library(data.table)

df <- data.table(data2 = c(1,5,11,15,24,31,32,65,99,100,101,140))

df[,neighbours := ifelse(c(0,diff(data_2)) == 1,1,0)]

df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]

df[,neigh_seq := rleid(neighbours)]



unique(df[,ifelse(neighbours == 1,mean(data2),data2),by = neigh_seq])



   neigh_seq    V1

1:         1   1.0

2:         1   5.0

3:         1  11.0

4:         1  15.0

5:         1  24.0

6:         2  31.5

7:         3  65.0

8:         4 100.0

9:         5 140.0

What it does :
first line set neigbours to 1 if the difference with following number is 1

 1:     1          0

 2:     5          0

 3:    11          0

 4:    15          0

 5:    24          0

 6:    31          0

 7:    32          1

 8:    65          0

 9:    99          0

10:   100          1

11:   101          1

12:   140          0

I wanr to group so that neighbour variable is 1 for all neigbours. I need to add 1 to each end of each groups:

df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]

    data2 neighbours

 1:     1          0

 2:     5          0

 3:    11          0

 4:    15          0

 5:    24          0

 6:    31          1

 7:    32          1

 8:    65          0

 9:    99          1

10:   100          1

11:   101          1

12:   140          0

Then after I just do a grouping on changing neighbour value, and set the value to mean if they are neihbours

df[,ifelse(neighbours == 1,mean(data2),data2),by = rleid(neighbours)]

    rleid    V1

 1:     1   1.0

 2:     1   5.0

 3:     1  11.0

 4:     1  15.0

 5:     1  24.0

 6:     2  31.5

 7:     2  31.5

 8:     3  65.0

 9:     4 100.0

10:     4 100.0

11:     4 100.0

12:     5 140.0

and take the unique values. And voila.

edited Dec 10 '18 at 12:30

answered Dec 10 '18 at 12:24

denis

2,0271222

add a comment |

This is a dplyr version, also using as a grouping variable cumsum(c(1,diff(x)!=1)):

library(dplyr)

data_2 %>% data.frame(x = .) %>% 

group_by(id = cumsum(c(1,diff(x)!=1))) %>% 

summarise(res = mean(x)) %>% 

select(res)

# A tibble: 9 x 1

    res

  <dbl>

1   1.0

2   5.0

3  11.0

4  15.0

5  24.0

6  31.5

7  65.0

8 100.0

9 140.0

answered Dec 10 '18 at 19:06

Lamia

3,1551717

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53704926%2faverage-neighbours-inside-a-vector%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

4 Answers
4

active

oldest

votes

4 Answers
4

active

oldest

votes

Here is another idea that creates an id via cumsum(c(TRUE, diff(a) > 1)), where 1 shows the gap threshold, i.e.

#our group variable

grp <- cumsum(c(TRUE, diff(a) > 1))



#keep only groups with length 1 (i.e. with no neighbor)

i1 <- a[!!!ave(a, grp, FUN = function(i) length(i) > 1)] 



#Find the mean of the groups with more than 1 rows,

i2 <- unname(tapply(a, grp, function(i)mean(i[length(i) > 1])))



#Concatenate the above 2 (eliminating NAs from i2) to get final result

c(i1, i2[!is.na(i2)])

#[1]  1.0  5.0 11.0 15.0 24.0 65.0 31.5

You can also wrap it in a function. I left the gap as a parameter so you can adjust,

get_vec <- function(x, gap) {

    grp <- cumsum(c(TRUE, diff(x) > gap))

    i1 <- x[!!!ave(x, grp, FUN = function(i) length(i) > 1)]

    i2 <- unname(tapply(x, grp, function(i) mean(i[length(i) > 1])))

    return(c(i1, i2[!is.na(i2)]))

}



get_vec(a, 1)

#[1]  1.0  5.0 11.0 15.0 24.0 65.0 31.5



get_vec(a_2, 1)

#[1]   1.0   5.0  11.0  15.0  24.0  65.0 140.0  31.5 100.0

DATA:

a <- c(1,5,11,15,24,31,32,65)

a_2 <- c(1, 5, 11, 15, 24, 31, 32, 65, 99, 100, 101, 140)

edited Dec 10 '18 at 13:50

answered Dec 10 '18 at 12:30

Sotos

29.3k51640

add a comment |

Here is another idea that creates an id via cumsum(c(TRUE, diff(a) > 1)), where 1 shows the gap threshold, i.e.

#our group variable

grp <- cumsum(c(TRUE, diff(a) > 1))



#keep only groups with length 1 (i.e. with no neighbor)

i1 <- a[!!!ave(a, grp, FUN = function(i) length(i) > 1)] 



#Find the mean of the groups with more than 1 rows,

i2 <- unname(tapply(a, grp, function(i)mean(i[length(i) > 1])))



#Concatenate the above 2 (eliminating NAs from i2) to get final result

c(i1, i2[!is.na(i2)])

#[1]  1.0  5.0 11.0 15.0 24.0 65.0 31.5

You can also wrap it in a function. I left the gap as a parameter so you can adjust,

get_vec <- function(x, gap) {

    grp <- cumsum(c(TRUE, diff(x) > gap))

    i1 <- x[!!!ave(x, grp, FUN = function(i) length(i) > 1)]

    i2 <- unname(tapply(x, grp, function(i) mean(i[length(i) > 1])))

    return(c(i1, i2[!is.na(i2)]))

}



get_vec(a, 1)

#[1]  1.0  5.0 11.0 15.0 24.0 65.0 31.5



get_vec(a_2, 1)

#[1]   1.0   5.0  11.0  15.0  24.0  65.0 140.0  31.5 100.0

DATA:

a <- c(1,5,11,15,24,31,32,65)

a_2 <- c(1, 5, 11, 15, 24, 31, 32, 65, 99, 100, 101, 140)

edited Dec 10 '18 at 13:50

answered Dec 10 '18 at 12:30

Sotos

29.3k51640

add a comment |

Here is another idea that creates an id via cumsum(c(TRUE, diff(a) > 1)), where 1 shows the gap threshold, i.e.

#our group variable

grp <- cumsum(c(TRUE, diff(a) > 1))



#keep only groups with length 1 (i.e. with no neighbor)

i1 <- a[!!!ave(a, grp, FUN = function(i) length(i) > 1)] 



#Find the mean of the groups with more than 1 rows,

i2 <- unname(tapply(a, grp, function(i)mean(i[length(i) > 1])))



#Concatenate the above 2 (eliminating NAs from i2) to get final result

c(i1, i2[!is.na(i2)])

#[1]  1.0  5.0 11.0 15.0 24.0 65.0 31.5

You can also wrap it in a function. I left the gap as a parameter so you can adjust,

get_vec <- function(x, gap) {

    grp <- cumsum(c(TRUE, diff(x) > gap))

    i1 <- x[!!!ave(x, grp, FUN = function(i) length(i) > 1)]

    i2 <- unname(tapply(x, grp, function(i) mean(i[length(i) > 1])))

    return(c(i1, i2[!is.na(i2)]))

}



get_vec(a, 1)

#[1]  1.0  5.0 11.0 15.0 24.0 65.0 31.5



get_vec(a_2, 1)

#[1]   1.0   5.0  11.0  15.0  24.0  65.0 140.0  31.5 100.0

DATA:

a <- c(1,5,11,15,24,31,32,65)

a_2 <- c(1, 5, 11, 15, 24, 31, 32, 65, 99, 100, 101, 140)

edited Dec 10 '18 at 13:50

answered Dec 10 '18 at 12:30

Sotos

29.3k51640

Here is another idea that creates an id via cumsum(c(TRUE, diff(a) > 1)), where 1 shows the gap threshold, i.e.

#our group variable

grp <- cumsum(c(TRUE, diff(a) > 1))



#keep only groups with length 1 (i.e. with no neighbor)

i1 <- a[!!!ave(a, grp, FUN = function(i) length(i) > 1)] 



#Find the mean of the groups with more than 1 rows,

i2 <- unname(tapply(a, grp, function(i)mean(i[length(i) > 1])))



#Concatenate the above 2 (eliminating NAs from i2) to get final result

c(i1, i2[!is.na(i2)])

#[1]  1.0  5.0 11.0 15.0 24.0 65.0 31.5

You can also wrap it in a function. I left the gap as a parameter so you can adjust,

get_vec <- function(x, gap) {

    grp <- cumsum(c(TRUE, diff(x) > gap))

    i1 <- x[!!!ave(x, grp, FUN = function(i) length(i) > 1)]

    i2 <- unname(tapply(x, grp, function(i) mean(i[length(i) > 1])))

    return(c(i1, i2[!is.na(i2)]))

}



get_vec(a, 1)

#[1]  1.0  5.0 11.0 15.0 24.0 65.0 31.5



get_vec(a_2, 1)

#[1]   1.0   5.0  11.0  15.0  24.0  65.0 140.0  31.5 100.0

DATA:

a <- c(1,5,11,15,24,31,32,65)

a_2 <- c(1, 5, 11, 15, 24, 31, 32, 65, 99, 100, 101, 140)

edited Dec 10 '18 at 13:50

answered Dec 10 '18 at 12:30

Sotos

29.3k51640

edited Dec 10 '18 at 13:50

answered Dec 10 '18 at 12:30

Sotos

29.3k51640

answered Dec 10 '18 at 12:30

Sotos

29.3k51640

answered Dec 10 '18 at 12:30

Sotos

29.3k51640

add a comment |

Here is my solution, which uses run-length encoding to identify groups:

foo <- function(x) {

  y <- x - seq_along(x) #normalize to zero differences in groups

  ind <- rle(y) #run-length encoding

  ind$values <- ind$lengths != 1 #to find groups

  ind$values[ind$values] <- cumsum(ind$values[ind$values]) #group ids

  ind <- inverse.rle(ind)

  xnew <- x

  xnew[ind != 0] <- ave(x, ind, FUN = mean)[ind != 0] #calculate means

  xnew[!(duplicated(ind) & ind != 0)] #remove duplicates from groups

}



foo(data)

#[1]  1.0  5.0 11.0 15.0 24.0 31.5 65.0

foo(data_2)

#[1]   1.0   5.0  11.0  15.0  24.0  31.5  65.0 100.0 140.0

data_3 <- c(1, 2, 4, 1, 2)

foo(data_3)

#[1] 1.5 4.0 1.5

I assume that you don't need an extremely efficient solution. If you do, I'd recommend a simple C++ for loop in Rcpp.

edited Dec 10 '18 at 12:19

answered Dec 10 '18 at 12:05

Roland

99.6k6112184

add a comment |

Here is my solution, which uses run-length encoding to identify groups:

foo <- function(x) {

  y <- x - seq_along(x) #normalize to zero differences in groups

  ind <- rle(y) #run-length encoding

  ind$values <- ind$lengths != 1 #to find groups

  ind$values[ind$values] <- cumsum(ind$values[ind$values]) #group ids

  ind <- inverse.rle(ind)

  xnew <- x

  xnew[ind != 0] <- ave(x, ind, FUN = mean)[ind != 0] #calculate means

  xnew[!(duplicated(ind) & ind != 0)] #remove duplicates from groups

}



foo(data)

#[1]  1.0  5.0 11.0 15.0 24.0 31.5 65.0

foo(data_2)

#[1]   1.0   5.0  11.0  15.0  24.0  31.5  65.0 100.0 140.0

data_3 <- c(1, 2, 4, 1, 2)

foo(data_3)

#[1] 1.5 4.0 1.5

I assume that you don't need an extremely efficient solution. If you do, I'd recommend a simple C++ for loop in Rcpp.

edited Dec 10 '18 at 12:19

answered Dec 10 '18 at 12:05

Roland

99.6k6112184

add a comment |

Here is my solution, which uses run-length encoding to identify groups:

foo <- function(x) {

  y <- x - seq_along(x) #normalize to zero differences in groups

  ind <- rle(y) #run-length encoding

  ind$values <- ind$lengths != 1 #to find groups

  ind$values[ind$values] <- cumsum(ind$values[ind$values]) #group ids

  ind <- inverse.rle(ind)

  xnew <- x

  xnew[ind != 0] <- ave(x, ind, FUN = mean)[ind != 0] #calculate means

  xnew[!(duplicated(ind) & ind != 0)] #remove duplicates from groups

}



foo(data)

#[1]  1.0  5.0 11.0 15.0 24.0 31.5 65.0

foo(data_2)

#[1]   1.0   5.0  11.0  15.0  24.0  31.5  65.0 100.0 140.0

data_3 <- c(1, 2, 4, 1, 2)

foo(data_3)

#[1] 1.5 4.0 1.5

I assume that you don't need an extremely efficient solution. If you do, I'd recommend a simple C++ for loop in Rcpp.

edited Dec 10 '18 at 12:19

answered Dec 10 '18 at 12:05

Roland

99.6k6112184

Here is my solution, which uses run-length encoding to identify groups:

foo <- function(x) {

  y <- x - seq_along(x) #normalize to zero differences in groups

  ind <- rle(y) #run-length encoding

  ind$values <- ind$lengths != 1 #to find groups

  ind$values[ind$values] <- cumsum(ind$values[ind$values]) #group ids

  ind <- inverse.rle(ind)

  xnew <- x

  xnew[ind != 0] <- ave(x, ind, FUN = mean)[ind != 0] #calculate means

  xnew[!(duplicated(ind) & ind != 0)] #remove duplicates from groups

}



foo(data)

#[1]  1.0  5.0 11.0 15.0 24.0 31.5 65.0

foo(data_2)

#[1]   1.0   5.0  11.0  15.0  24.0  31.5  65.0 100.0 140.0

data_3 <- c(1, 2, 4, 1, 2)

foo(data_3)

#[1] 1.5 4.0 1.5

I assume that you don't need an extremely efficient solution. If you do, I'd recommend a simple C++ for loop in Rcpp.

edited Dec 10 '18 at 12:19

answered Dec 10 '18 at 12:05

Roland

99.6k6112184

edited Dec 10 '18 at 12:19

answered Dec 10 '18 at 12:05

Roland

99.6k6112184

answered Dec 10 '18 at 12:05

Roland

99.6k6112184

answered Dec 10 '18 at 12:05

Roland

99.6k6112184

add a comment |

I have a data.table based solution, same could be translated into dplyr I guess:

library(data.table)

df <- data.table(data2 = c(1,5,11,15,24,31,32,65,99,100,101,140))

df[,neighbours := ifelse(c(0,diff(data_2)) == 1,1,0)]

df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]

df[,neigh_seq := rleid(neighbours)]



unique(df[,ifelse(neighbours == 1,mean(data2),data2),by = neigh_seq])



   neigh_seq    V1

1:         1   1.0

2:         1   5.0

3:         1  11.0

4:         1  15.0

5:         1  24.0

6:         2  31.5

7:         3  65.0

8:         4 100.0

9:         5 140.0

What it does :
first line set neigbours to 1 if the difference with following number is 1

 1:     1          0

 2:     5          0

 3:    11          0

 4:    15          0

 5:    24          0

 6:    31          0

 7:    32          1

 8:    65          0

 9:    99          0

10:   100          1

11:   101          1

12:   140          0

I wanr to group so that neighbour variable is 1 for all neigbours. I need to add 1 to each end of each groups:

df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]

    data2 neighbours

 1:     1          0

 2:     5          0

 3:    11          0

 4:    15          0

 5:    24          0

 6:    31          1

 7:    32          1

 8:    65          0

 9:    99          1

10:   100          1

11:   101          1

12:   140          0

Then after I just do a grouping on changing neighbour value, and set the value to mean if they are neihbours

df[,ifelse(neighbours == 1,mean(data2),data2),by = rleid(neighbours)]

    rleid    V1

 1:     1   1.0

 2:     1   5.0

 3:     1  11.0

 4:     1  15.0

 5:     1  24.0

 6:     2  31.5

 7:     2  31.5

 8:     3  65.0

 9:     4 100.0

10:     4 100.0

11:     4 100.0

12:     5 140.0

and take the unique values. And voila.

edited Dec 10 '18 at 12:30

answered Dec 10 '18 at 12:24

denis

2,0271222

add a comment |

I have a data.table based solution, same could be translated into dplyr I guess:

library(data.table)

df <- data.table(data2 = c(1,5,11,15,24,31,32,65,99,100,101,140))

df[,neighbours := ifelse(c(0,diff(data_2)) == 1,1,0)]

df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]

df[,neigh_seq := rleid(neighbours)]



unique(df[,ifelse(neighbours == 1,mean(data2),data2),by = neigh_seq])



   neigh_seq    V1

1:         1   1.0

2:         1   5.0

3:         1  11.0

4:         1  15.0

5:         1  24.0

6:         2  31.5

7:         3  65.0

8:         4 100.0

9:         5 140.0

What it does :
first line set neigbours to 1 if the difference with following number is 1

 1:     1          0

 2:     5          0

 3:    11          0

 4:    15          0

 5:    24          0

 6:    31          0

 7:    32          1

 8:    65          0

 9:    99          0

10:   100          1

11:   101          1

12:   140          0

I wanr to group so that neighbour variable is 1 for all neigbours. I need to add 1 to each end of each groups:

df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]

    data2 neighbours

 1:     1          0

 2:     5          0

 3:    11          0

 4:    15          0

 5:    24          0

 6:    31          1

 7:    32          1

 8:    65          0

 9:    99          1

10:   100          1

11:   101          1

12:   140          0

Then after I just do a grouping on changing neighbour value, and set the value to mean if they are neihbours

df[,ifelse(neighbours == 1,mean(data2),data2),by = rleid(neighbours)]

    rleid    V1

 1:     1   1.0

 2:     1   5.0

 3:     1  11.0

 4:     1  15.0

 5:     1  24.0

 6:     2  31.5

 7:     2  31.5

 8:     3  65.0

 9:     4 100.0

10:     4 100.0

11:     4 100.0

12:     5 140.0

and take the unique values. And voila.

edited Dec 10 '18 at 12:30

answered Dec 10 '18 at 12:24

denis

2,0271222

add a comment |

I have a data.table based solution, same could be translated into dplyr I guess:

library(data.table)

df <- data.table(data2 = c(1,5,11,15,24,31,32,65,99,100,101,140))

df[,neighbours := ifelse(c(0,diff(data_2)) == 1,1,0)]

df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]

df[,neigh_seq := rleid(neighbours)]



unique(df[,ifelse(neighbours == 1,mean(data2),data2),by = neigh_seq])



   neigh_seq    V1

1:         1   1.0

2:         1   5.0

3:         1  11.0

4:         1  15.0

5:         1  24.0

6:         2  31.5

7:         3  65.0

8:         4 100.0

9:         5 140.0

What it does :
first line set neigbours to 1 if the difference with following number is 1

 1:     1          0

 2:     5          0

 3:    11          0

 4:    15          0

 5:    24          0

 6:    31          0

 7:    32          1

 8:    65          0

 9:    99          0

10:   100          1

11:   101          1

12:   140          0

I wanr to group so that neighbour variable is 1 for all neigbours. I need to add 1 to each end of each groups:

df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]

    data2 neighbours

 1:     1          0

 2:     5          0

 3:    11          0

 4:    15          0

 5:    24          0

 6:    31          1

 7:    32          1

 8:    65          0

 9:    99          1

10:   100          1

11:   101          1

12:   140          0

Then after I just do a grouping on changing neighbour value, and set the value to mean if they are neihbours

df[,ifelse(neighbours == 1,mean(data2),data2),by = rleid(neighbours)]

    rleid    V1

 1:     1   1.0

 2:     1   5.0

 3:     1  11.0

 4:     1  15.0

 5:     1  24.0

 6:     2  31.5

 7:     2  31.5

 8:     3  65.0

 9:     4 100.0

10:     4 100.0

11:     4 100.0

12:     5 140.0

and take the unique values. And voila.

edited Dec 10 '18 at 12:30

answered Dec 10 '18 at 12:24

denis

2,0271222

I have a data.table based solution, same could be translated into dplyr I guess:

library(data.table)

df <- data.table(data2 = c(1,5,11,15,24,31,32,65,99,100,101,140))

df[,neighbours := ifelse(c(0,diff(data_2)) == 1,1,0)]

df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]

df[,neigh_seq := rleid(neighbours)]



unique(df[,ifelse(neighbours == 1,mean(data2),data2),by = neigh_seq])



   neigh_seq    V1

1:         1   1.0

2:         1   5.0

3:         1  11.0

4:         1  15.0

5:         1  24.0

6:         2  31.5

7:         3  65.0

8:         4 100.0

9:         5 140.0

What it does :
first line set neigbours to 1 if the difference with following number is 1

 1:     1          0

 2:     5          0

 3:    11          0

 4:    15          0

 5:    24          0

 6:    31          0

 7:    32          1

 8:    65          0

 9:    99          0

10:   100          1

11:   101          1

12:   140          0

I wanr to group so that neighbour variable is 1 for all neigbours. I need to add 1 to each end of each groups:

df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]

    data2 neighbours

 1:     1          0

 2:     5          0

 3:    11          0

 4:    15          0

 5:    24          0

 6:    31          1

 7:    32          1

 8:    65          0

 9:    99          1

10:   100          1

11:   101          1

12:   140          0

Then after I just do a grouping on changing neighbour value, and set the value to mean if they are neihbours

df[,ifelse(neighbours == 1,mean(data2),data2),by = rleid(neighbours)]

    rleid    V1

 1:     1   1.0

 2:     1   5.0

 3:     1  11.0

 4:     1  15.0

 5:     1  24.0

 6:     2  31.5

 7:     2  31.5

 8:     3  65.0

 9:     4 100.0

10:     4 100.0

11:     4 100.0

12:     5 140.0

and take the unique values. And voila.

edited Dec 10 '18 at 12:30

answered Dec 10 '18 at 12:24

denis

2,0271222

edited Dec 10 '18 at 12:30

answered Dec 10 '18 at 12:24

denis

2,0271222

answered Dec 10 '18 at 12:24

denis

2,0271222

answered Dec 10 '18 at 12:24

denis

2,0271222

add a comment |

This is a dplyr version, also using as a grouping variable cumsum(c(1,diff(x)!=1)):

library(dplyr)

data_2 %>% data.frame(x = .) %>% 

group_by(id = cumsum(c(1,diff(x)!=1))) %>% 

summarise(res = mean(x)) %>% 

select(res)

# A tibble: 9 x 1

    res

  <dbl>

1   1.0

2   5.0

3  11.0

4  15.0

5  24.0

6  31.5

7  65.0

8 100.0

9 140.0

answered Dec 10 '18 at 19:06

Lamia

3,1551717

add a comment |

This is a dplyr version, also using as a grouping variable cumsum(c(1,diff(x)!=1)):

library(dplyr)

data_2 %>% data.frame(x = .) %>% 

group_by(id = cumsum(c(1,diff(x)!=1))) %>% 

summarise(res = mean(x)) %>% 

select(res)

# A tibble: 9 x 1

    res

  <dbl>

1   1.0

2   5.0

3  11.0

4  15.0

5  24.0

6  31.5

7  65.0

8 100.0

9 140.0

answered Dec 10 '18 at 19:06

Lamia

3,1551717

add a comment |

This is a dplyr version, also using as a grouping variable cumsum(c(1,diff(x)!=1)):

library(dplyr)

data_2 %>% data.frame(x = .) %>% 

group_by(id = cumsum(c(1,diff(x)!=1))) %>% 

summarise(res = mean(x)) %>% 

select(res)

# A tibble: 9 x 1

    res

  <dbl>

1   1.0

2   5.0

3  11.0

4  15.0

5  24.0

6  31.5

7  65.0

8 100.0

9 140.0

answered Dec 10 '18 at 19:06

Lamia

3,1551717

This is a dplyr version, also using as a grouping variable cumsum(c(1,diff(x)!=1)):

library(dplyr)

data_2 %>% data.frame(x = .) %>% 

group_by(id = cumsum(c(1,diff(x)!=1))) %>% 

summarise(res = mean(x)) %>% 

select(res)

# A tibble: 9 x 1

    res

  <dbl>

1   1.0

2   5.0

3  11.0

4  15.0

5  24.0

6  31.5

7  65.0

8 100.0

9 140.0

answered Dec 10 '18 at 19:06

Lamia

3,1551717

answered Dec 10 '18 at 19:06

Lamia

3,1551717

answered Dec 10 '18 at 19:06

Lamia

3,1551717

answered Dec 10 '18 at 19:06

Lamia

3,1551717

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Htykuut