Probability of finding a $k$-subset in a $d$-dimensional random binary array of length $n$ with at least...

For a somwehat simple version of the problem, imagine a group of ten people ($n=10$), for which we observe two binary attributes ($d= 2$). One of the attributes tells us if any given person wears glasses ($d_1$) the other if any given person wears a hat ($d_2$). We know that in total $5$ people wear hats and $6$ people wear glasses, but we don't know who wears what (random binary array). What is the probability of finding a subset of size $4$ ($k= 4$) such that at least three people in the subset wear a hat ($s_1 = 3$) AND at least four people wear glasses ($s_2 = 4$)? It's somewhat clear that there are realisations where wearing a hat and wearing glasses are mutually exclusive, such that those who have a hat don't have glasses, and realisations where having glasses and having a hat go together. Thus, for any realisation of the random binary array there might or might not be a subset of size $k$ with the above properties. I can't figure, though, how to calculate the probabilities.

The more general definition of the problem is this:

What is the probability of finding a $k$-subset in a $d$-dimensional random binary array of length $n$, such that this subset has at least $s_1$ ones in dimension $1$, $s_2$ ones in dimension $2$, ..., $s_d$ ones in dimension $d$. The total number of ones per dimension ($ts_1$ to $ts_d$) are fixed.

Is there a name to this problem? I tried to google, but I really did not know what to look for, since the problem definition is quite unwieldy to begin with.

Is the problem tractable analytically? If yes does the solution scale to larger $d$, $k$ and $n$ when solved algorithmically?

Your help is highly appreciated! Thanks a lot in advance!

edited Dec 8 '18 at 7:18

Brahadeesh

6,19742361

asked Dec 7 '18 at 23:56

derpetermann

add a comment |

The more general definition of the problem is this:

Is there a name to this problem? I tried to google, but I really did not know what to look for, since the problem definition is quite unwieldy to begin with.

Is the problem tractable analytically? If yes does the solution scale to larger $d$, $k$ and $n$ when solved algorithmically?

Your help is highly appreciated! Thanks a lot in advance!

edited Dec 8 '18 at 7:18

Brahadeesh

6,19742361

asked Dec 7 '18 at 23:56

derpetermann

add a comment |

The more general definition of the problem is this:

Is there a name to this problem? I tried to google, but I really did not know what to look for, since the problem definition is quite unwieldy to begin with.

Is the problem tractable analytically? If yes does the solution scale to larger $d$, $k$ and $n$ when solved algorithmically?

Your help is highly appreciated! Thanks a lot in advance!

edited Dec 8 '18 at 7:18

Brahadeesh

6,19742361

asked Dec 7 '18 at 23:56

derpetermann

The more general definition of the problem is this:

Is there a name to this problem? I tried to google, but I really did not know what to look for, since the problem definition is quite unwieldy to begin with.

Is the problem tractable analytically? If yes does the solution scale to larger $d$, $k$ and $n$ when solved algorithmically?

Your help is highly appreciated! Thanks a lot in advance!

combinatorics random-variables

edited Dec 8 '18 at 7:18

Brahadeesh

6,19742361

asked Dec 7 '18 at 23:56

derpetermann

edited Dec 8 '18 at 7:18

Brahadeesh

6,19742361

asked Dec 7 '18 at 23:56

derpetermann

edited Dec 8 '18 at 7:18

Brahadeesh

6,19742361

edited Dec 8 '18 at 7:18

Brahadeesh

6,19742361

edited Dec 8 '18 at 7:18

Brahadeesh

6,19742361

asked Dec 7 '18 at 23:56

derpetermann

asked Dec 7 '18 at 23:56

derpetermann

asked Dec 7 '18 at 23:56

derpetermann

add a comment |

1 Answer
1

active

oldest

votes

$newcommand{prob}[1]{text{Pr}mathopen{}left[#1right]}$
$newcommand{ksubsets}{binom{[N]}{k}}$

I am a colleague of @derpetermann and wanted to share a attempt to solving this, in the hope that it clarifies things and from there on it may be easier for others to solve it. So this is the way I approached the problem...

Among a population of $N$ individuals, every individual $n in [N]$ (I use the short notation [N] for ${1,2,..,N}$) is described by absence or presence of $D$ features:

$$ X_{n,d} :=
begin{cases}
1 & text{Feature $d$ present in individual $n$} \
0 & text{o/w}
end{cases}
$$

We know the total number of occurrences of each feature, given by

$$C_d := sum_{n in [N]} X_{n,d}$$

Otherwise, for a fixed $d$, the $X_{n,d}$ are equally distributed. Now we want to know: What is the probability of finding a subset $S$ of $k$ individuals, such that every feature $d$ is present in at least $c_d$ individuals in $S$ (for given $c_d in [k]$). I.e. how can we compute (or) estimate the following probability?

$$prob{bigcup_{S in binom{[N]}{k}} bigcap_{d in [D]} Big( sum_{n in S} X_{n,d} geq c_d Big) }$$

Note: While the $X_{cdot,d}$ for different features are independent, the union is not over independent (nor disjoint) events. And that is where we got stuck. We can rewrite the inner part (the probability that a fixed $k$-subset $S$ satisfies our condition):

$$Y_S := bigcap_{d in [D]} Big( sum_{n in S} X_{n,d} geq c_d Big)$$
$$prob{Y_S} = prod_{d in [D]} frac{sum_{c=c_d}^{min(k, C_d)} binom{k}{c} binom{N-k}{C_d - c}}{binom{N}{C_d}}$$

but have no idea of how to compute the union $prob{bigcup_{S in binom{[N]}{k}} Y_S}$. Even a useful bound upper bound on this probability would be highly appreciated.

answered Dec 20 '18 at 16:40

Nico

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3030523%2fprobability-of-finding-a-k-subset-in-a-d-dimensional-random-binary-array-of%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

$newcommand{prob}[1]{text{Pr}mathopen{}left[#1right]}$
$newcommand{ksubsets}{binom{[N]}{k}}$

Among a population of $N$ individuals, every individual $n in [N]$ (I use the short notation [N] for ${1,2,..,N}$) is described by absence or presence of $D$ features:

$$ X_{n,d} :=
begin{cases}
1 & text{Feature $d$ present in individual $n$} \
0 & text{o/w}
end{cases}
$$

We know the total number of occurrences of each feature, given by

$$C_d := sum_{n in [N]} X_{n,d}$$

$$prob{bigcup_{S in binom{[N]}{k}} bigcap_{d in [D]} Big( sum_{n in S} X_{n,d} geq c_d Big) }$$

$$Y_S := bigcap_{d in [D]} Big( sum_{n in S} X_{n,d} geq c_d Big)$$
$$prob{Y_S} = prod_{d in [D]} frac{sum_{c=c_d}^{min(k, C_d)} binom{k}{c} binom{N-k}{C_d - c}}{binom{N}{C_d}}$$

but have no idea of how to compute the union $prob{bigcup_{S in binom{[N]}{k}} Y_S}$. Even a useful bound upper bound on this probability would be highly appreciated.

answered Dec 20 '18 at 16:40

Nico

add a comment |

$newcommand{prob}[1]{text{Pr}mathopen{}left[#1right]}$
$newcommand{ksubsets}{binom{[N]}{k}}$

Among a population of $N$ individuals, every individual $n in [N]$ (I use the short notation [N] for ${1,2,..,N}$) is described by absence or presence of $D$ features:

$$ X_{n,d} :=
begin{cases}
1 & text{Feature $d$ present in individual $n$} \
0 & text{o/w}
end{cases}
$$

We know the total number of occurrences of each feature, given by

$$C_d := sum_{n in [N]} X_{n,d}$$

$$prob{bigcup_{S in binom{[N]}{k}} bigcap_{d in [D]} Big( sum_{n in S} X_{n,d} geq c_d Big) }$$

$$Y_S := bigcap_{d in [D]} Big( sum_{n in S} X_{n,d} geq c_d Big)$$
$$prob{Y_S} = prod_{d in [D]} frac{sum_{c=c_d}^{min(k, C_d)} binom{k}{c} binom{N-k}{C_d - c}}{binom{N}{C_d}}$$

but have no idea of how to compute the union $prob{bigcup_{S in binom{[N]}{k}} Y_S}$. Even a useful bound upper bound on this probability would be highly appreciated.

answered Dec 20 '18 at 16:40

Nico

add a comment |

$newcommand{prob}[1]{text{Pr}mathopen{}left[#1right]}$
$newcommand{ksubsets}{binom{[N]}{k}}$

Among a population of $N$ individuals, every individual $n in [N]$ (I use the short notation [N] for ${1,2,..,N}$) is described by absence or presence of $D$ features:

$$ X_{n,d} :=
begin{cases}
1 & text{Feature $d$ present in individual $n$} \
0 & text{o/w}
end{cases}
$$

We know the total number of occurrences of each feature, given by

$$C_d := sum_{n in [N]} X_{n,d}$$

$$prob{bigcup_{S in binom{[N]}{k}} bigcap_{d in [D]} Big( sum_{n in S} X_{n,d} geq c_d Big) }$$

$$Y_S := bigcap_{d in [D]} Big( sum_{n in S} X_{n,d} geq c_d Big)$$
$$prob{Y_S} = prod_{d in [D]} frac{sum_{c=c_d}^{min(k, C_d)} binom{k}{c} binom{N-k}{C_d - c}}{binom{N}{C_d}}$$

but have no idea of how to compute the union $prob{bigcup_{S in binom{[N]}{k}} Y_S}$. Even a useful bound upper bound on this probability would be highly appreciated.

answered Dec 20 '18 at 16:40

Nico

$newcommand{prob}[1]{text{Pr}mathopen{}left[#1right]}$
$newcommand{ksubsets}{binom{[N]}{k}}$

Among a population of $N$ individuals, every individual $n in [N]$ (I use the short notation [N] for ${1,2,..,N}$) is described by absence or presence of $D$ features:

$$ X_{n,d} :=
begin{cases}
1 & text{Feature $d$ present in individual $n$} \
0 & text{o/w}
end{cases}
$$

We know the total number of occurrences of each feature, given by

$$C_d := sum_{n in [N]} X_{n,d}$$

$$prob{bigcup_{S in binom{[N]}{k}} bigcap_{d in [D]} Big( sum_{n in S} X_{n,d} geq c_d Big) }$$

$$Y_S := bigcap_{d in [D]} Big( sum_{n in S} X_{n,d} geq c_d Big)$$
$$prob{Y_S} = prod_{d in [D]} frac{sum_{c=c_d}^{min(k, C_d)} binom{k}{c} binom{N-k}{C_d - c}}{binom{N}{C_d}}$$

but have no idea of how to compute the union $prob{bigcup_{S in binom{[N]}{k}} Y_S}$. Even a useful bound upper bound on this probability would be highly appreciated.

answered Dec 20 '18 at 16:40

Nico

answered Dec 20 '18 at 16:40

Nico

answered Dec 20 '18 at 16:40

Nico

answered Dec 20 '18 at 16:40

Nico

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Mathematics Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Htykuut