Skip to content

Commit

Permalink
making requested adjustments
Browse files Browse the repository at this point in the history
  • Loading branch information
connor-pph committed Jan 31, 2025
1 parent fbc0baa commit f4a7ed1
Showing 1 changed file with 5 additions and 13 deletions.
18 changes: 5 additions & 13 deletions docs/notebooks/negative_binomial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -18,36 +18,28 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The negative binomial distribution is flexible with multiple possible formulations. For example, it can model the number of *trials* or *failures* in a sequence of independent Bernoulli trials with probability of success (or failure) $p$ until the $k$-th \"success\". If we want to model the number of trials until the $k$-th success, we can use the following definition:\n",
"The negative binomial distribution is flexible with multiple possible formulations. For example, it can model the number of *trials* or *failures* in a sequence of independent Bernoulli trials with probability of success (or failure) $p$ until the $k$-th \"success\". If we want to model the number of trials until the $k$-th success, the probability mass function (pmf) results:\n",
"\n",
"$$\n",
"Y \\sim \\text{NB}(k, p)\n",
"p(y | k, p)= \\binom{y - 1}{y-k}(1 -p)^{y - k}p^k\n",
"$$\n",
"\n",
"where $0 \\le p \\le 1$ is the probability of success in each Bernoulli trial, $k > 0$, usually integer, $y \\in \\{k, k + 1, \\cdots\\}$ and $Y$ is the number of trials until the $k$-th success.\n",
"\n",
"The probability mass function (pmf) is \n",
"\n",
"$$\n",
"p(y | k, p)= \\binom{y - 1}{y-k}(1 -p)^{y - k}p^k\n",
"$$\n",
"\n",
"In this case, since we are modeling the number of *trials* until the $k$-th success, $y$ starts at $k$ and can be any integer greater than or equal to $k$. If instead we want to model the number of *failures* until the $k$-th success, we can use the same definition but $Y$ represents failures and starts at $0$ and there's a slightly different pmf:\n",
"\n",
"$$\n",
"p(y | k, p)= \\binom{y + k - 1}{k-1}(1 -p)^{y}p^k\n",
"$$\n",
"\n",
"In this case, $y$ starts at $0$ and can be any integer greater than or equal to $0$. When modeling failures, $y$ starts at 0, when modeling trials, $y$ starts at $k$.\n",
"\n",
"\n"
"In this case, $y$ starts at $0$ and can be any integer greater than or equal to $0$. When modeling failures, $y$ starts at 0, when modeling trials, $y$ starts at $k$."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"These are not the only ways of defining the negative binomial distribution, there are plenty of options! One of the most interesting, and the one you see in [PyMC3](https://docs.pymc.io/api/distributions/discrete.html#pymc3.distributions.discrete.NegativeBinomial), the library we use in Bambi for the backend, is as a continuous mixture. The negative binomial distribution describes a Poisson random variable whose rate is also a random variable (not a fixed constant!) following a gamma distribution. Or in other words, conditional on a gamma-distributed variable $\\mu$, the variable $Y$ has a Poisson distribution with mean $\\mu$.\n",
"These are not the only ways of defining the negative binomial distribution, there are plenty of options! One of the most interesting, and the one you see in [PyMC](https://www.pymc.io/projects/docs/en/stable/api/distributions/generated/pymc.NegativeBinomial.html), the library we use in Bambi for the backend, is as a continuous mixture. The negative binomial distribution describes a Poisson random variable whose rate is also a random variable (not a fixed constant!) following a gamma distribution. Or in other words, conditional on a gamma-distributed variable $\\mu$, the variable $Y$ has a Poisson distribution with mean $\\mu$.\n",
"\n",
"Under this alternative definition, the pmf is\n",
"\n",
Expand Down Expand Up @@ -96,7 +88,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Scipy uses the number of *failures* until $k$ successes definition, therefore $y$ starts at 0. In the following plot, we have the probability of observing $y$ failures before we see $k=3$ successes. "
"SciPy uses the number of *failures* until $k$ successes definition, therefore $y$ starts at 0. In the following plot, we have the probability of observing $y$ failures before we see $k=3$ successes. "
]
},
{
Expand Down

0 comments on commit f4a7ed1

Please sign in to comment.