diff --git a/docs/notebooks/negative_binomial.ipynb b/docs/notebooks/negative_binomial.ipynb index d0aaff1e..d5dbcdfa 100644 --- a/docs/notebooks/negative_binomial.ipynb +++ b/docs/notebooks/negative_binomial.ipynb @@ -18,36 +18,28 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The negative binomial distribution is flexible with multiple possible formulations. For example, it can model the number of *trials* or *failures* in a sequence of independent Bernoulli trials with probability of success (or failure) $p$ until the $k$-th \"success\". If we want to model the number of trials until the $k$-th success, we can use the following definition:\n", + "The negative binomial distribution is flexible with multiple possible formulations. For example, it can model the number of *trials* or *failures* in a sequence of independent Bernoulli trials with probability of success (or failure) $p$ until the $k$-th \"success\". If we want to model the number of trials until the $k$-th success, the probability mass function (pmf) results:\n", "\n", "$$\n", - "Y \\sim \\text{NB}(k, p)\n", + "p(y | k, p)= \\binom{y - 1}{y-k}(1 -p)^{y - k}p^k\n", "$$\n", "\n", "where $0 \\le p \\le 1$ is the probability of success in each Bernoulli trial, $k > 0$, usually integer, $y \\in \\{k, k + 1, \\cdots\\}$ and $Y$ is the number of trials until the $k$-th success.\n", "\n", - "The probability mass function (pmf) is \n", - "\n", - "$$\n", - "p(y | k, p)= \\binom{y - 1}{y-k}(1 -p)^{y - k}p^k\n", - "$$\n", - "\n", "In this case, since we are modeling the number of *trials* until the $k$-th success, $y$ starts at $k$ and can be any integer greater than or equal to $k$. If instead we want to model the number of *failures* until the $k$-th success, we can use the same definition but $Y$ represents failures and starts at $0$ and there's a slightly different pmf:\n", "\n", "$$\n", "p(y | k, p)= \\binom{y + k - 1}{k-1}(1 -p)^{y}p^k\n", "$$\n", "\n", - "In this case, $y$ starts at $0$ and can be any integer greater than or equal to $0$. When modeling failures, $y$ starts at 0, when modeling trials, $y$ starts at $k$.\n", - "\n", - "\n" + "In this case, $y$ starts at $0$ and can be any integer greater than or equal to $0$. When modeling failures, $y$ starts at 0, when modeling trials, $y$ starts at $k$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "These are not the only ways of defining the negative binomial distribution, there are plenty of options! One of the most interesting, and the one you see in [PyMC3](https://docs.pymc.io/api/distributions/discrete.html#pymc3.distributions.discrete.NegativeBinomial), the library we use in Bambi for the backend, is as a continuous mixture. The negative binomial distribution describes a Poisson random variable whose rate is also a random variable (not a fixed constant!) following a gamma distribution. Or in other words, conditional on a gamma-distributed variable $\\mu$, the variable $Y$ has a Poisson distribution with mean $\\mu$.\n", + "These are not the only ways of defining the negative binomial distribution, there are plenty of options! One of the most interesting, and the one you see in [PyMC](https://www.pymc.io/projects/docs/en/stable/api/distributions/generated/pymc.NegativeBinomial.html), the library we use in Bambi for the backend, is as a continuous mixture. The negative binomial distribution describes a Poisson random variable whose rate is also a random variable (not a fixed constant!) following a gamma distribution. Or in other words, conditional on a gamma-distributed variable $\\mu$, the variable $Y$ has a Poisson distribution with mean $\\mu$.\n", "\n", "Under this alternative definition, the pmf is\n", "\n", @@ -96,7 +88,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Scipy uses the number of *failures* until $k$ successes definition, therefore $y$ starts at 0. In the following plot, we have the probability of observing $y$ failures before we see $k=3$ successes. " + "SciPy uses the number of *failures* until $k$ successes definition, therefore $y$ starts at 0. In the following plot, we have the probability of observing $y$ failures before we see $k=3$ successes. " ] }, {