Skip to content

Commit

Permalink
Gallery: Add Censored and Truncated
Browse files Browse the repository at this point in the history
  • Loading branch information
aloctavodia committed Jul 7, 2024
1 parent f7cc892 commit 55e0a2c
Show file tree
Hide file tree
Showing 2 changed files with 193 additions and 0 deletions.
97 changes: 97 additions & 0 deletions docs/examples/censored_distribution.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
---
jupytext:
text_representation:
extension: .md
format_name: myst
kernelspec:
display_name: Python 3
language: python
name: python3
---
# Censored Distribution

This is not a distribution per se, but a modifier of univariate distributions.

A censored distribution arises when the observed data is limited to a certain range, and values outside this range are not recorded. For instance, in a study aiming to measure the impact of a drug on mortality rates it may be known that an individual's age at death is at least 75 years (but may be more). Such a situation could occur if the individual withdrew from the study at age 75, or if the individual is currently alive at the age of 75. Censoring can also happen when a value falls outside the range of a measuring instrument. For example, if a bathroom scale only measures up to 140 kg, and a 160-kg person is weighed, the observer would only know that the individual's weight is at least 140 kg.


## Probability Density Function (PDF):

```{code-cell}
---
tags: [remove-input]
mystnb:
image:
alt: Censored Distribution PDF
---
import arviz as az
from preliz import Normal, Censored
az.style.use('arviz-doc')
Censored(Normal(0, 1), -1, 1).plot_pdf(support=(-4, 4))
Normal(0, 1).plot_pdf(alpha=0.5)
```

## Cumulative Distribution Function (CDF):

```{code-cell}
---
tags: [remove-input]
mystnb:
image:
alt: Censored Distribution CDF
---
Censored(Normal(0, 1), -1, 1).plot_cdf(support=(-4, 4))
Normal(0, 1).plot_cdf(alpha=0.5)
```


## Key properties and parameters:


**Probability Density Function (PDF):**

Given a base distribution with cumulative distribution function (CDF) and probability density mass/function (PDF). The pdf of a Censored distribution is:

$$
\begin{cases}
0 & \text{for } x < \text{lower}, \\
\text{CDF}(lower) & \text{for } x = \text{lower}, \\
\text{PDF}(x) & \text{for } \text{lower} < x < \text{upper}, \\
1-\text{CDF}(upper) & \text {for } x = \text{upper}, \\
0 & \text{for } x > \text{upper},
\end{cases}
$$

where `lower` and `upper` are the lower and upper bounds of the censored distribution, respectively.

**Cumulative Distribution Function (CDF):**

The given expression can be written mathematically as:


$$
\begin{cases}
0 & \text{for } x < \text{lower}, \\
\text{CDF}(x) & \text{for } \text{lower} < x < \text{upper}, \\
1 & \text{for } x > \text{upper},
\end{cases}
$$

where `lower` and `upper` are the lower and upper bounds of the censored distribution, respectively.


```{seealso}
:class: seealso
**Related Distributions:**
- [Truncated](truncated_distribution.md) - In a truncated distribution, values outside the range are set to the nearest bound, while in a censored distribution, they are not recorded.
```

## References

- Wikipedia - [Censored distribution](https://en.wikipedia.org/wiki/Censoring_(statistics))
96 changes: 96 additions & 0 deletions docs/examples/truncated_distribution.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
---
jupytext:
text_representation:
extension: .md
format_name: myst
kernelspec:
display_name: Python 3
language: python
name: python3
---
# Truncated Distribution

This is not a distribution per se, but a modifier of univariate distributions.

Truncated distributions arise in cases where the ability to record, or even to know about, occurrences is limited to values which lie above or below a given threshold or within a specified range. For example, if the dates of birth of children in a school are examined, these would typically be subject to truncation relative to those of all children in the area given that the school accepts only children in a given age range on a specific date. There would be no information about how many children in the locality had dates of birth before or after the school's cutoff dates if only a direct approach to the school were used to obtain information.

## Probability Density Function (PDF):

```{code-cell}
---
tags: [remove-input]
mystnb:
image:
alt: Truncated Distribution PDF
---
import arviz as az
from preliz import Gamma, Truncated
az.style.use('arviz-doc')
Truncated(Gamma(mu=2, sigma=1), 1, 4.5).plot_pdf()
Gamma(mu=2, sigma=1).plot_pdf()
```

## Cumulative Distribution Function (CDF):

```{code-cell}
---
tags: [remove-input]
mystnb:
image:
alt: Trucated Distribution CDF
---
Truncated(Gamma(mu=2, sigma=1), 1, 4.5).plot_cdf()
Gamma(mu=2, sigma=1).plot_cdf()
```


## Key properties and parameters:


**Probability Density Function (PDF):**

Given a base distribution with cumulative distribution function (CDF) and probability density mass/function (PDF). The pdf of a Truncated distribution is:

$$
\begin{cases}
0 & \text{for } x < \text{lower}, \\
\frac{\text{PDF}(x, dist)}{\text{CDF}(upper, dist) - \text{CDF}(lower, dist)}
& \text{for } \text{lower} <= x <= \text{upper}, \\
0 & \text{for } x > \text{upper},
\end{cases}
$$

where `lower` and `upper` are the lower and upper bounds of the truncated distribution, respectively.

**Cumulative Distribution Function (CDF):**

The given expression can be written mathematically as:


$$
\begin{cases}
0 & \text{if } x_i < \text{lower} \\
1 & \text{if } x_i > \text{upper} \\
\frac{\text{CDF}(x_i) - \text{CDF}(\text{lower})}{\text{CDF}(\text{upper}) - \text{CDF}(\text{lower})} & \text{if } \text{lower} \leq x_i \leq \text{upper}
\end{cases}
$$

where `lower` and `upper` are the lower and upper bounds of the truncated distribution, respectively.


```{seealso}
:class: seealso
**Related Distributions:**
- [Censored](censored_distribution.md) - In a censored distribution, values outside the range are not recorded, while in a truncated distribution, they are set to the nearest bound.
- [TruncatedNormal](truncated_normal_distribution.md) - A truncated normal distribution is a normal distribution that has been restricted to a specific range.
```

## References

- Wikipedia - [Truncated distribution](https://en.wikipedia.org/wiki/Truncated_distribution)

0 comments on commit 55e0a2c

Please sign in to comment.