Should Poisson[0,0] be handled especially? #293

kratsg · 2018-10-04T21:07:56Z

Question

Filing on behalf of @wiso for now since we need an issue to move and focus the discussion better. The initial PR #277 added a feature to use scipy.special.xlogy which was actually reverted in #280 because we started seeing divergence in calculations among the backends around the special lambda=0 result.

That said, the poisson is not well-defined for lambda=0, so it's not clear if we should define it for Poisson[0,0]. @wiso's discussion here #277 (comment) is copied/pasted here verbatim. From this point on.

By the way if I well understood the implementation for the poisson in the other backends is a gaussian, e.g.

https://github.com/diana-hep/pyhf/blob/cbb899fc134b7d156550dfdc9130154f264a5188/pyhf/tensor/tensorflow_backend.py#L198-L200

So I don't think you want to implement such a check:

In [2]: pyhf.tensorlib.poisson(3, 2)
Out[2]: 0.18044704431548356

In [3]: pyhf.set_backend(pyhf.tensor.mxnet_backend())

In [4]: pyhf.tensorlib.poisson(3, 2)
Out[4]: 

[0.21969564]
<NDArray 1 @cpu(0)>

In [5]: pyhf.set_backend(pyhf.tensor.pytorch_backend())

In [6]: pyhf.tensorlib.poisson(3, 2)
Out[6]: tensor([0.2197])

I don't know how the nan vs 1 problem can affect the optimization (probably you should reason in that way and not from the mathematical point of view (e.g. Poisson not defined for lam=0), in the same way as you are using a form of continuous Poissonian). By the way if we observe n=0 then the formula become exp(-lam) and its limit is 1 for lam->0, which is also the maximum. This is true also for the (discrete) Poisson distribution.

Here a stupid case where having a number for Poisson[0, 0] may be useful.

from pyhf import get_backend
from pyhf.utils import loglambdav
import pyhf.simplemodels

_, optimizer = get_backend()

pdf = pyhf.simplemodels.hepdata_like(signal_data=[1], bkg_data=[0.],
                                     bkg_uncerts=[10])  # last parameter should not count in this case

init_pars = pdf.config.suggested_init()
par_bounds = pdf.config.suggested_bounds()

NOBS = 10
print (optimizer.unconstrained_bestfit(loglambdav, [NOBS] + pdf.config.auxdata,
                                       pdf, init_pars, par_bounds)[0])

with this branch, result is 10 as expected. Also, if I put Let put NOBS=0 as a result I get 0, without any RuntimeWarning.

With master branch (just pulled) with any numbers (I tried 10 or 0):

/home/turra/pyhf/pyhf/tensor/numpy_backend.py:186: RuntimeWarning: divide by zero encountered in log
  return np.exp(n * np.log(lam) - lam - gammaln(n + 1.))
/home/turra/pyhf/pyhf/tensor/numpy_backend.py:186: RuntimeWarning: invalid value encountered in multiply
  return np.exp(n * np.log(lam) - lam - gammaln(n + 1.))
     fun: nan
     jac: array([nan, nan])
 message: 'Iteration limit exceeded'
    nfev: 1404
     nit: 101
    njev: 101
  status: 9
 success: False
       x: array([nan, nan])
Traceback (most recent call last):
  File "mytest.py", line 15, in <module>
    pdf, init_pars, par_bounds)[0])
  File "/home/turra/pyhf/pyhf/optimize/opt_scipy.py", line 14, in unconstrained_bestfit
    assert result.success
AssertionError

It works only if I put a non-zero background.

The text was updated successfully, but these errors were encountered:

andrewfowlie · 2020-05-12T08:54:01Z

I think this is a good idea. The $\lambda \to 0$ limit of a Poisson seems to me well-defined and intuitive,

P(n | \lambda = 0) = { 1 if n = 0
                     { 0 otherwise

The mean and variance are $E(n) = Var(n) = \lambda = 0$ as expected.

The Poisson is implemented this way (requring only non-negative rate paramater) in R, which should give some confidence that it isn't an obviously silly thing to do.

It appears in tests that scipy behaves this way too, although it's not clearly documented.

lukasheinrich · 2020-05-12T09:07:27Z

My first hunch is that we would relegate this however the underlying probability libraries handle this. e.g. what do tfp.Poisson, torch.distributions.Poisson do?

I could also see us using a (configurable) cut off / clamping. E.g. through the interpolations used in HF we might anyways get into situation where the rate becomes negative. Should we in this case cut off the rate with something a la

np.where(rate <=0, cutoff, rate)

where cutoff is 1e-16 or similar. Do you care if its Pois(n|lam = 0) vs Pois(n|lam = 1e-16) or see any issues with this approach?

andrewfowlie · 2020-05-12T09:15:26Z

I don't see any practical problem. For the sake of argument, if you have a histogram with n bins and in each bin you observed 0 events, I will be able to drive a difference between the correct behaviour for $\lambda \to 0$ and $\lambda$ clipped at 1e-16 by increasing n. It will however take n \sim 1e16 bins before the effect becomes noticeable 😄

Minor thing, wouldn't np.where(rate <=cutoff, cutoff, rate) be better though?

lukasheinrich · 2020-05-12T09:17:16Z

yes it would be better :)

matthewfeickert · 2020-05-14T16:03:38Z

I don't see any practical problem.

e.g. what do tfp.Poisson, torch.distributions.Poisson do?

Consider the following example for PyTorch

>>> import torch
>>> from torch.distributions import Poisson
>>>
>>> rate_lambda = torch.tensor([0.0])
>>> observed_k = torch.tensor([0.0, 1.0])
>>>
>>> degenerate_pmf = Poisson(rate_lambda)
>>> degenerate_pmf.log_prob(observed_k)
tensor([nan, -inf])
>>> torch.exp(degenerate_pmf.log_prob(observed_k))
tensor([nan, 0.])

and TensorFlow

>>> import tensorflow as tf
>>> import tensorflow_probability as tfp
>>>
>>> rate_lambda = tf.convert_to_tensor([0.0])
>>> observed_k = tf.convert_to_tensor([0.0, 1.0])
>>>
>>> degenerate_pmf = tfp.distributions.Poisson(rate_lambda)
>>> degenerate_pmf.log_prob(observed_k)
<tf.Tensor: shape=(2,), dtype=float32, numpy=array([ nan, -inf], dtype=float32)>
>>> degenerate_pmf.prob(observed_k)
<tf.Tensor: shape=(2,), dtype=float32, numpy=array([nan,  0.], dtype=float32)>

This behavior obviously should be avoided.

I could also see us using a (configurable) cut off / clamping.

While this is fine from the numerical perspective, there is also the question on if this is a good idea to implement at all from a physics perspective.

If you "observe" zero Monte Carlo events in a bin, this does not mean that the rate parameter is actually 0 — this gives us \hat{λ} not λ. If you had 100 times as many Monte Carlo events the number could be different. If the rate parameter is actually zero in the model then nothing physical is being modeled.

I would think that the most responsible thing to do is to loudly warn the user that their model is unphysical and suggest that they rebin or reconsider the model. What could be possible is that we implement a global flag so that by default this behavior happens, but if a user passes this flag then the zero bins are masked.

andrewfowlie · 2020-05-14T21:38:52Z

When lambda is an MLE estimate of a rate parameter from an MC simulation, why does an MLE estimate of 0 for that rate parameter indicate that the model is unphysical?

alexander-held · 2020-11-27T15:28:18Z

It could be good to catch models that predict zero yields in a bin with an easier-to-understand error message than

File "[...]/pyhf/src/pyhf/modifiers/staterror.py", line 115, in finalize
    assert len(sigmas[sigmas > 0]) == pdfconfig.param_set(mod).n_parameters
AssertionError

alexander-held · 2021-06-25T10:35:18Z

From a practical perspective, what do you think about adding a warning at runtime when using models with 0 yield predicted in some bins, and then letting the different backends implement the behavior in any way they like? I assume that usually when such a model is encountered, the model should be changed anyway. If not, the user can take care to pick the backend implementing the behavior they are looking for.

The cutoff suggestion also sounds good to me, ideally accompanied by a warning message.

matthewfeickert · 2021-10-21T21:06:59Z

torch v1.10.0 now also allows for zero rate from pytorch/pytorch#61511

>>> import pyhf
>>> from pyhf import tensorlib
>>> import torch
>>> torch.__version__
'1.10.0+cu102'
>>> pyhf.set_backend("pytorch")
>>> tensorlib.poisson(tensorlib.astensor([0, 0, 1, 1]), tensorlib.astensor([0, 1, 0, 1]))
tensor([1.0000, 0.3679, 0.0000, 0.3679])

so the backends might be deciding this for us. I'll need to check the other behaviours.

From a practical perspective, what do you think about adding a warning at runtime when using models with 0 yield predicted in some bins, and then letting the different backends implement the behavior in any way they like?

This sounds pretty reasonable.

matthewfeickert · 2021-10-21T22:17:31Z

I've checked and all the backends agree on this now: the zero rate is allowed in the default public API across all of them

# degenerate_pmf.py
print("\nSciPy:")
import numpy as np
from scipy.stats import poisson

rate_lambda = np.asarray([0.0])
observed_k = np.asarray([0.0, 1.0])

print(f"# degenerate pmf log prob: {np.log(poisson.pmf(observed_k, rate_lambda))}")
print(f"# degenerate pmf prob: {poisson.pmf(observed_k, rate_lambda)}")

print("\nJAX:")
import jax.numpy as jnp
import jax.scipy.stats.poisson as jax_poisson

rate_lambda = jnp.asarray([0.0])
observed_k = jnp.asarray([0.0, 1.0])

print(f"# degenerate pmf log prob: {jnp.log(jax_poisson.pmf(observed_k, rate_lambda))}")
print(f"# degenerate pmf prob: {jax_poisson.pmf(observed_k, rate_lambda)}")

print("\nPyTorch:")
import torch
from torch.distributions import Poisson

rate_lambda = torch.tensor([0.0])
observed_k = torch.tensor([0.0, 1.0])

degenerate_pmf = Poisson(rate_lambda)
print(f"# degenerate pmf log prob: {degenerate_pmf.log_prob(observed_k)}")
print(f"# degenerate pmf prob: {torch.exp(degenerate_pmf.log_prob(observed_k))}")

print("\nTensorFlow:")
import tensorflow as tf
import tensorflow_probability as tfp

rate_lambda = tf.convert_to_tensor([0.0])
observed_k = tf.convert_to_tensor([0.0, 1.0])

degenerate_pmf = tfp.distributions.Poisson(rate_lambda)
print(f"# degenerate pmf log prob: {degenerate_pmf.log_prob(observed_k)}")
print(f"# degenerate pmf prob: {degenerate_pmf.prob(observed_k)}")

$ python degenerate_pmf.py

SciPy:
/home/feickert/Code/GitHub/pyhf/degenerate_pmf.py:8: RuntimeWarning: divide by zero encountered in log
  print(f"# degenerate pmf log prob: {np.log(poisson.pmf(observed_k, rate_lambda))}")
# degenerate pmf log prob: [  0. -inf]
# degenerate pmf prob: [1. 0.]

JAX:
# degenerate pmf log prob: [-4.768373e-07          -inf]
# degenerate pmf prob: [0.9999995 0.       ]

PyTorch:
# degenerate pmf log prob: tensor([0., -inf])
# degenerate pmf prob: tensor([1., 0.])

TensorFlow:
# degenerate pmf log prob: [  0. -inf]
# degenerate pmf prob: [1. 0.]

So it probably makes sense to implement this default behavior. We should check what ROOT does as well, and then consider the cutoff behavior.

kratsg added bug Something isn't working help wanted Extra attention is needed / contributions welcome question Further information is requested follow up research experimental stuff labels Oct 4, 2018

matthewfeickert mentioned this issue Sep 10, 2019

pyhf 2019 into 2020 Roadmap #561

Open

41 tasks

andrewfowlie mentioned this issue May 14, 2020

scipy.stats.poisson docs for rate = 0 scipy/scipy#12113

Closed

matthewfeickert mentioned this issue May 15, 2020

AssertionError with zero bins in histogram bulk #870

Closed

3 tasks

matthewfeickert mentioned this issue May 29, 2020

handle empty bins gracefully #180

Closed

This was referenced Jul 28, 2020

fix: Ensure TensorFlow backend Poisson compatibility with other backends #1001

Merged

Motivation for allowing zero rate in Poisson in v0.11.0? tensorflow/probability#1025

Closed

matthewfeickert mentioned this issue Oct 27, 2020

Add a percentile function to tensorlib #815

Closed

matthewfeickert mentioned this issue May 13, 2021

Supporting missing backends in tests #1454

Open

matthewfeickert mentioned this issue Jul 2, 2021

Support Python 3.9 #1515

Closed

matthewfeickert mentioned this issue Sep 1, 2021

Tracking of TensorFlow and Python 3.9 harmonization #1573

Closed

matthewfeickert self-assigned this Oct 21, 2021

matthewfeickert mentioned this issue Oct 21, 2021

feat: Allow zero rate Poisson #1657

Merged

4 tasks

lukasheinrich closed this as completed in #1657 Oct 22, 2021

andrzejnovak mentioned this issue Apr 27, 2022

Treatment of 0 bins for shapes #1852

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should Poisson[0,0] be handled especially? #293

Should Poisson[0,0] be handled especially? #293

kratsg commented Oct 4, 2018 •

edited by matthewfeickert

Loading

andrewfowlie commented May 12, 2020 •

edited

Loading

lukasheinrich commented May 12, 2020

andrewfowlie commented May 12, 2020

lukasheinrich commented May 12, 2020

matthewfeickert commented May 14, 2020

andrewfowlie commented May 14, 2020

alexander-held commented Nov 27, 2020

alexander-held commented Jun 25, 2021

matthewfeickert commented Oct 21, 2021 •

edited

Loading

matthewfeickert commented Oct 21, 2021

Should Poisson[0,0] be handled especially? #293

Should Poisson[0,0] be handled especially? #293

Comments

kratsg commented Oct 4, 2018 • edited by matthewfeickert Loading

Question

andrewfowlie commented May 12, 2020 • edited Loading

lukasheinrich commented May 12, 2020

andrewfowlie commented May 12, 2020

lukasheinrich commented May 12, 2020

matthewfeickert commented May 14, 2020

andrewfowlie commented May 14, 2020

alexander-held commented Nov 27, 2020

alexander-held commented Jun 25, 2021

matthewfeickert commented Oct 21, 2021 • edited Loading

matthewfeickert commented Oct 21, 2021

kratsg commented Oct 4, 2018 •

edited by matthewfeickert

Loading

andrewfowlie commented May 12, 2020 •

edited

Loading

matthewfeickert commented Oct 21, 2021 •

edited

Loading