Adding bonferroni correction

Hi,

I have used this repo to build our current drft solution for our team.

![image](https://user-images.githubusercontent.com/14976422/190618058-14ef5bda-e4ac-4a49-9725-62b6a4e230df.png)


I am historizing the dataset using an azure ml tabular dataset by passing the path of my drift loadings in blob. This allows me to query the data over time and plot how the p-values are varying.

However in order to detect if we have drift or not in the overall dataset it is unfair to just look to one feature.

I am using a bonferroni correction: Bland JM, Altman DG: Multiple significance tests: The Bonferroni method. BMJ 1995;310(6973):170.

In the same way that it is implemented in Seldon Core:

```python
# TODO: return both feature-level and batch-level drift predictions by default
        # values below p-value threshold are drift
        if drift_type == 'feature':
            drift_pred = (p_vals < self.p_val).astype(int)
        elif drift_type == 'batch' and self.correction == 'bonferroni':
            threshold = self.p_val / self.n_features
            drift_pred = int((p_vals < threshold).any())  # type: ignore[assignment]
        elif drift_type == 'batch' and self.correction == 'fdr':
            drift_pred, threshold = fdr(p_vals, q_val=self.p_val)  # type: ignore[assignment]
        else:
            raise ValueError('`drift_type` needs to be either `feature` or `batch`.')
```

Maybe something worth to add to the example!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding bonferroni correction #4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Adding bonferroni correction #4

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions