Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with PP-plot and different distributions #64

Open
erykml opened this issue Apr 9, 2019 · 1 comment
Open

Issue with PP-plot and different distributions #64

erykml opened this issue Apr 9, 2019 · 1 comment

Comments

@erykml
Copy link

erykml commented Apr 9, 2019

  • Python version: Python 3.6.8
  • numpy version: 1.14.3
  • matplotlib version: 2.0.2
  • mpl-probscale version: 0.2.3
  • Operating System: MacOS Mojave 10.14.3

Description

I tried modifying the examples from the documentation and created two PP-plots: one using Standard Normal Distribution as the theoretical distribution, another one using N(100, 5). And both plots look exactly the same (this is not true for QQ-plots). Am I missing something?

What I Did

import warnings
warnings.simplefilter('ignore')

import numpy
from matplotlib import pyplot
import seaborn
from scipy import stats
import probscale
clear_bkgd = {'axes.facecolor':'none', 'figure.facecolor':'none'}
seaborn.set(style='ticks', context='talk', color_codes=True, rc=clear_bkgd)

# load up some example data from the seaborn package
tips = seaborn.load_dataset("tips")

%matplotlib inline
%config InlineBackend.figure_format ='retina'

common_opts = dict(
    plottype='pp',
    probax='x',
    datascale='log',
    datalabel='Total Bill (USD)',
    scatter_kws=dict(marker='+', linestyle='none', mew=1)
)

norm = stats.norm(100, 5)

fig, (ax1, ax2) = pyplot.subplots(figsize=(10, 6), ncols=2, sharex=True)
fig = probscale.probplot(tips['total_bill'], ax=ax1, dist=norm,
                         problabel='N(100, 5) Probabilities', **common_opts)

fig = probscale.probplot(tips['total_bill'], ax=ax2, dist=None,
                         problabel='Standard Normal Probabilities', **common_opts)

seaborn.despine()
@phobson
Copy link
Member

phobson commented Apr 13, 2019

Thanks for the precisely described and demonstrated issue.

I think this stems from a gap in the documentation. A PP plot doesn't actually use the distribution of the data. It's displaying the percentiles of the data, which are unaffected by any inferred or provided distribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants