Evaluating Nuisance Model Performance: #300
-
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Thanks @bindugupta, Regarding 1.: I would recommend to evaluate especially the logloss or similar measures and compare to a simple average (similar as Regarding 2.: In my personal opinion these three tests will nearly never fail and can be more considered as "sanity checks", but not more. Using different data subsets are a nice option to see if the estimates vary much more than the estimated standard deviation, but serve a similar purpose as measures of out-of-sample performance of the nuisance estimates. None of these tests addresses identification issues. |
Beta Was this translation helpful? Give feedback.
Thanks @bindugupta,
Regarding 1.:$X$ then the propensity score should should reflect that.
Generally, the nuisance estimates are used to adjust for confounding. E.g. if treatment assignment strongly influenced by you covariates
As the classifiers (for
ml_m
andml_g
) are used to fit conditional expectationsI would recommend to evaluate especially the logloss or similar measures and compare to a simple average (similar as$R^2$ -measures). If the treatment probability is quite low, the model can be able to adjust for counfounding but still classifiy 0 (or not treated) for all units (e.g. if all probabilites are still predicted smaller than 0.5).
If the mod…