Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error "invalid survival times for this distribution" for all interval censored data #62

Open
tdhock opened this issue Apr 5, 2019 · 2 comments

Comments

@tdhock
Copy link

tdhock commented Apr 5, 2019

Hi I am using flexsurv via

library(flexsurv)
library(penaltyLearning)
library(survival)
data(neuroblastomaProcessed, package="penaltyLearning")
X.mat <- neuroblastomaProcessed$feature.mat[, c("log.n", "log.hall")]
y.mat <- neuroblastomaProcessed$target.mat
train.df <- data.frame(X.mat, y.mat)
fit.survival <- survival::survreg(
  Surv(min.L, max.L, type="interval2") ~ log.n + log.hall,
  train.df, dist="gaussian")
fit.survival
fit.flex <- flexsurv::flexsurvreg(
  Surv(exp(min.L), exp(max.L), type="interval2") ~ log.n + log.hall,
  data=train.df,
  dist="lnorm")

I was expecting that flexsurvreg would estimate the same model as survival::survreg. Instead, I got an error on my system:

> fit.survival <- survival::survreg(
+   Surv(min.L, max.L, type="interval2") ~ log.n + log.hall,
+   train.df, dist="gaussian")
> fit.survival
Call:
survival::survreg(formula = Surv(min.L, max.L, type = "interval2") ~ 
    log.n + log.hall, data = train.df, dist = "gaussian")

Coefficients:
(Intercept)       log.n    log.hall 
 -2.5470812   0.9339951   1.0142676 

Scale= 0.5408448 

Loglik(model)= -199.3   Loglik(intercept only)= -547.4
	Chisq= 696.08 on 2 degrees of freedom, p= <2e-16 
n= 3418 
> fit.flex <- flexsurv::flexsurvreg(
+   Surv(exp(min.L), exp(max.L), type="interval2") ~ log.n + log.hall,
+   data=train.df,
+   dist="lnorm")
Error in (function (formula, data, weights, subset, na.action, dist = "weibull",  : 
  Invalid survival times for this distribution
> 
> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17134)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] flexsurv_1.1.1             icenReg_2.0.9             
[3] coda_0.19-2                Rcpp_1.0.0                
[5] survival_2.42-3            penaltyLearning_2018.09.04
[7] data.table_1.11.8          namedCapture_2019.02.25   

loaded via a namespace (and not attached):
 [1] RColorBrewer_1.1-2 pillar_1.3.0       compiler_3.5.1     plyr_1.8.4        
 [5] bindr_0.1.1        iterators_1.0.11   tools_3.5.1        magic_1.5-9       
 [9] tibble_1.4.2       gtable_0.2.0       lattice_0.20-35    pkgconfig_2.0.2   
[13] rlang_0.3.0.1      Matrix_1.2-14      foreach_1.5.1      mvtnorm_1.0-8     
[17] bindrcpp_0.2.2     dplyr_0.7.8        grid_3.5.1         tidyselect_0.2.5  
[21] mstate_0.2.11      deSolve_1.22       glue_1.3.0         R6_2.3.0          
[25] tidyr_0.8.2        ggplot2_3.1.0      purrr_0.2.5        magrittr_1.5      
[29] scales_1.0.0       codetools_0.2-15   splines_3.5.1      assertthat_0.2.0  
[33] abind_1.4-7        colorspace_1.4-0   quadprog_1.5-5     geometry_0.3-6    
[37] muhaz_1.2.6.1      lazyeval_0.2.1     munsell_0.5.0      crayon_1.3.4      
> 

Is this because interval censored data are NOT supported? all of the outputs in these data are interval/left/right censored. (no un-censored outputs)

If interval censored data are supported, then is this a bug? Any known fixes/work-arounds?

@chjackson
Copy link
Owner

The error happens when trying to find initial values for the flexsurvreg fit. It does this by calling survreg(..., dist="lognormal") on the natural-scale survival times. This results in the invalid survival times error you see. I'm not sure whether or not survreg is supposed to work on data that are interval censored from 0 to Inf....

You can work around this by supplying initial values: flexsurvreg(..., inits=c(1,1,0,0),...) works for me on this example.

I should probably work around this too - perhaps by calling survreg(..., dist="gaussian") on the log times as you did.

@tdhock
Copy link
Author

tdhock commented Apr 5, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants