Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error writing to connection - Linux/Ubuntu parallelization issue #17

Open
julian-wittische opened this issue May 24, 2021 · 13 comments
Open

Comments

@julian-wittische
Copy link

We are having an error on several brand new Ubuntu servers with everything installed and updated, when we run the code with parallelization. This is the error we get:

Error in serialize(data, node$con, xdr = FALSE) : ignoring SIGPIPE signal
Error in serialize(data, node$con, xdr = FALSE) :
error writing to connection

Here is a reproducible example code, copied from the tutorial, which triggers the issue with our particular setup:

write.dir <- #please fill here
library(ResistanceGA)
data(resistance_surfaces)
data(samples)
sample.locales <-SpatialPoints(samples[,c(2,3)])
r.stack <-stack(resistance_surfaces$categorical,resistance_surfaces$continuous,resistance_surfaces$feature)
GA.inputs <-GA.prep(ASCII.dir = r.stack,Results.dir = write.dir,method = "LL",max.cat = 500,max.cont = 500,seed = 555,parallel = 4)
gdist.inputs <-gdist.prep(length(sample.locales),samples = sample.locales,method ='commuteDistance')
PARM <-c(1, 250, 75, 1, 3.5, 150, 1, 350)
Resist <-Combine_Surfaces(PARM = PARM,gdist.inputs = gdist.inputs,GA.inputs = GA.inputs,out = NULL,rescale = TRUE)
gdist.response <-Run_gdistance(gdist.inputs = gdist.inputs,r = Resist)
gdist.inputs <-gdist.prep(n.Pops =length(sample.locales),samples = sample.locales,response =as.vector(gdist.response),method ='commuteDistance')
Multi.Surface_optim <-MS_optim(gdist.inputs = gdist.inputs,GA.inputs = GA.inputs)

Session info:

R version 4.0.5 (2021-03-31) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.2 LTS

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] ResistanceGA_4.1-0.46 raster_3.4-10 sp_1.4-5

loaded via a namespace (and not attached): [1] jsonlite_1.7.2 splines_4.0.5 foreach_1.5.1 [4] gtools_3.8.2 shiny_1.6.0 expm_0.999-6 [7] stats4_4.0.5 spatstat.geom_2.1-0 LearnBayes_2.15.1 [10] pillar_1.6.1 lattice_0.20-44 glue_1.4.2 [13] digest_0.6.27 promises_1.2.0.1 polyclip_1.10-0 [16] minqa_1.2.4 colorspace_2.0-1 MuMIn_1.43.17 [19] htmltools_0.5.1.1 httpuv_1.6.1 Matrix_1.3-3 [22] plyr_1.8.6 spatstat.sparse_2.0-0 JuliaCall_0.17.4 [25] pkgconfig_2.0.3 gmodels_2.18.1 purrr_0.3.4 [28] xtable_1.8-4 spatstat.core_2.1-2 scales_1.1.1 [31] gdata_2.18.0 tensor_1.5 XR_0.7.2 [34] later_1.2.0 spatstat.utils_2.1-0 lme4_1.1-27 [37] proxy_0.4-25 tibble_3.1.2 mgcv_1.8-35 [40] generics_0.1.0 ggplot2_3.3.3 ellipsis_0.3.2 [43] XRJulia_0.9.0 cli_2.5.0 magrittr_2.0.1 [46] crayon_1.4.1 mime_0.10 deldir_0.2-10 [49] fansi_0.4.2 doParallel_1.0.16 nlme_3.1-152 [52] MASS_7.3-54 class_7.3-19 tools_4.0.5 [55] lifecycle_1.0.0 munsell_0.5.0 e1071_1.7-6 [58] gdistance_1.3-6 akima_0.6-2.1 compiler_4.0.5 [61] rlang_0.4.11 units_0.7-1 classInt_0.4-3 [64] grid_4.0.5 nloptr_1.2.2.2 iterators_1.0.13 [67] goftest_1.2-2 igraph_1.2.6 miniUI_0.1.1.1 [70] boot_1.3-28 GA_3.2.1 gtable_0.3.0 [73] codetools_0.2-18 abind_1.4-5 DBI_1.1.1 [76] R6_2.5.0 knitr_1.33 dplyr_1.0.6 [79] fastmap_1.1.0 utf8_1.2.1 ggExtra_0.9 [82] spdep_1.1-7 KernSmooth_2.23-20 spatstat.data_2.1-0 [85] parallel_4.0.5 Rcpp_1.0.6 vctrs_0.3.8 [88] sf_0.9-8 rpart_4.1-15 coda_0.19-4 [91] spData_0.3.8 tidyselect_1.1.1 xfun_0.23

We have tried reinstalling everything with different versions, to no avail. We have a very large RAM on both servers. A simple parallelization with doParallel works:

library(doParallel)  
getPrimeNumbers <- function(n) {  
   n <- as.integer(n)
   if(n > 1e6) stop("n too large")
   primes <- rep(TRUE, n)
   primes[1] <- FALSE
   last.prime <- 2L
   for(i in last.prime:floor(sqrt(n)))
   {
      primes[seq.int(2L*last.prime, n, last.prime)] <- FALSE
      last.prime <- last.prime + min(which(primes[(last.prime+1):n]))
   }
   which(primes)
}
no_cores <- detectCores() - 1  
registerDoParallel(cores=no_cores)  
cl <- makeCluster(no_cores, type="FORK")  
result <- parLapply(cl, 10:10000, getPrimeNumbers)  
stopCluster(cl)
@wpeterman
Copy link
Owner

wpeterman commented May 25, 2021 via email

@julian-wittische
Copy link
Author

I did store an appropriate path in the write.dir object, I simply got rid of it if someone wanted to run it. The script works in Windows.

@EveTC
Copy link

EveTC commented May 29, 2021

Hello - I am receiving the same error, also using a server with ubuntu

Did you find a fix for this @julian-wittische? - thanks
I have googled the general error and some seem to think it is the amount of memory it will use? However I am running it on a server with many CPUs so I dont think this should be a problem

@julian-wittische
Copy link
Author

Hello,

Unfortunately, we have been unable to solve this issue. The memory is not the issue in our case because even using only 2 CPU with huge RAM triggers the error.

@wpeterman
Copy link
Owner

wpeterman commented Jun 1, 2021 via email

@EveTC
Copy link

EveTC commented Jun 7, 2021

Yes everything writes to the correct and specified directory when I run a small example without the parrallel setting.

@EveTC
Copy link

EveTC commented Jun 18, 2021

As I can not seem to solve the parallel issue. Is it possible to run all the rasters for SS_optim seperatley (i.e. on seperate cores) and then concatenate the results for the pseudo bootsrapping method? I am running it for a big area so am trying to find any way to speed up the process.

Or can we use doParallel around the function itself somehow? Sorry I am very new to using parallel in R etc.

Thank you

@wpeterman
Copy link
Owner

wpeterman commented Jun 18, 2021 via email

@EveTC
Copy link

EveTC commented Jun 21, 2021

Ok thanks @wpeterman. I have found this chain (luca-scr/GA#50 - see answer to the issue at the end) on the GA GitHub (@julian-wittische) which may help us debug the issue?

I have never used parallel in R but once I set

cl <- makePSOCKcluster(8) # I defined cl by this commend
registerDoParallel(cl)

I no longer get the previous error but then I do not recieve the normal iteration ouptut so I am unsure if it is working.

I am going to play with this today and let you know how it goes, but if you have any success with this way forward - please let me know.

@EveTC
Copy link

EveTC commented Jun 21, 2021

I believe it now works for me. I run the code below:

library(parallel)
library(doParallel)

cl <- makePSOCKcluster(32)
registerDoParallel(cl)

# Set variables for ResistanceGA
GA.inputs_All <- GA.prep(method="AIC", ASCII.dir=raster, Results.dir = write.dir, min.cat=1, seed=111, parallel=cl)
# Inputs for resistance method
gdist.inputs <- gdist.prep(length(sample.sp), samples=sample.sp, response= lower(fst), method='costDistance')

# Export info to cluster
clusterExport(cl=cl,varlist=c("GA.inputs_All","gdist.inputs","raster","sample.sp","fst")) # list everything you call in ro GA.inputs and gdist
clusterEvalQ(cl=cl, .libPaths("/R")) # set path to where your R library is
clusterCall(cl=cl, library, package = "ResistanceGA", character.only = TRUE)
 
# Run SS_optim
run1_SSoptim <- SS_optim(gdist.inputs = gdist.inputs, GA.inputs = GA.inputs_All, diagnostic_plots=FALSE)

# Stop cluster once it has finished
stopCluster(cl)

@cmu002
Copy link

cmu002 commented Apr 18, 2022

Has this issue been officially resolved? I'm running into the same output errors when I run my code in an ubuntu EC2 instance.

Error in unserialize(socklist[[n]]) : error reading from connection

@wpeterman
Copy link
Owner

This was an idiosyncratic error that I could never recreate on clusters or computers I had access to. If you're receiving an error when running ResistanceGA with Julia, try following suggestion from Julian and Eve above.

@vladimirovav
Copy link

I also have a problem with running ResistanceGA in Linux/ubuntu and I know others that had problems working in this operating system, whilst in windows OS it works perfectly well. My error produces this:

Error happens in Julia. InitError: could not load library "/home/v_vl/.julia/artifacts/4a6fe8c6eda7f19d80afb858eb9f7cbe312f1453/lib/libnetcdf.so" /usr/lib/x86_64-linux-gnu/libcurl.so: version CURL_4' not found (required by /home/v_vl/.julia/artifacts/4a6fe8c6eda7f19d80afb858eb9f7cbe312f1453/lib/libnetcdf.so)

Did anyone figure out if maybe there is a version of Ubuntu and Julia that would make ResistanceGA work well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants