Replies: 7 comments
-
A couple of things you can quickly check to verify the RAPIDS Accelerator plugin is active. In the driver startup logs, you should see the following log:
You can also visit the SQL tab in the Spark UI to look at the visual description of the query plan. Search for nodes that have You can also set the config |
Beta Was this translation helpful? Give feedback.
-
Also be aware that most of the ML operations that you are looking at use RDD to do the processing and that is something we don't currently accelerate. The only thing we might accelerate in the examples you pointed out is reading the input data. And that is only if it is in a type we support, like parquet, orc, or CSV |
Beta Was this translation helpful? Give feedback.
-
Yes, my input data file is CSV, which is about 1G. And I change the submission case: But it doesn't seem to work,I found that there are no Gpu nodes from the SQL tab in the Spark UI. |
Beta Was this translation helpful? Give feedback.
-
Do you see the startup log message showing the plugin is initializing? Also run with |
Beta Was this translation helpful? Give feedback.
-
No, I can not find the plugin is initializing in the startup log message even if I run with |
Beta Was this translation helpful? Give feedback.
-
GPU scheduling is built into spark. You don't need this plugin to be able to schedule a job with a GPU, so if you see GPU resources on the UI it does not indicate if you successfully installed the accelerator or not. |
Beta Was this translation helpful? Give feedback.
-
Thank you, I have solved the question successfully. |
Beta Was this translation helpful? Give feedback.
-
What is your question?
Hello, when I use the IntelliJIdea to package LogisticRegressionSummaryExample.scala (in $SPARK_HOME/exampls/src/main/scala/org/apache/spark/examples/ml, and the DataFrameExample and GradientBoostedTreeRegressorExample have also been tried) and submit it to spark standalone mode, and add the configuration parameters for calling the GPU. In the Spark://master:8080 webUI interface, I can see that the GPU is used, and I can also find that the GPU resource is called by the application from the historyServer webUI. But it is strange that if I use nvidia-smi to check the GPU on the node terminal, I find that the GPU does not seem to be called. Then I removed the parameters for configuring the GPU to be called, and only used the CPU, and found that the running time of the application did not differ much
I changed the input data file, and the data size is about 1G.
the cudf-0.14-cuda10-2.jar and rapids-4-spark_2.12-0.1.0.jar are located at /opt/sparkRapidsPlugin/
in /etc/profile:
#Spark3.0-GPU Accelerator
export SPARK_RAPIDS_DIR=/opt/sparkRapidsPlugin
export SPARK_CUDF_JAR=${SPARK_RAPIDS_DIR}/cudf-0.14-cuda10-2.jar
export SPARK_RAPIDS_PLUGIN_JAR=${SPARK_RAPIDS_DIR}/rapids-4-spark_2.12-0.1.0.jar
The following is my submission case:
./bin/spark-submit --classs org.hik.LogisticRegressionSummaryExampl
--master spark://master:7077
--executor-cores 2
--conf spark.task.cpus 1
--conf spark.executor.resource.gpu.amount=4
--conf spark.executor.resource.gpu.discoveryScript=./examples/src/main/scripts/getGpusResources.sh
--conf spark.task.resource.gpu.amount=1
--conf spark.plugins=com.nvidia.spark.SQLPlugin
--jars ${SPARK_CUDF_JAR}, ${SPARK_RAPIDS_PLUGIN_JAR}, LR.jar
we can see from the webUI that the GPU is called [http://master:8080]
data:image/s3,"s3://crabby-images/38d35/38d3583d1b7d1856264cdc23855685f4fc272db3" alt="master8080"
We can also see that the GPU is called from the historyServer webUI [http://master:18080]
data:image/s3,"s3://crabby-images/b2ea6/b2ea6d942573d27bf4ccfedf4f9b2c7bea6a099f" alt="historyServerGPU"
However, through the nvidia-smi we can see that the GPU is not being used.
data:image/s3,"s3://crabby-images/0a982/0a98251f3bfa3472cba0444850733dbdded9eb4c" alt="terminalGPU"
Actually I suspect that the GPU is indeed not called.
I spent 2 days and still did not solve this problem, which made me very confused.
Would you help me solve it? Thank you very much.
Beta Was this translation helpful? Give feedback.
All reactions