Skip to content

Commit c6d7aa3

Browse files
authored
doc: update documentation about using sagemaker-pyspark with EMR (#116)
1 parent abd6b6a commit c6d7aa3

File tree

1 file changed

+11
-13
lines changed

1 file changed

+11
-13
lines changed

sagemaker-pyspark-sdk/README.rst

Lines changed: 11 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -161,7 +161,7 @@ Training and Hosting an XGBoost model using SageMaker PySpark
161161
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
162162

163163
A XGBoostSageMakerEstimator runs a training job using the Amazon SageMaker XGBoost algorithm upon
164-
invocation of fit(), returning a SageMakerModel.
164+
invocation of fit(), returning a SageMakerModel.
165165

166166
.. code-block:: python
167167
@@ -173,16 +173,16 @@ invocation of fit(), returning a SageMakerModel.
173173
# there is no need to do this in code.
174174
conf = (SparkConf()
175175
.set("spark.driver.extraClassPath", ":".join(classpath_jars())))
176-
SparkContext(conf=conf)
176+
SparkContext.getOrCreate(conf=conf)
177177
178178
iam_role = "arn:aws:iam:0123456789012:role/MySageMakerRole"
179179
180180
region = "us-east-1"
181-
training_data = spark.read.format("libsvm").option("numFeatures", "784")
182-
.load("s3a://sagemaker-sample-data-{}/spark/mnist/train/".format(region))
181+
training_data = (spark.read.format("libsvm").option("numFeatures", "784")
182+
.load("s3a://sagemaker-sample-data-{}/spark/mnist/train/".format(region)))
183183
184-
test_data = spark.read.format("libsvm").option("numFeatures", "784")
185-
.load("s3a://sagemaker-sample-data-{}/spark/mnist/train/".format(region))
184+
test_data = (spark.read.format("libsvm").option("numFeatures", "784")
185+
.load("s3a://sagemaker-sample-data-{}/spark/mnist/train/".format(region)))
186186
187187
xgboost_estimator = XGBoostSageMakerEstimator(
188188
trainingInstanceType="ml.m4.xlarge",
@@ -265,7 +265,7 @@ Create a bootstrap script to install sagemaker_pyspark in your new EMR cluster:
265265
#!/bin/bash
266266
267267
sudo pip install sagemaker_pyspark
268-
sudo /usr/bin/pip-3.4 install sagemaker_pyspark
268+
sudo pip3 install sagemaker_pyspark
269269
270270
271271
Upload this script to an S3 bucket:
@@ -274,7 +274,7 @@ Upload this script to an S3 bucket:
274274
275275
$ aws s3 cp bootstrap.sh s3://your-bucket/prefix/
276276
277-
In the AWS Console launch a new EMR Spark Cluster, set s3://your-bucket/prefix/bootstrap.sh as the
277+
In the AWS Console launch a new EMR Spark Cluster, set s3://your-bucket/prefix/bootstrap.sh as the
278278
bootstrap script. Make sure to:
279279

280280
- Run the Cluster in the same VPC as your SageMaker Notebook Instance.
@@ -315,15 +315,13 @@ Configure your SageMaker Notebook instance to connect to the cluster
315315

316316
Open a terminal session in your notebook: new->terminal
317317

318-
Copy the default `sparkmagic config <https://github
319-
.com/jupyter-incubator/sparkmagic/blob/master/sparkmagic/example_config.json>`__
318+
Copy the default `sparkmagic config <https://github.com/jupyter-incubator/sparkmagic/blob/master/sparkmagic/example_config.json>`__
320319

321320
You can download it in your terminal using:
322321

323322
.. code-block:: sh
324323
325-
$ wget https://raw.githubusercontent
326-
.com/jupyter-incubator/sparkmagic/master/sparkmagic/example_config.json
324+
$ wget https://raw.githubusercontent.com/jupyter-incubator/sparkmagic/master/sparkmagic/example_config.json
327325
328326
In the ``kernel_python_credentials`` section, replace the ``url`` with
329327
``http://your-cluster-private-dns-name:8998``.
@@ -335,7 +333,7 @@ Override the default spark magic config
335333
$ cp example_config.json ~/.sparkmagic/config.json
336334
337335
338-
Launch a notebook using either the ``pyspark2`` or ``pyspark3`` Kernel. As soon as you try to run
336+
Launch a notebook using ``Sparkmagic (Pyspark)`` Kernel. As soon as you try to run
339337
any code block, the notebook will connect to your spark cluster and get a ``SparkSession`` for you.
340338

341339

0 commit comments

Comments
 (0)