-
Notifications
You must be signed in to change notification settings - Fork 326
[Documentation]Instructions on how to take your application to production #345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cc: @bamurtaugh |
docs/take-to-prod.md
Outdated
@@ -0,0 +1,108 @@ | |||
Taking your Spark .Net Application to Production |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to call it ".NET for Apache Spark" or ".NET for Spark" application instead (since we steer away from calling it Spark.NET publicly)? Also, I think ".NET" should be all caps for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I will change it to .NET for Apache Spark
for now. And keep all .NET
caps. Thanks!
docs/take-to-prod.md
Outdated
This how-to provides general instructions on how to take your .NET for Apache Spark application to production. | ||
In this documentation, we will summary the most commonly asked scenarios when running Spark .Net Application. | ||
And you will also learn how to package your application and submit your application with [spark-submit](https://spark.apache.org/docs/latest/submitting-applications.html) and [Apachy Livy](https://livy.incubator.apache.org/). | ||
- [How to take your application to production when you have single dependency](#how-to-take-your-application-to-production-when-you-have-single-dependency) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- [How to take your application to production when you have single dependency](#how-to-take-your-application-to-production-when-you-have-single-dependency) | |
- [How to take your application to production when you have a single dependency](#how-to-take-your-application-to-production-when-you-have-a-single-dependency) |
Not sure if we can change the phrasing here and still have it be precise, but "a single dependency" might sound a little cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, could we make these headings either more concise or more precise? i.e., either remove the "How to take your application to production" part since that phrase is already in the article title, or add a phrase that more specifically states what it means to take an app to production (does it just mean running spark-submit
, so we could say something like "Deploy app with a single dependency"?).
- [How to take your application to production when you have single dependency](#how-to-take-your-application-to-production-when-you-have-single-dependency) | |
- [Single dependency](#single-dependency) |
- [How to take your application to production when you have single dependency](#how-to-take-your-application-to-production-when-you-have-single-dependency) | |
- [How to deploy your application when you have a single dependency](#how-to-deploy-your-application-when-you-have-a-single-dependency) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your suggestion! I would prefer the second one which I think is concise and precise.
docs/take-to-prod.md
Outdated
``` | ||
#### 2. Using Apache Livy | ||
- Please see below as an example of running your app with Apache Livy in Scenario 3 and Scenario 5. | ||
And you should use `"files": ["adl://<cluster name>.azuredatalakestore.net/<some dir>/nugetLibrary.dll"]` in Scenario 4. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And you should use `"files": ["adl://<cluster name>.azuredatalakestore.net/<some dir>/nugetLibrary.dll"]` in Scenario 4. | |
Additionally, you should use `"files": ["adl://<cluster name>.azuredatalakestore.net/<some dir>/nugetLibrary.dll"]` in Scenario 4. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made the changed to resolve all the comments (except few which need some input). Thanks so much @bamurtaugh for your comments and feedback!
#### Scenario 4. SparkSession code references a function from a Nuget package that has been installed in the csproj | ||
This would be the use case when `SparkSession` code references a function from a Nuget package in the same project (e.g. mySparkApp.csproj). | ||
#### Scenario 5. SparkSession code references a function from a DLL on the user's machine | ||
This would be the use case when `SparkSession` code reference business logic (UDFs) on the user's machine (e.g. `SparkSession` code in the mySparkApp.csproj and businessLogic.dll on a different machine). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does businessLogic.dll be from a different machine ?
```shell | ||
{ | ||
"file": "adl://<cluster name>.azuredatalakestore.net/<some dir>/microsoft-spark-<spark_majorversion.spark_minorversion.x>-<spark_dotnet_version>.jar", | ||
"className": "org.apache.spark.deploy.dotnet.DotnetRunner", | ||
"files": [“adl://<cluster name>.azuredatalakestore.net/<some dir>/businessLogic.dll" ], | ||
"args": ["dotnet","adl://<cluster name>.azuredatalakestore.net/<some dir>/mySparkApp.dll","<app arg 1>","<app arg 2>,"...","<app arg n>"] | ||
} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should just provide the zip
example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Closing this one and open a new pr #349 to move forward from there. Thanks! |
Currently, there are some questions asked by customers that how they can run spark dotnet application in different scenarios. This PR gathers most commonly asked scenarios and provides general instructions on how customer can package their applications and submit jobs in such scenarios.