Supported application languages and versions
Analytic Engine Powered by Apache Spark supports different languages like Python, R and Scala with Spark.
+ Indicates that Spark 2.4 was removed in Cloud Pak for Data 4.0.7. You can only use Spark 2.4 if you are on a Cloud Pak for Data version prior to 4.0.7.
The following template IDs exist for the different languages and Spark versions.
| Spark version/language | Template ID |
|---|---|
| Spark 3.0 / Python 3.9, 3.8 or 3.7 | spark-3.0.0-jaas-v2-cp4d-template |
| Spark 2.4 / Python 3.8 or 3.7 + | spark-2.4.0-jaas-v2-cp4d-template |
| Spark 3.0 / Scala 2.12 | spark-3.0.0-jaas-v2-cp4d-template |
| Spark 2.4 / Scala 2.11 + | spark-2.4.0-jaas-v2-cp4d-template |
| Spark 3.0 / R 3.6 | spark-3.0.0-jaas-v2-cp4d-template |
| Spark 2.4 / R 3.6 + | spark-2.4.0-jaas-v2-cp4d-template |
The following examples show you sample payloads for submitting Spark job for different languages and Spark versions. Insert the appropriate template ID for the language and Spark version you need.
-
Payload for submitting a Spark job with Python 3.9:
{ "template_id": "<template_id>", "application_details": { "application": "/myapp/customApps/example.py", "application_arguments": ["<your_application_arguments>"], "conf": { "spark.app.name": "MyJob", "spark.eventLog.enabled": "true" }, "env": { "RUNTIME_PYTHON_ENV": "python39", "PYTHONPATH": "/myapp/pippackages:/home/spark/space/assets/data_asset:/home/spark/user_home/python-3:/cc-home/_global_/python-3:/home/spark/shared/user-libs/python:/home/spark/shared/conda/envs/python/lib/python/site-packages:/opt/ibm/conda/miniconda/lib/python/site-packages:/opt/ibm/third-party/libs/python3:/opt/ibm/image-libs/python3:/opt/ibm/image-libs/spark2/metaindexmanager.jar:/opt/ibm/image-libs/spark2/stmetaindexplugin.jar:/opt/ibm/spark/python:/opt/ibm/spark/python/lib/py4j-0.10.7-src.zip" } }, "volumes": [{ "name": "appvol", "mount_path": "/myapp", "source_sub_path": "" }] } -
Payload for submitting a Spark job with Python 3.8:
{ "template_id": "<template_id>", "application_details": { "application": "/myapp/customApps/example.py", "application_arguments": ["<your_application_arguments>"], "conf": { "spark.app.name": "MyJob", "spark.eventLog.enabled": "true" }, "env": { "RUNTIME_PYTHON_ENV": "python38", "PYTHONPATH": "/myapp/pippackages:/home/spark/space/assets/data_asset:/home/spark/user_home/python-3:/cc-home/_global_/python-3:/home/spark/shared/user-libs/python:/home/spark/shared/conda/envs/python/lib/python/site-packages:/opt/ibm/conda/miniconda/lib/python/site-packages:/opt/ibm/third-party/libs/python3:/opt/ibm/image-libs/python3:/opt/ibm/image-libs/spark2/metaindexmanager.jar:/opt/ibm/image-libs/spark2/stmetaindexplugin.jar:/opt/ibm/spark/python:/opt/ibm/spark/python/lib/py4j-0.10.7-src.zip" } }, "volumes": [{ "name": "appvol", "mount_path": "/myapp", "source_sub_path": "" }] } -
Payload for submitting a Spark job with Python 3.7:
{ "template_id": "<template_id>", "application_details": { "application": "/opt/ibm/spark/examples/src/main/python/wordcount.py", "application_arguments": ["/opt/ibm/spark/examples/src/main/resources/people.txt"], "conf": { "spark.app.name": "MyJob", "spark.eventLog.enabled": "true" }, "env": { "SAMPLE_ENV_KEY": "SAMPLE_VALUE" }, "driver-memory": "4G", "driver-cores": 1, "executor-memory": "4G", "executor-cores": 1, "num-executors": 1 } } -
Payload for submitting a Spark Scala job:
{ "template_id": "<template_id>", "application_details": { "application": "/opt/ibm/spark/examples/jars/spark-examples*.jar", "application_arguments": ["1"], "class": "org.apache.spark.examples.SparkPi", "conf": { "spark.app.name": "MyJob", "spark.eventLog.enabled": "true" }, "env": { "SAMPLE_ENV_KEY": "SAMPLE_VALUE" }, "driver-memory": "4G", "driver-cores": 1, "executor-memory": "4G", "executor-cores": 1, "num-executors": 1 } } -
Payload for submitting an R 3.6 Spark job:
{ "template_id": "<template_id>", "application_details": { "application": "/opt/ibm/spark/examples/src/main/r/dataframe.R", "class": "org.apache.spark.examples.SparkPi", "conf": { "spark.app.name": "MyJob", "spark.eventLog.enabled": "true" }, "env": { "SAMPLE_ENV_KEY": "SAMPLE_VALUE" }, "driver-memory": "4G", "driver-cores": 1, "executor-memory": "4G", "executor-cores": 1, "num-executors": 1 } }
Parent topic: Submitting Spark jobs