Issues/1951 update aws spot documentation (#4310)

* Update Preemptibility documentation * Add example of --defaultPreemptible to preemptability section * Replace preemptable with preemptible * Add compatibilty for spelling preemptible preemptable * Remove note in job.py referring to preemptable * Change Preempability to Preemptibility * Update documentation, add support for preemptable * add backwards compatibility for preemptable keyword --------- Co-authored-by: Adam Novak <[email protected]> Co-authored-by: Lon Blauvelt <[email protected]>
DataBiosphere · Feb 2, 2023 · eb21245 · eb21245
1 parent 02b4aad
commit eb21245
Show file tree

Hide file tree

Showing 29 changed files with 558 additions and 535 deletions.
diff --git a/docs/running/cliOptions.rst b/docs/running/cliOptions.rst
@@ -248,9 +248,9 @@ autoscaled cluster, as well as parameters to control the level of provisioning.
   --nodeTypes NODETYPES
                         Specifies a list of comma-separated node types, each of which is
                         composed of slash-separated instance types, and an optional spot
-                        bid set off by a colon, making the node type preemptable. Instance
+                        bid set off by a colon, making the node type preemptible. Instance
                         types may appear in multiple node types, and the same node type
-                        may appear as both preemptable and non-preemptable.
+                        may appear as both preemptible and non-preemptible.
 
                         Valid argument specifying two node types:
                             c5.4xlarge/c5a.4xlarge:0.42,t2.large
@@ -288,16 +288,16 @@ autoscaled cluster, as well as parameters to control the level of provisioning.
   --scaleInterval SCALEINTERVAL
                         The interval (seconds) between assessing if the scale of
                         the cluster needs to change. (Default: 60)
-  --preemptableCompensation PREEMPTABLECOMPENSATION
+  --preemptibleCompensation PREEMPTIBLECOMPENSATION
                         The preference of the autoscaler to replace
-                        preemptable nodes with non-preemptable nodes, when
-                        preemptable nodes cannot be started for some reason.
+                        preemptible nodes with non-preemptible nodes, when
+                        preemptible nodes cannot be started for some reason.
                         Defaults to 0.0. This value must be between 0.0 and
                         1.0, inclusive. A value of 0.0 disables such
                         compensation, a value of 0.5 compensates two missing
-                        preemptable nodes with a non-preemptable one. A value
+                        preemptible nodes with a non-preemptible one. A value
                         of 1.0 replaces every missing pre-emptable node with a
-                        non-preemptable one.
+                        non-preemptible one.
   --nodeStorage NODESTORAGE
                         Specify the size of the root volume of worker nodes
                         when they are launched in gigabytes. You may want to
@@ -321,10 +321,10 @@ keeping this limited we can avoid nodes occupied with services causing deadlocks
   --maxServiceJobs MAXSERVICEJOBS
                         The maximum number of service jobs that can be run
                         concurrently, excluding service jobs running on
-                        preemptable nodes. default=9223372036854775807
-  --maxPreemptableServiceJobs MAXPREEMPTABLESERVICEJOBS
+                        preemptible nodes. default=9223372036854775807
+  --maxPreemptibleServiceJobs MAXPREEMPTIBLESERVICEJOBS
                         The maximum number of service jobs that can run
-                        concurrently on preemptable nodes.
+                        concurrently on preemptible nodes.
                         default=9223372036854775807
   --deadlockWait DEADLOCKWAIT
                         Time, in seconds, to tolerate the workflow running only
@@ -371,8 +371,8 @@ from the batch system.
                         type and a count are used, they must be separated by a
                         colon. If multiple types of accelerators are used, the
                         specifications are separated by commas. Default is [].
-  --defaultPreemptable BOOL
-                        Make all jobs able to run on preemptable (spot) nodes
+  --defaultPreemptible BOOL
+                        Make all jobs able to run on preemptible (spot) nodes
                         by default.
   --maxCores INT        The maximum number of CPU cores to request from the
                         batch system at any one time. Standard suffixes like
@@ -391,8 +391,8 @@ systems have issues!).
   --retryCount RETRYCOUNT
                         Number of times to retry a failing job before giving
                         up and labeling job failed. default=1
-  --enableUnlimitedPreemptableRetries
-                        If set, preemptable failures (or any failure due to an
+  --enableUnlimitedPreemptibleRetries
+                        If set, preemptible failures (or any failure due to an
                         instance getting unexpectedly terminated) will not count
                         towards job failures and -\\-retryCount.
   --doubleMem           If set, batch jobs which die due to reaching memory
@@ -514,8 +514,8 @@ to run both simultaneously. To cope with this situation Toil attempts to
 schedule services and accessors intelligently, however to avoid a deadlock
 with workflows running service jobs it is advisable to use the following parameters:
 
-* ``--maxServiceJobs``: The maximum number of service jobs that can be run concurrently, excluding service jobs running on preemptable nodes.
-* ``--maxPreemptableServiceJobs``: The maximum number of service jobs that can run concurrently on preemptable nodes.
+* ``--maxServiceJobs``: The maximum number of service jobs that can be run concurrently, excluding service jobs running on preemptible nodes.
+* ``--maxPreemptibleServiceJobs``: The maximum number of service jobs that can run concurrently on preemptible nodes.
 
 Specifying these parameters so that at a maximum cluster size there will be
 sufficient resources to run accessors in addition to services will ensure that

diff --git a/docs/running/cloud/amazon.rst b/docs/running/cloud/amazon.rst
@@ -324,32 +324,40 @@ For more information on other autoscaling (and other) options have a look at :re
     Some important caveats about starting a toil run through an ssh session are
     explained in the :ref:`sshCluster` section.
 
-Preemptability
+Preemptibility
 ^^^^^^^^^^^^^^
 
-Toil can run on a heterogeneous cluster of both preemptable and non-preemptable nodes. Being preemptable node simply
+Toil can run on a heterogeneous cluster of both preemptible and non-preemptible nodes. Being a preemptible node simply
 means that the node may be shut down at any time, while jobs are running. These jobs can then be restarted later
 somewhere else.
 
-A node type can be specified as preemptable by adding a `spot bid`_ to its entry in the list of node types provided with
-the ``--nodeTypes`` flag. If spot instance prices rise above your bid, the preemptable node whill be shut down.
+A node type can be specified as preemptible by adding a `spot bid`_ to its entry in the list of node types provided with
+the ``--nodeTypes`` flag. If spot instance prices rise above your bid, the preemptible node whill be shut down.
 
-While individual jobs can each explicitly specify whether or not they should be run on preemptable nodes
-via the boolean ``preemptable`` resource requirement, the ``--defaultPreemptable`` flag will allow jobs without a
-``preemptable`` requirement to run on preemptable machines.
+Individual jobs can explicitly specify whether they should be run on preemptible nodes via the boolean ``preemptible``
+resource requirement, if this is not specified, the job will not run on preemptible nodes even if preemptible nodes
+are available unless specified with the ``--defaultPreemptible`` flag. The ``--defaultPreemptible`` flag will allow
+jobs without a ``preemptible`` requirement to run on preemptible machines. For example::
 
-.. admonition:: Specify Preemptability Carefully
+    $ python /root/sort.py aws:us-west-2:<my-jobstore-name> \
+          --provisioner aws \
+          --nodeTypes c3.4xlarge:2.00 \
+          --maxNodes 2 \
+          --batchSystem mesos \
+          --defaultPreemptible
+
+.. admonition:: Specify Preemptibility Carefully
 
     Ensure that your choices for ``--nodeTypes`` and ``--maxNodes <>`` make
     sense for your workflow and won't cause it to hang. You should make sure the
     provisioner is able to create nodes large enough to run the largest job
-    in the workflow, and that non-preemptable node types are allowed if there are
-    non-preemptable jobs in the workflow.
+    in the workflow, and that non-preemptible node types are allowed if there are
+    non-preemptible jobs in the workflow.
 
-Finally, the ``--preemptableCompensation`` flag can be used to handle cases where preemptable nodes may not be
+Finally, the ``--preemptibleCompensation`` flag can be used to handle cases where preemptible nodes may not be
 available but are required for your workflow. With this flag enabled, the autoscaler will attempt to compensate
-for a shortage of preemptable nodes of a certain type by creating non-preemptable nodes of that type, if
-non-preemptable nodes of that type were specified in ``--nodeTypes``.
+for a shortage of preemptible nodes of a certain type by creating non-preemptible nodes of that type, if
+non-preemptible nodes of that type were specified in ``--nodeTypes``.
 
 .. _spot bid: https://aws.amazon.com/ec2/spot/pricing/
 

diff --git a/src/toil/batchSystems/abstractBatchSystem.py b/src/toil/batchSystems/abstractBatchSystem.py
@@ -46,7 +46,7 @@
 class BatchJobExitReason(enum.IntEnum):
     FINISHED: int = 1  # Successfully finished.
     FAILED: int = 2  # Job finished, but failed.
-    LOST: int = 3  # Preemptable failure (job's executing host went away).
+    LOST: int = 3  # Preemptible failure (job's executing host went away).
     KILLED: int = 4  # Job killed before finishing.
     ERROR: int = 5  # Internal error.
     MEMLIMIT: int = 6  # Job hit batch system imposed memory limit
@@ -476,12 +476,12 @@ class AbstractScalableBatchSystem(AbstractBatchSystem):
     """
 
     @abstractmethod
-    def getNodes(self, preemptable: Optional[bool] = None, timeout: int = 600) -> Dict[str, NodeInfo]:
+    def getNodes(self, preemptible: Optional[bool] = None, timeout: int = 600) -> Dict[str, NodeInfo]:
         """
-        Returns a dictionary mapping node identifiers of preemptable or non-preemptable nodes to
+        Returns a dictionary mapping node identifiers of preemptible or non-preemptible nodes to
         NodeInfo objects, one for each node.
 
-        :param preemptable: If True (False) only (non-)preemptable nodes will be returned.
+        :param preemptible: If True (False) only (non-)preemptible nodes will be returned.
                If None, all nodes will be returned.
         """
         raise NotImplementedError()

diff --git a/src/toil/batchSystems/kubernetes.py b/src/toil/batchSystems/kubernetes.py
@@ -538,37 +538,37 @@ def __init__(self) -> None:
             Taints which are allowed to be present (with these values).
             """
 
-        def set_preemptable(self, preemptable: bool) -> None:
+        def set_preemptible(self, preemptible: bool) -> None:
             """
             Add constraints for a job being preemptible or not.
 
-            Preemptable jobs will be able to run on preemptable or non-preemptable
-            nodes, and will prefer preemptable nodes if available.
+            Preemptible jobs will be able to run on preemptible or non-preemptible
+            nodes, and will prefer preemptible nodes if available.
 
-            Non-preemptable jobs will not be allowed to run on nodes that are
-            marked as preemptable.
+            Non-preemptible jobs will not be allowed to run on nodes that are
+            marked as preemptible.
 
             Understands the labeling scheme used by EKS, and the taint scheme used
             by GCE. The Toil-managed Kubernetes setup will mimic at least one of
             these.
             """
 
-            # We consider nodes preemptable if they have any of these label or taint values.
+            # We consider nodes preemptible if they have any of these label or taint values.
             # We tolerate all effects of specified taints.
             # Amazon just uses a label, while Google
             # <https://cloud.google.com/kubernetes-engine/docs/how-to/preemptible-vms>
             # uses a label and a taint.
-            PREEMPTABLE_SCHEMES = {'labels': [('eks.amazonaws.com/capacityType', ['SPOT']),
+            PREEMPTIBLE_SCHEMES = {'labels': [('eks.amazonaws.com/capacityType', ['SPOT']),
                                               ('cloud.google.com/gke-preemptible', ['true'])],
                                    'taints': [('cloud.google.com/gke-preemptible', ['true'])]}
 
-            if preemptable:
-                # We want to seek preemptable labels and tolerate preemptable taints.
-                self.desired_labels += PREEMPTABLE_SCHEMES['labels']
-                self.tolerated_taints += PREEMPTABLE_SCHEMES['taints']
+            if preemptible:
+                # We want to seek preemptible labels and tolerate preemptible taints.
+                self.desired_labels += PREEMPTIBLE_SCHEMES['labels']
+                self.tolerated_taints += PREEMPTIBLE_SCHEMES['taints']
             else:
-                # We want to prohibit preemptable labels
-                self.prohibited_labels += PREEMPTABLE_SCHEMES['labels']
+                # We want to prohibit preemptible labels
+                self.prohibited_labels += PREEMPTIBLE_SCHEMES['labels']
 
 
         def apply(self, pod_spec: V1PodSpec) -> None:
@@ -695,7 +695,7 @@ def _create_pod_spec(
 
         # Also start on the placement constraints
         placement = KubernetesBatchSystem.Placement()
-        placement.set_preemptable(job_desc.preemptable)
+        placement.set_preemptible(job_desc.preemptible)
 
         for accelerator in job_desc.accelerators:
             # Add in requirements for accelerators (GPUs).