Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify s3 functions: create_s3_bucket. #4566

Open
wants to merge 62 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
062da74
Simplify create s3 bucket.
DailyDreaming Aug 16, 2023
1142681
Remove old function.
DailyDreaming Aug 16, 2023
25ab299
Resolve one more mk bucket location.
DailyDreaming Aug 16, 2023
5bf34bc
Simplify tagging.
DailyDreaming Aug 17, 2023
cd47aa1
Typing.
DailyDreaming Aug 17, 2023
b39d561
Typing.
DailyDreaming Aug 17, 2023
5428f5e
Merge branch 'master' into issues/4088-mv-s3-functions-mk-bucket
adamnovak Aug 17, 2023
5b25a35
Simplify s3 functions: delete_s3_bucket. (#4567)
DailyDreaming Aug 31, 2023
7646a5d
Merge branch 'master' into issues/4088-mv-s3-functions-mk-bucket
adamnovak Aug 31, 2023
095ba36
Rebase.
DailyDreaming Apr 30, 2024
29a2ecc
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 1, 2024
8870d67
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 1, 2024
db585b7
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 1, 2024
2bb9092
Missing import.
DailyDreaming May 1, 2024
cb9e474
Another missing import.
DailyDreaming May 2, 2024
0e08474
Merge branch 'master' into issues/4088-mv-s3-functions-mk-bucket
DailyDreaming May 2, 2024
3ca63ed
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 2, 2024
3e21094
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 6, 2024
7f4f0d6
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 6, 2024
a4a395e
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 6, 2024
0a950bf
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 6, 2024
03941e9
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 6, 2024
3323e08
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 7, 2024
9341962
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 9, 2024
c0eea1b
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 14, 2024
59b4b8a
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 14, 2024
52be927
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 14, 2024
2242ace
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 17, 2024
8c3a876
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 17, 2024
a9c62db
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 18, 2024
f779127
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 20, 2024
d13fc62
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 20, 2024
cc6ab3c
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 21, 2024
fbdc80a
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 21, 2024
392064b
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 21, 2024
74d8ee8
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 23, 2024
004f139
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] May 24, 2024
455da2a
Merge branch 'master' into issues/4088-mv-s3-functions-mk-bucket
DailyDreaming Jun 6, 2024
53dcd42
Remove cruft.
DailyDreaming Jun 6, 2024
f6876ad
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Jun 6, 2024
797624d
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Jun 10, 2024
b468005
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Jun 10, 2024
04c47e2
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Jun 10, 2024
77303d6
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Jun 11, 2024
0c48203
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Jun 11, 2024
8e35f61
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Jun 11, 2024
56b6ad6
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Jun 19, 2024
ed97974
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Jun 19, 2024
90d54a8
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Jun 24, 2024
21a221e
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Jun 25, 2024
39f9c4c
Rebase from master.
DailyDreaming Jun 28, 2024
21bbf5e
Patch test_utils.py.
DailyDreaming Jun 28, 2024
8b3c897
Merge branch 'master' into issues/4088-mv-s3-functions-mk-bucket
DailyDreaming Aug 19, 2024
6be49c0
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Aug 19, 2024
85558eb
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Aug 19, 2024
5ac40e0
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Aug 21, 2024
b9efc4a
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Aug 22, 2024
cc6be4b
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Aug 22, 2024
e89e6ac
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Aug 22, 2024
9a3f53a
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Aug 27, 2024
68954a2
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Aug 27, 2024
3aaf340
Merge master into issues/4088-mv-s3-functions-mk-bucket
github-actions[bot] Sep 3, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion contrib/admin/cleanup_aws_resources.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,8 @@

from src.toil.lib import aws
from src.toil.lib.aws import session
from src.toil.lib.aws.utils import delete_iam_role, delete_iam_instance_profile, delete_s3_bucket, delete_sdb_domain
from src.toil.lib.aws.utils import delete_iam_role, delete_iam_instance_profile, delete_sdb_domain
from src.toil.lib.aws.s3 import delete_s3_bucket
from src.toil.lib.generatedEC2Lists import regionDict

# put us-west-2 first as our default test region; that way anything with a universal region shows there
Expand Down
2 changes: 1 addition & 1 deletion src/toil/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@
QueueSizeMessage,
gen_message_bus_path)
from toil.fileStores import FileID
from toil.lib.aws import zone_to_region, build_tag_dict_from_env
from toil.lib.aws import zone_to_region
from toil.lib.compatibility import deprecated
from toil.lib.conversions import bytes2human, human2bytes
from toil.lib.io import try_path
Expand Down
164 changes: 54 additions & 110 deletions src/toil/jobStores/aws/jobStore.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,16 +13,13 @@
# limitations under the License.
import hashlib
import itertools
import json
import logging
import os
import pickle
import re
import reprlib
import stat
import time
import urllib.error
import urllib.request
import uuid
from contextlib import contextmanager
from io import BytesIO
Expand All @@ -33,9 +30,9 @@
import boto.sdb
from boto.exception import SDBResponseError
from botocore.exceptions import ClientError
from mypy_boto3_s3.service_resource import Bucket

import toil.lib.encryption as encryption
from toil.lib.aws import build_tag_dict_from_env
from toil.fileStores import FileID
from toil.jobStores.abstractJobStore import (AbstractJobStore,
ConcurrentFileModificationException,
Expand All @@ -57,10 +54,8 @@
ReadableTransformingPipe,
WritablePipe)
from toil.lib.aws.session import establish_boto3_session
from toil.lib.aws.utils import (create_s3_bucket,
enable_public_objects,
flatten_tags,
get_bucket_region,
from toil.lib.aws.s3 import create_s3_bucket, delete_s3_bucket
from toil.lib.aws.utils import (get_bucket_region,
get_object_for_url,
list_objects_for_url,
retry_s3,
Expand Down Expand Up @@ -721,85 +716,58 @@ def bucket_retry_predicate(error):
return False

bucketExisted = True
for attempt in retry_s3(predicate=bucket_retry_predicate):
with attempt:
try:
# the head_bucket() call makes sure that the bucket exists and the user can access it
self.s3_client.head_bucket(Bucket=bucket_name)

bucket = self.s3_resource.Bucket(bucket_name)
except ClientError as e:
error_http_status = get_error_status(e)
if error_http_status == 404:
bucketExisted = False
logger.debug("Bucket '%s' does not exist.", bucket_name)
if create:
bucket = create_s3_bucket(
self.s3_resource, bucket_name, self.region
)
# Wait until the bucket exists before checking the region and adding tags
bucket.wait_until_exists()

# It is possible for create_bucket to return but
# for an immediate request for the bucket region to
# produce an S3ResponseError with code
# NoSuchBucket. We let that kick us back up to the
# main retry loop.
assert (
get_bucket_region(bucket_name) == self.region
), f"bucket_name: {bucket_name}, {get_bucket_region(bucket_name)} != {self.region}"

tags = build_tag_dict_from_env()

if tags:
flat_tags = flatten_tags(tags)
bucket_tagging = self.s3_resource.BucketTagging(bucket_name)
bucket_tagging.put(Tagging={'TagSet': flat_tags})

# Configure bucket so that we can make objects in
# it public, which was the historical default.
enable_public_objects(bucket_name)
elif block:
raise
else:
return None
elif error_http_status == 301:
# This is raised if the user attempts to get a bucket in a region outside
# the specified one, if the specified one is not `us-east-1`. The us-east-1
# server allows a user to use buckets from any region.
raise BucketLocationConflictException(get_bucket_region(bucket_name))
else:
raise
else:
bucketRegion = get_bucket_region(bucket_name)
if bucketRegion != self.region:
raise BucketLocationConflictException(bucketRegion)

if versioning and not bucketExisted:
# only call this method on bucket creation
bucket.Versioning().enable()
# Now wait until versioning is actually on. Some uploads
# would come back with no versions; maybe they were
# happening too fast and this setting isn't sufficiently
# consistent?
time.sleep(1)
while not self._getBucketVersioning(bucket_name):
logger.warning(f"Waiting for versioning activation on bucket '{bucket_name}'...")
time.sleep(1)
elif check_versioning_consistency:
# now test for versioning consistency
# we should never see any of these errors since 'versioning' should always be true
bucket_versioning = self._getBucketVersioning(bucket_name)
if bucket_versioning != versioning:
assert False, 'Cannot modify versioning on existing bucket'
elif bucket_versioning is None:
assert False, 'Cannot use a bucket with versioning suspended'
if bucketExisted:
logger.debug(f"Using pre-existing job store bucket '{bucket_name}'.")
try:
# the head_bucket() call makes sure that the bucket exists and the user can access it
self.s3_client.head_bucket(Bucket=bucket_name)
bucket = self.s3_resource.Bucket(bucket_name)
except ClientError as e:
error_http_status = get_error_status(e)
if error_http_status == 404:
bucketExisted = False
logger.debug("Bucket '%s' does not exist.", bucket_name)
if create:
bucket = create_s3_bucket(self.s3_resource, bucket_name, self.region)
elif block:
raise
else:
logger.debug(f"Created new job store bucket '{bucket_name}' with versioning state {versioning}.")
return None
elif error_http_status == 301:
# This is raised if the user attempts to get a bucket in a region outside
# the specified one, if the specified one is not `us-east-1`. The us-east-1
# server allows a user to use buckets from any region.
raise BucketLocationConflictException(get_bucket_region(bucket_name))
else:
raise
else:
bucketRegion = get_bucket_region(bucket_name)
if bucketRegion != self.region:
raise BucketLocationConflictException(bucketRegion)

if versioning and not bucketExisted:
# only call this method on bucket creation
bucket.Versioning().enable()
# Now wait until versioning is actually on. Some uploads
# would come back with no versions; maybe they were
# happening too fast and this setting isn't sufficiently
# consistent?
time.sleep(1)
while not self._getBucketVersioning(bucket_name):
logger.warning(f"Waiting for versioning activation on bucket '{bucket_name}'...")
time.sleep(1)
elif check_versioning_consistency:
# now test for versioning consistency
# we should never see any of these errors since 'versioning' should always be true
bucket_versioning = self._getBucketVersioning(bucket_name)
if bucket_versioning != versioning:
assert False, 'Cannot modify versioning on existing bucket'
elif bucket_versioning is None:
assert False, 'Cannot use a bucket with versioning suspended'
if bucketExisted:
logger.debug(f"Using pre-existing job store bucket '{bucket_name}'.")
else:
logger.debug(f"Created new job store bucket '{bucket_name}' with versioning state {versioning}.")

return bucket
return bucket

def _bindDomain(self, domain_name, create=False, block=True):
"""
Expand Down Expand Up @@ -1618,7 +1586,7 @@ def destroy(self):
# TODO: Add other failure cases to be ignored here.
self._registered = None
if self.filesBucket is not None:
self._delete_bucket(self.filesBucket)
delete_s3_bucket(s3_resource=s3_boto3_resource, bucket_name=self.filesBucket)
DailyDreaming marked this conversation as resolved.
Show resolved Hide resolved
self.filesBucket = None
for name in 'filesDomain', 'jobsDomain':
domain = getattr(self, name)
Expand All @@ -1636,30 +1604,6 @@ def _delete_domain(self, domain):
if not no_such_sdb_domain(e):
raise

@staticmethod
def _delete_bucket(bucket):
"""
:param bucket: S3.Bucket
"""
for attempt in retry_s3():
with attempt:
try:
uploads = s3_boto3_client.list_multipart_uploads(Bucket=bucket.name).get('Uploads')
if uploads:
for u in uploads:
s3_boto3_client.abort_multipart_upload(Bucket=bucket.name,
Key=u["Key"],
UploadId=u["UploadId"])

bucket.objects.all().delete()
bucket.object_versions.delete()
bucket.delete()
except s3_boto3_client.exceptions.NoSuchBucket:
pass
except ClientError as e:
if get_error_status(e) != 404:
raise


aRepr = reprlib.Repr()
aRepr.maxstring = 38 # so UUIDs don't get truncated (36 for UUID plus 2 for quotes)
Expand Down
26 changes: 11 additions & 15 deletions src/toil/lib/aws/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -168,29 +168,25 @@ def file_begins_with(path, prefix):
except (URLError, socket.timeout, HTTPException):
return False


def running_on_ecs() -> bool:
"""
Return True if we are currently running on Amazon ECS, and false otherwise.
"""
# We only care about relatively current ECS
return 'ECS_CONTAINER_METADATA_URI_V4' in os.environ

def build_tag_dict_from_env(environment: MutableMapping[str, str] = os.environ) -> Dict[str, str]:
tags = dict()
owner_tag = environment.get('TOIL_OWNER_TAG')

def tags_from_env() -> Dict[str, str]:
try:
tags = json.loads(os.environ.get('TOIL_AWS_TAGS', '{}'))
except json.decoder.JSONDecodeError:
logger.error('TOIL_AWS_TAGS must be in JSON format: {"key" : "value", ...}')
exit(1)

# TODO: Remove TOIL_OWNER_TAG and only use TOIL_AWS_TAGS .
DailyDreaming marked this conversation as resolved.
Show resolved Hide resolved
owner_tag = os.environ.get('TOIL_OWNER_TAG')
if owner_tag:
tags.update({'Owner': owner_tag})

user_tags = environment.get('TOIL_AWS_TAGS')
if user_tags:
try:
json_user_tags = json.loads(user_tags)
if isinstance(json_user_tags, dict):
tags.update(json.loads(user_tags))
else:
logger.error('TOIL_AWS_TAGS must be in JSON format: {"key" : "value", ...}')
exit(1)
except json.decoder.JSONDecodeError:
logger.error('TOIL_AWS_TAGS must be in JSON format: {"key" : "value", ...}')
exit(1)
return tags
Loading