-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
file service: make status aware of fetch files #424
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@slint might want to double check this!
@@ -22,7 +22,8 @@ def fetch_file(service_id, record_id, file_key): | |||
file_record = service.read_file_metadata(system_identity, record_id, file_key) | |||
source_url = file_record.data["uri"] | |||
# download file | |||
with requests.get(source_url, stream=True) as response: | |||
# verify=True for self signed certificates by default | |||
with requests.get(source_url, stream=True, allow_redirects=True) as response: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need the allow_redirects
here? what was the problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more of making sure we don't have issues in case a remote file fetch URL relies on redirects (e.g. imagine CERNBox link redirecting to the new version). It doesn't affect URLs that directly serve the file content, but makes sure we don't store the 3xx response as the file content.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, one small clarification on the comment and potential follow-up
@@ -22,7 +22,8 @@ def fetch_file(service_id, record_id, file_key): | |||
file_record = service.read_file_metadata(system_identity, record_id, file_key) | |||
source_url = file_record.data["uri"] | |||
# download file | |||
with requests.get(source_url, stream=True) as response: | |||
# verify=True for self signed certificates by default | |||
with requests.get(source_url, stream=True, allow_redirects=True) as response: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more of making sure we don't have issues in case a remote file fetch URL relies on redirects (e.g. imagine CERNBox link redirecting to the new version). It doesn't affect URLs that directly serve the file content, but makes sure we don't store the 3xx response as the file content.
@@ -22,7 +22,8 @@ def fetch_file(service_id, record_id, file_key): | |||
file_record = service.read_file_metadata(system_identity, record_id, file_key) | |||
source_url = file_record.data["uri"] | |||
# download file | |||
with requests.get(source_url, stream=True) as response: | |||
# verify=True for self signed certificates by default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: I'm not sure I understand this one. If verify=True
, that means that certificates (including self-signed) will be attempted to be verified, which might fail. I'm not suggesting to add verify=False
, but maybe we can add a TODO (or open an issue) to make all these extra parameters configurable via e.g. RECORDS_RESOURCES_FILES_FETCH_REQUESTS_ARGS = { ... }
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, by default self signed will fail. Opened an issue #425.
Sample response with new status: