Release/0.1.17
Contents
New Features
- Allow
Datafile
to be used as a context manager for changes to local datafiles - Allow
Datafile.from_cloud
to be used as a context manager for changes to cloud datafiles - Allow
Datafile
to remember where in the cloud it came from - Add the following methods to
Datafile
:get_cloud_metadata
update_cloud_metadata
clear_from_file_cache
_get_cloud_location
_store_cloud_location
_check_for_attribute_conflict
- Avoid re-uploading
Datafile
file or metadata if they haven't changed - Raise error if implicit cloud location is missing from
Datafile
- Add
GoogleCloudStorageClient.update_metadata
method - Allow option to not update cloud metadata in
Datafile
cloud methods - Allow tags to contain capitals and forward slashes (but not start or end in a forward slash)
- Allow
datetime
and posix timestamps forDatafile.timestamp
- Add
Datafile.posix_timestamp
property
Breaking changes
- Close #148: remove
hash_value
fromDatafile
GCS metadata - When hashing
Datafile
s, only hash represented file (i.e. stop hashing metadata) - When hashing
Dataset
s andManifest
s, only hash the files contained (i.e. stop hashing metadata) - Make hash of
Hashable
instance with_ATTRIBUTES_TO_HASH=None
the empty string hash value"AAAAAA=="
Minor improvements
- Simplify output of
GoogleCloudStorageClient.get_metadata
- Make
Hashable
instances re-calculate theirhash_value
every time unless animmutable_hash_value
is explicitly provided (e.g. for cloud datafiles where you don't have the file locally to hash) - Add private
Identifiable._set_id
method - Close #147: pull metadata gathering for
Datafile
into method - Get
datetime
objects directly from GCS blob instead of parsing string serialisations - Add
time
utils module - Add hash preparation function to
Hashable
fordatetime
instances - Use the empty string hash value for
Datafile
if GCScrc32c
metadata isn't present - Stop serialising hash value of
Manifest
,Dataset
, andDatafile
Fixes
- Close #146: Stop serialising GCS metadata as JSON. This avoids strings in the metadata appearing in two sets of quotation marks on Google Cloud Storage. This is a breaking change for any files already persisted with JSON-encoded metadata.
- Remove ability to set custom hash value via
kwargs
when usingDatafile.from_cloud
Testing
- Factor out cloud datafile creation in datafile tests
Quality Checklist
- New features are fully tested (No matter how much Coverage Karma you have)