Skip to content

Release/0.1.17

Compare
Choose a tag to compare
@github-actions github-actions released this 07 May 18:05
8924d88

Contents

New Features

  • Allow Datafile to be used as a context manager for changes to local datafiles
  • Allow Datafile.from_cloud to be used as a context manager for changes to cloud datafiles
  • Allow Datafile to remember where in the cloud it came from
  • Add the following methods to Datafile:
    • get_cloud_metadata
    • update_cloud_metadata
    • clear_from_file_cache
    • _get_cloud_location
    • _store_cloud_location
    • _check_for_attribute_conflict
  • Avoid re-uploading Datafile file or metadata if they haven't changed
  • Raise error if implicit cloud location is missing from Datafile
  • Add GoogleCloudStorageClient.update_metadata method
  • Allow option to not update cloud metadata in Datafile cloud methods
  • Allow tags to contain capitals and forward slashes (but not start or end in a forward slash)
  • Allow datetime and posix timestamps for Datafile.timestamp
  • Add Datafile.posix_timestamp property

Breaking changes

  • Close #148: remove hash_value from Datafile GCS metadata
  • When hashing Datafiles, only hash represented file (i.e. stop hashing metadata)
  • When hashing Datasets and Manifests, only hash the files contained (i.e. stop hashing metadata)
  • Make hash of Hashable instance with _ATTRIBUTES_TO_HASH=None the empty string hash value "AAAAAA=="

Minor improvements

  • Simplify output of GoogleCloudStorageClient.get_metadata
  • Make Hashable instances re-calculate their hash_value every time unless an immutable_hash_value is explicitly provided (e.g. for cloud datafiles where you don't have the file locally to hash)
  • Add private Identifiable._set_id method
  • Close #147: pull metadata gathering for Datafile into method
  • Get datetime objects directly from GCS blob instead of parsing string serialisations
  • Add time utils module
  • Add hash preparation function to Hashable for datetime instances
  • Use the empty string hash value for Datafile if GCS crc32c metadata isn't present
  • Stop serialising hash value of Manifest, Dataset, and Datafile

Fixes

  • Close #146: Stop serialising GCS metadata as JSON. This avoids strings in the metadata appearing in two sets of quotation marks on Google Cloud Storage. This is a breaking change for any files already persisted with JSON-encoded metadata.
  • Remove ability to set custom hash value via kwargs when using Datafile.from_cloud

Testing

  • Factor out cloud datafile creation in datafile tests

Quality Checklist

  • New features are fully tested (No matter how much Coverage Karma you have)