Skip to content

Add ObjectStore::append #3791

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 2, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 35 additions & 2 deletions object_store/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -278,15 +278,21 @@ pub type MultipartId = String;
/// Universal API to multiple object store services.
#[async_trait]
pub trait ObjectStore: std::fmt::Display + Send + Sync + Debug + 'static {
/// Save the provided bytes to the specified location.
/// Save the provided bytes to the specified location
///
/// The operation is guaranteed to be atomic, it will either successfully
/// write the entirety of `bytes` to `location`, or fail. No clients
/// should be able to observe a partially written object
async fn put(&self, location: &Path, bytes: Bytes) -> Result<()>;

/// Get a multi-part upload that allows writing data in chunks
///
/// Most cloud-based uploads will buffer and upload parts in parallel.
///
/// To complete the upload, [AsyncWrite::poll_shutdown] must be called
/// to completion.
/// to completion. This operation is guaranteed to be atomic, it will either
/// make all the written data available at `location`, or fail. No clients
/// should be able to observe a partially written object
///
/// For some object stores (S3, GCS, and local in particular), if the
/// writer fails or panics, you must call [ObjectStore::abort_multipart]
Expand All @@ -306,6 +312,33 @@ pub trait ObjectStore: std::fmt::Display + Send + Sync + Debug + 'static {
multipart_id: &MultipartId,
) -> Result<()>;

/// Returns an [`AsyncWrite`] that can be used to append to the object at `location`
///
/// A new object will be created if it doesn't already exist, otherwise it will be
/// opened, with subsequent writes appended to the end.
///
/// This operation cannot be supported by all stores, most use-cases should prefer
/// [`ObjectStore::put`] and [`ObjectStore::put_multipart`] for better portability
/// and stronger guarantees
///
/// This API is not guaranteed to be atomic, in particular
///
/// * On error, `location` may contain partial data
/// * Concurrent calls to [`ObjectStore::list`] may return partially written objects
/// * Concurrent calls to [`ObjectStore::get`] may return partially written data
/// * Concurrent calls to [`ObjectStore::put`] may result in data loss / corruption
/// * Concurrent calls to [`ObjectStore::append`] may result in data loss / corruption
///
/// Additionally some stores, such as Azure, may only support appending to objects created
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 the docstrings make a lot of sense -- thank you for spelling it out so clearly

/// with [`ObjectStore::append`], and not with [`ObjectStore::put`], [`ObjectStore::copy`], or
/// [`ObjectStore::put_multipart`]
async fn append(
&self,
_location: &Path,
) -> Result<Box<dyn AsyncWrite + Unpin + Send>> {
Err(Error::NotImplemented)
}

/// Return the bytes that are stored at the specified location.
async fn get(&self, location: &Path) -> Result<GetResult>;

Expand Down