Skip to content

Commit

Permalink
Merge pull request #394 from TileDB-Inc/ttd/history-file
Browse files Browse the repository at this point in the history
Update README and add HISTORY.md file
  • Loading branch information
Jake Bolewski authored Feb 20, 2018
2 parents 85f6c6e + 8e8a9fc commit 4d1a205
Show file tree
Hide file tree
Showing 2 changed files with 78 additions and 15 deletions.
67 changes: 67 additions & 0 deletions HISTORY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# TileDB v1.2.0 Release Notes
The 1.2.0 release of TileDB includes many new features, improvements in stability and performance, and two new language interfaces (Python and C++). There are also several breaking changes in the C API and on-disk format, documented below.

**Important Note**: due to several improvements and changes in the underlying array storage mechanism, you will need to recreate any existing arrays in order to use them with TileDB v1.2.0.

## New features
* Windows support. TileDB is now fully supported on Windows systems (64-bit Windows 7 and newer).
* Python API. We are very excited to announce the initial release of a Python API for TileDB. The Python API makes TileDB accessible to a much broader audience, allowing seamless integration with existing Python libraries such as NumPy, Pandas and the scientific Python ecosystem.
* C++ API. We've included a C++ API, which allows TileDB to integrate into modern C++ applications without having to write code towards the C API. The C++ API is more concise and provides additional compile time type safety.
* S3 object store support. You can now easily store, query, and manipulate your TileDB arrays on S3 API compatibile object stores, including Amazon's AWS S3 service.
* Virtual filesystem interface. The TileDB API now exposes a virtual filesystem (or VFS) interface, allowing you to perform tasks such as file creation, deletion, reads, and appends without worrying about whether your files are stored on S3, HDFS, a POSIX or Windows filesystem, etc.
* Key-value store. TileDB now provides a key-value (meta) data storage abstraction. Its implementation is built upon TileDB's sparse arrays and inherits all the properties of TileDB sparse arrays.


## Improvements
* Homebrew formula added for easier installation on macOS. Homebrew is now the perferred method for installing TileDB and its dependencies on macOS.
* Docker images updated to include stable/unstable/dev options, and easy configuration of additional components (e.g. S3 support).
* Tile cache implemented, which will greatly speed up repeated queries on overlapping regions of the same array.
* Ability to pass runtime configuration arguments to TileDB/VFS backends.
* Unnamed (or "anonymous") dimensions are now supported; having a single anonymous attribute is also supported.
* Concurrency bugfixes for several compressors.
* Correctness issue fixed in double-delta compressor for some datatypes.
* Better build behavior on systems with older GCC or CMake versions.
* Several memory leaks and overruns fixed with help of sanitizers.
* Many improved error condition checks and messages for easier debugging.
* Many other small bugs and API inconsistencies fixed.

## C API additions
* `tiledb_config_*`: Types and functions related to the new configuration object and functionality.
* `tiledb_config_iter_*`: Iteration functionality for retieving parameters/values from the new configuration object.
* `tiledb_ctx_get_config()`: Function to get a configuration from a context.
* `tiledb_filesystem_t`: Filesystem type enum.
* `tiledb_ctx_is_supported_fs()`: Function to check for support for a given filesystem backend.
* `tiledb_error_t`, `tiledb_error_message()` and `tiledb_error_free()`: Type and functions for TileDB error messages.
* `tiledb_ctx_get_last_error()`: Function to get last error from context.
* `tiledb_domain_get_rank()`: Function to retrieve number of dimensions in a domain.
* `tiledb_domain_get_dimension_from_index()` and `tiledb_domain_get_dimension_from_name()`: Replaces dimension iterators.
* `tiledb_dimension_{create,free,get_name,get_type,get_domain,get_tile_extent}()`: Functions related to creation and manipulation of `tiledb_dimension_t` objects.
* `tiledb_array_schema_set_coords_compressor()`: Function to set the coordinates compressor.
* `tiledb_array_schema_set_offsets_compressor()`: Function to set the offsets compressor.
* `tiledb_array_schema_get_attribute_{num,from_index,from_name}()`: Replaces attribute iterators.
* `tiledb_query_create()`: Replaced many arguments with new `tiledb_query_set_*()` setter functions.
* `tiledb_array_get_non_empty_domain()`: Function to retrieve the non-empty domain from an array.
* `tiledb_array_compute_max_read_buffer_sizes()`: Function to compute an upper bound on the buffer sizes required for a read query.
* `tiledb_object_ls()`: Function to visit the children of a path.
* `tiledb_uri_to_path()`: Function to convert a file:// URI to a platform-native path.
* `TILEDB_MAX_PATH` and `tiledb_max_path()`: The maximum length for tiledb resource paths.
* `tiledb_kv_*`: Types and functions related to the new key-value store functionality.
* `tiledb_vfs_*`: Types and functions related to the new virtual filesystem (VFS) functionality.

## Breaking changes

### C API
* Rename `tiledb_array_metadata_t` -> `tiledb_array_schema_t`, and associated `tiledb_array_metadata_*` functions to `tiledb_array_schema_*`.
* Remove `tiledb_attribute_iter_t`.
* Remove `tiledb_dimension_iter_t`.
* Rename `tiledb_delete()`, `tiledb_move()`, `tiledb_walk()` to `tiledb_object_{delete,move,walk}()`.
* `tiledb_ctx_create`: Config argument added.
* `tiledb_domain_create`: Datatype argument removed.
* `tiledb_domain_add_dimension`: Name, domain and tile extent arguments replaced with single `tiledb_dimension_t` argument.
* `tiledb_query_create()`: Replaced many arguments with new `tiledb_query_set_*()` setter functions.
* `tiledb_array_create()`: Added array URI argument.
* `tiledb_*_free()`: All free functions now take a pointer to the object pointer instead of simply object pointer.
* The include files are now installed into a `tiledb` folder. The correct path is now `#include <tiledb/tiledb.h>` (or `#include <tiledb/tiledb>` for the C++ API).

### Resource Management
* Support for moving resources across previous VFS backends (local fs <-> HDFS) has been removed. A more generic implementation for this functionality with improved performance is planned for the next version of TileDB.
26 changes: 11 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
[![logo](https://api-reference.tiledb.io/_static/tileDB_uppercase_600_112.png)](https://tiledb.io)
<a href="https://tiledb.io"><img src="https://api-reference.tiledb.io/_static/tileDB_uppercase_600_112.png" alt="TileDB logo" width="400"></a>

[![Travis](https://travis-ci.org/TileDB-Inc/TileDB.svg?branch=dev)](https://travis-ci.org/TileDB-Inc/TileDB)
[![Appveyor](https://ci.appveyor.com/api/projects/status/kb9h9r5xulk6j53c/branch/dev?svg=true)](https://ci.appveyor.com/project/StavrosPapadopoulos/tiledb/branch/dev)
Expand All @@ -9,26 +9,22 @@ TileDB allows you to manage the massive dense and sparse multi-dimensional array

## What is TileDB?

TileDB is an efficient multi-dimensional array management system which introduces a novel on-disk format that can effectively store dense and sparse array data with support for fast updates and reads. It features excellent compression and an efficient parallel I/O system with high scalability.
TileDB is an efficient multi-dimensional array management system which introduces a novel on-disk format that can effectively store dense and sparse array data with support for fast updates and reads. It features excellent compression, an efficient parallel I/O system with high scalability, and a Python API.

TileDB is open-sourced under the permissive MIT License.
TileDB works on Linux, macOS, and Windows, and is open-sourced under the permissive MIT License.

Learn more and see examples and tutorials in the [official documentation](https://docs.tiledb.io).

## Features

* **Novel Format**.
TileDB introduces a novel multi-dimensional array format that effectively handles both dense and sparse data with fast updates.
* **Parallelism**.
TileDB is process-/thread-safe and supports asynchronous I/O, making it easy to write efficient parallel computational engines on top.
* **Performance**.
TileDB is developed in C and C++ with low-level optimizations, targeting superb performance.
* **Integration**.
TileDB exposes a C API that makes it easy to integrate with higher-level languages.
* **Compression**.
TileDB offers a variety of compression mechanisms, such as Gzip, Bzip2, LZ4, ZStandard, Blosc, double-delta and run-length encoding.

[More features](https://docs.tiledb.io/docs/features)
* **Novel Format.** TileDB introduces a novel multi-dimensional array format that effectively handles both dense and sparse data with fast updates. Contrary to other popular systems (e.g., HDF5) that are optimized mostly for dense arrays, TileDB is optimized for both dense and sparse arrays, exposing a unified array API. In addition, TileDB's concept of immutable, append-only fragments allows for efficient updates.
* **Compression.** Experience fast slicing and dicing of your arrays while achieving high compression ratios with TileDB's tile-based approach. TileDB can compress array data with a growing number of compressors, such as GZIP, BZIP2, LZ4, ZStandard, Blosc, double-delta and run-length encoding.
* **Parallelism.** Build powerful parallel analytics on top of the TileDB array storage manager (e.g., using OpenMP or MPI), leveraging TileDB's thread-/process-safety and asynchronous writes and reads.
* **Portability.** TileDB works on Linux, macOS and Windows, offering easy installation packages, binaries and Docker containerization. Integrate TileDB with the tools of your favorite platform to manage massive multi-dimensional array data.
* **Language Bindings.** Enable your Python and NumPy data science applications to work with immense amounts of data, beyond what can be stored in main memory. TileDB is built in C and C++ for performance and provides a Python API for interoperability and ease of use.
* **Multiple Backends.** Transparently store your arrays across multiple backends such as HDFS or S3-compliant object stores (like AWS S3, minio, or Ceph). TileDB's API is the same regardless of where the array is stored.
* **Key-value Store.** Store any persistent metadata with TileDB's key-value storage functionality. A TileDB key-value store is implemented as a TileDB sparse array and inherits all its benefits (such as compression, parallelism, and multiple backend support).
* **Virtual Filesystem.** Add general file management and IO to your applications for any supported storage backend using TileDB's unified "virtual filesystem" (VFS) API.

## Installation

Expand Down

0 comments on commit 4d1a205

Please sign in to comment.