Skip to content

v8.1.0: Updated types and many Ops improvements

Compare
Choose a tag to compare
@adrianeboyd adrianeboyd released this 08 Jul 13:54
17846c4

✨ New features and improvements

  • Added support for mypy 0.950 and pydantic v1.9.0, added bound types throughout layers and ops (#599).
  • Made all NumpyOps CPU kernels generic (#627).
  • Made all custom CUDA kernels generic (#603).
  • Added bounds checks for NumpyOps (#618).
  • Fixed out-of-bounds writes in NumpyOps and CupyOps (#664).
  • Reduced unnecessary zero-init allocations (#632).
  • Fixed reductions when applied to zero-length sequences (#637).
  • Added NumpyOps.cblas to get a table of C BLAS functions (#643, #700).
  • Improved type-casting in NumpyOps.asarray (#656).
  • Simplified CupyOps.asarray (#661).
  • Fixed Model.copy() for layers used more than once (#659).
  • Fixed potential race in Shim (#677).
  • Convert numpy arrays using dlpack in xp2tensorflow and xp2torch when possible (#686).
  • Improved speed of HashEmbed by avoiding large temporary arrays (#696).
  • Added Ops.reduce_last and Ops.reduce_first (#710).
  • Numerous test suite improvements.
  • Experimental: Add support for Metal Performance Shaders with PyTorch nightlies (#685).

🔴 Bug fixes

  • Fix issue #707: Fix label smoothing threshold for to_categorical.

⚠️ Backwards incompatibilities

  • In most cases the typing updates allow many casts and ignores to be removed, but types may also need minor modifications following the updates for mypy and pydantic.

  • get_array_module now returns None for non-numpy/cupy array input rather than returning numpy by default.

  • The prefer_gpu and require_gpu functions no longer set the default PyTorch torch.Tensor type to torch.cuda.FloatTensor. This means that wrapped PyTorch models cannot assume that Tensors are allocated on a CUDA GPU after calling these functions. For example:

    # Before Thinc v8.1.0, this Tensor would be allocated on the GPU after
    # {prefer,require}_gpu. Now it will be allocated as a CPU tensor by default.
    token_mask = torch.arange(max_seq_len)
    
    # To ensure correct allocation, specify the device where the Tensor should be allocated. 
    # `input` refers to the input of the model. 
    token_mask = torch.arange(max_seq_len, device=input.device) 
    

    This change brings Thinc's behavior in line with how device memory allocation is normally handled in PyTorch.

👥 Contributors

@adrianeboyd, @danieldk, @honnibal, @ines, @kadarakos, @koaning, @richardpaulhudson, @shadeMe, @svlandeg