6.2.3 #14703
DevinTDHa
announced in
Announcement
6.2.3
#14703
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
📢 Spark NLP 6.2.3: Further Improvements for NerDL
Spark NLP 6.2.3 introduces targeted improvements to training performance and stability of NerDLApproach and bug fixes for CamemBertForTokenClassification.
NerDLApproach now uses new internal data-loading behavior, and improving training speed and preventing out-of-memory errors.
🔥 Highlights
Enhanced NerDLApproach training performance through threaded data loading and optimized partitioning.
🚀 New Features & Enhancements
NerDLApproach Training Optimizations
Significant performance improvements for training of
NerDLApproach:Threaded Data Loading: When enabling the memory optimizer (
setEnableMemoryOptimizer(true)), data can now be pre-fetched through a threaded data loader. By default, it is disabled but can be tuned by using:By tuning this parameter (for example 20 batches), you can get training time reductions of about 10%.
Optimized Partitioning Strategy: NerDLApproach now applies optimized dataframe partitioning when using the memory optimizer (
setEnableMemoryOptimizer(true)) by default, improving parallelization efficiency during training and preventing out-of-memory errors.For manual tuning of the input data frames, this behavior can be disabled with:
🐛 Bug Fixes
❤️ Community Support
💻 Installation
Python
Spark Packages
CPU
GPU
Apple Silicon
AArch64
Maven
FAT JARs
What's Changed
Full Changelog: 6.2.2...6.2.3
This discussion was created from the release 6.2.3.
Beta Was this translation helpful? Give feedback.
All reactions