Releases: JohnSnowLabs/johnsnowlabs
John Snow Labs 4.3.1 Library Release
- Hotfix for bug that causes pip install to fail because of dependency conflicts from NLU
John Snow Labs 4.3.0 Library Release
- bump enterprise NLP and open source NLP to 4.3.0
- Generic Log Reg and Generic SVM available for
finance
,legal
andmedical
modules - Hubert, Swin Transformer, Zero-SHOT NER, CamemBert for QA for
nlp
module
John Snow Labs 4.2.9 Library Release
- New TextSplitter Annotator for Finance & Legal which is just a use-case focused alias for
SentenceDetector
- Fix bug with NLP module not properly refreshing attached classes after running
nlp.install()
John Snow Labs 4.2.8 Library Release
- DocMapper, DocMapperApproach, DocObfuscator, DocMlClassifier, Resolution2Chunk, ContextParserModel for finance,legal and healthcare
- Upgrade Enterprise NLP to 4.2.8
- Upgrade Open Source NLP to 4.2.8
- Upgrade Visual NLP to 4.3.0
- better error messages when importing modules fails
- Improving various error messages
- fix bug causing dependency on nbformat
- fix bugs with handling paths incorrectly on windows
John Snow Labs Library 4.2.5 Release
- visual to 4.2.4 bump
- enterprise nlp to 4.2.5 bump
- training_log_parser
John Snow Labs Library 4.2.4 Release
bump nlp-enterprise version
fix bad import mapping for some legal annotators
fix bug with setting license env variables on Databricks
John Snow Labs Library 4.2.3 Release
- version bumps
- docstring updates
- new nlp annotators from enterprise 4.2.1
John Snow Labs 4.2.2 - Spark NLP version Bump to 4.2.1
We are glad to announce johnsnowlabs
4.2.2 has been released!
Changes :
- Version Bump Spark NLP to 4.2.1
- Fix minor bug with type conversion during pypi standard instal
John Snow Labs 4.2.1 Release - No more restarts required after installing licensed libs
We are pleased to announce that version 4.2.1 of the johnsnowlabs the library has been released!
It comes with one crucial improvement:
No more notebook restarts required after running jsl.install()
John Snow Labs Library 4.2.0 Release
We are announcing with incredible excitement the release of the John Snow Labs 4.2.0 Library!
It introduces
- New Enterprise Syntax to easily access any feature of any JSL-Library.
- Highly configurable Automatic Installers with various Authorization Flows and Installation Targets like 1-Click OAUTH, 1 Line Databricks, 1 Line for new Enterprise Compatible venv and extended Offline support.
- Easily run a Python Function, Raw Python Code Snippet, Python Script or Python Module in a Databricks cluster in 1 line of code and create one if missing.
- Smart License/Jar/Wheel Caching, never type your license twice on the same machine when starting up a SparkSession or Re-Installing licensed libs!
- Various of Safety Mechanisms and Footguns removed, to reduce injuries :)
Introducing the new Enterprise Syntax for working with all of John Snow Labs libraries.
It bundles every relevant Function and Class you might ever need when working with JSL-Libraries into 1 simple import line.
from johnsnowlabs import *
This single import gets your thorugh all of the certification Notebooks, with exception of a few third party libraries.
The following modules will become avaiable :
Links to existing prodcuts
Usage&Overview for more details on import structure
nlp.MyAnno()
andnlp.my_function()
for every of Spark NLP's Python Functions/Classes/Modulesocr.MyAnno()
andocr.my_function()
for every of Spark OCR's Python Functions/Classes/Moduleslegal.MyAnno()
andlegal.my_function()
for every of Spark For Legal Python Functions/Classes/Modulesfinance.MyAnno()
andfinance.my_function()
for every of Spark For Finance Python Functions/Classes/Modulesmedical.MyAnno()
andmedical.my_function()
for every of Spark For Medical Python Functions/Classes/Modulesviz.MyVisualizer()
for every of Spark NLP-Display Classesjsl.load()
andjsl.viz()
from NLU
New Powerful Installation and Spark Session Start
The John Snow Labs libary aims to make installing Licensed Libraries and starting a Sparksession as easy as possible.
Installation Docs & Launch a Spark Session Docs
jsl.install()
- Authorization Flows (proove you have a License):
- Auto-Detect Environment Variables
- Auto Detect license files in current working dir
- Auto Detect cached license information that was stored in
~/.johnsnowlabs
from previous uns - Auto-Inject Local Browser Based OAuth
- Auto-Inject Colab Button based Oauth
- Manual Variable Definition
- Manual Json Path
- Access Token
- Installation Targets (Where to install to?):
- Currently running Python Process
- Into a Python environment, which is not the currently running Process
- Into a provided Venv
- Into a freshly created venv by the john snow labs library
- Airgap, by creating easy copy-pastable Zip file with all Jar/Wheels/Licenses to run in airgap
- Databricks
- Authorization Flows (proove you have a License):
jsl.start()
- After having run
jsl.install()
you can just runjsl.start()
.It remembers the license that was used to install and also has all jars pre-downloaded.
Additionally, it gives very helpful Logs when launching a session, telling you loaded jars and their versions.
You can even load a new license duringjsl.start()
, which supports all of the previously mentioned authorization flows.
- After having run
License Management
List all of your usable jsl licenses with jsl.list_remote_licenses()
And your locally cached licenses with jsl.list_local_licenses()
Databricks Utils
Easily submit any task to a Databricks cluster, in various formats, see Utils for databricks Docs
Run a Raw Python Code String in a Cluster and also create on on the fly.
from johnsnowlabs import *
script = """
import nlu
print(nlu.load('sentiment').predict('That was easy!'))"""
cluster_id = jsl.install(json_license_path=my_license, databricks_host=my_host,databricks_token=my_token)
jsl.run_in_databricks(script,
databricks_cluster_id=cluster_id,
databricks_host=my_host,
databricks_token=my_token,
run_name='Python Code String Example')
Run a Python Function in a Cluster.
def my_function():
import nlu
medical_text = """A 28-year-old female with a history of gestational
diabetes presented with a one-week history of polyuria ,
polydipsia , poor appetite , and vomiting ."""
df = nlu.load('en.med_ner.diseases').predict(medical_text)
for c in df.columns: print(df[c])
# my_function will run on databricks
jsl.run_in_databricks(my_function,
databricks_cluster_id=cluster_id,
databricks_host=my_host,
databricks_token=my_token,
run_name='Function test')
Run a Python Script in a Cluster.
jsl.run_in_databricks('path/to/my/script.py',
databricks_cluster_id=cluster_id,
databricks_host=my_host,
databricks_token=my_token,
run_name='Script test ')
Run a Python Module in a Cluster
import johnsnowlabs.auto_install.health_checks.nlp_test as nlp_test
jsl.run_in_databricks(nlp_test,
databricks_cluster_id=cluster_id,
databricks_host=my_host,
databricks_token=my_token,
run_name='nlp_test')
Testing Utils
You can use the John Snow Labs library to automatically test 10000+ models and 100+ Notebooks in 1 line of code within a small machine like a single Google Colab Instance and generate very handy error reports of potentially broken Models, Notebooks or Models hub Markdown Snippets.
Automatically test Notebooks/Modelshub Markdwon via URL, File-Path and many more options!
Workshop Notebook Testing Utils
See Utils for Testing Notebooks docs
from johnsnowlabs.utils.notebooks import test_ipynb
# Test a Local Markdown file with a Python Snippet
test_ipynb('path/to/local/notebook.ipynb')
# Test a Modelshub Python Markdown Snippet via URL
test_ipynb('https://raw.githubusercontent.com/JohnSnowLabs/spark-nlp-workshop/master/tutorials/Certification_Trainings/Healthcare/5.Spark_OCR.ipynb',)
# Test a folder of Markdown Snippets and generate a Report file, which captures all stderr/stdout
test_ipynb('my/notebook/folder')
# Test an Array of URLS/Paths to Markdown Fies
test_ipynb([ 'https://raw.githubusercontent.com/JohnSnowLabs/spark-nlp-workshop/master/tutorials/Certification_Trainings/Healthcare/5.Spark_OCR.ipynb', 'path/to/local/notebook.ipynb',])
# Run ALL notebooks in the Certification Folder
test_result = test_ipynb('WORKSHOP')
# Only run Finance notebooks
test_result = test_ipynb('WORKSHOP-FIN')
# Only run Legal notebooks
test_result = test_ipynb('WORKSHOP-LEG')
# Only run Medical notebooks
test_result = test_ipynb('WORKSHOP-MED')
# only run Open Source notebooks
test_result = test_ipynb('WORKSHOP-OS')
Modelshub Testing Utils
See Utils for Testing Models & Modelshub Markdown Snippets Docs
from johnsnowlabs.utils.modelhub_markdown import test_markdown
# Test a Local Markdown file with a Python Snippet
test_markdown('path/to/my/file.md')
# Test a Modelshub Python Markdown Snippet via URL
test_markdown('https://nlp.johnsnowlabs.com/2022/08/31/legpipe_deid_en.html')
# Test a folder of Markdown Snippets and generate a Report file, which captures all stderr/stdout
test_markdown('my/markdown/folder')
# Test an Array of URLS/Paths to Markdown Fies
test_ipynb(['legpipe_deid_en.html','path/to/local/markdown_snippet.md',])