Skip to content

Commit

Permalink
Predictions support in tasks.json. (#121)
Browse files Browse the repository at this point in the history
* Predictions support in tasks.json.

* Examples with predictions included.

* NER example.

* Some.

* Completions back.

* Fixed examples.

* Copy predictions on front.

* Some.

* Connect your running models for prediction prelabeling, active learning and retraining (#127)

* WIP: ml backend connection

* add missed module

* add dependencies

* working ml backend

* fix predictions array

* add predict api

* copy prediction button

* add machine learning integration readme

* make the copy prediction button functional
convert files to new format

* correct tasks with predictions

* Fixes.

* move project,ml_backend,ml_api to models.py, add comments

* fix get_schema

* Fixes.

* Fixes for ml_backend is None.

* Remove redundant print.

* update docs

* Fixes.

* Fix.

* docs

* ls for teams

* Buttons added

* Some.

* Slack image added.

* modify logger

* fix logging levels, null train jobs

* add train job restore, log formatters
  • Loading branch information
makseq authored and niklub committed Nov 28, 2019
1 parent ef107ec commit 7d26e77
Show file tree
Hide file tree
Showing 43 changed files with 2,705 additions and 1,568 deletions.
41 changes: 41 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,43 @@ Check [documentation](https://labelstud.io/guide/backend.html) about backend + f
docker run -p 8200:8200 -t -i heartexlabs/label-studio -c config.json -l ../examples/chatbot_analysis/config.xml -i ../examples/chatbot_analysis/tasks.json -o output
```

### Machine learning integration

You can easily connect your favorite machine learning framework with Label Studio by using [Heartex SDK](https://github.com/heartexlabs/pyheartex).

That gives you the opportunities to:
- use model predictions as prelabeling
- simultaneously update (retrain) your model while new annotations are coming
- perform labeling in active learning mode
- instantly create running production-ready prediction service

There is a quick example tutorial how to do that with simple image classification:

1. Clone pyheartex, and start serving:
```bash
git clone https://github.com/heartexlabs/pyheartex.git
cd pyheartex/examples/docker
docker-compose up -d
```
2. Specify running server in your label config:
```json
"ml_backend": {
"url": "http://localhost:9090",
"model_name": "my_super_model"
}
```
3. Launch Label Studio with [image classification config](examples/image_classification/config.xml):
```bash
python server.py -l ../examples/image_classification/config.xml
```

Once you're satisfied with prelabeling results, you can imediately send prediction requests via REST API:
```bash
curl -X POST -H 'Content-Type: application/json' -d '{"image_url": "https://go.heartex.net/static/samples/kittens.jpg"}' http://localhost:8200/predict
```
Feel free to play around any other models & frameworks apart from image classifiers! (see instructions [here](https://github.com/heartexlabs/pyheartex#advanced-usage))
## Changelog
Detailed changes for each release are documented in the [release notes](https://github.com/heartexlabs/label-studio/releases).
Expand All @@ -73,6 +110,10 @@ Please make sure to read the
- [Contributing Guideline](/CONTRIBUTING.md)
- [Code Of Conduct](/CODE_OF_CONDUCT.md)
## Label Studio for Teams, Startups, and Enterprises
Label Studio for Teams is our enterprise edition (cloud & on-prem), that includes a data manager, high-quality baseline models, active learning, collaborators support, and more. Please visit the [website](https://www.heartex.ai/) to learn more.
## License
This software is licensed under the [Apache 2.0 LICENSE](/LICENSE) © [Heartex](https://www.heartex.net/).
Expand Down
8 changes: 7 additions & 1 deletion backend/config.json
Original file line number Diff line number Diff line change
Expand Up @@ -13,5 +13,11 @@
"editor": {
"build_path": "../build/static",
"debug": false
}
},

"!ml_backend": {
"url": "http://localhost:9090",
"model_name": "my_super_model"
},
"sampling": "uniform"
}
12 changes: 9 additions & 3 deletions backend/logger.json
Original file line number Diff line number Diff line change
@@ -1,10 +1,16 @@
{
"version": 1,
"formatters": {
"standard": {
"format": "[%(asctime)s] [%(name)s] [%(levelname)s] %(message)s"
}
},
"handlers": {
"console": {
"class": "logging.StreamHandler",
"level": "DEBUG",
"stream": "ext://sys.stdout"
"stream": "ext://sys.stdout",
"formatter": "standard"
}
},
"loggers": {
Expand All @@ -17,9 +23,9 @@
}
},
"root": {
"level": "DEBUG",
"level": "ERROR",
"handlers": [
"console"
]
}
}
}
1 change: 1 addition & 0 deletions backend/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@ appdirs==1.4.3
mixpanel==4.4.0
pandas==0.24.0
Pillow==6.2.0
attrs==19.1.0
63 changes: 54 additions & 9 deletions backend/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,37 +5,56 @@
import flask
import json # it MUST be included after flask!
import utils.db as db
import logging

from copy import deepcopy
from inspect import currentframe, getframeinfo
from flask import request, jsonify, make_response, Response
from utils.misc import (
exception_treatment, log_config, log, config_line_stripped, load_config
)
from utils.analytics import Analytics
from utils.models import DEFAULT_PROJECT_ID, Project, MLBackend

logger = logging.getLogger(__name__)


app = flask.Flask(__name__, static_url_path='')
app.secret_key = 'A0Zrdqwf1AQWj12ajkhgFN]dddd/,?RfDWQQT'


# init
c = None
# load editor config from XML
label_config_line = None
# analytics
analytics = None
# machine learning backend
ml_backend = None
# project object with lazy initialization
project = None


def reload_config():
global c
global label_config_line
global analytics
global ml_backend
global project
c = load_config()
label_config_line = config_line_stripped(open(c['label_config']).read())
if analytics is None:
analytics = Analytics(label_config_line, c.get('collect_analytics', True))
else:
analytics.update_info(label_config_line, c.get('collect_analytics', True))
# configure project
if project is None:
project = Project(label_config=label_config_line)
# configure machine learning backend
if ml_backend is None:
ml_backend_params = c.get('ml_backend')
if ml_backend_params:
ml_backend = MLBackend.from_params(ml_backend_params)
project.connect(ml_backend)


@app.template_filter('json')
Expand Down Expand Up @@ -99,7 +118,7 @@ def index():
task_id = request.args.get('task_id', None)

if task_id is not None:
task_data = db.get_completions(task_id)
task_data = db.get_task_with_completions(task_id)
if task_data is None:
task_data = db.get_task(task_id)

Expand Down Expand Up @@ -128,25 +147,29 @@ def tasks_page():
completed_at=completed_at)


@app.route('/api/projects/1/next/', methods=['GET'])
@app.route(f'/api/projects/{DEFAULT_PROJECT_ID}/next/', methods=['GET'])
@exception_treatment
def api_generate_next_task():
""" Generate next task to label
"""
# try to find task is not presented in completions
completions = db.get_completions_ids()
for (task_id, task) in db.get_tasks().items():
for task_id, task in db.iter_tasks():
if task_id not in completions:
log.info(msg='New task for labeling', extra=task)
analytics.send(getframeinfo(currentframe()).function)
# try to use ml backend for predictions
if ml_backend:
task = deepcopy(task)
task['predictions'] = ml_backend.make_predictions(task, project)
return make_response(jsonify(task), 200)

# no tasks found
analytics.send(getframeinfo(currentframe()).function, error=404)
return make_response('', 404)


@app.route('/api/projects/1/task_ids/', methods=['GET'])
@app.route(f'/api/projects/{DEFAULT_PROJECT_ID}/task_ids/', methods=['GET'])
@exception_treatment
def api_all_task_ids():
""" Get all tasks ids
Expand All @@ -162,13 +185,13 @@ def api_tasks(task_id):
""" Get task by id
"""
# try to get task with completions first
task_data = db.get_completions(task_id)
task_data = db.get_task_with_completions(task_id)
task_data = db.get_task(task_id) if task_data is None else task_data
analytics.send(getframeinfo(currentframe()).function)
return make_response(jsonify(task_data), 200)


@app.route('/api/projects/1/completions_ids/', methods=['GET'])
@app.route(f'/api/projects/{DEFAULT_PROJECT_ID}/completions_ids/', methods=['GET'])
@exception_treatment
def api_all_completion_ids():
""" Get all completion ids
Expand All @@ -190,6 +213,9 @@ def api_completions(task_id):
completion.pop('state', None) # remove editor state
completion_id = db.save_completion(task_id, completion)
log.info(msg='Completion saved', extra={'task_id': task_id, 'output': request.json})
# try to train model with new completions
if ml_backend:
ml_backend.update_model(db.get_task(task_id), completion, project)
analytics.send(getframeinfo(currentframe()).function)
return make_response(json.dumps({'id': completion_id}), 201)

Expand Down Expand Up @@ -236,15 +262,34 @@ def api_completion_update(task_id, completion_id):
return make_response('ok', 201)


@app.route('/api/projects/1/expert_instruction')
@app.route(f'/api/projects/{DEFAULT_PROJECT_ID}/expert_instruction')
@exception_treatment
def api_instruction():
""" Instruction for annotators
"""
analytics.send(getframeinfo(currentframe()).function)
return make_response(c['instruction'], 200)


@app.route('/predict', methods=['POST'])
@exception_treatment
def api_predict():
""" Make ML prediction using ml_backend
"""
task = request.json
if project.ml_backend:
predictions = project.ml_backend.make_predictions({'data': task}, project)
analytics.send(getframeinfo(currentframe()).function)
return make_response(jsonify(predictions), 200)
else:
analytics.send(getframeinfo(currentframe()).function, error=400)
return make_response(jsonify("No ML backend"), 400)


@app.route('/data/<path:filename>')
def get_image_file(filename):
def get_data_file(filename):
""" External resource serving
"""
directory = request.args.get('d')
return flask.send_from_directory(directory, filename, as_attachment=True)

Expand Down
Binary file added backend/static/images/slack.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 12 additions & 0 deletions backend/templates/header.html
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,17 @@
<span class="delim">|</span>

<a href="https://github.com/heartexlabs/label-studio" target="_blank"><img src="/static/images/github.svg" height="22"/></a>

<a href="https://docs.google.com/forms/d/e/1FAIpQLSdLHZx5EeT1J350JPwnY2xLanfmvplJi6VZk65C2R4XSsRBHg/viewform?usp=sf_link"
target="_blank"><img src="/static/images/slack.png" height="22"/></a>

<div class="fb-like"
style="top: -8px !important;"
data-href="https://www.facebook.com/heartexnet/"
data-width="" data-layout="button" data-action="like"
data-size="small" data-show-faces="false" data-share="false"
data-colorscheme="dark"></div>
<script async defer crossorigin="anonymous" src="https://connect.facebook.net/en_US/sdk.js#xfbml=1&version=v5.0&appId=1384721251840630&autoLogAppEvents=1"></script>

</ul>
</div>
11 changes: 4 additions & 7 deletions backend/templates/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -54,12 +54,8 @@
"completions:menu", // right menu with completion items
"side-column" // entity
],
task: {
id: {{ task_data["id"] }},
data: {{ task_data["data"] | json | safe }},
completions: {{ task_data["completions"] | json | safe }},
// predictions: [], // the same as completions but will be displayed in predictions section
}
task: {{ task_data | json | safe }}

});
{% else %}
// Label stream mode
Expand All @@ -70,11 +66,12 @@
project: { id: 1 },
interfaces: [
"basic",
"load", // load next task
"load", // load next task automatically (label stream mode)
"panel", // undo, redo, reset panel
"controls", // all control buttons: skip, submit, update
"submit", // submit button on controls
"predictions", // show predictions from task.predictions = [{...}, {...}]
"predictions:menu", // right menu with prediction items
"completions", // show completions
"side-column" // entity
]
Expand Down
Loading

0 comments on commit 7d26e77

Please sign in to comment.