Update README.md

deliahu · deliahu · commit 6a1fb784b06a · 2019-11-08T13:09:13.000-08:00
diff --git a/README.md b/README.md
@@ -3,7 +3,7 @@
 Cortex is an open source platform that takes machine learning models—trained with nearly any framework—and turns them into production web APIs in one command. <br>
 
 <!-- Set header Cache-Control=no-cache on the S3 object metadata (see https://help.github.com/en/articles/about-anonymized-image-urls) -->
-![Demo](https://cortex-public.s3-us-west-2.amazonaws.com/demo/gif/v0.8.gif)<br>
+![Demo](https://cortex-public.s3-us-west-2.amazonaws.com/demo/gif/v0.8.gif)
 
 <br>
 
@@ -25,108 +25,73 @@ Cortex is an open source platform that takes machine learning models—trained w
 
 <br>
 
-## Quickstart
+## Usage
 
-Below, we'll walk through how to use Cortex to deploy OpenAI's GPT-2 model as a service on AWS. You'll need to [install Cortex](https://www.cortex.dev/install) on your AWS account before getting started.
+### Step 1: define your API
 
-<br>
+```python
+# predictor.py
 
-### Step 1: Define your deployment
+model = download_my_model()
+
+def predict(sample, metadata):
+    return model.predict(sample["text"])
+```
 
-The configuration below will download the model from the `cortex-examples` S3 bucket and deploy it as a web service that can serve real-time predictions.
+### Step 2: configure your deployment
 
 ```yaml
 # cortex.yaml
 
 - kind: deployment
-  name: text
+  name: sentiment
 
 - kind: api
-  name: generator
-  tensorflow:
-    model: s3://cortex-examples/tensorflow/text-generator/gpt-2/124M
-    request_handler: handler.py
+  name: classifier
+  predictor:
+    path: predictor.py
   compute:
     gpu: 1
 ```
 
-<!-- CORTEX_VERSION_README_MINOR -->
-You can run the code that generated the model [here](https://colab.research.google.com/github/cortexlabs/cortex/blob/0.10/examples/tensorflow/text-generator/gpt-2.ipynb).
-
-<br>
-
-### Step 2: Add request handling
-
-The model requires encoded data for inference, but the API should accept strings of natural language as input. It should also decode the inference output as human-readable text.
-
-```python
-# handler.py
-
-from encoder import get_encoder
-encoder = get_encoder()
-
-def pre_inference(sample, signature, metadata):
-    context = encoder.encode(sample["text"])
-    return {"context": [context]}
-
-def post_inference(prediction, signature, metadata):
-    response = prediction["sample"]
-    return encoder.decode(response)
-```
-
-<br>
-
-### Step 3: Deploy to AWS
-
-`cortex deploy` takes the declarative configuration from `cortex.yaml` and creates it on the cluster.
+### Step 3: deploy to AWS
 
 ```bash
 $ cortex deploy
 
 deployment started
-```
 
-You can track the status of a deployment using `cortex get`.
 
-```bash
-$ cortex get generator --watch
+$ cortex get classifier --watch
 
 status   up-to-date   available   requested   last update   avg latency
 live     1            1           1           8s            123ms
 
-url: http://***.amazonaws.com/text/generator
+url: http://***.amazonaws.com/sentiment/classifier
 ```
 
-Cortex will automatically launch more replicas if the load increases and spin down replicas if there is unused capacity.
-
-<br>
-
-### Step 4: Serve real-time predictions
-
-Once you have your endpoint, you can make requests.
+### Step 4: serve real-time predictions
 
 ```bash
-$ curl http://***.amazonaws.com/text/generator \
+$ curl http://***.amazonaws.com/sentiment/classifier \
     -X POST -H "Content-Type: application/json" \
-    -d '{"text": "machine learning"}'
+    -d '{"text": "the movie was great!"}'
 
-Machine learning, with more than one thousand researchers around the world today, are looking to create computer-driven machine learning algorithms that can also be applied to human and social problems, such as education, health care, employment, medicine, politics, or the environment...
+positive
 ```
 
-Any questions? [chat with us](https://gitter.im/cortexlabs/cortex).
-
 <br>
 
 ## How Cortex works
 
-The CLI sends configuration and code to the cluster every time you run `cortex deploy`. Each model is loaded from S3 into a Docker container, along with any Python packages and request handling code. The model is exposed as a web service using Elastic Load Balancing (ELB), Flask, TensorFlow Serving, and ONNX Runtime. The containers are orchestrated on Elastic Kubernetes Service (EKS) while logs and metrics are streamed to CloudWatch.
+The CLI sends configuration and code to the cluster every time you run `cortex deploy`. Each model is loaded into a Docker container, along with any Python packages and request handling code. The model is exposed as a web service using Elastic Load Balancing (ELB), Flask, TensorFlow Serving, and ONNX Runtime. The containers are orchestrated on Elastic Kubernetes Service (EKS) while logs and metrics are streamed to CloudWatch.
 
 <br>
 
 ## More examples
 
 <!-- CORTEX_VERSION_README_MINOR x4 -->
 - [Sentiment analysis](https://github.com/cortexlabs/cortex/tree/0.10/examples/tensorflow/sentiment-analysis) in TensorFlow with BERT
-- [Image classification](https://github.com/cortexlabs/cortex/tree/0.10/examples/tensorflow/image-classifier) in TensorFlow with Inception v3
-- [Text Generation](https://github.com/cortexlabs/cortex/tree/0.10/examples/pytorch/text-generator) in PyTorch with Hugging Face's DistilGPT2
+- [Image classification](https://github.com/cortexlabs/cortex/tree/0.10/examples/tensorflow/image-classifier) in TensorFlow with Inception
+- [Text generation](https://github.com/cortexlabs/cortex/tree/0.10/examples/pytorch/text-generator) in PyTorch with DistilGPT2
 - [Iris classification](https://github.com/cortexlabs/cortex/tree/0.10/examples/xgboost/iris-classifier) in XGBoost / ONNX