Skip to content

borism/finetune-deploy-bert-with-amazon-sagemaker-for-hugging-face

 
 

Repository files navigation

Finetuning Hugging Face DistilBERT with IMDB dataset.

In this demo, we will use the Hugging Faces transformers and datasets library with Amazon SageMaker to fine-tune a pre-trained transformer on binary text classification. In particular, we will use the pre-trained DistilBERT model with the IMDB dataset. We will then deploy the resulting model for inference using SageMaker Serverless Endpoint.

We'll be using an offshoot of BERT called DistilBERT that is smaller, and so faster and cheaper for both training and inference. A pre-trained model is available in the transformers library from Hugging Face.

The IMDB is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. It provides a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well. It's avalaible under the IMDB dataset on Hugging Face.

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

About

No description, website, or topics provided.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 85.2%
  • Python 14.8%