Finetuning Hugging Face DistilBERT with IMDB dataset.

In this demo, we will use the Hugging Faces transformers and datasets library with Amazon SageMaker to fine-tune a pre-trained transformer on binary text classification. In particular, we will use the pre-trained DistilBERT model with the IMDB dataset. We will then deploy the resulting model for inference using SageMaker Serverless Endpoint.

We'll be using an offshoot of BERT called DistilBERT that is smaller, and so faster and cheaper for both training and inference. A pre-trained model is available in the transformers library from Hugging Face.

The IMDB is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. It provides a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well. It's avalaible under the IMDB dataset on Hugging Face.

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
scripts		scripts
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
finetune-distilbert.ipynb		finetune-distilbert.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Finetuning Hugging Face DistilBERT with IMDB dataset.

Security

License

About

Releases

Packages

Languages

License

borism/finetune-deploy-bert-with-amazon-sagemaker-for-hugging-face

Folders and files

Latest commit

History

Repository files navigation

Finetuning Hugging Face DistilBERT with IMDB dataset.

Security

License

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages