Skip to content

Commit

Permalink
Add files via upload
Browse files Browse the repository at this point in the history
  • Loading branch information
andysingal authored Jan 24, 2024
1 parent 8e5f91f commit 4a200a4
Showing 1 changed file with 62 additions and 0 deletions.
62 changes: 62 additions & 0 deletions Multimodal/TrOCR.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": [],
"gpuType": "T4"
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
},
"accelerator": "GPU"
},
"cells": [
{
"cell_type": "markdown",
"source": [
"## TrOCR\n",
"Text recognition is a long-standing research problem for document digitalization. Existing approaches for text recognition are usually built based on CNN for image understanding and RNN for char-level text generation. In addition, another language model is usually needed to improve the overall accuracy as a post-processing step. In this paper, we propose an end-to-end text recognition approach with pre-trained image Transformer and text Transformer models, namely TrOCR, which leverages the Transformer architecture for both image understanding and wordpiece-level text generation. The TrOCR model is simple but effective, and can be pre-trained with large-scale synthetic data and fine-tuned with human-labeled datasets. Experiments show that the TrOCR model outperforms the current state-of-the-art models on both printed and handwritten text recognition tasks.\n",
"\n"
],
"metadata": {
"id": "2NXGTpw-OX_O"
}
},
{
"cell_type": "code",
"source": [
"%%capture\n",
"!pip install -q transformers\n",
"!pip install -q sentencepiece\n",
"!pip install -q jiwer\n",
"!pip install -q datasets\n",
"!pip install -q evaluate\n",
"!pip install -q -U accelerate\n",
"\n",
"\n",
"!pip install -q matplotlib\n",
"!pip install -q protobuf==3.20.1\n",
"!pip install -q tensorboard"
],
"metadata": {
"id": "rfeboSP6PEVv"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "cUgITgyiOJ0n"
},
"outputs": [],
"source": []
}
]
}

0 comments on commit 4a200a4

Please sign in to comment.