How to combine two trained data files : tam.traineddata and mal.traineddata #396

Amricreate · 2024-08-11T04:19:14Z

I am actually working on training Grantha script. What I figured out is that both Tamil and Malayalam have almost same characters and formatting as that of Grantha.

So, I assume if both models are used simutaneously, then the combined model can be used for finetuning using my Grantha data. I am not sure whether it will work. What is your opinion? Is this the only way like finetuning each model separately and then use this command:

"tesseract image.png -l grantha+grantha1"

Shreeshrii · 2024-12-11T12:23:55Z

Have you had any success in developing a model for Grantha script? There are others willing to collaborate on such a project. Please respond with any progress you have made. Thanks.

Amricreate · 2025-01-31T09:45:33Z

@Shreeshrii

Indeed. I trained a model which is fine-tuned from Tamil.traineddata (from tessdata_best repository). But it would work better on printed images but mercilessly fail on handwritten images due to segmentation issues.
Still there are exceptions. I can train handwritten images with single line up to 4 lines. But after that I am getting the error "Compute CTC targets failed for ....". And I raised it as an issue in #414 .

After delving into this training approach, I came up with a conclusion. I assume some upgradation would be required both in the VGSL Network Specification to include more advanced network layers, and in PSM definition for recognizing handwritten images apart from LSTM networks.

It would be grateful if I could be a part of that project. This would really help me elevate my career.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to combine two trained data files : tam.traineddata and mal.traineddata #396

How to combine two trained data files : tam.traineddata and mal.traineddata #396

Amricreate commented Aug 11, 2024

Shreeshrii commented Dec 11, 2024

Amricreate commented Jan 31, 2025 •

edited

Loading

How to combine two trained data files : tam.traineddata and mal.traineddata #396

How to combine two trained data files : tam.traineddata and mal.traineddata #396

Comments

Amricreate commented Aug 11, 2024

Shreeshrii commented Dec 11, 2024

Amricreate commented Jan 31, 2025 • edited Loading

Amricreate commented Jan 31, 2025 •

edited

Loading