Updated inference script

ZappaBoy · ZappaBoy · commit 6652c78e3ed8 · 2024-10-27T10:41:21.000+01:00
diff --git a/README.md b/README.md
@@ -6,7 +6,8 @@ Lugano.
 ## Challenge
 
 Here is a brief explanation of the challenge:
-The challenge was proposed by **Ai4Privacy**, a company that builds global solutions that enhance **privacy protections**
+The challenge was proposed by **Ai4Privacy**, a company that builds global solutions that enhance **privacy protections
+**
 in the rapidly evolving world of **Artificial Intelligence**.
 The challenge goal is to create a machine learning model capable of detecting and masking **PII** (Personal Identifiable
 Information) in text data across several languages and locales. The task requires working with a synthetic dataset to
@@ -17,7 +18,9 @@ including client support, legal, and general data anonymization tools. Success i
 scaling privacy-conscious AI systems without compromising the UX or operational performance.
 
 ## Getting Started
+
 Create a `.env` file. Start copying the `.env.example` file and rename it to `.env`. Fill in the required values.
+
 ```bash
 cp .env.example .env
 ```
@@ -93,17 +96,17 @@ Here is a list of available BERT models that can be used for fine-tuning. Additi
 may also work with minimal modifications:
 
 - BERT classic
-  + `bert-base-uncased`, `bert-large-uncased`, `bert-base-cased`, `bert-large-cased`
+    + `bert-base-uncased`, `bert-large-uncased`, `bert-base-cased`, `bert-large-cased`
 - DistilBERT
-  + `distilbert-base-uncased`, `distilbert-base-cased`
+    + `distilbert-base-uncased`, `distilbert-base-cased`
 - RoBERTa
-  + `roberta-base`, `roberta-large`
+    + `roberta-base`, `roberta-large`
 - ALBERT
-  + `albert-base-v2`, `albert-large-v2`, `albert-xlarge-v2`, `albert-xxlarge-v2`
+    + `albert-base-v2`, `albert-large-v2`, `albert-xlarge-v2`, `albert-xxlarge-v2`
 - Electra
-  + `google/electra-small-discriminator`, `google/electra-base-discriminator`, `google/electra-large-discriminator`
+    + `google/electra-small-discriminator`, `google/electra-base-discriminator`, `google/electra-large-discriminator`
 - DeBERTa
-  + `microsoft/deberta-base`, `microsoft/deberta-large`
+    + `microsoft/deberta-base`, `microsoft/deberta-large`
 
 ### GLiNER Fine-Tuning
 
@@ -141,4 +144,12 @@ You can use the following GLiNER models for fine-tuning, though additional compa
 - `gliner-community/gliner_small-v2.5`
 
 ## Results
+
 A results folder is available in the repository to store the results of the various experiments and related metrics.
+
+## Other Information
+
+We also provide a solution to the issue in
+the [pii-masking-400k](https://huggingface.co/datasets/ai4privacy/pii-masking-400k/discussions/3) repository.
+We created a method to transform the natural language text into a token-tag format that can be used to train a Named
+Entity Recognition (NER) model using the `AutoTrain` `huggingface` api.