Merge pull request kermitt2#1189 from kermitt2/update-doc

Fix internal links in the documentation
kp-forks · Nov 20, 2024 · 354132f · 354132f
2 parents 9fd4c77 + 4e7fba4
commit 354132f
Show file tree

Hide file tree

Showing 10 changed files with 34 additions and 31 deletions.
diff --git a/doc/Deep-Learning-models.md b/doc/Deep-Learning-models.md
@@ -18,7 +18,7 @@ Current neural models can be up to 50 times slower than CRF, depending on the ar
 
 ## Recommended Deep Learning models
 
-By default, only CRF models are used by Grobid. You need to select the Deep Learning models you would like to use in the GROBID configuration yaml file (`grobid/grobid-home/config/grobid.yaml`). See [here](https://grobid.readthedocs.io/en/latest/Configuration/#configuring-the-models) for more details on how to select these models. The most convenient way to use the Deep Learning models is to use the full GROBID Docker image and pass a configuration file at launch of the container describing the selected models to be used instead of the default CRF ones. Note that the full GROBID Docker image is already configured to use Deep Learning models for bibliographical reference and affiliation-address parsing. 
+By default, only CRF models are used by Grobid. You need to select the Deep Learning models you would like to use in the GROBID configuration yaml file (`grobid/grobid-home/config/grobid.yaml`). See [here](Configuration.md#configuring-the-models) for more details on how to select these models. The most convenient way to use the Deep Learning models is to use the full GROBID Docker image and pass a configuration file at launch of the container describing the selected models to be used instead of the default CRF ones. Note that the full GROBID Docker image is already configured to use Deep Learning models for bibliographical reference and affiliation-address parsing. 
 
 For current GROBID version 0.8.1, we recommend considering the usage of the following Deep Learning models: 
 
@@ -46,7 +46,7 @@ However, if you need a "local" library installation and build, prepare a lot of
 
 #### Classic python and Virtualenv
 
-<span>0.</span> Install GROBID as indicated [here](https://grobid.readthedocs.io/en/latest/Install-Grobid/).
+<span>0.</span> Install GROBID as indicated [here](Install-Grobid.md).
 
 The following was tested with Java version up to 17.
 
@@ -130,7 +130,7 @@ INFO  [2020-10-30 23:04:07,756] org.grobid.core.jni.DeLFTModel: Loading DeLFT mo
 INFO  [2020-10-30 23:04:07,758] org.grobid.core.jni.JEPThreadPool: Creating JEP instance for thread 44
 ```
 
-It is then possible to [benchmark end-to-end](https://grobid.readthedocs.io/en/latest/End-to-end-evaluation/) the selected Deep Learning models as any usual GROBID benchmarking exercise. In practice, the CRF models should be mixed with Deep Learning models to keep the process reasonably fast and memory-hungry. In addition, note that, currently, due to the limited amount of training data, Deep Learning models perform significantly better than CRF only for a few models (`citation`, `affiliation-address`, `reference-segmenter`). This should of course certainly change in the future! 
+It is then possible to [benchmark end-to-end](End-to-end-evaluation.md) the selected Deep Learning models as any usual GROBID benchmarking exercise. In practice, the CRF models should be mixed with Deep Learning models to keep the process reasonably fast and memory-hungry. In addition, note that, currently, due to the limited amount of training data, Deep Learning models perform significantly better than CRF only for a few models (`citation`, `affiliation-address`, `reference-segmenter`). This should of course certainly change in the future! 
 
 #### Anaconda 
 

diff --git a/doc/Grobid-docker.md b/doc/Grobid-docker.md
@@ -57,7 +57,7 @@ Access the service:
   - open the browser at the address `http://localhost:8080`
   - the health check will be accessible at the address `http://localhost:8081`
 
-Grobid web services are then available as described in the [service documentation](https://grobid.readthedocs.io/en/latest/Grobid-service/).
+Grobid web services are then available as described in the [service documentation](Grobid-service.md).
 
 By default, this image runs Deep Learning models for:
 
@@ -113,7 +113,7 @@ Access the service:
   - open the browser at the address `http://localhost:8080`
   - the health check will be accessible at the address `http://localhost:8081`
 
-Grobid web services are then available as described in the [service documentation](https://grobid.readthedocs.io/en/latest/Grobid-service/).
+Grobid web services are then available as described in the [service documentation](Grobid-service.md).
 
 
 ## Configure using the yaml config file

diff --git a/doc/Grobid-service.md b/doc/Grobid-service.md
@@ -59,7 +59,7 @@ If required, modify the file under `grobid/grobid-home/config/grobid.yaml` for s
 
 See the [configuration page](Configuration.md) for details on how to set the different parameters of the `grobid.yaml` configuration file. Service and logging parameters are also set in this configuration file.
 
-If Docker is used, see [here](https://grobid.readthedocs.io/en/latest/Grobid-docker/#configure-using-the-yaml-config-file) on how to start a Grobid container with a modified configuration file. 
+If Docker is used, see [here](Grobid-docker.md#configure-using-the-yaml-config-file) on how to start a Grobid container with a modified configuration file. 
 
 ### Model loading strategy 
 You can choose to load all the models at the start of the service or lazily when a model is used the first time, the latter being the default. 
@@ -178,20 +178,20 @@ curl -v -H "Accept: application/x-bibtex" --form input=@./thefile.pdf localhost:
 
 Convert the complete input document into TEI XML format (header, body and bibliographical section).
 
-|  method   |  request type         |  response type       | parameters               | requirement     | description                                                                                                                                                                                                                                         |
-|---        |---                    |---                   |--------------------------|-----------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| POST, PUT | `multipart/form-data` | `application/xml`    | `input`                  | required        | PDF file to be processed                                                                                                                                                                                                                            |
-|           |                       |                      | `consolidateHeader`      | optional        | `consolidateHeader` is a string of value `0` (no consolidation), `1` (consolidate and inject all extra metadata, default value), `2` (consolidate the citation and inject DOI only), or `3` (consolidate  using only extracted DOI - if extracted). |
-|           |                       |                      | `consolidateCitations`   | optional        | `consolidateCitations` is a string of value `0` (no consolidation, default value) or `1` (consolidate and inject all extra metadata), or `2` (consolidate the citation and inject DOI only).                                                        |
-|           |                       |                      | `consolidatFunders`      | optional        | `consolidateFunders` is a string of value `0` (no consolidation, default value) or `1` (consolidate and inject all extra metadata), or `2` (consolidate the funder and inject DOI only).                                                            |
-|           |                       |                      | `includeRawCitations`    | optional        | `includeRawCitations` is a boolean value, `0` (default, do not include raw reference string in the result) or `1` (include raw reference string in the result).                                                                                     |
-|           |                       |                      | `includeRawAffiliations` | optional        | `includeRawAffiliations` is a boolean value, `0` (default, do not include raw affiliation string in the result) or `1` (include raw affiliation string in the result).                                                                              |
-|           |                       |                      | `includeRawCopyrights`   | optional        | `includeRawCopyrights` is a boolean value, `0` (default, do not include raw copyrights/license string in the result) or `1` (include raw copyrights/license string in the result).                                                                  |
-|           |                       |                      | `teiCoordinates`         | optional        | list of element names for which coordinates in the PDF document have to be added, see [Coordinates of structures in the original PDF](Coordinates-in-PDF.md) for more details                                                                       |
-|           |                       |                      | `segmentSentences`       | optional        | Paragraphs structures in the resulting TEI will be further segmented into sentence elements <s>                                                                                                                                                     |
-|           |                       |                      | `generateIDs`            | optional        | if supplied as a string equal to `1`, it generates uniqe identifiers for each text component                                                                                                                                                        |
-|           |                       |                      | `start`                  | optional        | Start page number of the PDF to be considered, previous pages will be skipped/ignored, integer with first page starting at `1`, (default `-1`, start from the first page of the PDF)                                                                |
-|           |                       |                      | `end`                    | optional        | End page number of the PDF to be considered, next pages will be skipped/ignored, integer with first page starting at `1` (default `-1`, end with the last page of the PDF)                                                                          |
+|  method   |  request type         |  response type       | parameters               | requirement     | description                                                                                                                                                                                                                                                        |
+|---        |---                    |---                   |--------------------------|-----------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| POST, PUT | `multipart/form-data` | `application/xml`    | `input`                  | required        | PDF file to be processed                                                                                                                                                                                                                                           |
+|           |                       |                      | `consolidateHeader`      | optional        | `consolidateHeader` is a string of value `0` (no consolidation), `1` (consolidate and inject all extra metadata, default value), `2` (consolidate the citation and inject DOI only), or `3` (consolidate  using only extracted DOI - if extracted).                |
+|           |                       |                      | `consolidateCitations`   | optional        | `consolidateCitations` is a string of value `0` (no consolidation, default value) or `1` (consolidate and inject all extra metadata), or `2` (consolidate the citation and inject DOI only).                                                                       |
+|           |                       |                      | `consolidatFunders`      | optional        | `consolidateFunders` is a string of value `0` (no consolidation, default value) or `1` (consolidate and inject all extra metadata), or `2` (consolidate the funder and inject DOI only).                                                                           |
+|           |                       |                      | `includeRawCitations`    | optional        | `includeRawCitations` is a boolean value, `0` (default, do not include raw reference string in the result) or `1` (include raw reference string in the result).                                                                                                    |
+|           |                       |                      | `includeRawAffiliations` | optional        | `includeRawAffiliations` is a boolean value, `0` (default, do not include raw affiliation string in the result) or `1` (include raw affiliation string in the result).                                                                                             |
+|           |                       |                      | `includeRawCopyrights`   | optional        | `includeRawCopyrights` is a boolean value, `0` (default, do not include raw copyrights/license string in the result) or `1` (include raw copyrights/license string in the result).                                                                                 |
+|           |                       |                      | `teiCoordinates`         | optional        | list of element names for which coordinates in the PDF document have to be added, see [Coordinates of structures in the original PDF](Coordinates-in-PDF.md) for more details                                                                                      |
+|           |                       |                      | `segmentSentences`       | optional        | Paragraphs structures in the resulting TEI will be further segmented into sentence elements <s>                                                                                                                                                                    |
+|           |                       |                      | `generateIDs`            | optional        | if supplied as a string equal to `1`, it generates uniqe identifiers for each text component                                                                                                                                                                       |
+|           |                       |                      | `start`                  | optional        | Start page number of the PDF to be considered, previous pages will be skipped/ignored, integer with first page starting at `1`, (default `-1`, start from the first page of the PDF)                                                                               |
+|           |                       |                      | `end`                    | optional        | End page number of the PDF to be considered, next pages will be skipped/ignored, integer with first page starting at `1` (default `-1`, end with the last page of the PDF)                                                                                         |
 
 Response status codes: