-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
66d9bbe
commit 69ffc7b
Showing
9 changed files
with
1,202 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
# DISCOver | ||
DISCOver: an interface to explore the DISCO corpus |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,153 @@ | ||
<!DOCTYPE html> | ||
<html lang="en"> | ||
<head> | ||
<meta charset="UTF-8"> | ||
<title>About (the DISCOurse)</title> | ||
<meta http-equiv="content-type" content="text/html; charset=utf-8" /> | ||
<meta name="HandheldFriendly" content="true" /> | ||
<meta name="viewport" content="width=device-width, height=device-height, user-scalable=no" /> | ||
<title>DISCO</title> | ||
<script src="js/jquery-3.3.1.min.js"></script> | ||
<script src="js/jquery.tokeninput.js"></script> | ||
<script src="https://code.jquery.com/ui/1.12.1/jquery-ui.js"></script> | ||
<script src="https://code.highcharts.com/highcharts.js"></script> | ||
<script src="https://code.highcharts.com/modules/exporting.js"></script> | ||
<link rel="shortcut icon" href="img/favicon.ico" type="image/x-icon"> | ||
<link rel="icon" href="img/favicon.ico" type="image/x-icon"> | ||
<script src="js/menu.js"></script> | ||
<link rel="stylesheet" href="css/main.css" type="text/css"/> | ||
<link rel="stylesheet" href="css/jquery-ui.css"> | ||
<link rel="stylesheet" href="css/token-input.css" type="text/css" /> | ||
</head> | ||
<body> | ||
|
||
<!--#include virtual="ssi/menu.html"--> | ||
|
||
<main> | ||
<h1>About this corpus</h1> | ||
<h2 id="description">Corpus description</h2> | ||
<p>Our corpus currently offers a total of 4087 sonnets in Spanish: 2676 from the 19th | ||
century, 330 from the 18th century and 1088 from the so-called Spanish Golden Age (15th | ||
to 17th centuries). There are a total of 1204 authors (both from Spain and Latin | ||
America). It intends to provide a wide sample, inspired by distant reading approaches <a | ||
href="#moretti" target="blank">(Moretti, 2005)</a>. The raw texts were in most cases extracted from | ||
<a href="#cervantes" target="blank">Biblioteca Virtual Miguel de Cervantes (1999)</a>, with some | ||
18th-century texts coming from <a href="https://wikisource.org/wiki/Main_Page" target="blank">Wikisource</a>. A table in section Data | ||
Distribution below summarizes these data. </p> | ||
<p>The corpus is available in plain-text and in TEI formats; XML-TEI P5 was used given this | ||
standard’s benefits in terms of reuse, storage, and retrieval. Author metadata were | ||
extracted or inferred from unstructured content in the sources (year, place of birth and | ||
death, and gender), and placed in the TEIheader, or in a metadata table in the case of | ||
the plain-text version. For both TEI and plain-text formats, two versions of the texts | ||
are available: one collecting every sonnet per author, the other encoding a single | ||
sonnet per file. For corpus preparation, we closely followed the TEI guidelines and | ||
RIDE’s criteria for Digital Text Collections <a href="#hk-n" target="blank">(Henny-Krahmer and Neuber, | ||
2017)</a>. </p> | ||
<p>Additionally, authors have been assigned VIAF identifiers and described using RDFa | ||
attributes. This gives the corpus an entry-point to the Linked Open Data cloud, | ||
enhancing its findability. The corpus is available as a GitHub repository and saved in | ||
Zenodo, in response to good practices for data use, reuse, and conservation.</p><h2 | ||
id="graphics">Data distribution</h2> | ||
<table class="desc"> | ||
<caption><span id="table1">Table 1</span>: Corpus data distribution per period, author gender and primary continent of | ||
literary activity</caption> | ||
<tr> | ||
<th>Period</th> | ||
<th>Nbr of Sonnets</th> | ||
<th colspan="3">Nbr of Authors</th> | ||
<th>Tokens</th> | ||
</tr> | ||
<tr> | ||
<td rowspan="4"><b>19th</b></td> | ||
<td rowspan="4">2676</td> | ||
<td rowspan="4">685</td> | ||
<td>Female</td> | ||
<td>48</td> | ||
<td rowspan="4">252,518</td> | ||
</tr> | ||
<tr> | ||
<td>Male</td> | ||
<td>637</td> | ||
</tr> | ||
<tr> | ||
<td>America</td> | ||
<td>334</td> | ||
</tr> | ||
<tr> | ||
<td>Europe</td> | ||
<td>348 (+3)</td> | ||
</tr> | ||
<tr> | ||
<td rowspan="4"><b>18th</b></td> | ||
<td rowspan="4">323</td> | ||
<td rowspan="4">42</td> | ||
<td>Female</td> | ||
<td>1</td> | ||
<td rowspan="4">29,006</td> | ||
</tr> | ||
<tr> | ||
<td>Male</td> | ||
<td>41</td> | ||
</tr> | ||
<tr> | ||
<td>America</td> | ||
<td>6</td> | ||
</tr> | ||
<tr> | ||
<td>Europe</td> | ||
<td>36</td> | ||
</tr> | ||
<tr> | ||
<td rowspan="4"><b>15th-17th</b><br />(Golden Age)</td> | ||
<td rowspan="4">1088</td> | ||
<td rowspan="4">477</td> | ||
<td>Female</td> | ||
<td>31</td> | ||
<td rowspan="4">99,779</td> | ||
</tr> | ||
<tr> | ||
<td>Male</td> | ||
<td>446</td> | ||
</tr> | ||
<tr> | ||
<td>America</td> | ||
<td>12</td> | ||
</tr> | ||
<tr> | ||
<td>Europe</td> | ||
<td>458 (+7)</td> | ||
</tr> | ||
</table> | ||
<h2>Bibliography</h2> | ||
<p id="cervantes">Biblioteca Virtual Miguel de Cervantes (1999): <em>Biblioteca Virtual | ||
Miguel de Cervantes</em> | ||
<a href="http://www.cervantesvirtual.com" target="_blank">http://www.cervantesvirtual.com</a></p> | ||
<p id="hk-n">Henny-Krahmer, Ulrike, and Frederike Neuber. 2017. “Criteria for Reviewing Digital Text Collections, Version 1.0.” <em>A Review Journal for Digital Editions and Resources</em>, no. 6. <a href="https://www.i-d-e.de/publikationen/weitereschriften/criteria-text-collections-version-1-0/" target="_blank">https://www.i-d-e.de/publikationen/weitereschriften/criteria-text-collections-version-1-0</a>>.</p> | ||
<p id="moretti">Moretti, Franco. 2005. <em>Graphs, Maps, Trees: Abstract Models for a Literary History</em>. Verso</p> | ||
<hr /> | ||
<div style="text-align:center; font-size:smaller; font-style:italic"> | ||
<h3>Cálamo currante</h3> | ||
<p>Si escribir te propones un soneto,<br/> | ||
ve haciendo lo que yo, que, a fe, no es harto;<br/> | ||
tras el verso tercero saldrá el cuarto...<br/> | ||
¡Si es coser y cantar! ¡Mira: un cuarteto!</p> | ||
|
||
<p>Haz otro igual después, que te prometo<br/> | ||
que si aquesto es parir, es fácil parto;<br/> | ||
van seis versos, y el séptimo ya ensarto;<br/> | ||
otro, y van ocho, y al primer terceto.</p> | ||
|
||
<p>Todo es que el verso nono venga al baile<br/> | ||
y el décimo en la rueda esté metido.<br/> | ||
¿Hay consonante a baile y fraile? Haíle.</p> | ||
|
||
<p>Pues entonces, ya es esto pan comido,<br/> | ||
y cata a Periquillo hecho fraile,<br/> | ||
y cata el sonetejo concluido.</p> | ||
<p style="font-style:normal;">Francisco de Osuna</p> | ||
</div> | ||
</main> | ||
<!--#include virtual="ssi/footer.html"--> | ||
|
||
</body> | ||
</html> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<!DOCTYPE html> | ||
<html xmlns="http://www.w3.org/1999/xhtml"> | ||
<head> | ||
<title>About DISCO</title> | ||
<meta charset="utf-8" /> | ||
<link href="css/ancillary.css" rel="stylesheet" type="text/css" /> | ||
<link href="css/temporary.css" rel="stylesheet" type="text/css" /> | ||
</head> | ||
|
||
<body> | ||
<!--#include virtual="ssi/menu.html"--> | ||
<h1>Credits</h1> | ||
<p class='cite'>This interface visualizes and analyses the data available at:</p> | ||
<p class="cite">Ruiz Fabo, Pablo, Helena Bermúdez Sabel, Clara Martínez Cantón, and José | ||
Calvo Tello. 2017. <em>Diachronic Spanish Sonnet Corpus</em> (DISCO). Madrid: UNED. <a | ||
href="https://github.com/pruizf/disco" target="_blank" | ||
>https://github.com/pruizf/disco</a>. <a | ||
href="https://zenodo.org/badge/latestdoi/103841064" target="_blank"><img | ||
src="https://zenodo.org/badge/103841064.svg" /></a> | ||
</p> | ||
<p class="cite">That dataset was enhanced, and rhyme annotation was added using the tool <a | ||
href="https://github.com/versotym/rhymeTagger" target="_blank">RhymeTagger</a>, | ||
developed by <a target="_blank" href="http://www.versologie.cz/en/plechac.html">Petr Plecháč</a> (Ústav pro českou literaturu AV ČR, v. v. i.).</p> | ||
</body> | ||
</html> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
<!DOCTYPE html> | ||
<html lang="en"> | ||
<head> | ||
<meta charset="UTF-8"> | ||
<title>Credits</title> | ||
<meta http-equiv="content-type" content="text/html; charset=utf-8" /> | ||
<meta name="HandheldFriendly" content="true" /> | ||
<meta name="viewport" content="width=device-width, height=device-height, user-scalable=no" /> | ||
<title>DISCO</title> | ||
<script src="js/jquery-3.3.1.min.js"></script> | ||
<script src="js/jquery.tokeninput.js"></script> | ||
<script src="https://code.jquery.com/ui/1.12.1/jquery-ui.js"></script> | ||
<script src="https://code.highcharts.com/highcharts.js"></script> | ||
<script src="https://code.highcharts.com/modules/exporting.js"></script> | ||
<link rel="shortcut icon" href="img/favicon.ico" type="image/x-icon"> | ||
<link rel="icon" href="img/favicon.ico" type="image/x-icon"> | ||
<script src="js/menu.js"></script> | ||
<link rel="stylesheet" href="css/main.css" type="text/css"/> | ||
<link rel="stylesheet" href="css/jquery-ui.css"> | ||
<link rel="stylesheet" href="css/token-input.css" type="text/css" /> | ||
</head> | ||
<body> | ||
|
||
<!--#include virtual="ssi/menu.html"--> | ||
|
||
<main> | ||
<h1>Credits</h1> | ||
<h2>How to cite</h2> | ||
<p><strong>Bermúdez Sabel, Clara Martínez Cantón, Pablo Ruiz Fabo. 2019. <em>DISCOver: an interface to explore the DISCO corpus.</em> <a href="http://prf1.org/disco/">http://prf1.org/disco/</a></strong> | ||
<h2>Dataset</h2> | ||
<p>This interface visualizes and analyses the <strong>data</strong> available at:</p> | ||
<blockquote>Ruiz Fabo, Pablo, Helena Bermúdez Sabel, Clara Martínez Cantón, and José | ||
Calvo Tello. 2017. <em>Diachronic Spanish Sonnet Corpus</em> (DISCO). Madrid: UNED. <a | ||
href="https://github.com/pruizf/disco" target="_blank" | ||
>https://github.com/pruizf/disco</a>. <a | ||
href="https://zenodo.org/badge/latestdoi/103841064" target="_blank"><img | ||
src="https://zenodo.org/badge/103841064.svg" /></a> | ||
</blockquote> | ||
<p>This dataset was enhanced, and <strong>rhyme annotation</strong> was added using the tool <a | ||
href="https://github.com/versotym/rhymeTagger" target="_blank">RhymeTagger</a>, | ||
developed by <a target="_blank" href="http://versologie.cz/v2/web_content/plechac.php?lang=en">Petr Plecháč</a> (Ústav pro českou literaturu AV ČR, v. v. i.).</p> | ||
<p>The <strong>rhyme database</strong> (including the query and visualizations resources) <a href="http://versologie.cz/v2/tool_gunstick" target="_blank">Gunstick</a>, the rhyme database and related tools developed | ||
by the <a href="http://www.versologie.cz/en/" target="_blank">Versologie</a> research group.</p> | ||
<p>The interface was developed thanks to a <a href="https://www.avcr.cz/en/academic-public/support-of-research/josef-dobrovsky-fellowship/" target="_blank">Josef Dobrovský Fellowship</a>, funded by the Akademie věd České republiky (year 2018).</p> | ||
</main> | ||
<!--#include virtual="ssi/footer.html"--> | ||
|
||
</body> | ||
|
||
</html> |
Oops, something went wrong.