Skip to content

Commit 778d582

Browse files
committed
apacheGH-2665: SHACL-C HTML generator
1 parent f63360c commit 778d582

15 files changed

+1013
-153
lines changed

jena-arq/Grammar/jj2tokens

100644100755
+30-5
Original file line numberDiff line numberDiff line change
@@ -2,45 +2,70 @@
22
## Licensed under the terms of http://www.apache.org/licenses/LICENSE-2.0
33

44
# Extract tokens (will need more editting)
5+
# Fixes:
6+
## IRIref
7+
## Replace " for '
8+
9+
## WARNING: This script will do the bulk work of translation but it is imperfect.
10+
## The output will need editing for us in HTML as W3C BNF
511

612
$/ = undef ;
713
$_ = <> ;
14+
15+
## JavaCC Comments
816
s!//.*!!g ;
917

18+
## Find TOKEN { } blocks, terminated by } at start of line.
19+
# A block is \WTOKEN { ... }
1020
# Not greedy to find end brace
11-
@t = m/TOKEN\s*(?:\[IGNORE_CASE\])?\s*:\s*\n\{(.*?)\n\}/sg ;
21+
@t = m/[^_]TOKEN\s*(?:\[IGNORE_CASE\])?\s*:\s*\n\{(.*?)\n\}/sg ;
1222

1323
#{\s*([^{}]*)}/sg ;
1424

15-
# Fixups:
16-
25+
## For each block of TOKENS
1726
for $t (@t)
1827
{
1928
$t =~ s/\r//g ;
2029

2130
#print "\nTEXT:\nT:",$t,":\n" ;
2231

23-
32+
## Split on | to get individual tokens
33+
2434
@s = split(/\n\|/,$t) ;
2535
for $s (@s)
2636
{
37+
## Trim
38+
$s =~ s/^\s+//s;
39+
$s =~ s/\s+$//s;
40+
2741
($name, $rule) = split(/:/,$s,2) ;
2842

43+
2944
## Leading < and excess whitespace
3045
$name =~ s/^\s*\<\s*// ;
3146
$name =~ s/\s+$// ;
47+
48+
## Remove # for internal tokens
49+
$name =~ s/^#// ;
3250

3351
## Trailing > and excess whitespace
3452
$rule =~ s/^\s+// ;
3553
$rule =~ s/\s*\>\s*$// ;
3654

55+
## Flatten around |
3756
$rule =~ s/\|\s*\n\s*/\|/sg ;
3857
$rule =~ s/\n\s*\|/\|/sg ;
3958

59+
## Replace wrapping " with '
60+
## This may corrupt a token but cover the majority of cases - check output
61+
$rule =~ s/^"/'/;
62+
$rule =~ s/"$/'/;
63+
4064
## print "NAME: /",$name , "/\n" ;
4165
## print "--> ", $rule , "\n" ;
4266

43-
$spc = ' ' x (10-length($name)) ;
67+
## Format and output
68+
$spc = ' ' x (15-length($name)) ;
4469

4570
print "<",$name,">", $spc, " ::= ",$rule,"\n" ;
4671
}

jena-shacl/shaclc/.gitignore

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
## Intermediate files.
2+
X.html
3+
Y.html

jena-shacl/shaclc/README

+14
Original file line numberDiff line numberDiff line change
@@ -17,3 +17,17 @@ and cleans up the output files to remove or surpress Java warnings.
1717

1818
The generated javacc java is checked into git so you don't need to install
1919
javacc to build this module unless you want to change the parser.
20+
21+
== To produce BNF HTML
22+
23+
Run the parser generator - shaclc-parser
24+
25+
Produce the tokens.txt file.
26+
27+
The script 'jj2tokens' will do bulk translation but the output needs fixup and
28+
replacing some rules with better format.
29+
30+
Run shaclc2html
31+
32+
This produces X.html, the HTML table between HTML comments for GRAMMAR and
33+
Y.html a displayable HTML file with styling.

jena-shacl/shaclc/grammarExtracts

+49
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
#!/usr/bin/perl
2+
## Licensed to the Apache Software Foundation (ASF) under one
3+
## or more contributor license agreements. See the NOTICE file
4+
## distributed with this work for additional information
5+
## regarding copyright ownership. The ASF licenses this file
6+
## to you under the Apache License, Version 2.0 (the
7+
## "License"); you may not use this file except in compliance
8+
## with the License. You may obtain a copy of the License at
9+
##
10+
## http://www.apache.org/licenses/LICENSE-2.0
11+
##
12+
## Unless required by applicable law or agreed to in writing, software
13+
## distributed under the License is distributed on an "AS IS" BASIS,
14+
## WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
## See the License for the specific language governing permissions and
16+
## limitations under the License.
17+
18+
# Grammar HTML to a form of an HTML page suitable for cut&paste as fragments.
19+
20+
$DOC = 1 ;
21+
22+
if ( $DOC )
23+
{
24+
print <<'EOF'
25+
<?xml version="1.0" encoding="utf-8"?>
26+
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
27+
<html>
28+
<head>
29+
<title>SHACL-C Grammar</title>
30+
<link rel="stylesheet" type="text/css" href="https://www.w3.org/StyleSheets/TR/base.css" />
31+
<link rel="stylesheet" type="text/css" href="https://www.w3.org/2001/sw/DataAccess/rq23/local.css" />
32+
</head>
33+
<body>
34+
EOF
35+
}
36+
37+
while(<>)
38+
{
39+
s/\<a id="([^=\"]*)" name="([^=\"]*)"\>/<a href="#$1">/ ;
40+
print ;
41+
}
42+
43+
if ( $DOC )
44+
{
45+
print <<'EOF'
46+
</body>
47+
</html>
48+
EOF
49+
}

0 commit comments

Comments
 (0)