Skip to content

Commit

Permalink
script na vytažení všech použitých tagů pasáží
Browse files Browse the repository at this point in the history
  • Loading branch information
honzaflash committed Oct 7, 2021
1 parent 285101b commit 1732064
Showing 1 changed file with 32 additions and 0 deletions.
32 changes: 32 additions & 0 deletions perl_scripts/gather-tags.pl
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
#!/usr/bin/perl

my $usage =
"usage: ./gather-tags.pl TWINE_HTML [-l]\n" .
" prints all existing passage tags\n";


if (not defined $ARGV[0]) {
print $usage;
exit 1;
}

open(HTML, '<', $ARGV[0]) or die "couldn't open the file: $ARGV[0]";

my %tags;
while (<HTML>) {
if ($_ =~ /<tw-passagedata [^>]* tags="([^"]*)"/) {
for my $word (split(/\s+/, $1)) {
$tags{$word} = 1;
}
}
}

for (keys %tags) {
if ($ARGV[1] eq "-l") {
print "$_\n";
} else {
print "$_, ";
}
}


0 comments on commit 1732064

Please sign in to comment.