semtool

Tool for semantic actions on any language or file format driven by PEG grammars.

TODOs

Implement a simple parser for a PEG-like grammar with comb.
Get search file definition for SemGrep/OpenGrep.
Define a set of keywords/rules.
Implement parser for search files with comb.
Refine own PEG grammar parser.
Implement AST comparison.
Implement repacements with Go text templates.

Own Syntax

= is used for rule assignment.
/ is used for alternatives (FirstSuccessful).
Separating multiple rules with space is used for sequences (Sequence).
* and + are used for repetitions (Many0 and Many1).
,* and ,+ are used for lists (Separated0 and Separated1) withOUT parsing a separator at the end.
;* and ;+ are used for lists (Separated0 and Separated1) WITH optional parsing of a separator at the end.
? is used for optional parsers (Optional).
-> is used for parsing until another parser matches (Until).
. is used for any character or byte in case of a binary parser (either AnyChar or AnyByte).
' and " are used for string literals.
( and ) are used for grouping.
! is used for negative lookahead.
& is used for positive lookahead.
[ and ] are used for character classes.
In string literals ANSI escape sequences are supported and so are \377, \xabcdef and \u00abcdef for octal, hex and Unicode.
Comments start with # and continue to the end of the line.
Whitespace is ignored.

Predifined rules

EOF parses the end of the input.
EOL parses the end of a line ('\r, '\n' or '\r\n').
FLOAT parses a floating point number (without a sign).
INTEGER parses an integer number (without a sign).
SPACE parses any amount of Unicode whitespace (including none).
MUST_SPACE parses Unicode whitespace (at least one character).
NAME parses a name (a Unicode letter followed by zero or more Unicode letters, Unicode digits or underscores).

Predefined character classes

ALPHA parses a Unicode letter.
DIGIT parses a Unicode digit or number.
WORD is ALPHA and DIGIT combined into a single class.

Rules a user has to define

GRAMMAR is the root rule of a grammar.
VARIABLE parses a variable name (e.g. '$' NAME) in code snippets for searching or replacing. A variable can stand for any syntactically valid subtree of the current parse tree (AST).
PLACEHOLDER parses a placeholder (e.g. '_' or '$PLACEHOLDER') in code snippets for searching. Like a variable, a placeholder can stand for any syntactically valid subtree of the current parse tree (AST). But a placeholder can't be referenced later. So it can't be used in replacements.
BINARY is more a variable than a rule. It can only be set to true or false. false is the default value. So it can be omitted.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
ast		ast
doc		doc
grammar		grammar
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

semtool

TODOs

Own Syntax

Predifined rules

Predefined character classes

Rules a user has to define

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

flowdev/semtool

Folders and files

Latest commit

History

Repository files navigation

semtool

TODOs

Own Syntax

Predifined rules

Predefined character classes

Rules a user has to define

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages