Skip to content

Fails in lexer with a file using the unicode characters #37

@codelion

Description

@codelion

The parser fails to parse a file containing unicode characters like the following :


  class Queue

    def clear
    end
    alias_method :💣, :clear
  end

I made sure I am not having any character encoding issues by ensuring that the file is read properly using "UTF-8" encoding. We can see the character clearly when I print the file before calling the jruby-parser which throws an exception as shown below:

  class Queue

    def clear
    end
    alias_method :💣, :clear
  end

org.jrubyparser.lexer.SyntaxException
    at org.jrubyparser.lexer.Lexer.identifier(Lexer.java:1888)
    at org.jrubyparser.lexer.Lexer.yylex(Lexer.java:1478)
    at org.jrubyparser.lexer.Lexer.nextToken(Lexer.java:483)
    at org.jrubyparser.parser.Ruby20Parser.yyparse(Ruby20Parser.java:1515)
    at org.jrubyparser.parser.Ruby20Parser.yyparse(Ruby20Parser.java:1466)
    at org.jrubyparser.parser.Ruby20Parser.parse(Ruby20Parser.java:4666)
    at org.jrubyparser.Parser.parse(Parser.java:86)

This is not a test case I made up, I was actually trying to parse some real ruby source code e.g. it is used here. After I ran into this issue, I also tried a few other characters they also fail to parse:

  class Queue

    def clear
    end
    alias_method :☂, :clear
  end

org.jrubyparser.lexer.SyntaxException
    at org.jrubyparser.lexer.Lexer.identifier(Lexer.java:1888)
    at org.jrubyparser.lexer.Lexer.yylex(Lexer.java:1478)
    at org.jrubyparser.lexer.Lexer.nextToken(Lexer.java:483)
    at org.jrubyparser.parser.Ruby20Parser.yyparse(Ruby20Parser.java:1515)
    at org.jrubyparser.parser.Ruby20Parser.yyparse(Ruby20Parser.java:1466)
    at org.jrubyparser.parser.Ruby20Parser.parse(Ruby20Parser.java:4666)
    at org.jrubyparser.Parser.parse(Parser.java:86)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions