Skip to content

Commit b5f5a21

Browse files
author
Matt Morrison
committed
Merge pull request #50 from MattDMo/unicode
Unicode back to master!
2 parents b148a91 + 9d9c2b1 commit b5f5a21

11 files changed

+50149
-287
lines changed

PythonImproved.YAML-tmLanguage

Lines changed: 403 additions & 114 deletions
Large diffs are not rendered by default.

PythonImproved.tmLanguage

Lines changed: 582 additions & 168 deletions
Large diffs are not rendered by default.

README.md

Lines changed: 14 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
# Python Improved
22

3-
A better Python `.tmLanguage` syntax highlighting definition for [Sublime Text](http://www.sublimetext.com) and [TextMate](http://www.macromates.com). It includes support for both Python 2 and Python 3. Inspired by:
3+
A better Python `.tmLanguage` syntax highlighting definition for [Sublime Text](http://www.sublimetext.com) and [TextMate](http://www.macromates.com). It includes support for both Python 2 and Python 3, and unlike any other Python syntax definition now fully supports Unicode identifiers anywhere in your code! It also provides its own improved regex syntax definition for inline highlighting of raw string literals.
4+
5+
Inspired by:
46

57
- the original TextMate and Sublime Text `Python.tmLanguage` files
68
- facelessuser's [Better Python](https://github.com/facelessuser/sublime-languages)
@@ -21,9 +23,7 @@ If you prefer to modify your own color scheme, here is a list of new/modified sc
2123
- `support.ipython.in` and `support.ipython.out`: [IPython](http://ipython.org) `In [1]:`/`Out [1]:` fields — designed for use with [SublimeREPL](https://sublime.wbond.net/packages/SublimeREPL). The cell number can be themed with a different color using `support.ipython.cell-number`.
2224
- `constant.numeric.integer.(long).binary.python`: binary literals `0b00101010`, `0b00101010L`
2325
- `keyword.control.import.python` now contains `import`, `from`, _and_ `as`
24-
- `keyword.other.python` now only contains `assert` — `as`, `del`, `exec`, and `print` have been relocated
2526
- `support.type.exception.python` now matches any identifier that ends with `Exception` or `Error`, not just the built-in ones like `IndentationError` or `RuntimeException`, allowing for the highlighting of custom exceptions such as those included in third-party modules
26-
- Miscellaneous changes to `support.function.builtin.python` and `support.type.python` — a lot of personal judgement went in to deciding which word went where (for example, `list` is a built-in function, but it's also a type, so I put it in `type`), so if you have a good reason for disagreeing please tell me.
2727
- [Function annotation](http://www.python.org/dev/peps/pep-3107/) support for Python 3, thanks to [@facelessuser](https://github.com/facelessuser). New scopes added: `punctuation.separator.annotation.python`, `punctuation.separator.annotation.result.python`, `punctuation.definition.parameters-group.begin.python`, and `punctuation.definition.parameters-group.end.python`.
2828
- You can now have comments in multi-line function definitions:
2929

@@ -44,11 +44,21 @@ def myfunc(self, # gotta have self
4444

4545
- `constant.other.allcaps.python` captures variable names that are in all caps (`OPENING_PORT`, for example), assuming the convention that these are generally treated as constants in the code. Matches `CONSTANT`, `class.CONSTANT` and the `CONSTANT` part of `CLASS.CONSTANT`, but not `CLASS.function()`, `class.FUNCTION()`, or `FUNCTION()`.
4646
- Fixed the octal integers so the Python 3-style `0o123` is matched as well as the old-style `0123`
47-
- Built-in functions like `any()`, `dict()`, `len()`, `raw_input()`, etc. now have their arguments highlighted just like any other function. Many thanks to [@facelessuser](https://github.com/facelessuser) for the regex, and [@FichteFoll](https://github.com/FichteFoll) for valuable discussion. For those working with Python 2, `print` is still a standalone keyword, as is `del`. If you can think of any others that should be as well, please [let me know](https://github.com/MattDMo/PythonImproved/issues/8).
47+
- Built-in functions like `any()`, `dict()`, `len()`, `raw_input()`, etc. now have their arguments highlighted just like any other function. Many thanks to [@facelessuser](https://github.com/facelessuser) for the regex, and [@FichteFoll](https://github.com/FichteFoll) for valuable discussion. For those working with Python 2, `print` is still a standalone keyword (as are `assert` and `del`).
48+
- `support.function.magic` and `support.function.builtin` have now been split in two — `name` and `call`, so that `__init__` (`support.function.magic.name.python`), for example, can be themed differently than `__init__()` (`support.function.magic.call.python`).
49+
- Relatedly, magic function names (and calls), also known as the "dunder" methods for being surrounded by double underscores, have been collated from the 2.7 and 3.5 Data Model docs and cleaned up so that as much as possible is included there, but outdated or incorrect things are not. The same is true of the magic variables (`support.variable.magic`).
50+
- `support.type` now contains *only* what's defined in https://docs.python.org/X/library/functions.html and stdtypes.html (where `X` is `2` or `3`) *where the item is a class*. They are highlighted as such only if not followed by an opening parenthesis — if it is, it's highlighted as `support.function.builtin.call`. This addresses [#16](https://github.com/MattDMo/PythonImproved/issues/16).
51+
- Defined escaped characters (like `\n`, `\'`, `\\`, etc.) are now individually named as `constant.character.escape.*`, where `*` is `newline`, `single-quote`, `backslash`, etc.
52+
- And probably some more stuff I forgot about...
53+
54+
55+
## Notes
56+
4857
- To facilitate hacking, I'm also including my `.YAML-tmLanguage` file in the repo, which I use for my day-to-day work (I really hate debugging regexes embedded in XML). Install [`AAAPackageDev`](https://sublime.wbond.net/packages/AAAPackageDev) for syntax highlighting, and tools for converting between YAML, JSON, and XML/Plist formats. [Neon](https://sublime.wbond.net/packages/Neon%20Color%20Scheme) of course has great coloring for the `.YAML-tmLanguage` format, and especially the regexes :)
4958
- All Django-related stuff has been removed. If you want it back, just dig through the repo's history and you can find it. It was just too distracting.
5059
- I removed the SQL-related stuff from the string definitions, because 1) somebody complained, and 2) like Django, it was distracting. It didn't cover all of SQL, only highlighted some keywords, and just wasn't worth it.
5160
- Unicode escapes should now appear correctly in all strings, as with Python 3 all strings are Unicode. I think I got it right, if you think otherwise just let me know.
61+
- I've begun working on correctly highlighting all the various elements of the new-style string formatting mini-language, but I haven't applied it to the most recent release while I work out the kinks. Feel free to [join the discussion](https://github.com/MattDMo/PythonImproved/issues/38).
5262

5363
## Issues
5464

Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
# [PackageDev] target_format: plist, ext: tmLanguage
2+
comment: Matches Python's regular expression syntax.
3+
name: Regular Expressions (PythonImproved)
4+
scopeName: source.regexp.python.improved
5+
fileTypes: [re]
6+
uuid: DD867ABF-1EC6-415D-B047-687F550A1D51
7+
8+
patterns:
9+
- name: keyword.control.anchor.regexp
10+
match: \\[bBAZzG]|\^|\$
11+
12+
- name: keyword.other.back-reference.regexp
13+
match: \\[1-9][0-9]?
14+
15+
- name: keyword.operator.quantifier.regexp
16+
match: '[?+*][?+]?|\{(\d+,\d+|\d+,|,\d+|\d+)\}\??'
17+
18+
- name: keyword.operator.or.regexp
19+
match: \|
20+
21+
- name: comment.block.regexp
22+
begin: \(\?\#
23+
end: \)
24+
25+
- comment: We are restrictive in what we allow to go after the comment character to
26+
avoid false positives, since the availability of comments depend on regexp flags.
27+
name: comment.line.number-sign.regexp
28+
match: (?<=^|\s)#\s[[a-zA-Z0-9,. \t?!-:][^\x{00}-\x{7F}]]*$
29+
30+
- name: keyword.other.option-toggle.regexp
31+
match: \(\?[iLmsux]+\)
32+
33+
- name: keyword.other.back-reference.named.regexp
34+
match: (\()(\?P=([a-zA-Z_][a-zA-Z_0-9]*\w*))(\))
35+
36+
- name: meta.group.assertion.regexp
37+
begin: (\()((\?=)|(\?!)|(\?<=)|(\?<!))
38+
beginCaptures:
39+
'1': {name: punctuation.definition.group.regexp}
40+
'2': {name: punctuation.definition.group.assertion.regexp}
41+
'3': {name: meta.assertion.look-ahead.regexp}
42+
'4': {name: meta.assertion.negative-look-ahead.regexp}
43+
'5': {name: meta.assertion.look-behind.regexp}
44+
'6': {name: meta.assertion.negative-look-behind.regexp}
45+
end: (\))
46+
endCaptures:
47+
'1': {name: punctuation.definition.group.regexp}
48+
patterns:
49+
- include: $self
50+
51+
- comment: we can make this more sophisticated to match the | character that separates
52+
yes-pattern from no-pattern, but it's not really necessary.
53+
name: meta.group.assertion.conditional.regexp
54+
begin: (\()(\?\(([1-9][0-9]?|[a-zA-Z_][a-zA-Z_0-9]*)\))
55+
beginCaptures:
56+
'1': {name: punctuation.definition.group.regexp}
57+
'2': {name: punctuation.definition.group.assertion.conditional.regexp}
58+
'3': {name: entity.name.section.back-reference.regexp}
59+
end: (\))
60+
patterns:
61+
- include: $self
62+
63+
- name: meta.group.regexp
64+
begin: (\()((\?P<)([A-Za-z]\w*)(>)|(\?:))?
65+
beginCaptures:
66+
'1': {name: punctuation.definition.group.regexp}
67+
'3': {name: punctuation.definition.group.capture.regexp}
68+
'4': {name: entity.name.section.group.regexp}
69+
'5': {name: punctuation.definition.group.capture.regexp}
70+
'6': {name: punctuation.definition.group.no-capture.regexp}
71+
end: (\))
72+
endCaptures:
73+
'1': {name: punctuation.definition.group.regexp}
74+
patterns:
75+
- include: $self
76+
77+
- include: '#character-class'
78+
79+
repository:
80+
character-class:
81+
patterns:
82+
- match: |-
83+
(?x)\\
84+
(
85+
(w) |
86+
(W) |
87+
(s) |
88+
(S) |
89+
(d) |
90+
(D)
91+
)
92+
captures:
93+
'2': {name: constant.character.character-class.word.regexp}
94+
'3': {name: constant.character.character-class.non-word.regexp}
95+
'4': {name: constant.character.character-class.whitespace.regexp}
96+
'5': {name: constant.character.character-class.non-whitespace.regexp}
97+
'6': {name: constant.character.character-class.digit.regexp}
98+
'7': {name: constant.character.character-class.non-digit.regexp}
99+
- name: constant.character.escape.backslash.regexp
100+
match: \\.
101+
- name: constant.other.character-class.set.regexp
102+
begin: (\[)(\^)?
103+
beginCaptures:
104+
'1': {name: punctuation.definition.character-class.regexp}
105+
'2': {name: keyword.operator.negation.regexp}
106+
end: (\])
107+
endCaptures:
108+
'1': {name: punctuation.definition.character-class.regexp}
109+
patterns:
110+
- include: '#character-class'
111+
- name: constant.other.character-class.range.regexp
112+
match: ((\\.)|.)\-((\\.)|[^\]])
113+
captures:
114+
'2': {name: constant.character.escape.backslash.regexp}
115+
'4': {name: constant.character.escape.backslash.regexp}
116+
foldingStartMarker: (/\*|\{|\()
117+
foldingStopMarker: (\*/|\}|\))

0 commit comments

Comments
 (0)