Skip to content

Commit f068523

Browse files
authored
Add support for _ and , separators for Python formats (#68)
* Support _ as well as comma for pyfmt * Implement and cleanup support for , and _ separators for Python formats * Add more tests for Python separators
1 parent f4c9b77 commit f068523

File tree

7 files changed

+344
-246
lines changed

7 files changed

+344
-246
lines changed

README.md

+14-16
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,16 @@ This package offers Python-style general formatting and c-style numerical format
3434

3535
## Getting Started
3636

37-
This package is pure Julia. It is now registered, so it can be added simply with `Pkg.add("Format")`.
37+
This package is pure Julia. Setting up this package is like setting up other Julia packages:
38+
39+
```julia
40+
Pkg.add("Format")
41+
```
42+
or
43+
```julia
44+
]add Format
45+
```
46+
3847
It is forked off of [Formatting.jl](https://github.com/JuliaIO/Formatting.jl), and I try to keep the oldmaster branch up to date with the master branch of that, and cherry pick or port all necessary changes to `Format`).
3948

4049
To start using the package, you can simply write
@@ -43,7 +52,7 @@ To start using the package, you can simply write
4352
using Format
4453
```
4554

46-
This package depends on Julia of version 0.6 or above, and. The package is MIT-licensed.
55+
This package depends on Julia of version 1.4 or above, and. The package is MIT-licensed.
4756

4857

4958
## Python-style Types and Functions
@@ -133,10 +142,6 @@ One can use ``pyfmt`` to format a single value into a string, or ``format`` to f
133142

134143
At this point, this package implements a subset of Python's formatting language (with slight modification). Here is a summary of the differences:
135144

136-
- ``g`` and ``G`` for floating point formatting have not been supported yet. Please use ``f``, ``e``, or ``E`` instead.
137-
138-
- The package currently provides default alignment, left alignment ``<`` and right alignment ``>``. Other form of alignment such as centered alignment ``^`` has not been supported yet.
139-
140145
- In terms of argument specification, it supports natural ordering (e.g. ``{} + {}``), explicit position (e.g. ``{1} + {2}``). It hasn't supported named arguments or fields extraction yet. Note that mixing these two modes is not allowed (e.g. ``{1} + {}``).
141146

142147
- The package provides support for filtering (for explicitly positioned arguments), such as ``{1|>lowercase}`` by allowing one to embed the ``|>`` operator, which the Python counter part does not support.
@@ -171,19 +176,12 @@ fmtrfunc = generate_formatter( fmt ) # usage 2. This bypass repeated lookup of c
171176
s = fmtrfunc( 3.14159 )
172177
173178
s = format( 3.14159, precision=3 ) # usage 3. Most flexible, with some non-printf options. Least performant.
174-
```
175-
### Speed
176-
177-
`cfmt`: Speed penalty is about 20% for floating point and 30% for integers.
178-
179-
If the formatter is stored and used instead (see the example using `generate_formatter` above),
180-
the speed penalty reduces to 10% for floating point and 15% for integers.
181179
182180
### Commas
183181
184-
This package also supplements the lack of thousand separator e.g. `"%'d"`, `"%'f"`, `"%'s"`.
182+
This package also supplements the lack of thousand separator e.g. `"%'d"`, `"%'f"`, `"%'g"`.
185183
186-
Note: `"%'s"` behavior is that for small enough floating point (but not too small),
184+
Note: `"%'g"` behavior is that for small enough floating point (but not too small),
187185
thousand separator would be used. If the number needs to be represented by `"%e"`, no
188186
separator is used.
189187
@@ -198,7 +196,7 @@ s = format( 1234, commas=true ) # 1,234
198196
s = format( -1234, commas=true, parens=true ) # (1,234)
199197
```
200198

201-
The keyword arguments are (Bold keywards are not printf standard)
199+
The keyword arguments are (Bold keywords are not printf standard)
202200

203201
* width. Integer. Try to fit the output into this many characters. May not be successful.
204202
Sacrifice space first, then commas.

src/Format.jl

+3-10
Original file line numberDiff line numberDiff line change
@@ -121,23 +121,16 @@ using Format
121121
fmt = "%10.3f"
122122
s = cfmt( fmt, 3.14159 ) # usage 1. Quite performant. Easiest to switch to.
123123
124-
fmtrfunc = generate_formatter( fmt ) # usage 2. This bypass repeated lookup of cached function. Most performant.
124+
fmtrfunc = generate_formatter( fmt ) # usage 2. This bypasses repeated lookup of cached function. Most performant.
125125
s = fmtrfunc( 3.14159 )
126126
127127
s = format( 3.14159, precision=3 ) # usage 3. Most flexible, with some non-printf options. Least performant.
128128
```
129-
### Speed
130-
131-
`cfmt`: Speed penalty is about 20% for floating point and 30% for integers.
132-
133-
If the formatter is stored and used instead (see the example using `generate_formatter` above),
134-
the speed penalty reduces to 10% for floating point and 15% for integers.
135-
136129
### Commas
137130
138-
This package also supplements the lack of thousand separator e.g. `"%'d"`, `"%'f"`, `"%'s"`.
131+
This package also supplements the lack of thousand separator e.g. `"%'d"`, `"%'f"`, `"%'g"`.
139132
140-
Note: `"%'s"` behavior is that for small enough floating point (but not too small),
133+
Note: `"%'g"` behavior is that for small enough floating point (but not too small),
141134
thousand separator would be used. If the number needs to be represented by `"%e"`, no
142135
separator is used.
143136

src/cformat.jl

+43-12
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,9 @@ function _get_formatter(fmt)
1010
_formatters[fmt] = FmtSpec(fmt)
1111
end
1212

13-
_cfmt_comma(fspec::FmtSpec, x) = addcommasreal(_cfmt(fspec, x))
14-
_cfmt_comma(fspec::FmtSpec{FmtStr}, x::Rational) = addcommasrat(_cfmt(fspec, x))
15-
_cfmt_comma(fspec::FmtSpec{<:FmtInts}, x) = checkcommas(_cfmt(fspec, x))
13+
_cfmt_comma(fspec::FmtSpec, x) = addcommasreal(_cfmt(fspec, x), Char(fspec.tsep))
14+
_cfmt_comma(fspec::FmtSpec{FmtStr}, x::Rational) = addcommasrat(_cfmt(fspec, x), Char(fspec.tsep))
15+
_cfmt_comma(fspec::FmtSpec{<:FmtInts}, x) = checkcommas(_cfmt(fspec, x), Char(fspec.tsep))
1616

1717
function _cfmt(fspec::FmtSpec, x)
1818
sv = Base.StringVector(23) # Trust that lower level code will expand if necessary
@@ -29,29 +29,29 @@ function generate_formatter(fmt::ASCIIStr)
2929
fspec.tsep == 0 ? x -> _cfmt(fspec, x) : x -> _cfmt_comma(fspec, x)
3030
end
3131

32-
function addcommasreal(s)
32+
function addcommasreal(s, sep)
3333
len = length(s)
3434
dpos = findfirst( isequal('.'), s )
35-
dpos !== nothing && return addcommas(s, len, dpos-1)
35+
dpos !== nothing && return addcommas(s, len, dpos-1, sep)
3636
# find the rightmost digit
3737
for i in len:-1:1
38-
isdigit( s[i] ) && return addcommas(s, len, i)
38+
isdigit( s[i] ) && return addcommas(s, len, i, sep)
3939
end
4040
s
4141
end
4242

4343
# commas are added to only the numerator
44-
addcommasrat(s) = addcommas(s, length(s), findfirst( isequal('/'), s )-1)
44+
addcommasrat(s, sep) = addcommas(s, length(s), findfirst( isequal('/'), s )-1, sep)
4545

46-
function checkcommas(s)
46+
function checkcommas(s, sep)
4747
len = length(s)
4848
for i in len:-1:1
49-
isdigit( s[i] ) && return addcommas(s, len, i)
49+
isdigit( s[i] ) && return addcommas(s, len, i, sep)
5050
end
5151
s
5252
end
5353

54-
function addcommas(s::T, len, lst) where {T<:AbstractString}
54+
function addcommas(s::T, len, lst, sep) where {T<:AbstractString}
5555
lst < 4 && return s
5656
beg = 1
5757
while beg < len
@@ -69,12 +69,12 @@ function addcommas(s::T, len, lst) where {T<:AbstractString}
6969
pos = beg - 1
7070
for i = beg:lst-3
7171
sv[pos += 1] = s[i]
72-
(cnt -= 1) == 0 && (cnt = 3; sv[pos += 1] = ',')
72+
(cnt -= 1) == 0 && (cnt = 3; sv[pos += 1] = sep)
7373
end
7474
for i = lst-2:len; sv[i+commas] = s[i]; end
7575
T(sv)
7676
end
77-
addcommas(s) = (l = length(s); addcommas(s, l, l))
77+
addcommas(s, sep) = (l = length(s); addcommas(s, l, l, sep))
7878

7979
function generate_format_string(;
8080
width::Int=-1,
@@ -108,6 +108,37 @@ function generate_format_string(;
108108
String(append!(s, _codeunits(conversion)))
109109
end
110110

111+
"""
112+
Format a value, using the following keyword arguments to control formatting
113+
(Bold keywords are not printf standard):
114+
115+
* width. Integer. Try to fit the output into this many characters. May not be successful.
116+
Sacrifice space first, then commas.
117+
* precision. Integer. How many decimal places.
118+
* leftjustified. Boolean
119+
* zeropadding. Boolean
120+
* commas. Boolean. Thousands-group separator.
121+
* signed. Boolean. Always show +/- sign?
122+
* positivespace. Boolean. Prepend an extra space for positive numbers? (so they align nicely with negative numbers)
123+
* **parens**. Boolean. Use parenthesis instead of "-". e.g. `(1.01)` instead of `-1.01`. Useful in finance. Note that
124+
you cannot use `signed` and `parens` option at the same time.
125+
* **stripzeros**. Boolean. Strip trailing '0' to the right of the decimal (and to the left of 'e', if any ).
126+
* It may strip the decimal point itself if all trailing places are zeros.
127+
* This is true by default if precision is not given, and vice versa.
128+
* alternative. Boolean. See `#` alternative form explanation in standard printf documentation
129+
* conversion. length=1 string. Default is type dependent. It can be one of `aAeEfFoxX`. See standard
130+
printf documentation.
131+
* **mixedfraction**. Boolean. If the number is rational, format it in mixed fraction e.g. `1_1/2` instead of `3/2`
132+
* **mixedfractionsep**. Default `_`
133+
* **fractionsep**. Default `/`
134+
* **fractionwidth**. Integer. Try to pad zeros to the numerator until the fractional part has this width
135+
* **tryden**. Integer. Try to use this denominator instead of a smaller one. No-op if it'd lose precision.
136+
* **suffix**. String. This strings will be appended to the output. Useful for units/%
137+
* **autoscale**. Symbol, default `:none`. It could be `:metric`, `:binary`, or `:finance`.
138+
* `:metric` implements common SI symbols for large and small numbers e.g. `M`, `k`, `μ`, `n`
139+
* `:binary` implements common ISQ symbols for large numbers e.g. `Ti`, `Gi`, `Mi`, `Ki`
140+
* `:finance` implements common finance/news symbols for large numbers e.g. `b` (billion), `m` (millions)
141+
"""
111142
function format( x::T;
112143
width::Int=-1,
113144
precision::Int= -1,

src/fmt.jl

+3-13
Original file line numberDiff line numberDiff line change
@@ -141,17 +141,10 @@ end
141141
function _optional_commas(x::Real, s::AbstractString, fspec::FormatSpec)
142142
prevwidth = length(s)
143143
dpos = findfirst( isequal('.'), s)
144-
s = addcommas(s, prevwidth, dpos === nothing ? prevwidth : dpos - 1)
145-
144+
ns = addcommas(s, prevwidth, dpos === nothing ? prevwidth : dpos - 1, fspec.sep)
146145
# check for excess width from commas
147146
w = length(s)
148-
if fspec.width > 0 && w > fspec.width && w > prevwidth
149-
# we may have made the string too wide with those commas... gotta fix it
150-
# left or right alignment ('<' is left)
151-
# TODO: handle center alignment
152-
s = fspec.align == '<' ? rpad(strip(s), fspec.width) : lpad(strip(s), fspec.width)
153-
end
154-
s
147+
(fspec.width > 0 && w > fspec.width && w > prevwidth) ? s : ns
155148
end
156149
_optional_commas(x, s::AbstractString, fspec::FormatSpec) = s
157150

@@ -198,10 +191,7 @@ function fmt end
198191
function fmt(x; kwargs...)
199192
fspec = fmt_default(x)
200193
isempty(kwargs) || (fspec = FormatSpec(fspec; kwargs...))
201-
s = pyfmt(fspec, x)
202-
# TODO: allow other thousands separators besides comma
203-
# add the commas now... I was confused as to when this is done currently
204-
fspec.tsep ? _optional_commas(x, s, fspec) : s
194+
pyfmt(fspec, x)
205195
end
206196

207197
# some helper method calls, which just convert to kwargs

0 commit comments

Comments
 (0)