Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MessageFormat formatters #563

Open
nkovacs opened this issue Dec 4, 2015 · 23 comments
Open

MessageFormat formatters #563

nkovacs opened this issue Dec 4, 2015 · 23 comments

Comments

@nkovacs
Copy link
Contributor

nkovacs commented Dec 4, 2015

The messageformat library supports custom formatter functions. You could register globalize's formatters, so they could be used in messages, e.g. "Balance: {0, currency}", or "Posted at {0, datetime, long}".

It would also be great if there was a way to pass custom formatter functions to MessageFormat.

@rxaviers
Copy link
Member

Thanks for your message and sorry for the delayed answer.

Please, use variable replacement instead, e.g.: "Balance: {currency}" or "Posted at {date}" and have the variable formatted using the appropriate formatter in your code, e.g.:

Globalize.formatMessage("message", {date: Globalize.formatDate(new Date())});

If you find any problem using variable replacement instead or if you have further questions feel free to post additional comments.

PS:

The messageformat library supports custom formatter functions

... and I was one of the early pushers for such API to be adopted by SlexAxton/messageformat.js (the libraries Globalize uses for mesage formar under the hoods) (link). 😄 (and Alex an Eemeli did a great work updating the library). Having said that, given variable replacement could be used instead with no prejudice in that case, we opt for that.

If you want to update globalize message format to support such feature, feel free to contribute the change and I'd be happy to consider it: (a) send informal messages first to discuss the new API, then send a pull request with the implementation.

@nkovacs
Copy link
Contributor Author

nkovacs commented May 11, 2016

The problem with using a formatter in the variable is that it doesn't allow you to change the format in the message file. It's hard-coded.
This would not only allow the format to be customized for each language, it would also allow changing it without touching the code. E.g. if you have an interface where an admin can change the message files used by your app, this change would allow an admin to customize the format used in a message.

nkovacs added a commit to nkovacs/globalize that referenced this issue May 11, 2016
@nkovacs
Copy link
Contributor Author

nkovacs commented May 11, 2016

I made a quick proof of concept. The issues with it are:

Globalize.b955419430 = messageFormatterFn((function(  ) {
  return function (d) { return "Hello World " + fmt.date(d.now, ["en"], "long"); }
})()

Ideally I'd like the compiler to automatically detect the call to fmt.date, and compile the dateformatter as well, but I don't know if that's possible with the current version of MessageFormat.
This is the test file I used: https://gist.github.com/nkovacs/d6e429f7a5e0871ceb392e739031c100

@rxaviers
Copy link
Member

As an earlier step, could you please show me a map between each Globalize formatters option and its inlined message format representation? For example, above you mentioned long, is long the value for date, time, or datetime? Also, how to pass a skeleton?

Ideally I'd like the compiler to automatically detect the call to fmt.date, and compile the dateformatter...

Yeap the compiler could statically parse the message and do that (i.e., reuse the message formatter parser to deduce the formatters).

@nkovacs
Copy link
Contributor Author

nkovacs commented May 11, 2016

The message was {now, date, long} in the example, so it becomes {date: 'long'}. {now, time, long} would be {time: 'long'}, and {now, datetime, long} would be {datetime: 'long'}.

This is similar to what ICU does (http://icu-project.org/apiref/icu4j/com/ibm/icu/text/MessageFormat.html), except ICU only has date and time.

ICU also accepts a raw format (if the parameter is not one of the short formats), so {now, date, yyyy-MM-dd} would become {raw: 'yyyy-MM-dd'} if you wanted to emulate that.

Skeleton could be implemented in a few different ways:

  • {now, date, skeleton:GyMMMd}: this means you can't have {raw: 'skeleton:yyyy-MM-dd'}, but I don't think that's a big problem.
  • {now, date, skeleton, GyMMMd}: presumably no-one wants to use 'skeleton' as a raw value (or if the last value is missing, skeleton becomes the raw value, so there's no conflict)
  • {now, dateskeleton, GyMMMd}

The basic mapping could look like this:

  • {now, date, long} -> dateFormatter({date: 'long'})
  • {now, time, long} -> dateFormatter({time: 'long'})
  • {now, datetime, long} -> dateFormatter({datetime: 'long'})
  • {x, relativetime, day} -> relativeTimeFormatter('day')
  • {x, relativetime, day, short} -> relativeTimeFormatter('day', {form: 'short'})
  • {x, number} -> numberFormatter()
  • {x, number, percent} -> numberFormatter({style: 'percent'})
  • {x, currency, USD} -> currencyFormatter('USD')
  • {x, currency, USD, accounting} -> currencyFormatter('USD', {style: 'accounting'})
  • {x, unit, second} -> unitFormatter('second')
  • {x, unit, second, short} -> unitFormatter('second', { form: "short" })

But one reason I'd like to be able to customize the formatters is that I wanted to integrate globalize with Yii, and I could write custom formatters that work the same way ICU in php does (http://www.yiiframework.com/doc-2.0/guide-tutorial-i18n.html#message-formatting). That way I could use the same messages in php and in javascript, which is kind of a pain right now (I have to render everything in php to get pluralization and such things, and then return the html in the ajax response).

Yeap the compiler could statically parse the message and do that (i.e., reuse the message formatter parser to deduce the formatters).

Yeah, but for globalize I think it would be better if the custom formatter function received the Globalize instance, the same one used to render the message, so you don't have to instantiate another one and compile a new formatter each time the message is rendered (if you're not using pre-compiled files). But that requires modifying messageformat or using messageformat-parser directly. The runtime binding code is already a bit hacky, it could also be cleaned up.

@rxaviers
Copy link
Member

I liked it so far. /cc @jzaefferer and @alunny for their inputs.

@nkovacs
Copy link
Contributor Author

nkovacs commented May 11, 2016

I've hacked messageformat so that the custom formatter function can use the same globalize instance, and the dependent formatters can be passed to globalize-compiler: nkovacs@21cb3b3#diff-731a3fca6b201d79e2639fe1456b8787L156

I'll try to clean it up and get it into messageformat.js, but for now I just wanted to show how it could be done in globalize.

@ccschneidr
Copy link

This sounds great. Any news on it?

@rxaviers rxaviers reopened this Aug 2, 2016
nkovacs added a commit to nkovacs/globalize that referenced this issue Oct 5, 2016
This is needed so that the compiler can see the runtime information
attached to the function.

Refs globalizejs#563
nkovacs added a commit to nkovacs/globalize that referenced this issue Oct 5, 2016
@nkovacs
Copy link
Contributor Author

nkovacs commented Oct 5, 2016

Compilation now works. The only thing that needs to be changed in globalize-compiler is the compilation order.

Messageformat.js has since released a major new version, so I'll have to update that too.

@nkovacs
Copy link
Contributor Author

nkovacs commented Oct 16, 2016

Messageformat.js 1.0 has changed so much that the hacks used to integrate it into globalize no longer work. In particular, since the runtime is no longer static, I was unable to extract it and inject it into globalize's message-runtime module.

So instead I copied the messageformat compiler and runtime into globalize.js, and used messageformat-parser (which has since been extracted into a separate npm package). Since I now had direct access to the compiler, I was also able to remove the regexp hacks in messageFormatterRuntimeBind (the compiler can tell the runtime binding function what features are needed, e.g. plurals, select etc.).

Here's the commit: nkovacs@1586e12

What do you think?

@rxaviers
Copy link
Member

rxaviers commented Nov 1, 2016

These presets look nice:

{now, date, long} -> dateFormatter({date: 'long'})
{now, time, long} -> dateFormatter({time: 'long'})
{now, datetime, long} -> dateFormatter({datetime: 'long'})

Any of the below look nice to me too, except for the fact that adding a time pattern in the skeleton below will result in a datetime output, which sounds inconsistent with date since we have message formatters named time or datetime. Do you see what I mean? I have no suggestion at the moment though.

{now, date, skeleton, GyMMMd}
{now, dateskeleton, GyMMMd}

@rxaviers
Copy link
Member

rxaviers commented Nov 1, 2016

Messageformat.js 1.0 has changed so much that the hacks used to integrate it into globalize no longer work. In particular, since the runtime is no longer static, I was unable to extract it and inject it into globalize's message-runtime module.

So instead I copied the messageformat compiler and runtime into globalize.js, and used messageformat-parser (which has since been extracted into a separate npm package). Since I now had direct access to the compiler, I was also able to remove the regexp hacks in messageFormatterRuntimeBind (the compiler can tell the runtime binding function what features are needed, e.g. plurals, select etc.).

The existing "live-patch" isn't great, but I want to avoid copying dependencies and modifying them, because this is even harder to maintain over time. We need a better approach... Are there any changes we could propose in their code that would make it easier in our side? Is there any sort of JavaScript patch we could use instead of the bunch of replaces in Gruntfile?

@rxaviers
Copy link
Member

rxaviers commented Nov 1, 2016

About plural requiring cardinal + ordinal data... I want to avoid that. I'm wondering if {plural, ... could use a cardinal formatter, and {selectordinal, ... could use a ordinal formatter?

@nkovacs
Copy link
Contributor Author

nkovacs commented Nov 2, 2016

Any of the below look nice to me too, except for the fact that adding a time pattern in the skeleton below will result in a datetime output, which sounds inconsistent with date since we have message formatters named time or datetime.

I didn't implement the skeleton and raw options yet because I'm not sure how to do that. The rest are done: 4c95d94.

About plural requiring cardinal + ordinal data... I want to avoid that. I'm wondering if {plural, ... could use a cardinal formatter, and {selectordinal, ... could use a ordinal formatter?

With the custom compiler, yes. I'm not sure if it's doable if using messageformat.js directly, in the current version of globalize. It probably is, but it won't be pretty. That ties into your next question.

The existing "live-patch" isn't great, but I want to avoid copying dependencies and modifying them, because this is even harder to maintain over time. We need a better approach... Are there any changes we could propose in their code that would make it easier in our side? Is there any sort of JavaScript patch we could use instead of the bunch of replaces in Gruntfile?

I've used messageformat-parser from npm (it's not available in bower), so I only had to copy and rewrite the compiler and the runtime, which is relatively small, and that allowed me to customize it to globalize's needs. For example, the new messageFormatterRuntimeBind is much better.
I think this is a better approach than heavily patching messageformat.js in the Gruntfile.
It might be possible to use messageformat.js from npm and use only the compiler (compiler.js), but that's internal, so you'd again be left with something that can change and break at any time, plus some monkey-patching would still be needed.
The changes required to messageformat.js to make it usable in globalize would be extensive. They'd have to make it possible to customize the compiler. I doubt they'd want to add that complexity to messageformat.js, when you can just use messageformat-parser and write your own simple compiler.

@rxaviers
Copy link
Member

rxaviers commented Nov 2, 2016

I only had to copy and rewrite the compiler and the runtime

Could you please show me a diff?

@rxaviers
Copy link
Member

rxaviers commented Nov 2, 2016

Yeap, but looking at a diff from the original compiler and runtime to their rewritten ones would be easier to see what the changes are. Don't worry if you don't a diff handy, I can generate one...

Basically, I'm in line with your suggestion of using a newer messageformat. Although, I want to better understand the changes and impact.

@nkovacs
Copy link
Contributor Author

nkovacs commented Nov 2, 2016

It's a bit hard to see it here because of the whitespace changes required by the coding standard:

compiler.js: https://gist.github.com/nkovacs/8dea134c8af7345c1c7ed921e9dc7aad/revisions

runtime.js: https://gist.github.com/nkovacs/11f320e6ae60b1dccf943768367dab4d/revisions

The first revision is messageformat.js's version indented with 4 spaces (original is 2 spaces), second revision is my version.

@rxaviers
Copy link
Member

rxaviers commented Nov 2, 2016

I used your gists and created this diff that ignores white space changes:

@rxaviers
Copy link
Member

rxaviers commented Nov 2, 2016

@nkovacs how to you suggest we maintain these files? For instance, let's suppose messageformat publish new releases with updates to those files and we want to bring those updates in.

@rxaviers
Copy link
Member

rxaviers commented Nov 2, 2016

Another question is, what are the challenges and cost of using the new messageformat as is? From your above comments, one of them is "They'd have to make it possible to customize the compiler", what customization would be required please?

@nkovacs
Copy link
Contributor Author

nkovacs commented Nov 2, 2016

The problems I ran into trying to use messageformat 1.0.2:

The problems with using messageformat in general (this applies to 0.3.0 as well):

The problem is that messageformat.js compiles {now, date, short} to something like fmt.date(d.now, 'short'), but globalize.js needs the 'short' parameter at compile time to be able to compile an appropriate formatter function.

The minimum change required in messageformat.js would be to return the compiler's runtime property and add the arguments to its formatters object.
Globalize's compiler could then generate the appropriate wrappers and an fmt object with a special wrapper fmt.date function, and bind the compiled dateformatter as a dependency.

My version does it slightly differently: {now, date, short} is compiled to fmt[0](d.now), and the wrapper function receives an fmt array, where the 0th element is Globalize.dateFormatter({date: 'short'}).

Here's a complete compiled example:

Globalize.b955419430 = messageFormatterFn((function(plural, fmt, en) {
    return function(d) {
        return "Hello World number( " + plural(d.x, 0, en, {
            one: "one task " + fmt[0](d.now) + " ",
            other: d.x + " tasks " + fmt[1](d.now) + " "
        });
    }
})(messageFormat.plural, [Globalize("en").dateFormatter({
    "date": "long"
}), Globalize("en").dateFormatter({
    "date": "short"
})], Globalize("en").pluralGenerator({
    type: "both"
})), Globalize("en").pluralGenerator({
    "type": "both"
}), Globalize("en").dateFormatter({
    "date": "long"
}), Globalize("en").dateFormatter({
    "date": "short"
}));

and the original message was:

'Hello World number( {x, plural, one {one task {now, date, long} } other {{x} tasks {now, date, short} } }'

I'm not sure why the extra parameters are passed to messageFormatterFn, but I think that's already happening with the current version of globalize.js and pluralGenerator.

nkovacs added a commit to nkovacs/globalize that referenced this issue Jan 5, 2017
Most globalize modules can now be used directly
from within messages.

Also fixes selectOrdinal not using the correct plural function.

The messageformat compiler and runtime are forked from
messageformat.js.

Fixes globalizejs#563
nkovacs added a commit to nkovacs/globalize that referenced this issue May 15, 2017
Most globalize modules can now be used directly
from within messages.

Also fixes selectOrdinal not using the correct plural function.

The messageformat compiler and runtime are forked from
messageformat.js.

Fixes globalizejs#563
nkovacs added a commit to nkovacs/globalize that referenced this issue Jun 7, 2017
Most globalize modules can now be used directly
from within messages.

Also fixes selectOrdinal not using the correct plural function.

The messageformat compiler and runtime are forked from
messageformat.js.

Fixes globalizejs#563
nkovacs added a commit to nkovacs/globalize that referenced this issue Jul 3, 2017
Most globalize modules can now be used directly
from within messages.

Also fixes selectOrdinal not using the correct plural function.

The messageformat compiler and runtime are forked from
messageformat.js.

Fixes globalizejs#563
@jrsearles
Copy link

Any update on this? I am running into this as well. The messages for me are potentially user defined so formatting the value passed in isn't an option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants