Re: Parameterized String Externalization
On 2009-04-30 13:36:30 -0400, Marco <zakmck@iol.it> said:
Hi all,
I need to extract all the string constants in a package, transform
them into patterns (using java.text.MessageFormat), put the patterns
in a resource bundle and replace the strings in the code with calls to
a class that uses such a bundle. For instance this:
System.out.println ( "Process terminated " + n + " file(s)
analyzed" );
would be replaced by something like:
System.out.println ( i18n.tr ( "procTerminated" ), n );
where i18n is a class that wraps the usage of the RB and MessageFormat
(yes, names are ispired by gettext). In addition to code replacement,
a new entry would be created in the resource bundle:
procTerminated = Process terminated {0} file(s) analyzed
I don't actually need this for localization, although the result could
be used for that too. It's just to factorize all the messages in one
place and let other people review them.
Now, Eclipse has a feature that does something similar, but in
practice it works only for constants, in a case like the above it
produces two messages in the RB (!).
Does anyone know if there is some more advanced tool, which is able to
recognize the parameterization?
Thanks in advance for any help.
It may be too late to take advantage of this now, but if you write your
messages using MessageFormat.format (or, more recently, String.format)
in the first place, then it's much easier to externalize the format
string to a resource bundle later:
System.out.println ( String.format ( "Process terminated %1$d file(s)
analyzed", n ) );
or even
System.out.printf ( "Process terminated %1$d file(s) analyzed\n", n );
(I habitually use positional specifiers -- %1$d, %2$s, etc -- in format
strings to permit localization to re-order the placeholders. Using
non-positional format specifiers means that the placeholders MUST
appear in the same order as the parameters to String.format. I'd love
it if there were named placeholders, as with Python's format
specifiers, but without a syntax for named parameters it'd be painfully
verbose to implement.)
Analyzing the source code for this case isn't too hard _in theory_,
since the structure you're looking for is any place where the parse
tree contains
<string constant expression> ['+' <non-constant expression> ['+'
<string constant expression>]?]+
The tricky bit is in guessing the correct format specifier for each
non-constant expression: defaulting to %N$s (or {N}), which uses
toString(), is probably an acceptable default, but you'll still need to
go over the replacements and change them where you want, eg., a
specific date or number format instead of the default.
I'm not aware of any existing tools that implement this. There are
existing Java grammars for Antlr and JavaCC, though, so you could
probably write one yourself without too much trouble.
-o