Re: sorting std::vector<string> ignoring case
On Thu, 10 May 2007 09:05:32 -0600, Jerry Coffin <jcoffin@taeus.com> wrote:
This is one place that C++ does quite a bit better job than C. In C you
have only a single, global locale that applies to everything at any
given time.
C++ has a global locale (inherited from C) that applies to things like
the C string functions, but it also allows you to associate locale
objects with individual items so you can (usually) avoid the scenario
you describe above, having to constantly change the global locale. For
example, each iostream has an associated locale.
Right.
Unfortunately, in the case of strings, the locale-like information for
the string is an implicit property of the char_traits class, so it's not
entirely straightforward to do a string comparison using the ordering
from a specific locale. Worse, the char_traits for a string is a
template parameter, so two strings that use different char_traits are
completely separate types and you can't compare one to another, assign
one to another, etc. That's not always a major problem, but sometimes it
becomes a serious pain.
Tell me about it. :) Long ago, back in the Wild West days of standardized
C++, I wrote my own string class that supported true COW and modeled it as
closely as possible on the draft basic_string, and to make string types
assignment-compatible, I split my char_traits into "assign_traits" and
"compare_traits" so that my normal, case-insensitive, and filename string
types would all be assignment-compatible but not comparable. (More
generally, "assign_traits" applied to all things that were
comparison-agnostic.) The class header for my basic_string-alike looked
like this:
template<
class CharT,
class TraitsT = my_string_compare_char_traits<CharT>,
class AllocT = my_allocator<charT> >
class my_basic_string
: public my_basic_string_base<CharT, TraitsT::assign_traits, AllocT>
Functions that had "string" parameters and performed comparison-agnostic
operations (including functions like "compare" which don't use the string
parameter's traits) on them would use my_basic_string_base parameters, and
all three of my "standard" string types were assign_traits-compatible,
which was nice. However, you couldn't compare, say, a my_string to a
my_filename without casting one of them to the base class and thus
neutralizing it, because they differed in their compare_traits, which was
also nice. After nearly 10 years of using these classes, I wouldn't want to
use std::string instead of them in any program that made significant use of
strings. Unfortunately, the whole idea behind functions like
char_traits::lt has kind of gone out the window, so to get back on topic,
I'd hope that defining a new char_traits is not the current state of the
art for locale-aware string comparison in C++.
Finally, the simple fact is that most documentation on this part of the
C++ standard library just plain sucks. Josuttis covers it pretty well,
but I haven't seen much else that could even claim mediocre coverage.
This is also an area that's decidedly different from the STL-inspired
parts of the library -- this is a much more object-oriented setup, where
doing almost anything typically involves deriving a new class from one
of the base classes in the library. In quite a few cases, the overhead
can seem a bit ridiculous.
Yep, I've always found C++ locales fairly impenetrable. Besides Josuttis,
ISTR Stroustrup published a C++PL appendix concerning C++ locales on his
web site several years ago, which I did read, but I pretty much lost
interest in the subject since then.
--
Doug Harrison
Visual C++ MVP