Re: Is this String class properly implemented?

From:

"Daniel T." <daniel_t@earthlink.net>

Newsgroups:

comp.lang.c++

Date:

Sat, 18 Apr 2009 12:39:41 -0400

Message-ID:

<daniel_t-276FBB.12394118042009@earthlink.vsrv-sjc.supernews.net>

kevintse.onjee@gmail.com wrote:

I am new to the C++ language, I implement this String class just for
practising.
I want to know what to improve to make it *qualified* to be used in
real programs, or what code should be changed to make it more
efficient...yea, I know you will recommend me to using std::string,
(-:just practising...

No problem. Where I work, we don't use the STL (for mainly historical
reasons) and it is part of my job to maintain several different string
classes.

Here are some comments about the interface/implementation you chose:

In the functions: String::String(const int&) and String::String(const
long&), you are putting the null terminator at value[size] but in other
functions you are putting the null terminator at value[size + 1].

Also, those two functions are identical. They shouldn't be identical,
but even so, if properly implemented they would be almost identical.
Consider breaking out the parts that are alike into a separate function.

Your String::String(String& s) copy constructor is missing the 'const',
which limits where/how it can be used.

For String::String(const wchar_t* str, int offset, int count), you
decided not to handle the abnormal case of str == 0, but you handled the
case where offset < 0, and the case where count > wcslen(str) - offset.
Why handle some but not all?

For const wchar_t& String::wchar_tAt(int index) const... Doing something
within a throw that might cause a throw (in this case, allocating
memory) is bad form.

String String::substring(int beginIndex, int endIndex) const... in the
last constructor I mentioned above, you corrected offset and count when
they were bad values, but here you throw if beginIndex and endIndex are
bad values. Consistency is important.
Also, you have two 'new's in this function, (one outright, and the
other within the constructor of the 's' object. I suggest you look for
ways to remove one of them. You will probably need to add functionality
to the String class (a new member-function or two,) then you will have
to decide if these new methods should be public or private...

You have an "operator const wchar_t* () const" and a "const wchar_t*
toCString() const" that both do the same thing. Others have already
suggested you remove the former. Generally, having two member-functions
that do the exact same thing is unnecessary duplication (though it
sometimes makes sense in the face of inheritance, but even then one of
the functions should be implemented in terms of the other.)

Your trim functions should all be const.

If you are going to consistently put a null terminator on the end of
your 'value' array, you don't really need the 'size' variable at all;
it's a rather minor optimization that is more than overwhelmed by the
fact that your class has to reallocate the value array every time
someone increases the length of the string.

Now that I have said the above about your current implementation... I'm
not sure that this is an appropriate implementation. There are a lot of
different ways to implement a string class and different implementations
are better for different situations. For example:

Your string class is very inefficient whenever the length() of the
object grows yet you provide member-functions that grow the string (op+=
for example,) maybe you should either remove those functions or change
the implementation to make them more efficient.

If you choose to remove the op+=, then you should consider also removing
the op+ as well as the global op+ functions. (I consider it bad form to
provide a op+ but not a op+=.)

That said, your class is largely immutable, the sole mutators are the
two op+= and the two op= functions. If you went all the way and removed
all four of them, you could make your class even more time efficient by
using a data sharing implementation. If you choose to keep these four
functions, then maybe you should consider including more functions for
mutation and go with an implementation that is more efficient in the
face of mutation.

Lastly, you have to consider usage patterns. If 80% of your strings are
less than some limit (say 16 characters,) then you might want to
consider optimizing the class so it doesn't have to allocate any heap
memory in those cases.

There are several different ways to implement a "string class" and each
method presents different time and space usage measurements. The best
implementation depends on issues that simply cannot be addressed within
the string class itself. The string class provided in the standard has
fair tradeoffs that make it good for general use, but not the best
choice in all cases. If you have a program that makes heavy use of
strings, you might be better off providing your own implementation that
has different tradeoffs or even multiple string classes within the same
program.