Re: code critique please.
On Jul 10, 4:56 am, matt <ptrifoli...@hotmail.com> wrote:
this is my first program in this language ever (besides 'hello
world'), can i get a code critique, please? it's purpose is to
read through an input file character by character and tally
the occurrence of each input character. it seems to compile
and run, so i'm looking for the opinions of old-timers here
plz.
For starters, I'd like a bit more white space.
/*
* File: occurrenceTally.cpp
* Author: matthew
*
* Created on July 9, 2008, 6:15 PM
*/
#include <stdlib.h>
#include <fstream.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
I'm not sure what logic you're using to decide which headers to
include, but for this problem (done correctly), I think:
#include <iostream>
#include <fstream>
#include <vector>
#include <cstdlib>
should suffice. Add <cstdlib>, if you really insist on using
printf, and either <climits> or <limits>, if you want to be
truely portable (e.g. to platforms where char is not 8 bits).
And there should definitely be at least one empty line between
the last include and the first line of your code.
int main(int argCount, char** argvArray) {
The standard names are argc and argv. They may not be good
names, but anything else is unexpected, and anything which is
unexpected leads to confusion.
For a number of reasons (some probably historical), I also
insist on the name of the function starting the line, and the
opening brace of a function or a class also being in the first
column, i.e.:
int
main( int argc, char** argv )
{
printf("filename ");
printf("%s",argvArray[1]);
printf("\n");
Which is a core dump waiting to happen:
if ( argc != 2 ) {
std::cerr << "syntax is: progname filename" << std::endl ;
return EXIT_FAILURE ;
}
std::cout << "filename " << argv[ 1 ] << std::endl ;
Don't ever count on a user invoking your program correctly.
Verify.
If you really insist on printf, there's nothing wrong with:
std::printf( "filename %s\n", argv[ 1 ] ) ;
ifstream OpenFile(argvArray[1]);
That's "std::ifstream". And did you open a file? You need to
follow this line with:
if ( ! OpenFile ) {
std::cerr << "cannot open " << argv[ 1 ] << std::endl ;
return EXIT_FAILURE ;
}
char inputCharacter;
int characters[256];
Better:
std::vector< int > characterCount( ... ) ;
If you want a C style array:
int characterCount[ ... ] = {} ;
A C style array is probably acceptable here if you're an expert,
but you're far better off with std::vector---compiling with the
necessary options for maximum debugging (including
-D_GLIBCXX_CONCEPT_CHECKS -D_GLIBCXX_DEBUG
-D_GLIBCXX_DEBUG_PEDANTIC if you're using g++).
What you use as a dimension (and later how you index) is the
real problem; I'd go with something like CHAR_MAX - CHAR_MIN + 1
(with CHAR_MAX and CHAR_MAX from <climits>---<limits> has
similar values, in a more C++ way, but it's a lot wordier).
Also, what's in the array are counts, so the name should reflect
this: characterCount, or some such.
Finally, there's an issue of consistency in your naming
convention. Both OpenFile and inputCharacter are variables.
They should follow the same convention; in other words, either
both start with a capital, or both with a small letter. (I
think the latter is by far the most frequent.)
int i;
for (i = 1; i < 256; i++) {
characters[i]=0;
}
If you'd have declared the variable as above, this loop isn't
necessary.
while(!OpenFile.eof())
{
Be consistent. Inside a function, it's more or less a question
of style as to whether you put the opening brace on a line by
itself, or at the end of the if/while/for statement, but you
should choose one, and stick with it. Either put the opening
brace of the for loop, above, on a line by itself, or put this
opening brace at the end of the while. (The loop condition is
wrong, too, but I'll hit on that later.)
OpenFile.get(inputCharacter);
This loop doesn't work. Use the standard:
while ( OpenFile.get( inputCharacter ) ) {
Basically, the results of OpenFile.eof() aren't useful until
after the operation: OpenFile.eof() means that the next
operation will fail, but ! OpenFile.eof() doesn't mean that it
will succeed. You must test *after* the read. And
OpenFile.eof() will be false if the read failed because of a
hardware error, or some such. Not a likely situation, but for
this reason, using the istream as a boolean is the preferred
idiom; it does the right thing.
(FWIW: the above idiom is in some ways much like argc and argv:
you can definitely argue that it's not very nice to actively do
something in a condition. But the idiom is ubiquious, to the
point where anytime an experienced programmer sees something
else, he starts by trying to figure out why. In such cases,
it's best to do like everyone else.)
//printf("%c",inputCharacter);
characters[int(inputCharacter)]++;
And this line core dumps on my machine, at least if I use
std::vector. (Which is why you should use std::vector with the
debug options---the code is wrong.) On many machines (PC's,
Sun's, etc.) char is signed, at least by default, and even if
you carefully compile with options to make it unsigned, it's
best not to count on it. But if char is signed, CHAR_MIN is
negative, and you end up with a negative index. Which is
undefined behavior, and will cause an immediate core dump with
g++ or VC++, if you are using vector, and all of the debugging
options are turned on (and they should be).
There are two ways of handling this: convert the char to
unsigned char, or shift your index, i.e. either:
++ characterCount[
static_cast< unsigned char >( inputCharacter) ] ;
or
++ characterCount[ inputCharacter - CHAR_MIN ] ;
(I don't know where I got into the habit of using the latter; I
suspect the former is more frequent.)
Better yet, wrap the array in a class (CharacterIndexArray?), in
order to ensure that all accesses obey the same convention. (If
you use the first convention sometimes, and the second others,
you'll not get a core dump, but the results are likely to be a
bit strange.)
}
OpenFile.close();
Theoretically, you should probably check as to why your input
finished. There are three possible reasons: end of file, a
format error in the input, and a hardware error. Practically: a
format error isn't possible for character input, like you're
doing here. (It occurs when you try to read an int, and the
input is "abcd".) And many implementations don't really check
for hardware errors (which are exceedingly rare on modern
systems). So you can probably skip it.
Note that the same rule *doesn't* apply to output. One
important hardware error is disk full, and it does occur, even
on the most robust systems. And if you've had a hardware error,
the data you've written isn't there, and the user will want to
know about it, because he may be counting on that data.
for (i = 32; i < 126; i++) {
What's 32? And what's 126? Either:
for ( int i = 0 ; i <= UCHAR_MAX ; ++ i ) {
or:
for ( int i = CHAR_MIN ; i <= CHAR_MAX ; ++ i ) {
(It depends on what you used to index into the array: an
unsigned char, or a char, with the value shifted.)
char outputCharacter;
outputCharacter = char(i);
Note that the above is exactly the same as:
char outputCharacter = i ;
or
char outputCharacter( i ) ;
And that the variable is in fact totally unnecessary.
if ( characters[i]>0 ) {
printf("%c ", outputCharacter);
printf("%i\n", characters[i]);}
Again, who needs printf, and if you use it, why not:
std::printf( "%c %d\n", (char)i, characters[ asIndex( i ) ]) ;
(Replace asIndex( i ) with whatever is necessary to conform with
the way you indexed the array above, and the type and values of
the loop.)
Much better, of course:
std::cout << (char)i << ' ' << characterCount[ ... ] <<
std::endl ;
And you should check for printable first, and do something
differenti if isprint is false. (Use the version of isprint in
<locale>, *not* the one in <clocale>, if you are using char's.
The one in <clocale> only works for unsigned char.)
}
return (EXIT_SUCCESS);
And only do this if you've really succeeded.
}
--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34