Re: elementary string processing question
On Nov 1, 4:28 am, tonywh00t <tony.s...@gmail.com> wrote:
I have a "simple" question, especially for people familiar
with regex. I need to parse strings that have the form:
1:3::5:9
which indicates the set of integers {1 3 4 5 9}. In other
words i have a set of numbers separated by ":", where "::"
indicates a range from lo to hi inclusive. It is desirable to
error check this string (i.e it should. start and end with a
number, and be composed only numbers, "::", and ":"). I'm
currently using the Boost C++ library, and i've worked out
some pretty ugly solutions. If anyone has a suggestion, I'd
very much appreciate it. Thanks!
I presume that the number of entries in the string may vary;
otherwise, of course, you said it yourself, regex. I'd still
use regex to validate the string, something like
"^\\d+(:\\d+|::\\d+)*$", I think would do the trick. (It would
be really elegant if you could use capture, but capture doesn't
work well within closures---only the last match is captured.)
Then I'd simply break the string up into substrings at each ':':
std::vector< std::string >
parse( std::string const& source )
{
typedef std::string::const_iterator
TextIter ;
std::vector< std::string >
result ;
TextIter current = source.begin() ;
TextIter const end = source.end() ;
while ( current != end ) {
TextIter fieldBegin = current ;
current = std::find( current, end, ':' ) ;
result.push_back( std::string( fieldBegin, current ) ) ;
if ( current != end ) {
++ current ;
}
}
return result ;
}
This gives you an array of strings, with an emtpy string between
:: (so when you see an empty string, you know you have a range).
So you could do something like:
int
toInt( std::string const& string )
{
std::istringstream cvt( string ) ;
int result ;
cvt >> result ;
return result ;
}
std::vector< int >
convert( std::vector< std::string const& source )
{
typedef std::vector< std::string >::const_iterator
FieldIter ;
std::vector< int > result ;
FieldIter current = source.begin() ;
FieldIter const end = source.end() ;
while ( current != end ) {
result.push_back( toInt( *current ) ) ;
++ current ;
if ( current != end && *current == "" ) {
int bottom = result.back() ;
++ current ;
int top = toInt( *current ) ;
if ( top <= bottom ) {
throw someError ;
}
while ( ++ bottom <= top ) {
result.push_back( bottom ) ;
}
++ current ;
}
}
sort( result.begin(), result.end() ) ;
// Or you might want to track the last seen to ensure
// that the input was correctly sorted.
return result ;
}
Note that all of the above code supposes the precheck on the
format using regex. Otherwise, you'll need a lot more error
handling and special cases.
--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34