Re: "Fixing" the O(1) splice / O(1) size std::list problem?
In article
<howard.hinnant-80A36F.15120914022007@johnf2.biosci.ohio-state.edu>,
Howard Hinnant <howard.hinnant@gmail.com> wrote:
In article <d8ednfFYt6vBwE7YnZ2dnUVZ_silnZ2d@giganews.com>,
pjp@dinkumware.com ("P.J. Plauger") wrote:
I still haven't seen a compelling use case for O(1) "partial
splice", with or without a promised element count.
<g> That is ironic since it is the desire for this very function (with
the O(1) complexity) which has been the basis for a decade-long debate.
Ok, here's a reasonable use case:
I have a master database consisting of a list of Record:
struct Record
{
std::string department;
std::string data;
};
I want to create separate databases, one for each department, so that I
can send separate smaller lists to each individual department for
parallel processing. An "unmerge" if you will:
std::map<std::string, std::list<Record> >
unmerge(std::list<Record>& c)
{
std::map<std::string, std::list<Record> > r;
for (std::list<Record>::iterator i = c.begin(), e = c.end(); i != e;)
{
std::string department = i->department;
std::list<Record>::iterator j = i;
std::list<Record>::size_type count = 1;
for (++j; j != e; ++j, ++count)
if (j->department != i->department)
break;
std::list<Record>& t = r[department];
t.splice(t.end(), c, i, j, count);
i = j;
}
return r;
}
* This is reasonable code.
* It makes use of splice-some-from-other
* splice-some-from-other is within a loop, so keeping it as efficient as
possible would be nice.
* The count of the number of nodes one wants to splice at a time is
trivially computed without affecting the performance of the entire
algorithm.
* This algorithm purposefully makes a single pass through the master
list in an effort to minimize cache misses.
Yes but what is the signifigant gain over
while(!c.empty())
{
std::string const &department = c.front().department();
std::list<Record> &t = r[department];
while(!c.empty() && c.front().department == department)
t.splice(t.end(),c,c.begin());
}
The # of compares is the same, the only difference is the # of
internal pointer changes and that is also O(N). so they are both O(N),
where N == c.size(). I suppose if you had 10 million entries in c and
10 distinct departments it might be signifigant but then is a big std::
list a good idea to hold the original data? Seems like a contrived
example.
I am of no opinion over adding a fourth splice overload, just
wondering what is gained by the addition.
I don't see a case where I know the iterators i,j and distance(i,j)
independent of traversing the list. If I have to traverse the list
little is gained, over splicing to a temp list while I traverse, or in
this case splicing each item to the proper list while traversing the
original list, such as above does,
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]