Re: Passing multi-dimensional arrays to functions
On Wednesday, 27 March 2013 13:10:02 UTC+2, Seungbeom Kim wrote:
On 2013-03-26 19:40, ?????? Tiib wrote:
On Tuesday, 26 March 2013 07:50:05 UTC+2, Seungbeom Kim wrote:
You'll have M+1 allocations, worse locality, and weird and much
more verbose declarations:
std::vector<std::vector<double>> matrix(M,
std::vector<double>(N));
std::vector<std::vector<double>>::size_type sz = matrix.size();
std::vector<std::vector<double>>::const_iterator it =
matrix.end();
That factor is not too important ... more idiomatic would be to
assume that typedefs are always done ...:
typedef std::vector<double> Column;
typedef std::vector<Column> Matrix;
Matrix::size_type sz = matrix.size();
I often do that, too. And that turns a one-line declaration of an
array ('double matrix[M][N];') into three lines. Which is not too
bad, but while I like typedefs that yield meaningful abstractions,
this is rather just to replace repeating long names with repeating
short names, and/or typing too much on a single line with typing
reasonable amounts on more lines, which I feel to be like a
necessary evil.
The typedefs are part of class that has such multidimensional
relation. Multidimensional relation is complex and expensive.
Expensive things result with desire to refactor. For example to
consider switching Column with 'std::map<int,double>'. You have
difficulties to measure performance of such alternatives without
typedefs. Typedefs together with 'auto' make it often almost
effortless.
....also 'auto' was modified exactly because of that:
auto it = matrix.end();
This is great, as long as you can use C++11 (which is, sadly, not
always true).
The feature is in language. There will be constantly diminishing
amount of poor souls who can't use it. Most helpful suggestion for
them is to migrate. What else we can suggest?
in exchange for the nice access syntax 'm[i][j]'.
That is nicest but rarely optimal access syntax.
Is it suboptimal because of the underlying representation of a
vector of vectors, or is there anything fundamental in the access
syntax that prevents the access through it from being optimal?
Even with raw array it is commonly suboptimal. Say we have 'T
m[X][Y];', address of 'm[i][j]' is at 'm + i * sizeof(T[Y]) + j *
sizeof(T)'. That can lose 20% of performance on edge cases (that
rarely matters of course).
The algorithms that traverse multidimensional relations are usually
quite sequential so iterators (or pointers for case of C array) are
usually more optimal (one addition for every increment). Standard
algorithm library for example often performs quite close to optimal
(and takes iterators as rule).
On the other hand, I have written and used a wrapper that hides a
M*N- element vector inside and provides a nice access syntax
m[i][j] or m(i, j), but not being standardized, it doesn't tend
to last long.
IMHO the choice of container should be always implementation
detail. In OOP sense two-dimensional container is just a 1 to M*N
relation. It is not good to expose carriers of relations for any
by-passer to adjust.
I'm not sure if I implied anything opposite.
I was more about the immersion of 'm[i][j]' syntax that is usually
sub-optimal anyway so the only thing for what I would consider it
would be exposing the relation in interface.
If we need numbers then we again often can save some time with
math library (like Boost.Basic Linear Algebra Library) ... in
comparison with std::valarray.
For more serious work, yes. But we don't want to have to learn and
use big third-party libraries for simpler problems, and
std::valarray is simple and already in the standard.
And my question was more towards comparison with std::vector: even
for numeric matrices, I have seen many people recommend std::vector
but not as many recommend std::valarray. Why is that?
I was just answering your question ("why not valarray?"). My
impression is that it is not generic enough when discussing arrays. It
is not feature-rich enough when discussing computations with dense
matrices.
If it is image that we are processing then modern processors are
so powerful and memory speed is so limiting that it is often best
to keep images in a compressed format most of the time.
Externally, yes. But not while doing some actual processing on the
image, which is exactly when we need a good 2D or 3D array.
Usually you do such computations using specialized coprocessors (even
mobile phones start to have those) and keep 3D models *all* *the*
*time* in compressed format.
There's also Boost.MultiArray, but when asked "What's the best
way to have a multidimensional array in C++?", I hate to have to
answer "Oh, you have to install Boost first, and ..."
Why? The sooner novices realize that standard library is only
little and homey introduction to ocean of libraries we use (where
boost is also among the friendliest) ... the better. Boost has lot
of interesting (and often close to optimal) containers in it.
I agree that Boost has many interesting things, but I'm not so sure
about your "the sooner, the better" part. Novices already have too
many things to learn from the beginning, don't they?
OP was puzzled with passing C array 'int xxarray[nsamp][nsamp];' and
pointers. That array-to-pointer is hardest and most error-prone thing
in C. He would have far less problems when he has told that "we do not
use those here" and suggested to pass 'Xxarray& xxarray' instead that
is 'typedef boost::multi_array<int,2> Xxarray'.
What's the canonical way to have a multidimensional array in C++?
What a pity I haven't found the answer yet. So I just allow
myself to use the C arrays 'T a[M][N];', at least for simple
cases.
There can not be canonical ways since the purpose of one can't be
canonical.
Let's suppose we're in a situation where a built-in array is good
enough as long as the dimensions are constant or we have the VLA
from C99.
If dimensions are constant then 'std::array<std::array<T,Y>,X>' works
fine. Rest of the cases I also listed here:
If the space of usage is dim then it is fine to start from
std::vector, std::array or boost::multi_array and refactor it
later into more suitable container.
We'd have to pick one and start from it, of course. What I don't
like is that it may not be very obvious from the beginning which one
is the best, and switching costs.
Then do not switch to best, continue using suboptimal until it
actually hurts. What you want then? Do you want magic container that
profiles itself and switches to perfect container under the hood? What
helps to switch between containers I already answered in start of post
.... typedefs and auto.
I also have lot of usages of C arrays for immutable containers
like ...
T const A[][N] = {{ /*...*/ }, /*...*/ };
Note, that key here is immutability and that compiler calculates
the count of columns thanks to aggregate initialization (no such
luxury for non-aggregates). If things become mutable then C array
is dangerously too low level.
I don't see why immutability is the key for safety here.
If you take and eyeball historic records of effort spent in large
project on issues that involved stomping memory then you start to see.
Raw dynamic arrays (or raw pointers to such) are usual reason of
stomping memory (besides of being often suboptimal). Containers have
checked at() also rest of the operations and iterators are checked in
debug versions.
Immutability is key since immutable array you can't use for stomping
memory. It is always there (static duration) you do not write to it
and it can be pre-sorted. Search in pre-sorted array is O(log N) if
you use binary search or O(log log N) if you use interpolation search
.... so it is hard to have something even more optimal.
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]