Re: Profiling code with optimisation on

From:

"Ondrej Spanel" <ondrej.nospam@bistudionospam.com>

Newsgroups:

microsoft.public.vc.language

Date:

Fri, 22 Sep 2006 10:13:51 +0200

Message-ID:

<O9p#K8h3GHA.4392@TK2MSFTNGP05.phx.gbl>

Passing values in command line, as suggested in the other reply, is
possible, but parsing the input may be a little bit painfull.

If you want to simulate a non-constant input at some point, you can use
volatile to make sure compiler is reading the value from memory and not
evaluating it as a constant. Be sure sure to limit using volatile variables
only in init phase, not during the loop, as this could prevent optimizing
the loop properly. See below.

int main()
{
   volatile float in1 = 1;
   volatile float in2 = 2;
   volatile float in3 = 3;
   typedef Vector<float[3]> Vector3f;
   Vector3f v0 = {in1, in2, in3};
   Vector3f v1 = {in2, in2, in3};

    float result = VectorDotProd<Vector3f>::Apply(v0, v1);
    _RPTF1(_CRT_WARN, "v0.v1 = %f\n", result);

    printf("%f\n", result);

    return 0;
}

Ondrej

--
---------------------------------------
Ondrej Spanel
Lead Programmer
Bohemia Interactive Studio
www.bistudio.com
www.flashpoint1985.com

"Someone Somewhere" <someone@somewhere.com> wrote in message
news:BaYPg.610$PD.567@fe2.news.blueyonder.co.uk...

Hi,

I have some templated vector code to unroll a loop, see below. I am
trying to test this code against a simple a*a + b*b + c*c vector dot
product to see if it truly is producing the same code. I ran the code
below and had a look at the assembly listing and it does look like the
inlining works, but unfortunatly the optimising power of the compiler is
so great that it substituted my code for a simple:

    int main()
    {
        printf("%f", 14.0f);
        return 0;
    }

which is great, but not what i want for profiling purposes :-). I still
want some optimisation for inlining the code, just not so much. NOTE: I
only have the basic optimise for speed settings, nothing fancy like global
intrinsic functions....

template<typename _Value>
struct Vector
{
    _Value _value;
};

template<typename _Vector> struct VectorDotProd {};

template<typename _Type, int _Length>
struct VectorDotProd<Vector<_Type[_Length]> >
{
    typedef Vector<_Type[_Length]> _Vector;

    __inline static float Apply(_Vector const & vector0,
_Vector const & vector1)
    {
        return Impl<_Type, _Length-1>::Apply(vector0, vector1);
    }

    template<typename _Type, int _Index>
    struct Impl
    {
        __inline static float Apply(_Vector const & vector0,
_Vector const & vector1)
        {
            return vector0._value[_Index] * vector1._value[_Index]
                + Impl<_Type, _Index-1>::Apply(vector0, vector1);
        }
    };

    template<typename _Type>
    struct Impl<_Type, 0>
    {
        __inline static float Apply(_Vector const & vector0,
_Vector const & vector1)
        {
            return vector0._value[0] * vector1._value[0];
        }
    };

};

int main()
{
    typedef Vector<float[3]> Vector3f;
    Vector3f v0 = {1.0f, 2.0f, 3.0f};
    Vector3f v1 = {1.0f, 2.0f, 3.0f};

    float result = VectorDotProd<Vector3f>::Apply(v0, v1);
    _RPTF1(_CRT_WARN, "v0.v1 = %f\n", result);

    printf("%f\n", result);

    return 0;
}