Re: Is this standard c++...

From:

"Chris Thomasson" <cristom@comcast.net>

Newsgroups:

comp.lang.c++

Date:

Fri, 2 Mar 2007 03:48:36 -0800

Message-ID:

<IsOdnV3Z4OSSknXYnZ2dnUVZ_uGjnZ2d@comcast.com>

"Chris Thomasson" <cristom@comcast.net> wrote in message
news:XfqdnfVjNYfminvYnZ2dnUVZ_sapnZ2d@comcast.com...

"kwikius" <andy@servocomm.freeserve.co.uk> wrote in message
news:1172634410.345742.116020@m58g2000cwm.googlegroups.com...
[...]

Yep. I figured it out eventaully I think.
It seems to be possible but at the expense of always allocating your
char array oversize by alignment_of<T> -1. Its not possible to know
where on the stack the lmem object will go.

[...]

But consider a variant, and you are smokin', treating stack like heap
with no alloc overhead...

This may be the purpose behind the device ...

[...]

Indeed it its. I believe that I could make use of the following function
'ac_malloc_aligned' :

http://appcore.home.comcast.net/appcore/src/appcore_c.html
(2nd to last function in the file...)

Okay. I was thinking of something kind of like:

<pseudo-code/sketch>
---------

#include <cstdio>
#include <cstddef>
#include <cassert>
#include <new>

template<size_t T_basesz, size_t T_metasz, size_t T_basealign>
class lmem {
  unsigned char m_basebuf[T_basesz + T_metasz + T_basealign - 1];
  unsigned char *m_alignbuf;

private:
  static unsigned char* alignptr(unsigned char *buf, size_t alignsz) {
    ptrdiff_t base = buf - static_cast<unsigned char*>(0);
    ptrdiff_t offset = base % alignsz;
    ptrdiff_t result = (! offset) ? base : base + alignsz - offset;
    assert(! result % alignsz);
    return static_cast<unsigned char*>(0) + result;
  }

public:
  lmem() : m_alignbuf(alignptr(m_basebuf, T_basealign)) {}
  template<typename T>
  void* loadptr() const {
    assert(T_basesz >= (sizeof(T) * 2) - 1);
    return alignptr(m_alignbuf, sizeof(T));
  }
  void* loadmetaptr() const {
    return m_alignbuf + T_basesz;
  }
};

namespace detail {
  namespace os {
  namespace cfg {
    enum config_e {
      PAGE_SZ = 8192
    };
  }}

  namespace arch {
  namespace cfg {
    enum config_e {
      L2_CACHELINE_SZ = 64
    };
  }}

  namespace lheap {
  namespace cfg {
    enum config_e {
      BUF_SZ = os::cfg::PAGE_SZ * 2,
      BUF_ALIGN_SZ = arch::cfg::L2_CACHELINE_SZ,
      BUF_METADATA_SZ = sizeof(void*)
    };
  }}
}

template<typename T>
class autoptr_calldtor {
  T *m_ptr;
public:
  autoptr_calldtor(T *ptr) : m_ptr(ptr) {}
  ~autoptr_calldtor() {
    if (m_ptr) { m_ptr->~T(); }
  }
  T* loadptr() const {
    return m_ptr;
  }
};

namespace lheap {
  using namespace detail::lheap;
}

class foo {
public:
  foo() { printf("(%p)foo::foo()", (void*)this); }
  ~foo() { printf("(%p)foo::~foo()", (void*)this); }
};

int main() {
  {
    lmem<lheap::cfg::BUF_SZ,
               lheap::cfg::BUF_METADATA_SZ,
               lheap::cfg::BUF_ALIGN_SZ> foomem;

    autoptr_calldtor<foo> f(new (foomem.loadptr<foo>()) foo);
  }

  printf("\npress any key to exit...\n"); getchar();
  return 0;
}

The lmem object is meant to be barebones low-level buffer object in the
"system-code" part of my c++ memory allocator library I am currently
developing. Basically, I am going for a fairly thin wrapper over the
allocator pseudo-code I posted; you can follow the link to the invention to
look at it. Humm... As you can probably clearly see by now, I am a hard core
C programmer and I must admit that my c++ could skills can be improved
upon... So, any ideas for interface designs, or even system level design,
are welcome...

I was thinking about using a single lmem object per-thread and then
subsequently using it to allocate all of the per-thread data-structures my
allocator algorithms relies upon. So, essentially, every single
data-structure that makes up my multi-threaded allocator design can be based
entirely in the stacks of a plurality of threads. Wow, this has the
potential to have simply excellent scalability and performance
characteristics; anyway... ;^)

So, since lmem is all I "really" need and I don't want to post any of the
implementations details wrt lock-free algorithms, ect... what else can I
discuss here that's on topic... I am going to need to finally decide on
exactly how I will be laying out the per-thread structures in the buffer
managed by lmem...

Then I need to think about how I am going to ensure that the threads stacks
don't go away when any of its allocator structures are in use by any other
thread. The following technique currently works fine:

<pseudo-code>

template

class mylib_thread {
// ...
public:
  ~mylib_thread() {
    /*
     special atomic-decrement-and-wait function;
     off-topic, not shown here...
     we can discuss the lock-free aspects of my algorithm
     over on comp.programming.threads...
   */
  }
};

user_thread_entry(mylib_thread &_this) {
  // user application code
}

libsys_thread_entry(...) {
  // library system code

    lmem<lheap::cfg::BUF_SZ,
               lheap::cfg::BUF_METADATA_SZ,
               lheap::cfg::BUF_ALIGN_SZ> foomem;

    autoptr_calldtor<foo> _this(new (_thismem.loadptr<mylib_thread>())
mylib_thread);

    user_thread_entry(*_this.loadptr());
}

Any thoughts?