Re: Why is java consumer/producer so much faster than C++
On Thu, 26 Jul 2012 23:12:54 +0200
Luca Risolia <luca.risolia@studio.unibo.it> wrote:
On 26/07/2012 23:01, Melzzzzz wrote:
On Thu, 26 Jul 2012 22:15:33 +0200
Luca Risolia <luca.risolia@studio.unibo.it> wrote:
On 26/07/2012 21:56, Melzzzzz wrote:
f = ++f % cap_;
compiler warns about undefined behavior here.
Ouch..split that:
++f;
f %= cap_;
I corrected that already ;)
I have installed gcc-snapshot compiler
gcc version 4.8.0 20120314 (experimental) [trunk revision 185382]
(Ubuntu/Linaro 20120314-0ubuntu2)
Your timings with array assignment and rand included, are much worse
than gcc 4.6.3. Definitely something is wrong if
other thread does something else besides waiting on
spin lock.
Yes, it's worth to strace std::rand() to see what it does exactly.
Out of my curiosity, try to yield() after releasing the lock in both
put() and take():
m_.unlock();
boost::this_thread::yield(); // add this line
do you see any improvements?
I have solved your problem. Seems that your program suffers same problem
as my original version. When applied Howards algorithm to your version
I've got fastest time both for rand array assignment and without:
take a look:
(merged version with rand array assignment)
bmaxa@maxa:~/examples$ time ./consprod5
real 0m4.731s
user 0m5.164s
sys 0m2.804s
(merged version without rand array assignment,
wow this is fast ;)
bmaxa@maxa:~/examples$ time ./consprod5
real 0m2.345s
user 0m2.624s
sys 0m0.820s
code:
#include <thread>
#include <cstdlib>
#include <functional>
#include <atomic>
#include <memory>
#include <type_traits>
class spinlock_mutex {
std::atomic_flag flag = ATOMIC_FLAG_INIT;
public:
void lock() {
while (flag.test_and_set(std::memory_order_acquire)) {
std::this_thread::yield();
};
}
bool try_lock(){
if(flag.test_and_set(std::memory_order_acquire))
return false;
else
return true;
}
void unlock() {
flag.clear(std::memory_order_release);
}
};
template <class T, class A = std::allocator<T >>
class BlockingQueue {
const std::size_t cap_;
A alloc_;
T* queue_;
std::size_t n = 0, b = 0 , f = 0;
spinlock_mutex m_;
public:
BlockingQueue(const std::size_t cap, const A& alloc = A())
: cap_(cap), alloc_(alloc), queue_(alloc_.allocate(cap)) { }
~BlockingQueue() {
alloc_.deallocate(queue_, cap_);
}
static_assert(std::is_nothrow_move_constructible<T>::value,
"BlockingQueue cannot be exception-safe");
void put(T t) {
bool locked = m_.try_lock();
int retry = 0;
while(!locked)
{
if(n > cap_ / 4 && ++retry < 1000)
{
std::this_thread::yield();
}
else
{
locked = m_.try_lock();
}
}
alloc_.construct(&queue_[f], std::move(t));
++f;
if(f == cap_)f = 0;
++n;
m_.unlock();
}
T take() {
bool locked = m_.try_lock();
int retry = 0;
while(!locked)
{
if(n < 3 * cap_ / 4 && ++retry < 1000)
{
std::this_thread::yield();
}
else
{
locked = m_.try_lock();
}
}
T t = std::move(queue_[b]);
alloc_.destroy(&queue_[b]);
++b;
if(b == cap_) b = 0;
--n;
m_.unlock();
return t;
}
};
int main() {
BlockingQueue<int> produced(1000);
const int nitems = 100000000;
std::srand(12345);
std::function<void() > f_prod = [&](){
int i = nitems;
while (i-- > 0) {
produced.put(i);
}
};
std::thread producer1(f_prod);
std::function<void() > f_cons = [&](){
const int size = 10000;
int arr[size];
int i = nitems;
while (i-- > 0) {
arr[std::rand() % size] =
produced.take();
}
};
std::thread consumer1(f_cons);
producer1.join();
consumer1.join();
}