Re: Why is java consumer/producer so much faster than C++
On Jul 22, 2:59 pm, Melzzzzz <m...@zzzzz.com> wrote:
I have played little bit with new C++11 features and compared,
java performance to c++.
Actually this was meant to be GC vs RAII memory management,
but boiled down to speed of BlockingQueue class in java,
and mine in c++.
It seems that java implementation is so much more efficient
but I don't know why. I even tried c++ without dynamic
memory management (except queue itself) and that is *even slower*.
Must be some quirks with a queue ;)
These are timings:
(java openjdk 1.7)
bmaxa@maxa:~/examples$ time java consprod
real 0m13.411s
user 0m19.853s
sys 0m0.960s
(c++ gcc 4.6.3)
bmaxa@maxa:~/examples$ time ./consprod
real 0m28.726s
user 0m34.518s
sys 0m6.800s
Example programs follow (I think implementations of
blocking queues are similar):
// java
import java.util.concurrent.*;
import java.util.Random;
class Vars{
public final static int nitems = 100000000;
public final static Random rnd = new Random(12345);
public final static int size = 10000;
}
class Producer implements Runnable {
private final BlockingQueue<Integer> queue;
Producer(BlockingQueue<Integer> q) { queue = q; }
public void run() {
try {
int i = Vars.nitems;
while(i-- > 0) { queue.put(produce(i)); }
} catch (InterruptedException ex)
{
}
}
Integer produce(int i) { return new Integer(i); }
}
class Consumer implements Runnable {
private final BlockingQueue<Integer> queue;
Consumer(BlockingQueue<Integer> q)
{
queue = q;
}
public void run() {
try {
Integer[] arr = new Integer[10000];
int i = Vars.nitems;
while(i-- > 0) { consume(queue.take(),arr); }
} catch (InterruptedException ex)
{
}
}
void consume(Integer x, Integer[] arr)
{
arr[Vars.rnd.nextInt(Vars.size)] = x;
}
}
public class consprod {
public static void main(String [] args) {
try{
BlockingQueue<Integer> q = new ArrayBlockingQueue<Integer>(1=
00000);
Producer p = new Producer(q);
Consumer c = new Consumer(q);
new Thread(p).start();
new Thread(c).start();
} catch(Exception e)
{
e.printStackTrace();
}
}
}
//-----------------------------------------
// c++
#include <condition_variable>
#include <mutex>
#include <thread>
#include <deque>
#include <cstdlib>
template <class T>
class BlockingQueue{
public:
BlockingQueue(unsigned cap):capacity_(cap)
{
}
void put(T t)
{
std::unique_lock<std::mutex> lock(m_);
while(queue_.size() >= capacity_)c_full_.wait(lock);
queue_.push_back(std::move(t));
c_empty_.notify_one();
}
T take()
{
std::unique_lock<std::mutex> lock(m_);
while(queue_.empty())c_empty_.wait(lock);
T tmp = std::move(queue_.front());
queue_.pop_front();
c_full_.notify_one();
return std::move(tmp);
}
bool empty()
{
std::unique_lock<std::mutex> lock(m_);
return queue_.empty();
}
private:
std::mutex m_;
std::condition_variable c_empty_,c_full_;
std::deque<T> queue_;
unsigned capacity_;
};
int main()
{
BlockingQueue<std::unique_ptr<int>> produced(100000);
const int nitems = 100000000;
std::srand(12345);
std::function<void()> f_prod = [&]() {
int i = nitems;
while(i-- > 0){
produced.put(std::unique_ptr<int>(new int(i)));
}
};
std::thread producer1(f_prod);
std::function<void()> f_cons = [&]() {
const int size = 10000;
std::unique_ptr<int> arr[size];
int i = nitems;
while(i-- > 0)
{
arr[std::rand()%size] = produced.take();
}
};
std::thread consumer1(f_cons);
producer1.join();
consumer1.join();}
What g++ optimization options did you use? If you didn't compile with
proper optimization flags (e.g. at least -O2), then the numbers you
have are meaningless.
Also, you might be testing the difference between std::rand() and
Java's Random.
Also, why did you use dynamic allocation in the C++ code for an int?
If your original goal was to compare the Java way to the C++ way, you
are not doing it right. This is especially troubling after you
apparently went through the effort to make the queue support move-only
types. (Not sure. I'm still new to move-semantics.)
Also,
std::unique_ptr<int> arr[size];
....
arr[std::rand()%size] = produced.take();
I wonder if it's possible to optimize that into a simple assignment,
or whether there will be a call to delete() - or at least a branch to
test if the internal member pointer is null.
Just off the top of my head.