Re: Random weighted selection...
This message is in MIME format. The first part should be readable text,
while the remaining parts are likely unreadable without MIME-aware tools.
---910079544-1363231449-1243525163=:23744
Content-Type: TEXT/PLAIN; CHARSET=ISO-8859-15; FORMAT=flowed
Content-Transfer-Encoding: 8BIT
Content-ID: <alpine.DEB.1.10.0905290026521.14054@urchin.earth.li>
On Wed, 27 May 2009, B?gus Excepti?n wrote:
On May 10, 9:27?am, Tom Anderson <t...@urchin.earth.li> wrote:
interface Job extends Runnable {
? ? ? ? public int priority();
}
Then:
Job pickJob(List<Job> jobs, Random rnd) {
? ? ? ? int totalPriority = 0;
? ? ? ? for (Job job: jobs) totalPriority += job.priority();
? ? ? ? int index = rnd.nextInt(totalPriority);
? ? ? ? for (Job job: jobs) {
? ? ? ? ? ? ? ? index -= job.priority();
? ? ? ? ? ? ? ? if (index < 0) return job;
? ? ? ? }
? ? ? ? assert false; return null; // should be unreachable!
}
This seems to be what Seamus suggested, no?
Yes.
class QueuedJob implements Comparable<QueuedJob> {
? ? ? ? public final Job job;
? ? ? ? public final int serialNumber;
? ? ? ? public QueuedJob(Job job, ?int serialNumber) {
? ? ? ? ? ? ? ? this.job = job
? ? ? ? ? ? ? ? this.serialNumber = serialNumber;
? ? ? ? }
? ? ? ? public int compareTo(QueuedJob that) {
? ? ? ? ? ? ? ? int difference = -compare(this.job.priority(), that.job.priority());
? ? ? ? ? ? ? ? if (difference == 0) difference = compare(this.serialNumber, that.serialNumber);
? ? ? ? ? ? ? ? return difference;
? ? ? ? }
? ? ? ? private int compare(Integer a, Integer b) {
? ? ? ? ? ? ? ? return a.compareto(b);
? ? ? ? }
? ? ? ? public boolean equals(Object obj) {
? ? ? ? ? ? ? ? if ((obj == null) || !(obj instanceof QueuedJob)) return false;
? ? ? ? ? ? ? ? QueuedJob that = (QueuedJob)obj;
? ? ? ? ? ? ? ? return (this.job.equals(that.job)) && (this.serialNumber == that.serialNumber);
? ? ? ? }
}
class JobQueue {
? ? ? ? private SortedSet<QueuedJob> qjobs = new TreeSet<QueuedJob>();
? ? ? ? private int counter = 0;
? ? ? ? private Random rnd = new Random();
? ? ? ? private float p = 1.0 / 3.0; // or whatever
? ? ? ? public void enqueue(Job job) {
? ? ? ? ? ? ? ? qjobs.add(new QueuedJob(job, counter));
? ? ? ? ? ? ? ? ++counter; // ignore obvious problem with wrapping
? ? ? ? }
? ? ? ? public Job pop() {
? ? ? ? ? ? ? ? Iterator<QueuedJob> it = qjobs.iterator();
? ? ? ? ? ? ? ? while (it.hasNext()) {
? ? ? ? ? ? ? ? ? ? ? ? if (rnd.nextFloat() <= p) return pop(it);
? ? ? ? ? ? ? ? }
? ? ? ? ? ? ? ? // might not picky any of the above, so:
? ? ? ? ? ? ? ? return pop(qjobs.iterator());
? ? ? ? }
? ? ? ? private Job pop(Iterator<QueuedJob> it) {
? ? ? ? ? ? ? ? Job job = it.next().job;
? ? ? ? ? ? ? ? it.remove();
? ? ? ? ? ? ? ? remove job;
? ? ? ? }
}
Tom, is there an alternative to the costly and time consuming
iteration?
Yes - if you're talking about implementing the same probabilistic
mechanism as in the code immediately above. Instead of a TreeSet, use an
ArrayList, and keep it sorted, so that you have O(1) access to any
element. You then think about the maths: in the above scheme, every
element has a probability of being picked, and the probabilities for all
elements plus the possibility of none being picked sum to one (of course).
That is, there's a cumulative distribution over the options. You just have
to figure out what the function is, reverse it, then you can inject a
random number between 0 and 1, and get out an item, with exactly the same
probability distribution as with the iterative approach.
Having done a bit of scribbling on the back of an index card and a python
prompt, i think the magic function (where p is your probability of picking
at every step, and x is a random number from a uniform distribution
between 0 and 1) is:
index(x) = ceil(log_p(x))
And if index(x) >= number of items, then take it as 0.
Bear in mind that:
log_a(b) = log(b) / log(a)
Java doesn't have a function to do logarithms to an arbitrary base, but
that identity will let you compute them.
This was not part of what I had recalled in the initial post in this
thread, but...
What is the jobs were in an ArrayList, or other Collection(), and the
basic parameters were known about the list/collection/array.
You know what is easy/fast to learn:
The total # of elements
The highest priority
The lowest priority
How many elements will be selected (pulled out of the array)
Why couldn't you do _one_ iteration, and make your choice as to
whether to include an item at the same time that item comes up in the
iteration?
If you need to pick several items, rather than just one, then yes, you
could do them all at once.
Would sorting/ordering even be necessary? Without testing, that would
seem to be the least task intensive way to make a selection- even if not
a random weighted one.
If you're following the first algorithm, as at the top of this post, the
jobs don't need to be sorted. You do need to iterate over them. You can
pick several jobs at once (although you have to be a bit careful not to
pick the same job more than once).
You could also do something clever with enfilade trees that would let you
do an O(log n) lookup instead of an O(n) iteration, but that's a whole
other story.
tom
--
China Mieville has shown us how to be a good socialist and a bad science
fiction writer. -- The Times
---910079544-1363231449-1243525163=:23744--