Consider the simple scheduling problem:
Given N tasks, the $i^{th}$ needs $T_i$ seconds to be done. Find a schedule such that the total waiting time of all tasks is minimum.
The intuitive (and optimal) schedule is to do the shorter tasks earlier.
2. A generalized version
In the above problem, if a task i is started at time t, it contributes $t$ seconds to the total waiting time.
Consider the generalized version of the above problem in which if a task i is started at time t, it contributes a linear function of $t$ to the total cost, i.e. $A_it + B_i$ for two coefficients $A_i$ and $B_i$ of the task i. What is the optimal schedule in this problem?
Consider an arbitrary schedule in which $i$ precedes $i^\prime$. Suppose $i$ is started at time t. Swap $i$ and $i^\prime$, the cost incurred by the remaining tasks are still the same.
Thus, the cost difference between the latter and the former case is:
$D = A_i^\prime t + B_i^\prime + A_i (t+T_i^\prime) + B_i - (A_it + B_i + A_i^\prime(t+T_i) + B_i^\prime)$
$= A_iT_i^\prime - A_i^\prime T_i$
We have:
$D < 0 \Leftrightarrow \frac{A_i}{T_i} < \frac{A_i^\prime}{T_i^\prime}$
It implies that the larger the value of $\frac{A_i}{T_i}$ , the earlier the task i should be done.
We call $\frac{A_i}{T_i}$ the cost rate (CR) of i.
3. Restriction on tree ordering
In the above problem, there's no restriction on the ordering of the tasks: every task can be done first.
What can be said if the relationship between the tasks form a tree, i.e. in order to do a task, all of its ancestors must be finished first?
Let q be the task with largest CR.
If q has no parent, there always be an optimal schedule in which q is the first job to be done. In this case, we add q to the first slot in the schedule and continue the process with the remaining tasks.
Otherwise, let p be q's parent.
Consider an arbitrary schedule of the task:
$... p ... q ...$
If we move q to the left in the schedule such that q follows right after p, the total cost will not decrease.
That implies there always be an optimal schedule in which q follows right after p. Thus, we could "merge" p and q into a single node P.
Coloring node P at time t is equivalent to coloring node p at time t and then coloring node q at time t+T_p. The cost incurred is:
$A_pt+B_p + A_q(t+T_p) + B_q = (A_p+A_q)t + B_p + A_qT_p + B_q$
Thus, the coefficients of the new node P is:
- $A_P = A_p + A_q$
- $B_P = B_p + A_qT_p + B_q$
- $T_P = T_p + T_q$
We continue locate the task with largest CR. At each step, the number of tasks is decreased by 1. When there is no task left, we have an optimal schedule.
4. Algorithm
We can use a priority queue to locate the nodes with largest CR and obtain an O(nlogn) algorithm.
Pseudocode:
While #node > 0
q ← task with largest CR A[q]T[q]
if q.parent = null
add q to the schedule
for each (child w of q)
w.parent = null
else
p ← q.parent
A[p[ ← A[p] + A[q]
B[p] ← B[p] + A[q]T[p] + B[q]
T[p] ← T[p] + T[q]
for each (child w of q)
w.parent ← p
child[p].add(w);
remove q
5. Note
Idea: in this problem, the optimal solution satisfies a criteria which allows us to reduce to a problem's instance with smaller size. It's all about reduction.
If I have understood above correctly, for the tree:
ReplyDelete1 -> 2,
2 -> 3,
2 -> 4,
1 -> 5,
5 -> 6
cost of vertex 1 is 0,
cost of vertex 2 is 0,
cost of vertex 3 is 20,
cost of vertex 4 is 21,
cost of vertex 5 is 12,
cost of vertex 6 is 13.
Then you will give schedule as:
vertices 1 5 6 2 4 3,
cost of ordering is 288
Whereas the following schedule:
vertices 1 2 4 3 5 6,
has cost 281
Please tell me where I went wrong.