Sunday, November 1, 2009

[SPOJ] MKPAIRS

This task is an application of the y-intercept optimization problem in the previous post.

From an initial $O(n^3)$ DP algorithm, we could speed up to $O(n^2logn)$, which still got TLE on SPOJ.
Since for each hull, all the query slopes are decreasing, we could speed up the algorithm further to $O(n^2)$.

See the USACO analysis for more details.

The y-intercept optimization problem

Given a set of pairs $(x_1, y_1), ..., (x_k, y_k)$, and a number $a$, how do we find: $min \{ax_i+y_i\} $

This problem can be viewed geometrically as the following.
Consider $(x_1, y_1), ..., (x_k, y_k)$ as points on the plane.

For a given $a$, the line equation of the line passing through $(x_i, y_i)$ with slope $-a$ is:
$y = a(x_i-x) + y_i$
When $x=0$, $y=ax_i + y_i$, thus $ax_i+y_i$ is the y-intercept of the line passing through $(x_i, y_i)$ with slope $-a$

Our problem becomes to find the minimum y-intercept of the lines passing through $(x_i, y_i)$ with slope $-a$

It can be proved (and intuitively) that we only need to keep the convex hulls of $(x_1, y_1), ..., (x_k, y_k)$

The problem of maximizing the y-intercept is similar.

For the minimization problem, we only need to keep the lower hull. It is a set of connected segments with increasing slopes:



If we need to query the minimum intercept of lines with slope $a$, the optimum point $(x_i, y_i)$ is the point such that $a$ lies between the slope of the segment $(i-1), i$ and the segment $i, (i+1)$ (see figure above).

In other words, to query the minimum intercept, we can use binary search on the sequence of slopes.

If we know that the slope being queried is always increasing, we can just go through the set of segments linearly.

To keep track of the lower hull, if the given points are in increasing order of x, we could use a stack and obtain linear time amortized complexity:
for i ← 1 to k do
while (size(s)>2 and (s[top-1], s[top], p[i]) is a right turn
pop(s)
s.push(p[i])

A direct application of this problem is the task "Line Country" from NUS Selection Test #2, 2009. Since the slope being queried is not always increasing, we need to use binary search to obtain a O(nlogn) algorithm.
Implementation

Saturday, October 31, 2009

[SPOJ] PALDR (Solved)

PALDR Solved
Same algorithm but replaced palindromic checking by hashing (use int64 hash range):

for (int j=i; j < n; ++j) {
h=31*h+s[j];
hR=hR+p*s[j];
p*=31;

if ((j-i)%2==1 && h==hR) {
cont=true;
i=j+1;
break;
}
}

[SPOJ] MYSTIC

Let N be the number of types.

We have S trials and N types. N must occur $u \geq need[N]$ times in the sequence of trials. Removing these occurrences of N, we obtain a subproblem with S-u trials and N-1 types. This lead us to the following DP solution:

Let C(S,T) be the number of outcomes to satisfy the first S types when we have T trials. We have:
$C(S,T) = \sum_{u \geq need[S]}{_{u}^{T}{C}\times C(S-1,T-u)}$

We would like to compute the probability, not the cardinality C(S,T).
Let $F(S,T) = \frac{C(S,T)}{N^T}$
We have:
$F(S,T) = \frac{\sum_{u \geq need[S]}{_{u}^{T}C\times F(S-1,T-u)\times N^{T-u}}}{N^T}$
$= \sum_{u \geq need[S]}{\frac{_{u}^{T}C\times F(S-1,T-u)}{N^u}}$

F(N, G) is the result probability.

Note that F(S, T) is not a probability value. The correct probability value should be $\frac{C(S, T)}{S^T}$, not $\frac{C(S, T)}{N^T}$. To define F(S,T) is a trick to get the result probability faster.

Probability and cardinality could be used interchangeably.

When constructing a DP solution, use cardinality to be clearer. After that, divide the cardinality to the total outcome to get the probability function.

Of course, we could also define $F(S,T) = \frac{C(S,T)}{S^T}$, which is the correct probability value. A little bit more machine computations is required.

[SPOJ] VPALIN

Let $P=A^m=\underset{m \: times}{\underbrace{AA..A}}$
It's easy to prove that: if P is a palindrome then A is also a palindrome

Let Base(P) = the shortest string A such that $P=A^m$
Let P, Q be two palindromes. We can prove that:
PQ is a palindrome iff Base(P) = Base(Q)

Therefore, the problem is reduced to finding the period of a string, which can be done in O(n) using KMP's Next table.

To efficiently count the number of appearances of a base, we can use a hash function from a string into int64 range.
int64 hash(char *s, int n) {
int64 h=0;
FOR(i,n) h=31*h+s[i];
return h;
}

[SPOJ] Bookcase

Observe that: one of the maximum height is the largest height among all books

There are 3 book cases, let H1 H2 H3 be the maximum height of a book on case 1, 2, 3
WLOG, we always put the book with largest height on case 3
Let W1 be the width of case 1, W2 be the width of case 2
=> the width of case 3 = totalWidth - W1 - W2

Thus, if for every possible values (W1,W2), we could compute min{H1 + H2}, we are done.
This can be computed by DP:
let f[W1, W2] be the minimum value of H1+H2, among all possible arrangements of books such that case 1 has width W1 and case 2 has width W2 and the book width largest height is on case 3.

To see this, consider the 1D version: find a subset of book such that the sum of width is W and the maximum height is minimized. Sort the book in decreasing height. Let F[i][w] = minimum {max height | sum of width = w, using first i books}

If width[i] <= w:
F[i][w]=F[i-1][w] + (width[i] < w?F[i-1][w-width[i]]:(f[i-1][0]+height[i]))

The 2D case is similar: F[i][w1][w2]

Wednesday, October 28, 2009

[SPOJ] CSUBSEQS

Task: Given K sequences, determine the number of distinct common subsequences of them.

Suppose $\Sigma$ is the alphabet. In this problem, K=4 and $|\Sigma|=26$.

I have two algorithms for this problem:

Algorithm 1: $O(kn^k|\Sigma|)$:

Idea: partition the set of common subsequences by the ending character. If C is a common subsequence, it must either end by 'a', 'b', ..., or 'z'.

Let $S_{i_1,i_2,...,i_k}$ be the set of distinct common subsequences of $s_1[1..i_1], s_2[1..i_2], ..., s_k[1..i_k]$

We have:
$S_{i_1,i_2,...,i_k} = \bigcup_{c \in \Sigma}{c \cup ({S_{p_1-1,p_2-1,...,p_k-1}} \times \{ c \}})$
in which $p_i$ is the index of the last appearance of c in $s_i$.

In other words, the common subsequences ended by c are either:
  • The one-element subsequence "c"

  • A concatenation of a common subsequence of $S_{p_1-1,p_2-1,...,p_k-1}$ and c

Note that the character c must have been appeared in all sequences so far.

Speed up: We could use memoization instead of a bottom-up DP approach since not every entry in the table needs to be computed. Which entries are computed depends on the input string. My submission using memoization and map to store states got the top speed on SPOJ.

Algorithm 2: $O(2^{k+1}n^k)$
Let $S_{i_1,i_2,...,i_k}$ be the set of distinct common subsequences of $s_1[1..i_1], s_2[1..i_2], ..., s_k[1..i_k]$
Partition the common subsequences into to two set:
  • At least there is a string $s_t$ in which the common subsequence does not end at $s_t[i_t]$: we could compute the number of elements of this set by using the Inclusion Exclusion formula.

  • The common subsequence ends at $s_1[i_1], s_2[i_2], ..., s_k[i_k]$: we compute how many new subsequences in this set. This could also be computed using the Inclusion Exclusion formula by looking at the set of indexes $p_1, p_2, ... ,p_k$ in which $p_t$ is the index of the last appearance before $i_t$ of the character $s_t[i_t]$


My solution for algorithm 2 got TLE.

Note
To deal with distinct subsequences, observe that if a subsequence in the set end with a character c, it could always end with the rightmost c. An algorithm often processes indexes array such as p[i] = max{j, j < i AND s[j] = s[j]} (last appearance of the same element, or p[i][c] = max{j, j<=i AND s[j]=c} (last appearance of c)

Tuesday, October 27, 2009

[SPOJ] O(n^2) test case for PALDR's simple algorithm

Thanks to misof, there's a counterexample to show that the simple algorithm for PALDR is still O(n^2):

(abcd)^n xy (dcba)^n

Problem is rejudged. Some previously AC solutions got TLEs.

[SPOJ] EPALIN

Consider the longest border (LB) problem:

Input: string s[1..n] of length n
Output: LB(s) = max{ i | i < n AND s[1..i] = s[n-i+1..n] }

LB(s) can be computed in O(n) (this problem was introduced in the KMP algorithm)

EPALIN is essentially to find the longest palindrome suffix:
LS(s) = min{i | s[i..n] is a palindrome}

We reduce LS to LB. Consider the string:
$s^\prime = s_ns_{n-1}...s_1 \alpha\beta s_1s_2...s_n$ in which $\alpha \neq \beta$

It is easy to prove that LS(s') = LB(s)

Note: palindromic property can sometimes be handled by consider the string $SS^R$

Sunday, October 25, 2009

[SPOJ] MTREECOL - Color A Tree

1. A simple scheduling problem
Consider the simple scheduling problem:

Given N tasks, the $i^{th}$ needs $T_i$ seconds to be done. Find a schedule such that the total waiting time of all tasks is minimum.

The intuitive (and optimal) schedule is to do the shorter tasks earlier.

2. A generalized version
In the above problem, if a task i is started at time t, it contributes $t$ seconds to the total waiting time.

Consider the generalized version of the above problem in which if a task i is started at time t, it contributes a linear function of $t$ to the total cost, i.e. $A_it + B_i$ for two coefficients $A_i$ and $B_i$ of the task i. What is the optimal schedule in this problem?

Consider an arbitrary schedule in which $i$ precedes $i^\prime$. Suppose $i$ is started at time t. Swap $i$ and $i^\prime$, the cost incurred by the remaining tasks are still the same.
Thus, the cost difference between the latter and the former case is:
$D = A_i^\prime t + B_i^\prime + A_i (t+T_i^\prime) + B_i - (A_it + B_i + A_i^\prime(t+T_i) + B_i^\prime)$
$= A_iT_i^\prime - A_i^\prime T_i$
We have:
$D < 0 \Leftrightarrow \frac{A_i}{T_i} < \frac{A_i^\prime}{T_i^\prime}$

It implies that the larger the value of $\frac{A_i}{T_i}$ , the earlier the task i should be done.

We call $\frac{A_i}{T_i}$ the cost rate (CR) of i.

3. Restriction on tree ordering
In the above problem, there's no restriction on the ordering of the tasks: every task can be done first.

What can be said if the relationship between the tasks form a tree, i.e. in order to do a task, all of its ancestors must be finished first?

Let q be the task with largest CR.

If q has no parent, there always be an optimal schedule in which q is the first job to be done. In this case, we add q to the first slot in the schedule and continue the process with the remaining tasks.

Otherwise, let p be q's parent.
Consider an arbitrary schedule of the task:
$... p ... q ...$
If we move q to the left in the schedule such that q follows right after p, the total cost will not decrease.
That implies there always be an optimal schedule in which q follows right after p. Thus, we could "merge" p and q into a single node P.

Coloring node P at time t is equivalent to coloring node p at time t and then coloring node q at time t+T_p. The cost incurred is:

$A_pt+B_p + A_q(t+T_p) + B_q = (A_p+A_q)t + B_p + A_qT_p + B_q$

Thus, the coefficients of the new node P is:
  • $A_P = A_p + A_q$

  • $B_P = B_p + A_qT_p + B_q$

  • $T_P = T_p + T_q$

We continue locate the task with largest CR. At each step, the number of tasks is decreased by 1. When there is no task left, we have an optimal schedule.

4. Algorithm
We can use a priority queue to locate the nodes with largest CR and obtain an O(nlogn) algorithm.

Pseudocode:

While #node > 0
q ← task with largest CR A[q]T[q]
if q.parent = null
add q to the schedule
for each (child w of q)
w.parent = null
else
p ← q.parent
A[p[ ← A[p] + A[q]
B[p] ← B[p] + A[q]T[p] + B[q]
T[p] ← T[p] + T[q]
for each (child w of q)
w.parent ← p
child[p].add(w);
remove q


5. Note
Idea: in this problem, the optimal solution satisfies a criteria which allows us to reduce to a problem's instance with smaller size. It's all about reduction.

[TopCoder] SRM451 - Div1 - Level 1

Simple math help solve the problem: a*(1 + 10 + ... + 10^n) = x => brute force for n

Saturday, October 17, 2009

[SPOJ] MPART - Jamie's Contact Groups

Side notes: I was trying to figure out how a solution was very faster than mine. However, it turns out that solution was wrong (using pascal's string to read input instead of ansistring ) but still passed a special test case. So funny. When I corrected that solution, it was no better than mine.

Some notes on implementing Maximum Flow:

Use some wrapper ID functions to make the code readable and less exposed to errors:

int idSource() {return 0;}
int idTarget() {return n+m+1;}
int idPerson(int i) {return 1+i;}
int idGroup(int i) {return 1+n+i;}

Friday, October 16, 2009

[SPOJ] PALDR

Suppose s is concatenation of even palindromes
Let p=min{i | s[0..i] is an even palindrome}
Then there is a concatenation using s[0..p]
Proof: let s[0..q] be the first even palindrome in a decomposition of s (q>p)
Since s[0..q] is a palindrome, s[0..p] = s[q..q-p] = an even palindrome
=> s[0..p] s[p+1..q-p-1] s[q-p..q] is also a decomposition of s[0..q] (QED)
Algorithm:
1. search for p
2. continue the process with the string s[p+1 .. n-1]
The number of times we test for palindrome is O(n)
However, the complexity of each palindrome test could not be all O(n) and thus, the total complexity could not reach O(n^2)

[SPOJ] AEB52

Let c[i] = (#occurences of i) - 1, for i=1..n
The sequence is valid iff:
c[i] + c[i+1] + ... + c[n] >= 0 for all i=1..n <=> min(c[i] + c[i+1] + ... + c[n]) >= 0

Use an interval tree to represent c[i] and query: t=min(c[i] + c[i+1] + ... + c[n])

Wednesday, October 14, 2009

[SPOJ] MOBIVINA

This problem is a straight application of the maximum flow / minimum cut algorithm.

Construct a graph that has N+2 vertices.
N vertices corresponding to N people. 1 corresponds to Mobi, 1 corresponds to Vina.

Person i - Person j: capacity C[i,j]
Mobi - Person i : capacity M[i]
Person i - Vina: capacity V[i]

A cut Mobi-Vina corresponds to a way to assign people to services.
Cut's capacity = cost
Thus, maximum flow <=> minimum cut.

[SPOJ] STREDUCE

Dynamic programming O($N^3$)

Compute boolean array trans[i][j][t] = true (t=0..5) <=> s[i..j] can transform into A, B, AA, BB, AB, BA

Based on trans, we can easily compute f[i][j] = mininum length that s[i..j] can reduce to

Tuesday, October 13, 2009

[SPOJ] KPGAME

I assume this fact (seems reasonable although I could not prove it):

- If N is odd: taking a coin is always better
- If N is even: not taking a coin is always better

Let $F_N$ be the probability that the first player will win if there are N stones

With the above fact, we can derive the recurrence formula for $F_N$:

$\footnotesize{F_N = \begin{cases}
P[QF_{N-1} + (1-Q)F_{N-2}]+ (1-P)[QF_{N-1}+(1-Q)F_N)]& \text{ if N is odd} \\
(1-P)[(1-Q)F_{N-1}+QF_{N-2}]+ P[(1-Q)F_{N-1} + QF_N]& \text{ if N is even}
\end{cases}}$

$\footnotesize{F_N = \begin{cases}
QF_{N-1} + P(1-Q)F_{N-2} + (1-P)(1-Q)F_N & \text{ if N is odd} \\
(1-Q)F_{N-1}+(1-P)QF_{N-2} + PQF_N& \text{ if N is even}
\end{cases}}$

$\small{F_N = \begin{cases}
\frac{QF_{N-1} + P(1-Q)F_{N-2}}{1-(1-P)(1-Q)} & \text{ if N is odd} \\ \\
\frac{(1-Q)F_{N-1}+(1-P)QF_{N-2}}{1-PQ}& \text{ if N is even}
\end{cases}}$

Let's use matrix multiplication to compute $F_N$ quickly. We have:

$\begin{bmatrix}
F_N\\
F_{N-1}
\end{bmatrix}
=
\begin{cases}
\begin{bmatrix}
\frac{Q}{1-(1-P)(1-Q)}& \frac{P(1-Q)}{1-(1-P)(1-Q)}\\
1 & 0
\end{bmatrix}
\begin{bmatrix}
F_{N-1}\\
F_{N-2}
\end{bmatrix} & \text{ if N is odd}
\\
\\
\begin{bmatrix}
\frac{1-Q}{1-PQ}& \frac{(1-P)Q}{1-PQ}\\
1 & 0
\end{bmatrix}
\begin{bmatrix}
F_{N-1}\\
F_{N-2}
\end{bmatrix} & \text{ if N is even}
\end{cases}$

And:
$\begin{bmatrix}
F_N\\
F_{N-1}
\end{bmatrix}
=
\begin{cases}

AB^{\frac{N-1}{2}}
\begin{bmatrix}
F_1 \\
F_0
\end{bmatrix} & \text{ if N is odd}
\\
\\
B(AB)^{\frac{N}{2}-1}
\begin{bmatrix}
F_1 \\
F_0
\end{bmatrix} & \text{ if N is even}
\end{cases}$

Where A is the first matrix and B is the second matrix.

Base case:
$\begin{bmatrix}
F_1 \\
F_0
\end{bmatrix}
=
\begin{bmatrix}
\frac{P}{1-(1-P)(1-Q)} \\
0
\end{bmatrix}$

Implementation: calculation up to N = 10^8 would cause floating points error and using arbitrary length arithmetic would be too slow (e.g. Python's decimal). Trick: let N = min(N, 1000+N%2), the answer wouldn't be much different.

Test LaTeX

$\sum_{i=1}^n x_i$

To install LaTeX on Blogger