Editorial for DMOPC '21 Contest 8 P5 - Tree Building - Olympiads Online Judge

Editorial for DMOPC '21 Contest 8 P5 - Tree Building

Remember to use this editorial only when stuck, and not to copy-paste code from it. Please be respectful to the problem author and editorialist.
Submitting an official solution before solving the problem yourself is a bannable offence.

Author: zhouzixiang2004

Subtask 1

Hint

Which sequences of

N

numbers can be the degree sequence of a tree with

N

nodes?

Solution

For subtask 1, note that any sequence of integers

d_{1}, d_{2}, \dots, d_{N}

with

d_{i} \geq 1

for all

i

and

\sum_{i = 1}^{N} d_{i} = 2 N - 2

can be the degree sequence of a tree with

N

nodes.

Proof

Clearly, these conditions are necessary. To prove they are sufficient, induct on

N

with the

N = 2

case being clear. As

\sum_{i = 1}^{N} d_{i} = 2 N - 2

, there exists indices

i

and

j

with

d_{i} \geq 2

and

d_{j} = 1

, so we can subtract

1

from

d_{i}

, delete

d_{j}

, inductively construct a tree for the remaining sequence, and attach a leaf to node

i

labelled

j

We can now perform a 2D knapsack dynamic programming, where

d p [i] [j]

represents the minimum

\sum_{k = 1}^{i} a_{d_{k}}

\sum_{k = 1}^{i} d_{k} = j

, with

K

transitions at each state.

Time Complexity:

O (N^{2} K)

Subtask 2

Hint

The subtask 1 solution performs the same transitions

N

times. There is a technique, similar to fast exponentiation, that reuses previously computed results so that we only need to perform the transitions

O (\log N)

times.

Solution

The subtask 1 solution transitions only from

d p [i] [∙]

d p [i + 1] [∙]

by looping through all possible values of

d_{i + 1}

(with additional cost

a_{d_{i + 1}}

). However, we can also transition directly from

d p [i] [∙]

d p [2 i] [∙]

by looping through all possible values of

d_{i + 1} + d_{i + 2} + \dots + d_{2 i}

(with additional cost

d p [i] [d_{i + 1} + d_{i + 2} + \dots + d_{2 i}]

). This allows us to compute

d p [N] [∙]

using only

O (\log N)

transitions.

This technique is known as the "

\times 2 + 1

" trick. If we view the transitions as performing min-plus convolutions (and

d p [N] [∙]

as a min-plus convolution to the power of

N

), the similarities between the

\times 2 + 1

trick and fast exponentiation may become more clear.

Time Complexity:

O (N^{2} \log N)

Subtask 3

Hint

Find a dynamic programming approach that only uses

O (N)

states.

Hint 2

The subtask 1 approach may be misleading. Try to find a recurrence that works more directly with the tree structure.

Solution

Consider a group of

j

leaves with the same parent. If we delete all of these leaves, we have a smaller tree, and the degree of the parent node is

1

which gets replaced by a degree of

j + 1

. This leads to a new dynamic programming formulation, where we let

d p [i]

be the minimum cost of a tree with

i

nodes, with transitions of the form

d p [i] = {min}_{j = 1}^{K - 1} d p [i - j] + a [j + 1] + (j - 1) a [1]

Time Complexity:

O (N K)

Subtask 4

Hint

View the subtask 3 solution as an instance of unbounded knapsack with

K - 1

items. When

N

is really large, what items should we take?

Solution

We can write the subtask 3 recurrence in the form

d p [i] = {min}_{j = 1}^{K - 1} d p [i - j] + b [j]

where

b [j] = a [j + 1] + (j - 1) a [1]

for

1 \leq j \leq K - 1

, which is an instance of unbounded knapsack.

Intuitively, when

N

is really large, we should repeatedly take the

j

with the smallest

\frac{b [j]}{j}

as this minimizes the cost per unit of "weight." In fact, this is true whenever

N > K^{2}

Proof

Suppose

N > K^{2}

and we must never use this minimal

j

. As each item occupies a "weight" of at most

K - 1

, we must have used

k > K > j

items, say

u_{1}, u_{2}, \dots, u_{k}

to form

d p [N]

(possibly with duplicates). It is well known that in any set of

k \geq j

numbers, some nonempty subset of them sums to a multiple of

j

(Proof: Pigeonhole on the

k + 1

prefix sums modulo

j

and take their difference). Applying this lemma to

u_{1}, u_{2}, \dots, u_{k}

, some nonempty subset of them sum to a multiple of

j

. We can replace this subset with copies of item

j

, which doesn't make the result worse since

\frac{b [j]}{j}

is minimal, giving a contradiction.

How tight is this bound?

It is tight up to terms linear in

K

. We may have a test case where

b [K - 2]

and

b [K - 1]

are the only

b [i]

that aren't too big to be useful, with

\frac{b [K - 2]}{K - 2} > \frac{b [K - 1]}{K - 1}

. If the capacity of the knapsack is

\equiv 1 (\mod K - 1)

, then we must use at least

K - 2

copies of

b [K - 2]

to fully fill the knapsack, occupying a total space of

(K - 2)^{2} = K^{2} - O (K)

Thus we can do the dynamic programming up to

K^{2}

and do some math to simulate repeatedly using item

j

after that.

Time Complexity:

O (K^{3})

Subtask 5

Hint

Recall the subtask 2 solution. Can we apply a similar doubling idea?

Hint 2

It suffices to know

d p [⌈ \frac{i - K}{2} ⌉ \dots ⌊ \frac{i + K}{2} ⌋]

to calculate

d p [i]

Solution

Note that it suffices to know

d p [⌈ \frac{i - K}{2} ⌉ \dots ⌊ \frac{i + K}{2} ⌋]

to calculate

d p [i]

, as we can partition the items used in forming

d p [i]

into two subsets with sums between

⌈ \frac{i - K}{2} ⌉

and

⌊ \frac{i + K}{2} ⌋

. If we were to compute

d p [N]

recursively in this way, then the only states we visit are between

\frac{N}{2^{k}} - K

and

\frac{N}{2^{k}} + K

for some

k \geq 0

, and it follows that there are at most

O (K \log N)

states, with

O (K)

transitions each. Implementing this idea iteratively (in descending order of

k

) gives an

O (K^{2} \log N)

solution.

A similar idea with a more similar flavour as the

\times 2 + 1

trick is described here (Knapsack Speedup 3).

Time Complexity:

O (K^{2} \log N)

Alternative Solution

Trying to extend the idea from subtask 4, it is natural to wonder: Does it suffice to only consider the top few

j

with the smallest

\frac{b [j]}{j}

? It turns out that if we only consider the

⌈ \frac{2 K^{2}}{i} ⌉ + 1

best

j

when calculating

d p [i]

, then this solution is provably correct.

Proof

Suppose this is false for the first time at

i

. Let

m = ⌈ \frac{2 K^{2}}{i} ⌉ + 1

, let

j_{1}, j_{2}, \dots, j_{m}

be the

m

j

with the smallest

\frac{b [j]}{j}

, and let

u_{1}, u_{2}, \dots, u_{k}

be the items we used to form

d p [i]

. The main idea is that we will find a nonempty subset of

u_{1}, u_{2}, \dots, u_{k}

that can be represented as the sum of a linear combination of

j_{1}, j_{2}, \dots, j_{m}

, each coefficient being positive. Towards this, note that:

A theorem of Erdos and Graham (1972) on Frobenius numbers states that for any sequence $p_1 < p_2 < \dots < p_n$ $p_{1} < p_{2} < \dots < p_{n}$ with $\gcd(p_1, p_2, \dots, p_n) = 1$ $gcd (p_{1}, p_{2}, \dots, p_{n}) = 1$ , every number larger than $\frac{2p_1p_n}{n}-p_1$ $\frac{2 p_{1} p_{n}}{n} - p_{1}$ can be represented as the sum of a linear combination of $p_1, p_2, \dots, p_n$ $p_{1}, p_{2}, \dots, p_{n}$ with all positive coefficients. We can extend this theorem to the case when $g = \gcd(p_1, p_2, \dots, p_n) \ne 1$ $g = gcd (p_{1}, p_{2}, \dots, p_{n}) \neq 1$ by applying the theorem to $\frac{p_1}{g}, \frac{p_2}{g}, \dots, \frac{p_n}{g}$ $\frac{p_{1}}{g}, \frac{p_{2}}{g}, \dots, \frac{p_{n}}{g}$ , giving the result that every number larger than $\frac{2p_1p_n}{gn}-p_1$ $\frac{2 p_{1} p_{n}}{g n} - p_{1}$ can be represented in this way.
To apply this theorem, we need to first bound $g = \gcd(j_1, j_2, \dots, j_m)$ $g = gcd (j_{1}, j_{2}, \dots, j_{m})$ . Sort $j_1, j_2, \dots, j_m$ $j_{1}, j_{2}, \dots, j_{m}$ in increasing order of $j$ $j$ . As $j_1, j_2, \dots, j_m$ $j_{1}, j_{2}, \dots, j_{m}$ are all between $1$ $1$ and $K-1$ $K - 1$ , some difference of adjacent terms $d = j_{x+1}-j_x$ $d = j_{x + 1} - j_{x}$ is at most $\frac{K-2}{m-1}$ $\frac{K - 2}{m - 1}$ . It follows that $g \le \frac{K-2}{m-1}$ $g \leq \frac{K - 2}{m - 1}$ .
Now we need to find a subset of $u_1, u_2, \dots, u_k$ $u_{1}, u_{2}, \dots, u_{k}$ with a large sum that is a multiple of $g$ $g$ . Recall the lemma that every set of $k \ge g$ $k \geq g$ numbers has a subset that sums to a multiple of $g$ $g$ . We can repeatedly use this lemma to extract a subset of unused elements of $u_1, u_2, \dots, u_k$ $u_{1}, u_{2}, \dots, u_{k}$ , summing to a multiple of $g$ $g$ , as long as the sum of the remaining unused elements is at least $(K-1)(g-1)+1$ $(K - 1) (g - 1) + 1$ . This way, we can find a subset $S$ $S$ of $u_1, u_2, \dots, u_k$ $u_{1}, u_{2}, \dots, u_{k}$ that sums to a multiple of $g$ $g$ and is at least $i-(K-1)(g-1)$ $i - (K - 1) (g - 1)$ .
To apply the theorem to $j_1 < j_2 < \dots < j_m$ $j_{1} < j_{2} < \dots < j_{m}$ and the sum of elements in $S$ $S$ , we need to check that $\displaystyle i-(K-1)(g-1) > \frac{2j_1j_m}{gm}-j_1$ $i - (K - 1) (g - 1) > \frac{2 j_{1} j_{m}}{g m} - j_{1}$ Which follows from $\displaystyle g \le \frac{K-2}{m-1} < \frac{K}{\frac{2K^2}{i}} = \frac{i}{2K} < \frac{i}{K-1} \implies \frac{i}{g} > K-1$ $g \leq \frac{K - 2}{m - 1} < \frac{K}{\frac{2 K^{2}}{i}} = \frac{i}{2 K} < \frac{i}{K - 1} ⟹ \frac{i}{g} > K - 1$ $\displaystyle \implies \frac{i(g-1)}{g} \ge (K-1)(g-1) \implies i - \frac{i}{g} \ge (K-1)(g-1)$ $⟹ \frac{i (g - 1)}{g} \geq (K - 1) (g - 1) ⟹ i - \frac{i}{g} \geq (K - 1) (g - 1)$ $\displaystyle \implies i-(K-1)(g-1) \ge \frac{i}{g} > \frac{2K^2}{gm} > \frac{2j_1j_m}{gm}-j_1$ $⟹ i - (K - 1) (g - 1) \geq \frac{i}{g} > \frac{2 K^{2}}{g m} > \frac{2 j_{1} j_{m}}{g m} - j_{1}$ as needed.

Now replacing this subset with copies of items

j_{1}, j_{2}, \dots, j_{m}

as needed gives the contradiction.

Time Complexity:

O (K^{2} \log K)

Comments

There are no comments at the moment.