Editorial for DMOPC '21 Contest 8 P5 - Tree Building


Remember to use this editorial only when stuck, and not to copy-paste code from it. Please be respectful to the problem author and editorialist.
Submitting an official solution before solving the problem yourself is a bannable offence.

Author: zhouzixiang2004

Subtask 1

Hint Which sequences of N numbers can be the degree sequence of a tree with N nodes?
Solution For subtask 1, note that any sequence of integers d1,d2,,dN with di1 for all i and i=1Ndi=2N2 can be the degree sequence of a tree with N nodes.
Proof Clearly, these conditions are necessary. To prove they are sufficient, induct on N with the N=2 case being clear. As i=1Ndi=2N2, there exists indices i and j with di2 and dj=1, so we can subtract 1 from di, delete dj, inductively construct a tree for the remaining sequence, and attach a leaf to node i labelled j.
We can now perform a 2D knapsack dynamic programming, where dp[i][j] represents the minimum k=1iadk if k=1idk=j, with K transitions at each state.

Time Complexity: O(N2K)

Subtask 2

Hint The subtask 1 solution performs the same transitions N times. There is a technique, similar to fast exponentiation, that reuses previously computed results so that we only need to perform the transitions O(logN) times.
Solution The subtask 1 solution transitions only from dp[i][] to dp[i+1][] by looping through all possible values of di+1 (with additional cost adi+1). However, we can also transition directly from dp[i][] to dp[2i][] by looping through all possible values of di+1+di+2++d2i (with additional cost dp[i][di+1+di+2++d2i]). This allows us to compute dp[N][] using only O(logN) transitions.

This technique is known as the "×2+1" trick. If we view the transitions as performing min-plus convolutions (and dp[N][] as a min-plus convolution to the power of N), the similarities between the ×2+1 trick and fast exponentiation may become more clear.

Time Complexity: O(N2logN)

Subtask 3

Hint Find a dynamic programming approach that only uses O(N) states.
Hint 2 The subtask 1 approach may be misleading. Try to find a recurrence that works more directly with the tree structure.
Solution Consider a group of j leaves with the same parent. If we delete all of these leaves, we have a smaller tree, and the degree of the parent node is 1 which gets replaced by a degree of j+1. This leads to a new dynamic programming formulation, where we let dp[i] be the minimum cost of a tree with i nodes, with transitions of the form dp[i]=minj=1K1dp[ij]+a[j+1]+(j1)a[1] Time Complexity: O(NK)

Subtask 4

Hint View the subtask 3 solution as an instance of unbounded knapsack with K1 items. When N is really large, what items should we take?
Solution We can write the subtask 3 recurrence in the form dp[i]=minj=1K1dp[ij]+b[j] where b[j]=a[j+1]+(j1)a[1] for 1jK1, which is an instance of unbounded knapsack.

Intuitively, when N is really large, we should repeatedly take the j with the smallest b[j]j as this minimizes the cost per unit of "weight." In fact, this is true whenever N>K2.
Proof Suppose N>K2 and we must never use this minimal j. As each item occupies a "weight" of at most K1, we must have used k>K>j items, say u1,u2,,uk to form dp[N] (possibly with duplicates). It is well known that in any set of kj numbers, some nonempty subset of them sums to a multiple of j (Proof: Pigeonhole on the k+1 prefix sums modulo j and take their difference). Applying this lemma to u1,u2,,uk, some nonempty subset of them sum to a multiple of j. We can replace this subset with copies of item j, which doesn't make the result worse since b[j]j is minimal, giving a contradiction.
How tight is this bound? It is tight up to terms linear in K. We may have a test case where b[K2] and b[K1] are the only b[i] that aren't too big to be useful, with b[K2]K2>b[K1]K1. If the capacity of the knapsack is 1(modK1), then we must use at least K2 copies of b[K2] to fully fill the knapsack, occupying a total space of (K2)2=K2O(K).
Thus we can do the dynamic programming up to K2 and do some math to simulate repeatedly using item j after that.

Time Complexity: O(K3)

Subtask 5

Hint Recall the subtask 2 solution. Can we apply a similar doubling idea?
Hint 2 It suffices to know dp[iK2i+K2] to calculate dp[i].
Solution Note that it suffices to know dp[iK2i+K2] to calculate dp[i], as we can partition the items used in forming dp[i] into two subsets with sums between iK2 and i+K2. If we were to compute dp[N] recursively in this way, then the only states we visit are between N2kK and N2k+K for some k0, and it follows that there are at most O(KlogN) states, with O(K) transitions each. Implementing this idea iteratively (in descending order of k) gives an O(K2logN) solution.

A similar idea with a more similar flavour as the ×2+1 trick is described here (Knapsack Speedup 3).

Time Complexity: O(K2logN)
Alternative Solution Trying to extend the idea from subtask 4, it is natural to wonder: Does it suffice to only consider the top few j with the smallest b[j]j? It turns out that if we only consider the 2K2i+1 best j when calculating dp[i], then this solution is provably correct.
Proof Suppose this is false for the first time at i. Let m=2K2i+1, let j1,j2,,jm be the m j with the smallest b[j]j, and let u1,u2,,uk be the items we used to form dp[i]. The main idea is that we will find a nonempty subset of u1,u2,,uk that can be represented as the sum of a linear combination of j1,j2,,jm, each coefficient being positive. Towards this, note that:
  • A theorem of Erdos and Graham (1972) on Frobenius numbers states that for any sequence p1<p2<<pn with gcd(p1,p2,,pn)=1, every number larger than 2p1pnnp1 can be represented as the sum of a linear combination of p1,p2,,pn with all positive coefficients. We can extend this theorem to the case when g=gcd(p1,p2,,pn)1 by applying the theorem to p1g,p2g,,png, giving the result that every number larger than 2p1pngnp1 can be represented in this way.
  • To apply this theorem, we need to first bound g=gcd(j1,j2,,jm). Sort j1,j2,,jm in increasing order of j. As j1,j2,,jm are all between 1 and K1, some difference of adjacent terms d=jx+1jx is at most K2m1. It follows that gK2m1.
  • Now we need to find a subset of u1,u2,,uk with a large sum that is a multiple of g. Recall the lemma that every set of kg numbers has a subset that sums to a multiple of g. We can repeatedly use this lemma to extract a subset of unused elements of u1,u2,,uk, summing to a multiple of g, as long as the sum of the remaining unused elements is at least (K1)(g1)+1. This way, we can find a subset S of u1,u2,,uk that sums to a multiple of g and is at least i(K1)(g1).
  • To apply the theorem to j1<j2<<jm and the sum of elements in S, we need to check that i(K1)(g1)>2j1jmgmj1 Which follows from gK2m1<K2K2i=i2K<iK1ig>K1 i(g1)g(K1)(g1)iig(K1)(g1) i(K1)(g1)ig>2K2gm>2j1jmgmj1 as needed.
Now replacing this subset with copies of items j1,j2,,jm as needed gives the contradiction.
Time Complexity: O(K2logK)

Comments

There are no comments at the moment.