Editorial for JOI '21 Open P2 - Financial Report - Olympiads Online Judge

Editorial for JOI '21 Open P2 - Financial Report

Remember to use this editorial only when stuck, and not to copy-paste code from it. Please be respectful to the problem author and editorialist.
Submitting an official solution before solving the problem yourself is a bannable offence.

In this review, we explain a solution for each of 6 Subtasks. We first explain how to solve Subtask 1 by a brute-force search algorithm whose time complexity is $\mathcal{O}(2^N N)$ ~\mathcal{O}(2^N N)~. Then we explain a solution to Subtask 2 by Dynamic Programming whose time complexity is $\mathcal{O}(N^3)$ ~\mathcal{O}(N^3)~. An improvement of it gives a solution to Subtask 3 whose time complexity is $\mathcal{O}(N^2)$ ~\mathcal{O}(N^2)~.

Then we explain faster algorithm which is valid in some special circumstances such as Subtask 4 ( $D = 1$ ~D = 1~) and Subtask 5 ( $D = N$ ~D = N~). Finally, we explain full score solution whose time complexity is $\mathcal{O}(N \log N)$ ~\mathcal{O}(N \log N)~. It is an improvement of the solution of Subtask 3 by data structures. for certain how to solve the task faster for

If you hurry to read a full score solution, please see "Full Score Solution" section.

Subtask 1 ( $N \le 20$ ~N \le 20~)

First, we consider a brute-force search algorithm to choose the data for the presentation. There are two choices (Bitaro chooses it or does not choose it) for the data for each day. In total, there are $2 \times 2 \times \dots \times 2 = 2^N$ choices to choose the data. The solution of this task is the maximum of the impression scores for these choices.

How can we list all of the $2^N$ ~2^N~ possible choices by a brute-force algorithm? There are several methods of implementation.

Search all of the $2^N$ ~2^N~ possible choices by a nested loop. (It is an unusual method because the implementation is hard.)
Use a recursive function.
Use a brute-force bit search using binary digit numbers.

Here we explain the third method because the implementation is easy. The idea of brute-force bit search is to associate an integer between $0$ ~0~ and $2^N-1$ ~2^N-1~, inclusive, to each of the $2^N$ ~2^N~ choices. Precisely, we associate it in the following way.

Consider the binary digits of an integer $i$ ~i~ ( $0 \le i \le 2^N-1$ ~0 \le i \le 2^N-1~). If the digit in the $2^j$ ~2^j~-th place is $1$ ~1~, Bitaro chooses the $j$ ~j~-th data. If the digit in the $2^j$ ~2^j~-th place is $0$ ~0~, Bitaro does not choose the $j$ ~j~-th data.

We can accomplish a brute-force search of $2^N$ ~2^N~ possible choices by writing a for loop from $i = 0$ ~i = 0~ to $2^N-1$ ~2^N-1~. For each $i$ ~i~, we can check whether it satisfies the condition or not and calculate the impression scores in $\mathcal{O}(N)$ ~\mathcal{O}(N)~ time. The total time complexity is $\mathcal{O}(2^N N)$ ~\mathcal{O}(2^N N)~. We can calculate the answer under the constraints of Subtask 1. Thus we can get 15 points. The time complexity will be the same if we use other implementation methods such as recursive functions.

Subtask 2 ( $N \le 400$ ~N \le 400~)

A brute-force search algorithm is not efficient because if $N$ ~N~ is increased by $1$ ~1~, then the time complexity is multiplied by $2$ ~2~. Obviously it cannot give an answer when $N = 400$ ~N = 400~. We shall use Dynamic Programming (DP) to improve the algorithm. In the following, we consider an algorithm to choose the data in chronological order and update the values in the DP table sequentially

$dp[pos][curmax]$ ~dp[pos][curmax]~: the maximum number of times of making record-high sales amounts under the assumption that the final chosen data is the data of the $pos$ ~pos~-th day and the maximum sales amount at that day is $curmax$ ~curmax~ yen.

Let $(pos, curmax)$ ~(pos, curmax)~ be the state that $dp[pos][curmax]$ ~dp[pos][curmax]~ indicates. If Bitaro chooses the data of the $nxt$ ~nxt~-th day when the state is $(pos, curmax)$ ~(pos, curmax)~, the state becomes $(nxt, \max(curmax, nxt))$ ~(nxt, \max(curmax, nxt))~. Thus we can update the values in the DP table in the following way.

In the beginning, all the values of $dp[pos][curmax]$ ~dp[pos][curmax]~ are the initial value $-\infty$ ~-\infty~.
Do the operations 2-3 from smaller values of $pos, curmax$ ~pos, curmax~.
If $curmax = A_{pos}$ ~curmax = A_{pos}~, the value of $dp[pos][curmax]$ ~dp[pos][curmax]~ becomes $\max(dp[pos][curmax], 1)$ ~\max(dp[pos][curmax], 1)~. It means the first data chosen by Bitaro is the data of the $pos$ ~pos~-th day.
For every $nxt$ ~nxt~ ( $pos+1 \le nxt \le N$ ~pos+1 \le nxt \le N~), we update the value of $dp[pos][\max(curmax, A_{nxt})]$ ~dp[pos][\max(curmax, A_{nxt})]~ to be $\max(dp[pos][\max(curmax, A_{nxt})], dp[pos][curmax]+1)$ .
The answer is the maximum of the final values of $dp[pos][curmax]$ ~dp[pos][curmax]~.

However, we cannot calculate these values efficiently because the value of $curmax$ ~curmax~ can be as large as $10^9$ ~10^9~. Here we shall use the coordinate compression technique because the answer depends only on the relative magnitude relationship between $A_i$ ~A_i~. We may consider all values are compressed into the range $0 \le A_i \le N-1$ ~0 \le A_i \le N-1~. Then, we have $0 \le curmax \le N-1$ ~0 \le curmax \le N-1~, and we can finish the calculation of the DP table in $\mathcal{O}(N^3)$ ~\mathcal{O}(N^3)~ time.

Subtask 3 ( $N \le 7000$ ~N \le 7000~)

We shall speed up the DP algorithm as above. We are now calculating the values of the DP table $dp[pos][curmax]$ ~dp[pos][curmax]~ sequentially. In order to speed up the calculation, we need to do one of the following:

We shall speed up the calculation of each $dp[pos][curmax]$ ~dp[pos][curmax]~.
We shall reduce the total number of possible states of DP. Currently, the number of possible values of $(pos, curmax)$ ~(pos, curmax)~ is $\mathcal{O}(N^2)$ ~\mathcal{O}(N^2)~.

Here we explain how to reduce the calculation in the second part. We shall perform DP calculation without considering $curmax$ ~curmax~. Now, the reason why we need to consider $curmax$ ~curmax~ is to decide whether a record-high sales amount is achieved or not. In order to eliminate this consideration, we shall change the state of DP only when a record-high sales amount is achieved. Let us calculate the following DP table sequentially.

$dp[pos]$ ~dp[pos]~: the maximum number of times of making record-high sales amounts under the assumption that the final chosen data is the data of the $pos$ ~pos~-th day and the maximum sales amount is achieved at that day.

We can calculate the values in the DP table as follows: if it is possible to change the state from $dp[pre]$ ~dp[pre]~ to $dp[pos]$ ~dp[pos]~ (i.e., it is possible to achieve a maximum sales amount on the $pos$ ~pos~-th day after achieving a maximum sales amount on the $pre$ ~pre~-th day), then we update the value of $dp[pos]$ ~dp[pos]~ to be $\max(dp[pos], dp[pre]+1)$ ~\max(dp[pos], dp[pre]+1)~. It is possible to change the state as $dp[pre] \to dp[pos]$ ~dp[pre] \to dp[pos]~ if and only if the following condition is satisfied.

(Condition) $A_{pre} < A_{pos}$ ~A_{pre} < A_{pos}~, and, for each of the $pre+1, pre+2, \dots, pos-1$ ~pre+1, pre+2, \dots, pos-1~-th day, we can choose a data whose sales amount is at most $A_{pre}$ ~A_{pre}~ yen so that the difference of any two consecutive dates is at most $D$ ~D~ days.

In order to satisfy the above condition, the optimum strategy is to choose all of the data whose sales amount is at most $A_{pre}$ ~A_{pre}~ yen. Hence we can rewrite this condition as follows, and can determine it easily.

(Condition) $A_{pre} < A_{pos}$ ~A_{pre} < A_{pos}~, and, for each of the $pre+1, pre+2, \dots, pos-1$ ~pre+1, pre+2, \dots, pos-1~-th day, there are no consecutive $D$ ~D~ days when the sales amount exceeds $A_{pre}$ ~A_{pre}~ yen.

It takes $\mathcal{O}(N^3)$ ~\mathcal{O}(N^3)~ time to obtain the answer if, when we calculate the value of $dp[pos]$ ~dp[pos]~, we individually determine whether it is possible to change the state as $dp[pre] \to dp[pos]$ ~dp[pre] \to dp[pos]~ for each $pre$ ~pre~. We can calculate whether it is possible to change the state for every pair $(pre, pos)$ ~(pre, pos)~ in advance. Then, we can obtain the answer in $\mathcal{O}(N^2)$ ~\mathcal{O}(N^2)~ time in total. The answer of this task is the maximum of $dp[1], dp[2], \dots, dp[N]$ ~dp[1], dp[2], \dots, dp[N]~.

Method of precalculation (Part 1)

There are several methods of precalculation. Here is a method. First, we note that the condition "for each of the $pre+1, pre+2, \dots, pos-1$ ~pre+1, pre+2, \dots, pos-1~-th day, there are no consecutive $D$ ~D~ days when the sales amount exceeds $A_{pre}$ ~A_{pre}~ yen" becomes stronger if the value of $pos$ ~pos~ increases. Therefore, the condition is satisfied for a range of the form $pre+1 \le pos \le R_{pre}$ ~pre+1 \le pos \le R_{pre}~.

For each $i$ ~i~, the value of $R_i$ ~R_i~ is "the first position among $A_{i+1}, A_{i+2}, \dots, A_N$ ~A_{i+1}, A_{i+2}, \dots, A_N~ where $D+1$ ~D+1~ consecutive values exceed $A_i$ ~A_i~." (If such a position does not exist, then $R_i = N$ ~R_i = N~.) Thus, it can be calculated in $\mathcal{O}(N)$ ~\mathcal{O}(N)~ time for each $i$ ~i~. After we calculate the values of $R_1, R_2, \dots, R_N$ ~R_1, R_2, \dots, R_N~, we can determine whether it is possible to change the states easily; the change of states is possible if and only if " $A_{pre} < A_{pos}$ ~A_{pre} < A_{pos}~ and $pre+1 \le pos \le R_{pre}$ ~pre+1 \le pos \le R_{pre}~" are satisfied. We can calculate it for every pair $(pre, pos)$ ~(pre, pos)~ in advance in $\mathcal{O}(N^2)$ ~\mathcal{O}(N^2)~ time.

The method of calculating $R_1, R_2, \dots, R_N$ ~R_1, R_2, \dots, R_N~ is related to the full score solution.

Subtask 4 ( $D = 1$ ~D = 1~)

Let us recall the conditions. When $D = 1$ ~D = 1~, we need to choose the data so that the following two conditions are satisfied.

Bitaro will show the latest sales amount.
For any two consecutive data, the difference of the dates is at most $1$ ~1~ day. In other words, Bitaro has to choose the data of consecutive days.

Thus Bitaro has to choose the data of the $x, x+1, \dots, N$ ~x, x+1, \dots, N~-th days ( $1 \le x \le N$ ~1 \le x \le N~). There are $N$ ~N~ possible choices. But, if we calculate the impression score for each choice, the total time complexity becomes $\mathcal{O}(N^2)$ ~\mathcal{O}(N^2)~, and we cannot solve this Subtask.

In order to solve it, we shall calculate the answer for $x = N, N-1, \dots, 1$ ~x = N, N-1, \dots, 1~ in this order. We consider an algorithm using the data structure called "stack" to store the position where the maximum sales amount is achieved.

Perform the operations 1-3 for $i = N, N-1, \dots, 1$ ~i = N, N-1, \dots, 1~, in this order.
1. Delete (pop) the last element until the stack becomes empty or the last element of the stack becomes at least $A_i+1$ ~A_i+1~.
2. Add $A_i$ ~A_i~ to the end of the stack.
3. At that time, the size of the stack is the impression score when $x = i$ ~x = i~.
The answer is the maximum of the impression scores calculated above.

It may happen that it takes $\mathcal{O}(N)$ ~\mathcal{O}(N)~ time to perform an operation of type 2. But the total time complexity is $\mathcal{O}(N)$ ~\mathcal{O}(N)~ because the values are added to the stack $N$ ~N~ times and the number of times of deleting elements in the operation 2 is at most $N$ ~N~.

Subtask 5 ( $D = N$ ~D = N~)

When $D = N$ ~D = N~, we need to choose the data so that the following two conditions are satisfied.

Bitaro will show the latest sales amount.
For any two consecutive data, the difference of the dates is at most $N$ ~N~ days.

We can ignore the second condition because it is always satisfied for any choice of the data for presentation. Thus we shall consider the first condition only. Moreover, the impression score will never increase if we delete the date of the last day. Also, the impression score will remain the same if we delete all of the data where the maximum sales amount is achieved. Therefore, we only need to consider the data where the maximum sales amount is achieved. Hence we need to calculate the following.

The maximum number of data so that the sales amounts at the chosen days are increasing in chronological order.

This is a famous problem called "Longest Increasing Subsequence (LIS)." It is well-known that it can be solved in $\mathcal{O}(N \log N)$ ~\mathcal{O}(N \log N)~ time. There are several algorithms to solve it. Here is one such algorithm.

In the beginning, we set the values of $s_1, s_2, \dots, s_N$ ~s_1, s_2, \dots, s_N~ to be $\infty$ ~\infty~.
Perform the following operation for $i = 1, 2, \dots, N$ ~i = 1, 2, \dots, N~, in this order.
- Find the minimum $pos$ ~pos~ satisfying $s_{pos} \ge A_i$ ~s_{pos} \ge A_i~ by a binary search technique, and update the value of $s_{pos}$ ~s_{pos}~ to be $A_i$ ~A_i~.
Finally, the number of $k$ ~k~ such that the value of $s_k$ ~s_k~ is different from $\infty$ ~\infty~ is the length of the Longest Increasing Subsequence (LIS).

Note that the sequence $s_1, s_2, \dots, s_N$ ~s_1, s_2, \dots, s_N~ calculated by the above algorithm is not necessarily an example of Longest Increasing Subsequence (LIS).

Full Score Solution

Before, we calculate the DP table in chronological order (i.e., the order of the data $A_1, A_2, \dots, A_N$ ~A_1, A_2, \dots, A_N~). However, in the DP solution of Subtask 3, it is more convenient for us to calculate them in the reverse order (i.e., the order of the data $A_N, A_{N-1}, \dots, A_1$ ~A_N, A_{N-1}, \dots, A_1~). Here we shall consider calculating the following DP table in the opposite order.

$dp[pos]$ ~dp[pos]~: the maximum number of times of making record-high sales amounts in the past (in chronological order) when we are choosing data opposite to the chronological order and the final chosen data is the data of the $pos$ ~pos~-th day.

Note that we change the DP states in the opposite order. As in Subtask 3, the change of states $dp[pre] \to dp[pos]$ ~dp[pre] \to dp[pos]~ is possible if and only if the following condition is satisfied.

(Condition) $A_{pre} > A_{pos}$ ~A_{pre} > A_{pos}~, and, for each of the $pos+1, pos+2, \dots, pre-1$ ~pos+1, pos+2, \dots, pre-1~-th day, there are no consecutive $D$ ~D~ days when the sales amount exceeds $A_{pre}$ ~A_{pre}~ yen.

As in the solution of Subtask 3, the value of $R_i$ ~R_i~ is "the first position among $A_{i+1}, A_{i+2}, \dots, A_N$ ~A_{i+1}, A_{i+2}, \dots, A_N~ where $D+1$ ~D+1~ consecutive values exceed $A_i$ ~A_i~." (If such a position does not exist, then $R_i = N$ ~R_i = N~.) Then we can change the state to be $dp[pos]$ ~dp[pos]~ in the range $pos+1 \le pre \le R_{pos}$ ~pos+1 \le pre \le R_{pos}~ satisfying $A_{pre} > A_{pos}$ ~A_{pre} > A_{pos}~. We can calculate the DP table by the following simple formula. (If we calculate the table by this formula, the total time complexity becomes $\mathcal{O}(N^2)$ ~\mathcal{O}(N^2)~.)

$\displaystyle dp[pos] = \max\left(\max_{pos+1 \le i \le R_{pos}, A_i > A_{pos}} dp[i], 0\right)+1$

We need to speed up this algorithm using a data structure. In order not to consider the condition $A_i > A_{pos}$ ~A_i > A_{pos}~, we shall calculate the values of $A_i$ ~A_i~ in decreasing order. We have the following algorithm.

We set the values of $dp[1], dp[2], \dots, dp[N]$ ~dp[1], dp[2], \dots, dp[N]~ to be $-\infty$ ~-\infty~.
Perform the following operation in decreasing order of the value of $A_{pos}$ ~A_{pos}~ (in increasing order of $pos$ ~pos~ if the values are the same).
- We set $dp[pos] = \max(dp[pos+1], dp[pos+2], \dots, dp[R_{pos}], 0)+1$ .
Finally, the answer is the maximum of $dp[1], dp[2], \dots, dp[N]$ ~dp[1], dp[2], \dots, dp[N]~.

It usually takes $\mathcal{O}(N)$ ~\mathcal{O}(N)~ time to calculate one $dp[pos]$ ~dp[pos]~. Since calculating the minimum in a range is RMQ (Range Maximum Query), we can calculate it in $\mathcal{O}(\log N)$ ~\mathcal{O}(\log N)~ using a segment tree. Thus, once $R_1, R_2, \dots, R_N$ ~R_1, R_2, \dots, R_N~ are calculated, we can obtain the answer in $\mathcal{O}(N \log N)$ ~\mathcal{O}(N \log N)~ time in total.

Method of precalculation (Part 2)

The remaining task is to calculate the values of $R_1, R_2, \dots, R_N$ ~R_1, R_2, \dots, R_N~. We can calculate it in $\mathcal{O}(N^2)$ ~\mathcal{O}(N^2)~ time by the algorithm explained in Subtask 3. We need to speed up it. Recall that $R_i$ ~R_i~ is "the first position among $A_{i+1}, A_{i+2}, \dots, A_N$ ~A_{i+1}, A_{i+2}, \dots, A_N~ where $D+1$ ~D+1~ consecutive values exceed $A_i$ ~A_i~." In order words, it is the first value $j \ge i+D+1$ ~j \ge i+D+1~ where all of the values of $A_{j-D}, A_{j-D+1}, \dots, A_j$ ~A_{j-D}, A_{j-D+1}, \dots, A_j~ exceed $A_i$ ~A_i~.

Here we put $M_j = \max(A_{j-D}, A_{j-D+1}, \dots, A_j)$ . Then $R_i$ ~R_i~ is the minimum of $j$ ~j~ in the range $j \ge i+D+1$ ~j \ge i+D+1~ such that $M_j \ge A_i+1$ ~M_j \ge A_i+1~ is satisfied. Hence we can calculate them by the following way.

For $j = D+1, D+2, \dots, N$ ~j = D+1, D+2, \dots, N~, calculate $M_j = \max(A_{j-D}, A_{j-D+1}, \dots, A_j)$ . For example, they can be calculated in $\mathcal{O}(N \log N)$ ~\mathcal{O}(N \log N)~ time if we use a RMQ segment tree.
Then $R_i$ ~R_i~ is equal to the leftmost index among $M_{i+D+1}, M_{i+D+2}, \dots, M_N$ ~M_{i+D+1}, M_{i+D+2}, \dots, M_N~ where the value exceeds $A_i$ ~A_i~. For example, it can be calculated in $\mathcal{O}(N \log N)$ ~\mathcal{O}(N \log N)~ time in total if we use a binary search technique on a RMQ segment tree.

Hence we can calculate the values of $R_1, R_2, \dots, R_N$ ~R_1, R_2, \dots, R_N~ in $\mathcal{O}(N \log N)$ ~\mathcal{O}(N \log N)~ time in total.

There are several ways to calculate the values of $R_1, R_2, \dots, R_N$ ~R_1, R_2, \dots, R_N~ in advance. Here we only explain one way. Some contestants may solve this task by other methods. For example, in the operation 1, we can calculate $M_{D+1}, M_{D+2}, \dots, M_N$ ~M_{D+1}, M_{D+2}, \dots, M_N~ in $\mathcal{O}(N)$ ~\mathcal{O}(N)~ time by a "sliding minimum technique." There are also completely different methods. For example, we can calculate $R_i$ ~R_i~ in the increasing order of $A_i$ ~A_i~ using std::set.

Comments

There are no comments at the moment.