Editorial for JOI '22 Open P2 - Giraffes
Submitting an official solution before solving the problem yourself is a bannable offence.
Review: Hirotaka Yoneda (square1001)
In this review, we explain solutions for the 4 subtasks of the task "Giraffes" in JOI Open Contest 2022.
In the beginning, we describe an outline of solutions. Please read this section if you want to read a short review. However, it seems difficult to fully understand the solutions by a short review. Except for those who have enough confidence in the ability of programming contests, for the first time, we recommend you read solutions for each subtask in Section 2 and later sections. We explain the solutions for Subtasks 1–3 in a relatively comprehensive way. On the other hand, the solution for full score is rather difficult. We explain it for advanced participants.
1. Brief outline of solutions
A brief outline of a solution for each subtask is as follows. Here we only explain a brief outline. See Section
2 and later sections for detailed explanations on solutions. In the following, the initial positions of the giraffes
are written as , and the positions of the giraffes after rearrangement are written as
.
- Subtask 1 We can solve this subtask by checking all possible positions of the giraffes after rearrangement.
The time complexity is
.
- Subtask 2 For the conditions to hold, it is necessary that either one of
,
,
,
is satisfied. By recursively enumerating the permutations satisfying this condition, we can solve this subtask in
time.
- Subtask 3 Using the same idea as Subtask 2, we can solve this subtask by Dynamic Programming (DP).
Let
be "the minimum number of elements changed from
when
is a permutation of
satisfying the condition." We calculate
from smaller values of
. The time complexity of this solution is
.
- Full Score Solution Here we use the fact that, on average, the maximum number of unchanged elements
(
) is
. In Subtask 3, the region described by
can be considered as a square region
,
. We slightly modify DP in Subtask 3. We put
to be "the maximum length of a side of a square such that the point
is a vertex of direction
(
) of the square and the number of unchanged elements is
." We shall calculate the values of
. Since we have
possible values of
, we can solve this task in
time on average. We speed up the DP using plane sweep algorithm with segment trees. Then we can solve this task in
time on average, and get the full score.
Note that for the solutions of Subtasks 1, 2, 3, we do not use the randomness of input data. However, for the
full score solution, we use the randomness of input data. In the worst case, the time complexity of the full score
solution becomes .
2. Subtask 1 (
)
There are permutations of the giraffes. We can obtain the correct answer if we search all
the possible permutations and, among arrangements of the giraffes satisfying the condition of the visual quality,
calculate the minimum number of giraffes which are moved.
How can we write a program to search all the permutations?1 There are several ways to search the
permutations. The following are the major techniques.
- Call functions recursively to search all the possibilities.
- Use
std::next_permutation
to search all the possibilities.
Here we explain how to search all the possibilities using C++ standard library std::next_permutation
. It
is a rather easy technique. This function transforms an array into "the permutation next to
in the lexicographic
order." We give some examples.
std::next_permutation
transforms the arrayinto
.
std::next_permutation
transforms the arrayinto
.
std::next_permutation
transforms the arrayinto
.
Thus, if we apply std::next_permutation
to five times, the array is transformed as follows:
. This way, we can search all the possibilities.
Note that
std::next_permutation
returns false
only if there is no next permutation. Otherwise, it
returns true
. We can search all the permutations of by the following way.
vector<int> perm(N);
for (int i = 0; i < N; i++) {
perm[i] = i + 1; // Initially, set perm = (1, 2, ..., N)
}
do {
// In this loop, do some calculation for the permutation perm
} while(next_permutation(perm.begin(), perm.end()));
By a triple loop for possible values of and
, we can determine whether each permutation
satisfies the condition of the visual quality or not. The time complexity is
. Therefore, we can obtain the
answer in
time. We can solve Subtask 1.
3. Subtask 2 (
)
In Subtask 2 and later subtasks, a "permutation" means a permutation of if not otherwise specified.
First, when we solve such a task, it is important to think of an easy characterization. For this problem, we
should think about "what are the permutations satisfying the condition of visual quality?" The reason is, in its
original form, the condition in the task statement is too complicated to obtain efficient solution. To start, let us
consider which permutations satisfy the condition, and which permutations
do not satisfy the condition.
For example, when , which permutations
satisfy the condition? Observing a situation well, you may
notice that one of the ends is either
or
. The reason for this is that if both of the ends are
,
,
, the visual
quality becomes bad when we take a picture with
,
. In general, the following holds.
Property 1 If both
and
are between
and
, inclusive, then the permutation
does not satisfy "the condition of the visual quality."
The reason is that if both and
are between
and
, inclusive, then the following hold:
- One of
is equal to
(we write it as
). Then we have
.
- One of
is equal to
(we write it as
). Then we have
.
Therefore, the visual quality becomes bad if we choose ,
. Conversely, we obtain a necessary
condition for "the condition of the visual quality" to hold.
Property 2 For the permutation
to satisfy "the condition of the visual quality," it is necessary that at least one of
,
,
,
holds.
(TODO: insert diagram)
If , the remaining part
of the permutation (it is a permutation of
) has
to satisfy Property 2. Thus, if we choose one of the ends among the four possibilities, the remaining part
(surrounded by a blue box) of length
in the above figures has to satisfy Property 2 too. Similarly, for
each of the four possibilities, the remaining part of length
has to satisfy Property 2. The same is true
for the remaining part of length
, … etc. Therefore, for "the condition of the visual quality" to hold, it is
necessary that the permutation
satisfies the following condition.
Property 3 For "the condition of the visual quality" to hold, it is necessary that the permutation
is generated in the following way.
- In the beginning, we set
,
,
,
. Initially, none of
is determined.
- Repeat the following
times.
- Choose one of the following operations (a)-(d), and perform it.
(a) Set, increase
by
, and increase
by
.
(b) Set, increase
by
, and decrease
by
.
(c) Set, decrease
by
, and increase
by
.
(d) Set, decrease
by
, and decrease
by
.
At most permutations
are generated by Property 32. We see this from a tree diagram since we have
possibilities for each of
steps. Moreover, we see that the permutations generated in the above way always
satisfy the condition.
Property 4 Every permutation
generated by Property 3 satisfies "the condition of the visual quality."
Proof It is enough to prove that the visual quality does not become bad for any choice of
(
). Among
and
, let
be the element which is chosen first.
- If
is chosen by (a) or (c), the elements
chosen later than
are larger than the value of
(
) when
is chosen. Thus there does not exist
(
) satisfying
. The visual quality does not become bad.
- If
is chosen by (b) or (d), the elements
chosen later than
are smaller than the value of
(
) when
is chosen. Thus there does not exist
(
) satisfying
. The visual quality does not become bad.
Since the visual quality does not become bad in every case, the permutation satisfies the condition.
If we generate permutations by Property 3, we obtain all the permutations satisfying the condition. Moreover,
all the permutations satisfying the condition can be obtained in this way. Among them, the minimum
number of "the elements which are changed from the initial permutation" is the answer. Thus we can solve
this task in
time in total. We can solve Subtask 2 in this way.
Incidentally, by the algorithm explained as above, the same permutation may appear several times. In fact,
the number of permutations satisfying the condition is much smaller than ; there are only
permutations satisfying it3. If we implement the algorithm appropriately so that the same permutation does
not appear more than once, we can solve this task in
time. For each input data of Subtask 2, we can
calculate the answer within 0.01 seconds.
4. Subtask 3 (
)
Using ideas of the solution of Subtask 2, we can solve this task in time by Dynamic Programming
(DP)4. Since many tasks can be solved by DP in Informatics Olympiads, you are encouraged to learn about it.
First, in order to understand the solutions graphically, we draw a -dimensional picture describing the setting
of the task5. Let the
-coordinates be the indices
(= the positions of the giraffes) of the permutation, and the
-coordinates be the values
(= the heights of the giraffes) of the permutation. We draw a picture representing
the generation of permutations in Property 3. Then we have the following stratum structure like a Baumkuchen.
(The following picture describes the generation of the permutation
for
.)
(TODO: insert diagram)
In this picture the operation in Property 3 corresponds to the following operation "choose and 'fix' one of the
four vertices of the square, and shrink the side length of the square by toward the opposite direction."
(TODO: insert diagram)
Namely, the task can be rephrased as "when we make a stratum structure as above, what is the maximum
number of points which coincide with one of ?" (The following picture corresponds
to the permutation
in Sample Input 1. If we generate the permutation
as in the right figure,
points coincide. Since this is the maximum value, the answer is
.)
(TODO: insert diagram)
Then we perform DP using this idea. We shall calculate the following values from smaller values
of
.
(the maximum number of points which coincide with the initial points
inside the square whose lower-left vertex is
and side length is
)
There are four possible choices to shrink the side length of the square by . For each choice, from the state
of
(
,
,
), we can calculate "the maximum number of points which
coincide with the initial points inside
" as follows.
- If we choose (a), the number of points which coincide is
- If we choose (b), the number of points which coincide is
- If we choose (c), the number of points which coincide is
- If we choose (d), the number of points which coincide is
The value of is the maximum of the above
values. Thus, if we use the results for smaller
problems, we can immediately calculate the values of
.
Since the maximum number of points which coincide is , the answer to this task is
.
We can solve this task in
time in total. We can solve Subtask 3 in this way.
Actually, we can improve the constant in the solution drastically as follows. We use the following two
techniques.
- DP reusing memories.
- Optimization using
#pragma
.
We explain the first technique. The original implementation uses a -dimensional array of
. When
becomes large, it uses large amount of memories. It also slows down the execution of the program because
we cannot use cache memories efficiently. For the DP in this case, we can implement it reusing a
-dimensional
array6.
Next, we explain the latter technique. Perhaps you might be surprised if you haven't heard about it before. The execution of the program can be drastically speeded up if you add the following two lines in the beginning of the source code.
#pragma GCC target("avx2")
#pragma GCC optimize("O3")
By these two lines, the auto-vectorization of GCC is enabled, and a part of the program is compiled using fast instructions of SIMD. SIMD provides an instruction to perform a 128 or 256 bit calculation simultaneously. It can be used in most PC's today7. Using this, for example, we can perform 8 addition operations of 32 bit integers simultaneously, and it is expected that the execution becomes 4 times faster than before8.
Since the above DP program is rather simple, we can expect speeding-up by SIMD. In fact, adding the above
two lines, we can calculate the answer within about 1 second when .
5. How to "use the randomness of input data"
Up to this point, we have never used the special condition that " in input data are generated
randomly." To obtain a full score solution, we essentially use this condition.
First of all, if the input data is random, what is the difference from the general case. Generally speaking, some of the input data are unbiased, and some of the input data are "spite." If the randomness is guaranteed, we do not have to consider the biased data. Using this, we can sometimes reduce the computational complexity. Perhaps some of you see such kind of tasks for the first time, we give an example.
Problem Given a sequence
of
real numbers, write a program which sorts it in ascending order9. Here the values
(
) are greater than or equal to
and strictly less than
, and these values are generated by uniform random numbers.
If you have some knowledge on algorithms, you may know this problem can be solved in time.
However, if we use the fact that the inputs are random, we can solve it in
time on average as follows.
- For
, we define an array
to be "the array consisting of all
(
) which are larger than or equal to
and strictly less than
."
- We sort each of the
arrays
in ascending order.
- Combining the
sorted sequences
from the beginning successively, we obtain a sorted sequence.
If we consider general inputs, for example, if the input is , then all the
values
come to the array
, and sorting takes time after all. However, if the inputs are random (if you are not
too unlucky), this cannot happen. Since the expected value of
is
, even if sorting each
takes
time, it is solved in
time in total10. This example shows that we can sometimes speed up the computational
complexity of the algorithm using "the unbiasedness" of input data.
Now we return to the main theme of this section. In the sorting problem above, a nice property is that "the
expected value of is small" for random inputs, and we use this property to achieve
the time complexity
. In the task of giraffes, what are the nice properties satisfied for random inputs? The
answer is that, in the average cases, the maximum number of unmoved giraffes (
) is
.
Intuitively, you may wonder that most giraffes have to move, but it is not the case. Let us consider the reason.
First, since we have "a stratum structure" as in Subtask 3, we see that any permutation
satisfying "the condition of the visual quality" can be decomposed into an increasing sequence
and a decreasing sequence
. To see this, when we shrink the side length of the square by
,
we put the element into the sequence
if we choose the lower-left or the upper-right vertex, and the sequence
if we choose the upper-left or the lower-right vertex. The sequence
is always increasing and the sequence
is always decreasing. Thus we have the following conclusion.
Property 5 (the maximum number of unmoved giraffes)
(longest increasing subsequence of
)
(longest decreasing subsequence of
).
Therefore, if the length of longest increasing subsequences (LIS) and longest decreasing subsequences are sufficiently small in the average cases, the number of unmoved giraffes is sufficiently small. In fact, it is true, and the following is known.
Property 6 The expected value of the length of LIS of a randomly chosen permutation
of length
is approximately equal to
.
More precisely, it is known that the expected value of the length of LIS is , and the standard
deviation is
(Odlyzko & Rains, 1998). Perhaps, it may sound difficult. But we can show the following
related result relatively easily.
We consider the expected value of the number of increasing subsequences of length
. There are
ways to choose a subsequence of length
. The probability that the chosen subsequence is increasing is
. Therefore, when
, the expected value of the number of increasing subsequences of length
is approximately equal to11
Therefore, when
holds 12, the expected value of the number of increasing subsequences of length
becomes
. If
is larger than it, the probability that there exists an increasing subsequence of length
tends to
indefinitely.
Combining these two properties, we have the following fact.
Property 7 In the average cases, the maximum number of unmoved giraffes is approximately less than
.
Moreover, we experimentally know that this value is approximately equal to . We reduce the computational
complexity using this fact. It is explained in the following sections.
6. A solution of computational complexity 
In the above discussions, we see that the maximum number of points which coincide with the initial points is
approximately less than . Here we explain how to use it to reduce the computational complexity.
As a first step, we recall the Knapsack problem, which is a famous problem for DP.
0-1 Knapsack Problem There are
items. The
-th item (
) has value
and weight
. When we choose items so that the total weight does not exceed
, what is the maximum total value?
When the constraints on are small, this problem can be solved by DP in
time by putting
(the maximum total value if we fix the first
items whose total weight is
). On the other hand, when
the constraints on
are large, if the constraints on
are small, we have a similarly efficient solution. It can be
solved by DP in
time by putting
(the minimum total weight if we fix the first
items
whose total value is
).
The current task has the property that "the side length of a square" which corresponds to the weight in the
knapsack problem is large, whereas "the maximum number of points which coincide with the initial points"
which corresponds to the value in the knapsack problem is small. Thus it is natural to expect that the computational
complexity may be reduced if we use a solution similar to the knapsack problem for small . We consider
a DP calculating the following values.
(the minimum side length of a square if at most
points in the square with vertex
of direction
(
) coincide with the initial points
)
Although is used to represent a state in the original DP, we use
here. The reason is that we need
the information of the directions because there are
directions to increase the value of
, and the DP value is
increased by
only if, for some
, the point
is contained as the vertex of direction
. We can combine
a number of state transitions which do not change the DP values into a single state transition. Thus, as in the
following figure, we can perform a state transition if the square contains a smaller square.
(TODO: insert diagram)
Now we have an idea of the solution. We shall consider how to perform a DP calculation. First, "the square
described by " is given as follows.
- The square described by
:
,
- The square described by
:
,
- The square described by
:
,
- The square described by
:
,
Thus, for each , we have a square for each
. In total, we have at most
squares. We calculate
using the
squares as follows.
(the minimum value of
so that the square region
,
contains one of the
squares entirely)
(the minimum value of
so that the square region
,
contains one of the
squares entirely)
(the minimum value of
so that the square region
,
contains one of the
squares entirely)
(the minimum value of
so that the square region
,
contains one of the
squares entirely)
Each can be calculated in
time. Since there are approximately
combinations of the
triples
, you may wonder if the computational complexity is
. However, in the average cases, the
number of points which coincide with the initial points is approximately less than
. Thus, if we quit
the loop when no square remains, we need to calculate the DP values for
only. Therefore, in the
average cases, this task can be solved in
time. We give a sample implementation below. Note that since
we store the DP values in a
-dimensional array, the value
dp[i][j]
corresponds to , and the value
ndp[i][j]
corresponds to . In the source code, we omit the part of inputs and outputs, and
#include
, etc.
class square {
public:
int x, y, l;
square() : x(0), y(0), l(0) {};
square(int x_, int y_, int l_) : x(x_), y(y_), l(l_) {};
};
int solve(int N, const vector<int>& P) {
const int INF = 1012345678;
vector<vector<int> > dp(4, vector<int>(N, 0));
int t = 1;
while (true) {
vector<square> sq;
for (int i = 0; i < 4; i++) {
for (int j = 0; j < N; j++) {
if (dp[i][j] != INF) {
int x = j - (i & 2 ? dp[i][j] : 0);
int y = P[j] - (i & 1 ? dp[i][j] : 0);
sq.push_back(square(x, y, dp[i][j]));
}
}
}
int S = sq.size();
dp = vector<vector<int> >(4, vector<int>(N, INF));
for (int i = 0; i < 4; i++) {
for (int j = 0; j < N; j++) {
int minl = INF;
for (int k = 0; k < S; k++) {
int xdiff = (i & 2 ? j - (sq[k].x + sq[k].l) : sq[k].x - j);
int ydiff = (i & 1 ? P[j] - (sq[k].y + sq[k].l) : sq[k].y - P[j]);
if (xdiff > 0 && ydiff > 0) {
minl = min(minl, max(xdiff, ydiff) + sq[k].l);
}
}
int xlim = (i & 2 ? j : (N - 1) - j);
int ylim = (i & 1 ? P[j] : (N - 1) - P[j]);
if (minl <= min(xlim, ylim)) {
dp[i][j] = minl;
}
}
}
if (dp == vector<vector<int> >(4, vector<int>(N, INF))) {
break;
}
t += 1;
}
return N - t;
}
This solution has a large constant compared with other solutions by DP. We cannot expect speeding-up
by SIMD. When the size of the input data is about
, the actual execution is slower than the optimized
solution. However, as explained in the next section, we can speed up this solution using data structures,
which leads to a full score solution.
7. Full score solution (
)
In the previous solution, we calculate DP values. Since each value is calculated in
time, the
total computational complexity is
. We shall speed up it using data structures. We consider how to
calculate each
in
time, for example.
To do this, we need to treat the following problem. Since it is slightly complicated, we recommend you draw figures on paper.
Problem There are
squares. The lower-left vertex of the square
(
) is
, and the upper-right vertex of the square
is
. We need to answer the following
queries. The
-th query (
) is as follows. (It is allowed to read the queries beforehand.)
- Given
, what is the minimum value of
so that the square region
,
contains at least one of the
squares entirely?
The constraints of the problem is that the square regions and the points
given by the queries are entirely contained in the region
,
.
Here we consider the problem only for the direction . The cases of the directions
can be
treated similarly by rotation.
We shall show this problem can be solved in time by a plane sweep algorithm and a segment
tree. To see this, we summarize the situation of the problem. The answer to each query can be calculated in the
following way.
- The minimum of
for
satisfying
,
.
- Namely, the smaller of the following two values.
- The minimum of
for
satisfying
,
. (In Case #1, the gray point is contained in the blue region.)
- The minimum of
for
satisfying
,
. (In Case #2, the gray point is contained in the blue region.)
- The minimum of
(TODO: insert diagram)
Perhaps advanced contestants may notice that we can use a plane sweep algorithm to solve these problems.
A key to a plane sweep algorithm is that the value of the query and the value of the data
are independent.
We achieve this by decomposing the original problem into two sub-problems.
Now we shall solve Problem 1 by a plane sweep algorithm. We can also solve Problem 2 similarly since the
shape of the region is almost the same. Note that the region is given in terms of and
. It is easier to
consider the
-coordinates than the usual
-coordinates.
(TODO: insert diagram)
The rest is a typical plane sweep problem. Let the sweeping line be . We sweep the plane by
decreasing the value of
. (In the above figure, we move the line from right to left.) Then the answer to the
question is calculated as "the minimum of
among the currently added points whose
coordinate is larger
than or equal to
." This value can be quickly calculated by an RMQ13 segment tree which records the current
minimum value of
for each
-coordinate. More precisely, the algorithm is described as follows.
- We give an array
of length
. Let the initial values be
.
- From the larger values of
and
, we sort the
points
and
together. If two points are the same, we assume that
comes before
14.
- For
, in this order, we perform the following operations.
- If the
-th point in the sorted sequence is
, we let
be
.
- If the
-th point in the sorted sequence is
, the answer to the query
is
.
- If the
Here the operation 3 is "to update a point, and obtain the minimum value on the interval." We can perform it
in using a segment tree. Sweeping the plane and calculating the answers at once, we solve Problem 1
in
time.
Similarly, we can solve Problem 2 in time. In total, we can solve Problem in Section 7 in
time. Therefore, we can calculate the values of
for all
in
time.
This way, we can solve every test case of this task in
time.
As explained before, since the size of becomes at most
, the time complexity is
on
average, and we get the full score.
If we rotate
the situation appropriately, we do not need a source code for each of . Moreover, we can reduce
Problem 2 to Problem 1. Thus we need only implement a plane sweep algorithm in one subroutine actually. We
can devise the implementation of the solution to decrease the possibility of bugs in the program.
Finally, to implement a plane sweep algorithm, we can use a data structure similar to Binary Indexed Tree
(BIT)15 instead of a segment tree. Perhaps you may have an impression that BIT operates queries to calculate
sums. In this task, there are properties such as "the values of the array keeps decreasing" and "to calculate
the minimum of
, not the sum of them." We can use a data structure similar to BIT to manage the
minimum values. This way, we can slightly speed up the program.
8. Conjecture: Can we reduce the computational complexity further?
Some of you might have noticed that if the input data is random, a used point tends to get closer to the
line
or the line
.
This is true also for a LIS in a random permutation. We can prove16 that the distance between any point
in a LIS and the line
is at most
. We conjecture similar property holds for this task.
Conjecture 1 The distance between a used point and the line
or the line
is at most
.
If this conjecture is true, we can prove the following.
Conjecture 2 If Conjecture 1 is true, this task can be solved in
time on average.
Even in 2022, there are many problems whose computational complexities have been improved. Is the above conjecture true? Can the computational complexity of this task be improved further? We finish this review by asking these questions.
A permutation means an arrangement of
. In a competitive programming contest, tasks calculating all the possible "permutations" like arrangements of the giraffes appear frequently.↩
Of course, some permutations appear more than once.↩
Let
be the number of permutations of length
satisfying the condition. We have the recurrence formula
,
,
. Here, we subtract
because if "
and
are satisfied" and "
and
are satisfied," the same permutation is counted twice. Calculating the general term of the sequence, we obtain
.↩
In Dynamic Programming (DP), we decompose the problem into smaller problems, and calculate the answers recursively from smaller problems to larger ones. If you do not know about it, please search the websites.↩
In some tasks, we can easily understand the solutions when we draw a picture in the
-dimensional plane.↩
Generally speaking, when we implement a DP which reuses memories, we have to be careful about the order to calculate (update) the DP values. In this case, we cannot obtain a correct answer if we update them from larger values of
.↩
Of course, it can be used in most programming contests. If we cannot use AVX2, we can sometimes use AVX or SSE4 to speed up the program.↩
If you want to learn about SIMD (Single Instruction Multiple Data), please search the websites. For a reference, see https://yukicoder.me/wiki/auto_vectorization (in Japanese).↩
This means rearranging the sequence from smaller values. For example, if the sequence
is given, we obtain the sequence
after sorting it in ascending order.↩
The reason why the expected value of
is equal to
is seen as follows. For
, we put
if
is contained in
, and
otherwise. Then, we have
. Since the expected value of
is
(the probability that
is contained in
is
), and the expected value of
(
) is
(the probability that both
are contained in
is
), the expected value is the sum
.↩
We use Stirling's formula
to estimate the value. Here
is the base of the natural logarithms.↩
When
, precisely.↩
The abbreviation of Range Minimum Query. Here we consider a query to update a point, and obtain the minimum value on the interval. We can perform each of these queries in
time.↩
It takes
time if we use
std::sort
. If we sort them by Counting Sort, it can be done intime.↩
It is also called a Fenwick Tree.↩
For example, we ask whether "there is a possibility that it is used in a LIS" if
gets closer to
to a certain amount. If
, among
, there are approximately
elements which are smaller than
. Also, among
, there are approximately
elements which are larger than
. Therefore, the expected value of the length of a LIS which contains
is
. But there is a stochastic variation of the size, which is approximately equal to
. When
, the increment of the expected value of the length is equal to the stochastic variation. From this, we may argue that if the distance is longer than this value, then the element will not be used in a LIS. We can prove the claim in this way.↩
Comments