Editorial for JOI '16 Open P2 - Selling RNA Strands
Submitting an official solution before solving the problem yourself is a bannable offence.
Subtask 1 [10 points]
,
,
,
,
are small. We can solve this subtask by checking all RNA sequences for each order.
Subtask 2 [25 points]
Let's consider a simpler version of this problem:
There are
RNA sequences
and
orders
. For each
![]()
, compute the number of
such that the first
characters of
are
.
We can store all of to a trie. Now we identify a string
and the node of the trie which corresponds to
. For any two strings
, the following are equivalent:
is a prefix of
, that is, the first
characters of
are
.
- Node
is an ancestor of node
.
Let be the number of occurrences of
in
. Then the problem is computing
for each
.
This can be done by precalculation with DFS, but here we use a different approach. Using the Euler-Tour technique, checking whether node is an ancestor of node
can be reduced to checking whether a point (corresponding to
) is included in a range (corresponding to
). Now we get an algorithm for this problem:
- Store
to a trie.
- Using the Euler-Tour technique, compute the range and the point for nodes in the trie. Note that we need not compute the values for nodes which don't correspond to any of
.
- For each
, count the number of
such that the point corresponding to node
is included in the range corresponding to node
.
Let's go back to the original problem. Now we have the constraint by as well as by
. We saw that handling
can be done by a trie and the Euler-Tour technique. It can be easily checked that the same approach can be used for handling
if we consider
(where
is the reverse string of
).
Finally, we get an algorithm as follows:
- Store
to a trie.
- Using the Euler-Tour technique, compute the range and the point for nodes in the trie.
- Do the same for
.
- For each
, compute the answer by checking two ranges.
Subtask 3 [25 points]
Let ,
,
be
,
,
, respectively.
We saw that this problem becomes relatively easy if we can ignore or
. Now consider the following algorithm:
- For each
, extract only RNA sequences whose last
characters are
. Then compute the number of RNA sequences in the extracted ones whose first
characters are
.
The later part of the algorithm can be done by using a trie. The former part also can be done with a trie. This algorithm is not so efficient because it may be possible that all of the RNA sequences are extracted and thus stored to a trie for each . In this case, it works in
. But we can add a (seemingly small) optimization:
- Extract RNA sequences only once for the same
.
In fact, it significantly reduces the complexity to .
Here is a proof. For each ,
is extracted only if
is a suffix of
(that is, the last
characters of
are
). Let
be the set of such
's. Each string of
has different length, so
. Also
holds, so we have
. Thus
. It means that each
is added to a trie
times. Adding
to a trie once can be done in
. So adding strings to tries can be completed overall in
. Now we have tries, so computing the answer can be done by accessing node
of a trie. This can be done in
. Also, checking all of
itself takes
time.
After all, it was proved that this algorithm works in .
Subtask 4 [40 points]
The approach to the solution for Subtask 2 is reducing the problem to the following:
There are
points
and
rectangles
(whose axes are parallel to the
or
axis) on a 2D plane. For each rectangle, compute the number of points which are included in it.
If we ignore irrelevant nodes of tries, the coordinates of the points and the rectangles will be integers between and
.
Let be the number of
such that
and
. As the coordinates of the points are not less than
, we have
. So we just need to compute
for some queries
efficiently. This can be done by the following algorithm:
- Sort the points and the queries by
(for points) or
(for queries). If a point and a query have the same value, make the point appear earlier in the sequence.
- Initialize an array
with
.
- Scan the sorted sequence from the beginning and do the following:
- For a point
, set
.
- For a query
, answer
.
- For a point
In this algorithm, we need to do array operations efficiently. This can be done with a Binary Indexed Tree.
Comments