Editorial for COI '19 #1 Paint
Submitting an official solution before solving the problem yourself is a bannable offence.
The first subtask can simply be solved by traversing the monochrome neighborhoods using a DFS or BFS. In that case, the complexity of one coloring operation is linear with respect to the number of pixels in a component (monochrome neighborhood). In the worst case, the component will have pixels, so the total time complexity of this approach equals
.
The second subtask is a special case in which the image is one dimensional. Some of the approaches are:
- Maintaining a set of components (continuous intervals) using an efficient data structure (such as
std::set
in C++) that enables fast retrieval of neighboring elements and their merging (insertion and deletion). - Maintaining a set of connected components using union find.
- Dividing an array into parts of length
and keeping a flag for each part which tells us whether the whole part is a single component (i.e. is it monochrome).
The third subtask is special in the following way: in each step either there will be no changes (bucket tool is filled with the same color as the pixel it is applied on) or the component of the pixel will change its color and connect itself with all of its neighboring components. Maintaining these components can be done using the union find data structure, along which we will keep track of a set of active neighbors (e.g. using std::set
in C++). When traversing the adjacency list of the current component, i.e. when connecting it with one of its neighbors, we delete that neighbor from its set of active neighbors (and vice versa). There will be no more than connections and each has a complexity of deleting an element from the set,
. The total time complexity of this approach is therefore
.
On first glance, it may seem that the algorithm from the third subtask can be used to obtain the full score. But, the problem resides in the fact that we don't necessarily need to connect the current component to all of its neighbors in a single application of the bucket tool. Perhaps we can solve the problem by keeping the neighbors somehow grouped by color (e.g. using std::map
in C++), but when changing color of a current component, we would still need to traverse over all of its neighbors in order to let them know we have changed our color. Let's keep all of these observations in mind, but return to the solution of the first subtask.
Note that, if the average size of components was sufficiently small, say pixels, then the algorithm from the first subtask would be fast enough, i.e.
. What if the average size of components was larger? Something like
pixels should be good enough as well, a
should also be squeezable. What about
pixels? Unlikely. Luckily, increasing the size of an average component decreases the number of components that have at least that many pixels. In other words, at each moment there are at most
components that have
or more pixels (we will choose the exact value of
later). This property could be of use. The intuition is as follows: we will use one of the naive approaches to solve for components with
or less pixels, and some other approach to solve for components with more than
pixels. The important thing is that this approach doesn't need to be efficient when the component has less than
pixels.
More precisely, for each component we track the following values: color, size and a set of large neighbors (those with more than pixels – note that there is at most
of those). When connecting two components, we use union find to maintain these values. Additionally, for large components we keep around a list of its neighbors grouped by color. We will explain the algorithm using the following three operations:
Operation
– applies the bucket tool on a pixel
:
- Component label
to which pixel
belongs to is determined.
- We use the operation
which determines the labels of neighboring components of the given color,
.
- These components (
) are connected with the current component
(
).
- After connecting, we go through all big neighbors of a component and we notify them on the color change (i.e. the label of a current component
is added into lists of neighboring components for a given color).
- Component label
Operation
– finds neighboring components of a given color:
- If component
is small, traverse all of its neighbors (at most
of them) and for each we check what color is it.
- If component
is big, we use a list of neighbors grouped by color. Observation: After traversing that list, it can (has to) be deleted (after connecting, there will no longer be a neighbor of a given color – they will be a part of the same component).
- If component
Operation
– connects components
and
:
- Size of the resulting components equals to the sum of sizes of components
and
.
- The list of big components is obtained using small-to-large technique. (It's possible that after this operation we have some labels in our list that have been destroyed in the meantime, i.e. became connected in some larger component. We leave this implementation detail to the reader to resolve). the implementation detail discussed further in the text).
- In case of connecting two large components, we need to obtain a joint list of neighbors grouped by color. This is also done using small-to-large technique.
- Size of the resulting components equals to the sum of sizes of components
Operation will be done at most
times. Smaller to larger technique secures the total amortized complexity of
. The complexity of all operations
in case of small components is
, while the amortized complexity in case of big components is
. Finally, the complexity of all
operations is
. The total complexity also depends on a parameter
and equals
. It follows that the optimal runtime should be obtained when
.
Comments