The problem asks us to calculate the depth of each new node and keep track of the sum of depths.
To start with, note that we cannot just directly simulate the algorithm in the problem statement. If the tree is not balanced (and the authors of the test data will ensure that it is not), its depth will grow to
and the overall complexity will be
, way too much with
up to
.
Because the tree takes
memory per node, we can afford to construct it explicitly (so that we know for each node its left child, right child, parent and depth), as long as the construction is efficient – for example,
per number or faster. As it turns out, in this problem it suffices to keep just the depths.
Let
be the number we are inserting,
be the largest already inserted number smaller than
, and
be the smallest already inserted number larger than
.
We can show by contradiction that
will necessarily be an ancestor of
or vice versa. Also, because of the binary search tree property, after insertion
will become the left child of
or the right child of
, depending on which of the two is deeper in the tree.
So if we could efficiently determine the number
and
for every
, it would be easy to maintain the entire tree and calculate the numbers sought in the problem.
Assuming they made these observations, contestants using C++ had it easy because the built-in data structures set and map (which are themselves implemented as binary search trees behind the scenes, but balanced) support these operations – efficiently finding the element just above and just below a given element. The overall complexity of such a solution is
.
In Pascal or C, such a data structure is not available so we need to keep thinking.
Suppose we build a doubly linked list and initialize it to contain the numbers
through
in sorted order. If we remove the numbers in the input in reverse order, the list will be able to give us exactly what we want. At every point, the list will contain exactly the numbers that would have been inserted up until then, and it's easy to retrieve
and
, given
; these are the predecessor and successor of
in the list. The overall complexity of this solution is
.
Another intriguing approach is to keep in an array of length
, for each number that has been inserted, its predecessor and successor. When inserting a new number
, we do a bidirectional (breadth-first) search around
to find a number that has already been inserted. If we first find
, its successor is the number
. If we first find
, its predecessor is the number
. It is not obvious, but because this array gets dense fast, the overall complexity of inserting all numbers is
, even though individual insertions can take a number of steps linear in
.
Comments