Baltic OI '06 P5 - RLE Compression
View as PDFRLE is a simple compression algorithm used to compress sequences containing
subsequent repetitions of the same character. By compressing a particular sequence,
we obtain its code. The idea is to replace repetitions of a given character (like aaaaa)
with a counter saying how many repetitions there are. Namely, we represent it by a
triple containing a repetition mark, the repeating character and an integer representing
the number of repetitions. For example, aaaaa can be encoded as #a5 (where #
represents the repetition mark).
We need to somehow represent the alphabet, the repetition mark, and the counter. Let
the alphabet consist of  characters represented by integers from the set
. The code of a sequence of characters from 
 is also a sequence
of characters from 
. At any moment, the repetition mark is represented by a
character from 
, denoted by 
. Initially 
 is 0, but it may change during the coding.
The code is interpreted as follows:
- any character 
in the code, except the repetition mark, represents itself,
 - if the repetition mark 
occurs in the code, then the two following characters have special meaning:
- if 
is followed by
, then it represents
repetitions of
,
 - otherwise, if 
is followed by
(where
), then
will be the repetition mark from that point on,
 - otherwise, if 
is followed by
(where
and
), then it represents
repetitions of
.
 
 - if 
 
Using the above scheme, we can encode any sequence of characters from . For
instance, for 
, the sequence 
1002222223333303020000 can be encoded as
10010230320100302101. First character of the code 1 means simply 1. Next 001
encodes 00. Then, 023 represents 222222, 032 represents 33333, and 010 switches the
repetition mark to 1. Then 0302 represents itself and finally 101 encodes 0000.
A sequence may be encoded in many ways and code length may vary. Given an already encoded sequence, your task is to find a code with the least number of characters.
Write a program that:
- Reads the size of the alphabet and the code of a sequence.
 - Finds the shortest code for that sequence.
 - Writes the result.
 
Input Specification
The first line contains one integer 
: the size of the alphabet. The second line contains one integer 
: the length of the code. The last line contains 
 integers from the
set 
 separated by single spaces, representing the code of a sequence.
Output Specification
The first line should contain
one integer : the least number of characters in a code representing the given
sequence. The last line of the output should contain 
 integers from the set
 separated by single spaces: the code of the sequence. If there exist
several shortest sequences, your program should output any one of them.
Sample Input 1
4
20
1 0 0 1 0 2 3 0 3 2 0 1 0 0 3 0 2 1 0 1
Sample Output 1
19
1 0 1 0 0 0 1 2 3 1 3 2 0 3 0 2 1 0 1
Sample Input 2
14
15
10 10 10 0 10 0 10 10 13 10 10 13 10 10 13
Sample Output 2
9
0 10 13 0 10 13 0 10 10
Comments