CCO '99 P2 - Common Words
View as PDFCanadian Computing Competition: 1999 Stage 2, Day 1, Problem 2
Given a sequence of  words from a newspaper article and an integer 
, find the 
 most common word(s).
Input Specification
Input will consist of an integer  followed by 
 data sets. Each data set begins with a line containing 
 and 
, followed by 
 lines, each containing a word of up to 
 lowercase letters. There will be no more than 
 words per data set.
Output Specification
For each input data set, determine the  most common word(s). To be precise, a word 
 is the 
 most common if exactly 
 distinct words occur more frequently than 
 in the data set. Note that 
 might be multiply defined (i.e. there is a tie for the 
 most common word) or 
 might not exist (i.e. there is no 
 most common word). For each data set, print a title line indicating 
 using normal ordinal notation (1st, 2nd, 3rd, 4th, 5th, …) followed by a number of lines giving all the possible values for the 
 most common word. A blank line should follow the last word for each data set.
Sample Input
3
7 2
the
brown
the
fox
red
the
red
1 3
the
2 1
the
wash
Sample Output
2nd most common word(s):
red
3rd most common word(s):
1st most common word(s):
the
wash
Comments
What if
I WA'd 15 times before I realized that I read the question wrong
what do you mean by "WA'd"
Output Specification
To be precise, a word
 is the 
 most common if exactly 
 distinct words occur more frequently than 
 in the data set.
What happens if, for example, there is an
 value of 
 and a 
 value of 
? If there is more than one 'most common word(s)' (e.g. both the words 'hi' and 'hello' appear 3 times), would the third most common word be 'hey', appearing 2 times, due to the fact that there are 2 distinct words appearing as 'most common', or would 'hey' be the second most common? If the former, what would you do in a scenario when there are more words tied for 'most common word(s)' than the value of 
? Would you just output all the 'most common word(s)'?
EDIT I suppose my question is, if you have 2 words tied for the same place, do they register as both of those places, or just at the first one? (i.e. does searching for 'most common' and '2nd most common' output the same if there are 2 'most common' words)
Now for
. This time, there are two words to output, storm and brook,
because they both have the same number of occurrences. Each of these
words has exactly two words with more occurrences. This shows that we
sometimes need to output more than one word.
Expand on "normal ordinal notation" 11th or 11st
1st, 2nd, 3rd, 4th, 5th, 6th 7th, 8th, 9th, 10th, 11th, 12th, 13th, 14th, 15th, 16th, 17th, 18th, 19th, 20th, 21st, ...