String Finding

View as PDF

Submit solution

Points: 5
Time limit: 1.0s
Memory limit: 64M

Author:
Problem type

You have a string (indexed from 0) with no more than 10^6 lowercase characters. Find the first occurrence of a string T (1 \le |T| \le |S| \le 10^6), or print -1 if T is not a substring of S.

Input Specification

The first line will have the string S.
The second line will have the string T.

Output Specification

Print the index of the first occurrence of the string T in S, or -1 if it is not a substring of S.

Sample Input

higuyshowsitgoinghiguys
higuys

Sample Output

0

Comments


  • 0
    pattti  commented on Aug. 6, 2023, 1:38 a.m.

    yo how u pass tle using java


    • 2
      Kirito  commented on Aug. 6, 2023, 5:38 a.m.

      The point value is admittedly slightly misleading thanks to the problem being very trivial in certain languages (e.g. C/++, Python 3).

      You can either try making me sad by checking if Java has some built-in linear time string matching function, or implement KMP, Rabin-Karp, Z-algorithm, or any number of other linear-time string matching algorithms.


  • 2
    aeternalis1  commented on July 7, 2017, 1:57 p.m. edited

    Is it possible to create a solution for test case # 15 using python 3? I can't seem to run it under the time limit.


    • 5
      Kirito  commented on July 7, 2017, 3:28 p.m. edited

      Python's builtin find uses Boyer-Moore, which has a worst-case runtime of \mathcal O(|S| \times |T|). Rabin-Karp and KMP are two algorithms which have a worst-case runtime of \mathcal O(|S| + |T|).


      • 0
        TechnobladeNeverDies  commented on Sept. 18, 2022, 1:39 p.m.

        This is incorrect; Boyer-Moore runs in linear time when the pattern does not occur in the text, which is not the case for Python.


        • 1
          Kirito  commented on Sept. 20, 2022, 3:22 p.m. edited

          To be pedantic: it's described as "a simplication of Boyer-Moore, incorporating ideas from Horspool and Sunday". The \mathcal O(|S||T|) worst-case bound still holds. Whether you want to consider this "using Boyer-Moore" or not is up to you.

          If you want to get more specific, the CPython implementation is described in their code as

          /* fast search/count implementation, based on a mix between boyer-moore and horspool, with a few more bells and whistles on the top. for some more background, see: https://web.archive.org/web/20201107074620/http://effbot.org/zone/stringlib.htm */

          /* note: fastsearch may access s[n], which isn't a problem when using Python's ordinary string types, but may cause problems if you're using this code in other contexts. also, the count mode returns -1 if there cannot possibly be a match in the target string, and 0 if it has actually checked for matches, but didn't find any. callers beware! */

          /* If the strings are long enough, use Crochemore and Perrin's Two-Way algorithm, which has worst-case O(n) runtime and best-case O(n/k). Also compute a table of shifts to achieve O(n/k) in more cases, and often (data dependent) deduce larger shifts than pure C&P can deduce. See stringlib_find_two_way_notes.txt in this folder for a detailed explanation. */

          Source: https://github.com/python/cpython/blob/main/Objects/stringlib/fastsearch.h

          A challenge for anyone who's sufficiently bored would be figuring out if it is possible to force CPython to switch to the Two-Way algorithm on the given test data.