28. Find the Index of the First Occurrence in a String
Given two strings needle
and haystack
, return the index of the first occurrence of needle
in haystack
, or -1
if needle
is not part of haystack
.
KMP(Knuth,Morris,Pratt):
- KMP is mainly used in string matching
- When there is a string mismatch, you can know part of the text that has been matched before, and you can use this information to avoid matching again from the beginning.
- The next array is a prefix table.
- The prefix table is used for backtracking, it records where the pattern string should be re-matched from ,when it doesn't match the main string (text string).
-
n is the length of the text string and m is the length of the needle string. It can be seen that the process of matching is O(n), and the next array has to be generated separately before, and the time complexity is O(m). So the time complexity of the whole KMP algorithm is O(n+m).
-
The violent solution is obviously O(n × m)
- Space complexity: O(m)
violent solution:
class Solution:
def strStr(self, haystack: str, needle: str) -> int:
for m in range(len(haystack)):
n = len(needle)
if haystack[m:m+n] == needle:
return m
return -1
cheat solution:
class Solution:
def strStr(self, haystack: str, needle: str) -> int:
return haystack.find(needle)
KMP solution:
class Solution:
def getNext(self, s:str, next:list) -> None:
j = 0
next[0] = 0
for i in range(1,len(s)): #从1开始,因为“a” 不能算重复前缀,一个list自己跟自己比较
while j > 0 and s[i] != s[j]:#可能循环好几次直到 j=0 或者 s[i]==s[j] 的时候
j = next[j - 1]
if s[i] == s[j]:
j += 1
next[i] = j
def strStr(self, haystack: str, needle: str) -> int:
if len(needle) == 0:
return 0
j = 0
next = [0]*len(needle)
self.getNext(needle,next)
for i in range(len(haystack)): #从0开始,两个list开始比较
while j > 0 and haystack[i] != needle[j]:
j = next[j - 1]
if haystack[i] == needle[j]:
j += 1
if j == len(needle):
return i - len(needle) + 1
return -1
459. Repeated Substring Pattern
Given a string s
, check if it can be constructed by taking a substring of it and appending multiple copies of the substring together.
To determine whether a string s consists of a repeated substring, as long as two s's are put together and an s's appears inside, it means it is composed of a repeated substring.
using find to cheat:
class Solution:
def repeatedSubstringPattern(self, s: str) -> bool:
ss = s[1:] + s[:-1] #不包括第一个和最后一个元素
return ss.find(s) != -1
Time complexity: O(n)
Space complexity: O(1)
violent solution:
class Solution:
def repeatedSubstringPattern(self, s: str) -> bool:
n = len(s)
for i in range(1,n//2 + 1): #The maximum length of substr is half of n.
if n % i == 0:#when n can be divided by i
substr = s[:i]
if substr * (n//i) == s:
return True
return False
Time complexity:O(n^2)
KMP solution:
each value in next array += 1
Time complexity: O(n)
Space complexity: O(n)
class Solution:
def getNext(self, s:str, nxt:list) -> None:
j = 0
nxt[0] = 0
for i in range(1,len(s)):
while j > 0 and s[i] != s[j]:
j = nxt[j-1]
if s[i] == s[j]:
j += 1
nxt[i] = j
def repeatedSubstringPattern(self, s: str) -> bool:
j = 0
nxt = [0] * len(s) # s = [a,b,c,a,b,c,a,b,c,a,b,c]
self.getNext(s, nxt) # nxt= [0,0,0,1,2,3,4,5,6,7,8,9]
n = len(s)
if nxt[-1] != 0 and n % (n - nxt[-1]) == 0: # 有重复的 and 9 % 3 == 0
return True
return False