Given a string, find the length of the longest substring without repeating characters.
Example 1:
Input: "abcabcbb" Output: 3 Explanation: The answer is "abc", with the length of 3.
Example 2:
Input: "bbbbb" Output: 1 Explanation: The answer is "b", with the length of 1.
Example 3:
Input: "pwwkew"
Output: 3
Explanation: The answer is "wke", with the length of 3.
Note that the answer must be a substring, "pwke"is a subsequence and not a substring.
请注意,答案必须是子字符串,“pwke”是子序列而不是子字符串。
Approach 1: Brute Force 暴力
Intuition
Check all the substring one by one to see if it has no duplicate character.
逐个检查所有子字符串,看它是否没有重复的字符。
Algorithm
Suppose we have a function boolean allUnique(String substring)
which will return true if the characters in the substring are all unique, otherwise false. We can iterate through all the possible substrings of the given string s
and call the function allUnique
. If it turns out to be true, then we update our answer of the maximum length of substring without duplicate characters.
假设我们有一个函数boolean allUnique(String substring),如果子字符串中的字符都是唯一的,则返回true,否则返回false。 我们可以迭代给定字符串s的所有可能的子字符串并调用函数allUnique。 如果结果是真的,那么我们更新子字符串的最大长度的答案,没有重复的字符。
Now let's fill the missing parts:
-
To enumerate all substrings of a given string, we enumerate the start and end indices of them. Suppose the start and end indices are i and j, respectively. Then we have 0≤i<j≤n (here end index j is exclusive by convention). Thus, using two nested loops with i from 0 to n−1 and j from i+1 to n, we can enumerate all the substrings of s.
-
To check if one string has duplicate characters, we can use a set. We iterate through all the characters in the string and put them into the
set
one by one. Before putting one character, we check if the set already contains it. If so, we returnfalse
. After the loop, we returntrue
.要枚举给定字符串的所有子字符串,我们枚举它们的开始和结束索引。 假设开始和结束索引分别是i和j。 然后我们有0≤i<j≤n(这里结束索引j按惯例排除)。 因此,使用两个嵌套循环,其中i从0到n-1,j从i + 1到n,我们可以枚举s的所有子串。
要检查一个字符串是否有重复的字符,我们可以使用一个集合。 我们遍历字符串中的所有字符并将它们逐个放入集合中。 在放置一个字符之前,我们检查该集合是否已包含它。 如果是这样,我们返回false。 循环之后,我们返回true。
class Solution {
public int lengthOfLongestSubstring(String s) {
int n = s.length();
int ans = 0;
for(int i = 0;i < n;i++){
for(int j = i+1;j <= n;j++){
if(allUnique(s,i,j)) ans = Math.max(ans,j-i);
}
}
return ans;
}
public boolean allUnique(String s, int start, int end){
Set<Character> set = new HashSet<>();
for(int i = start;i < end;i++){
Character ch = s.charAt(i);
if(set.contains(ch)) return false;
set.add(ch);
}
return true;
}
}
Complexity Analysis
-
Time complexity : O(n^3).
To verify if characters within index range [i, j) are all unique, we need to scan all of them. Thus, it costs O(j - i) time.
For a given
i
, the sum of time costed by each j∈[i+1,n] is∑i+1nO(j−i)
Thus, the sum of all the time consumption is:
O(n^3)O(∑i=0n−1(∑j=i+1n(j−i)))=O(∑i=0n−12(1+n−i)(n−i))=O(n3)
-
Space complexity : O(min(n, m))O(min(n,m)). We need O(k)O(k) space for checking a substring has no duplicate characters, where kk is the size of the
Set
. The size of the Set is upper bounded by the size of the string nnand the size of the charset/alphabet mm.
最后运行结果:
Approach 2: Sliding Window
Approach 2: Sliding Window
Algorithm
The naive approach is very straightforward. But it is too slow. So how can we optimize it?
In the naive approaches, we repeatedly check a substring to see if it has duplicate character. But it is unnecessary. If a substring sij from index i to j - 1 is already checked to have no duplicate characters. We only need to check if s[j] is already in the substring sij.
To check if a character is already in the substring, we can scan the substring, which leads to an O(n^2) algorithm. But we can do better.
By using HashSet as a sliding window, checking if a character in the current can be done in O(1).
A sliding window is an abstract concept commonly used in array/string problems. A window is a range of elements in the array/string which usually defined by the start and end indices, i.e. [i, j) (left-closed, right-open). A sliding window is a window "slides" its two boundaries to the certain direction. For example, if we slide [i, j) to the right by 1 element, then it becomes [i+1, j+1) (left-closed, right-open).
Back to our problem. We use HashSet to store the characters in current window [i, j) (j = i initially). Then we slide the index j to the right. If it is not in the HashSet, we slide j further. Doing so until s[j] is already in the HashSet. At this point, we found the maximum size of substrings without duplicate characters start with index i. If we do this for all ii, we get our answer
public class Solution {
public int lengthOfLongestSubstring(String s) {
int n = s.length();
Set<Character> set = new HashSet<>();
int ans = 0, i = 0, j = 0;
while (i < n && j < n) {
// try to extend the range [i, j]
if (!set.contains(s.charAt(j))){
set.add(s.charAt(j++));
ans = Math.max(ans, j - i);
}
else {
set.remove(s.charAt(i++));
}
}
return ans;
}
}
Complexity Analysis
-
Time complexity : O(2n) = O(n). In the worst case each character will be visited twice by i and j.
-
Space complexity : O(min(m, n)). Same as the previous approach. We need O(k) space for the sliding window, where k is the size of the
Set
. The size of the Set is upper bounded by the size of the string n and the size of the charset/alphabet m.
最后运行结果:
Approach 3: Sliding Window Optimized
The above solution requires at most 2n steps. In fact, it could be optimized to require only n steps. Instead of using a set to tell if a character exists or not, we could define a mapping of the characters to its index. Then we can skip the characters immediately when we found a repeated character.
The reason is that if s[j] have a duplicate in the range [i, j) with index j′, we don't need to increase i little by little. We can skip all the elements in the range [i, j'] and let i to be j' + 1 directly.
Java (Using HashMap)
public class Solution {
public int lengthOfLongestSubstring(String s) {
int n = s.length(), ans = 0;
Map<Character, Integer> map = new HashMap<>(); // current index of character
// try to extend the range [i, j]
for (int j = 0, i = 0; j < n; j++) {
if (map.containsKey(s.charAt(j))) {
i = Math.max(map.get(s.charAt(j)), i);
}
ans = Math.max(ans, j - i + 1);
map.put(s.charAt(j), j + 1);
}
return ans;
}
}
Java (Assuming ASCII 128)
The previous implements all have no assumption on the charset of the string s
.
If we know that the charset is rather small, we can replace the Map
with an integer array as direct access table.
Commonly used tables are:
int[26]
for Letters 'a' - 'z' or 'A' - 'Z'int[128]
for ASCIIint[256]
for Extended ASCII
public class Solution {
public int lengthOfLongestSubstring(String s) {
int n = s.length(), ans = 0;
int[] index = new int[128]; // current index of character
// try to extend the range [i, j]
for (int j = 0, i = 0; j < n; j++) {
i = Math.max(index[s.charAt(j)], i);
ans = Math.max(ans, j - i + 1);
index[s.charAt(j)] = j + 1;
}
return ans;
}
}
Complexity Analysis
-
Time complexity : O(n). Index j will iterate n times.
-
Space complexity (HashMap) : O(min(m, n)). Same as the previous approach.
-
Space complexity (Table): O(m). m is the size of the charset.