String type of questions were asked in interviews frequently. You will most likely encounter one during your interviews.
We recommend:
총 8문제.
투포인터는 이렇게 쓰는 것이다. 이런 느낌의 문제.
leetcode.com/problems/reverse-string/solution/
Easy
Write a function that reverses a string. The input string is given as an array of characters char[].
Do not allocate extra space for another array, you must do this by modifying the input array in-place with O(1) extra memory.
You may assume all the characters consist of printable ascii characters.
Example 1:
Input: ["h","e","l","l","o"] Output: ["o","l","l","e","h"]
Example 2:
Input: ["H","a","n","n","a","h"] Output: ["h","a","n","n","a","H"]
Speaking seriously, let's use this problem to discuss two things:
Does in-place mean constant space complexity?
Two pointers approach.
Does in-place mean constant space complexity?
No. By definition, an in-place algorithm is an algorithm which transforms input using no auxiliary data structure.
The tricky part is that space is used by many actors, not only by data structures. The classical example is to use recursive function without any auxiliary data structures.
Is it in-place? Yes.
Is it constant space? No, because of recursion stack.
Algorithm
Here is an example. Let's implement recursive function helper which receives two pointers, left and right, as arguments.
Base case: if left >= right, do nothing.
Otherwise, swap s[left] and s[right] and call helper(left + 1, right - 1).
To solve the problem, call helper function passing the head and tail indexes as arguments: return helper(0, len(s) - 1).
Implementation
class Solution {
public void helper(char[] s, int left, int right) {
if (left >= right) return;
char tmp = s[left];
s[left++] = s[right];
s[right--] = tmp;
helper(s, left, right);
}
public void reverseString(char[] s) {
helper(s, 0, s.length - 1);
}
}
Complexity Analysis
Time complexity : \mathcal{O}(N)O(N) time to perform N/2N/2 swaps.
Space complexity : \mathcal{O}(N)O(N) to keep the recursion stack.
Two Pointers Approach
In this approach, two pointers are used to process two array elements at the same time. Usual implementation is to set one pointer in the beginning and one at the end and then to move them until they both meet.
Sometimes one needs to generalize this approach in order to use three pointers, like for classical Sort Colors problem.
Algorithm
Set pointer left at index 0, and pointer right at index n - 1, where n is a number of elements in the array.
While left < right:
Swap s[left] and s[right].
Move left pointer one step right, and right pointer one step left.
Implementation
class Solution {
public void reverseString(char[] s) {
int left = 0, right = s.length - 1;
while (left < right) {
char tmp = s[left];
s[left++] = s[right];
s[right--] = tmp;
}
}
}
Complexity Analysis
Time complexity : \mathcal{O}(N)O(N) to swap N/2N/2 element.
Space complexity : \mathcal{O}(1)O(1), it's a constant space solution.
leetcode.com/problems/reverse-integer/solution/
Given a 32-bit signed integer, reverse digits of an integer.
Example 1:
Input: 123 Output: 321
Example 2:
Input: -123 Output: -321
Example 3:
Input: 120 Output: 21
Note:
Assume we are dealing with an environment which could only store integers within the 32-bit signed integer range: [−231, 231 − 1]. For the purpose of this problem, assume that your function returns 0 when the reversed integer overflows.
Intuition
We can build up the reverse integer one digit at a time. While doing so, we can check beforehand whether or not appending another digit would cause overflow.
Algorithm
Reversing an integer can be done similarly to reversing a string.
We want to repeatedly "pop" the last digit off of xx and "push" it to the back of the \text{rev}rev. In the end, \text{rev}rev will be the reverse of the xx.
To "pop" and "push" digits without the help of some auxiliary stack/array, we can use math.
//pop operation: pop = x % 10; x /= 10; //push operation: temp = rev * 10 + pop; rev = temp;
However, this approach is dangerous, because the statement \text{temp} = \text{rev} \cdot 10 + \text{pop}temp=rev⋅10+pop can cause overflow.
Luckily, it is easy to check beforehand whether or this statement would cause an overflow.
To explain, lets assume that \text{rev}rev is positive.
Similar logic can be applied when \text{rev}rev is negative.
Complexity Analysis
접근 방식 1 : 숫자 팝 및 푸시 및 오버플로 전에 확인
직관
한 번에 한 자리 씩 역 정수를 만들 수 있습니다. 이렇게하는 동안 다른 숫자를 추가하면 오버플로가 발생하는지 미리 확인할 수 있습니다.
연산
정수 반전은 문자열 반전과 유사하게 수행 할 수 있습니다.
xx의 마지막 숫자를 반복해서 "팝"하고 \ text {rev} rev의 뒤쪽으로 "밀어 넣고"싶습니다. 결국 \ text {rev} rev는 xx의 반대가됩니다.
보조 스택 / 배열의 도움없이 숫자를 "팝"및 "푸시"하려면 수학을 사용할 수 있습니다.
// 팝 연산 : pop = x % 10; x / = 10; // 푸시 작업 : temp = rev * 10 + pop; rev = 임시;
그러나이 방법은 위험합니다. \ text {temp} = \ text {rev} \ cdot 10 + \ text {pop} temp = rev⋅10 + pop 문이 오버플로를 일으킬 수 있기 때문입니다.
운 좋게도이 문이 오버플로를 유발하는지 여부를 미리 확인하는 것은 쉽습니다.
설명하기 위해 \ text {rev} rev가 양수라고 가정합니다.
temp = \ text {rev} \ cdot 10 + \ text {pop} temp = rev⋅10 + pop으로 인해 오버플로가 발생하면 \ text {rev} \ geq \ frac {INTMAX} {10} rev≥10INTMAX 여야합니다.
\ text {rev}> \ frac {INTMAX} {10} rev> 10INTMAX이면 temp = \ text {rev} \ cdot 10 + \ text {pop} temp = rev⋅10 + pop은 오버플로가 보장됩니다.
\ text {rev} == \ frac {INTMAX} {10} rev == 10INTMAX이면 temp = \ text {rev} \ cdot 10 + \ text {pop} temp = rev⋅10 + pop은 다음과 같은 경우 오버플로됩니다. \ text {pop}> 7pop> 7 인 경우에만
\ text {rev} rev가 음수 일 때도 유사한 논리를 적용 할 수 있습니다.
복잡성 분석
시간 복잡성 : O (\ log (x)) O (log (x)). xx에는 대략 \ log_ {10} (x) log10 (x) 자리가 있습니다.
공간 복잡성 : O (1) O (1).
댓글 : 679
class Solution {
public int reverse(int x) {
int rev = 0;
while(x != 0){
int pop = x % 10;
x /= 10;
// this is to mitigate the largest, and smallest numbers in java.
if(rev > Integer.MAX_VALUE/10 || (rev == Integer.MAX_VALUE / 10 && pop > 7)) return 0;
if(rev < Integer.MIN_VALUE/10 || (rev == Integer.MIN_VALUE / 10 && pop < -8)) return 0;
rev = (rev * 10) + pop;
}
return rev;
}
}
mitigate
동사
1. [격식]완화[경감]시키다alleviate
action to mitigate poverty
빈곤 완화 조치
해쉬는 이렇게 쓰는 것이다. 이런 느낌의 문제.
leetcode.com/problems/first-unique-character-in-a-string/
Approach 1: Linear time solution
The best possible solution here could be of a linear time because to ensure that the character is unique you have to check the whole string anyway.
The idea is to go through the string and save in a hash map the number of times each character appears in the string. That would take \mathcal{O}(N)O(N) time, where N is a number of characters in the string.
And then we go through the string the second time, this time we use the hash map as a reference to check if a character is unique or not.
If the character is unique, one could just return its index. The complexity of the second iteration is \mathcal{O}(N)O(N) as well.
8 / 15
class Solution {
public int firstUniqChar(String s) {
HashMap<Character, Integer> count = new HashMap<Character, Integer>();
int n = s.length();
// build hash map : character and how often it appears
for (int i = 0; i < n; i++) {
char c = s.charAt(i);
count.put(c, count.getOrDefault(c, 0) + 1);
}
// find the index
for (int i = 0; i < n; i++) {
if (count.get(s.charAt(i)) == 1)
return i;
}
return -1;
}
}
Complexity Analysis
알파벳 공간만큼만의 해쉬가 필요한경우 char [26] 을 사용.
leetcode.com/problems/valid-anagram/
Algorithm
An anagram is produced by rearranging the letters of ss into tt. Therefore, if tt is an anagram of ss, sorting both strings will result in two identical strings. Furthermore, if ss and tt have different lengths, tt must not be an anagram of ss and we can return early.
public boolean isAnagram(String s, String t) { if (s.length() != t.length()) { return false; } char[] str1 = s.toCharArray(); char[] str2 = t.toCharArray(); Arrays.sort(str1); Arrays.sort(str2); return Arrays.equals(str1, str2); }
Complexity analysis
Time complexity : O(n \log n)O(nlogn). Assume that nn is the length of ss, sorting costs O(n \log n)O(nlogn) and comparing two strings costs O(n)O(n). Sorting time dominates and the overall time complexity is O(n \log n)O(nlogn).
Space complexity : O(1)O(1). Space depends on the sorting implementation which, usually, costs O(1)O(1) auxiliary space if heapsort is used. Note that in Java, toCharArray() makes a copy of the string so it costs O(n)O(n) extra space, but we ignore this for complexity analysis because:
Algorithm
To examine if tt is a rearrangement of ss, we can count occurrences of each letter in the two strings and compare them. Since both ss and tt contain only letters from a-za−z, a simple counter table of size 26 is suffice.
Do we need two counter tables for comparison? Actually no, because we could increment the counter for each letter in ss and decrement the counter for each letter in tt, then check if the counter reaches back to zero.
public boolean isAnagram(String s, String t) { if (s.length() != t.length()) { return false; } int[] counter = new int[26]; for (int i = 0; i < s.length(); i++) { counter[s.charAt(i) - 'a']++; counter[t.charAt(i) - 'a']--; } for (int count : counter) { if (count != 0) { return false; } } return true; }
Or we could first increment the counter for ss, then decrement the counter for tt. If at any point the counter drops below zero, we know that tt contains an extra letter not in ss and return false immediately.
public boolean isAnagram(String s, String t) { if (s.length() != t.length()) { return false; } int[] table = new int[26]; for (int i = 0; i < s.length(); i++) { table[s.charAt(i) - 'a']++; } for (int i = 0; i < t.length(); i++) { table[t.charAt(i) - 'a']--; if (table[t.charAt(i) - 'a'] < 0) { return false; } } return true; }
class Solution {
public boolean isAnagram(String s, String t) {
if (s.length() != t.length()) {
return false;
}
int[] table = new int[26];
for (int i = 0; i < s.length(); i++) {
table[s.charAt(i) - 'a']++;
}
for (int i = 0; i < t.length(); i++) {
table[t.charAt(i) - 'a']--;
if (table[t.charAt(i) - 'a'] < 0) {
return false;
}
}
return true;
}
}
Complexity analysis
Time complexity : O(n)O(n). Time complexity is O(n)O(n) because accessing the counter table is a constant time operation.
Space complexity : O(1)O(1). Although we do use extra space, the space complexity is O(1)O(1) because the table's size stays constant no matter how large nn is.
Follow up
What if the inputs contain unicode characters? How would you adapt your solution to such case?
Answer
Use a hash table instead of a fixed size counter. Imagine allocating a large size array to fit the entire range of unicode characters, which could go up to more than 1 million. A hash table is a more generic solution and could adapt to any range of characters.
투포인터를 알고나면 투포인터가 가장쉽다. 어프로치2
leetcode.com/problems/valid-palindrome/
Approach 1: Compare with Reverse
Intuition
A palindrome is a word, phrase, or sequence that reads the same backwards as forwards. e.g. madam
A palindrome, and its reverse, are identical to each other.
Algorithm
We'll reverse the given string and compare it with the original. If those are equivalent, it's a palindrome.
Since only alphanumeric characters are considered, we'll filter out all other types of characters before we apply our algorithm.
Additionally, because we're treating letters as case-insensitive, we'll convert the remaining letters to lower case. The digits will be left the same.
Complexity Analysis
Time complexity : O(n)O(n), in length nn of the string.
We need to iterate thrice through the string:
Each iteration runs linear in time (since each character operation completes in constant time). Thus, the effective run-time complexity is linear.
Space complexity : O(n)O(n), in length nn of the string. We need O(n)O(n) additional space to stored the filtered string and the reversed string.
Approach 2: Two Pointers
Intuition
If you take any ordinary string, and concatenate its reverse to it, you'll get a palindrome. This leads to an interesting insight about the converse: every palindrome half is reverse of the other half.
Simply speaking, if one were to start in the middle of a palindrome, and traverse outwards, they'd encounter the same characters, in the exact same order, in both halves!
1 / 6
Algorithm
Since the input string contains characters that we need to ignore in our palindromic check, it becomes tedious to figure out the real middle point of our palindromic input.
Instead of going outwards from the middle, we could just go inwards towards the middle!
So, if we start traversing inwards, from both ends of the input string, we can expect to see the same characters, in the same order.
The resulting algorithm is simple:
class Solution {
public boolean isPalindrome(String s) {
for (int i = 0, j = s.length() - 1; i < j; i++, j--) {
while (i < j && !Character.isLetterOrDigit(s.charAt(i))) {
i++;
}
while (i < j && !Character.isLetterOrDigit(s.charAt(j))) {
j--;
}
if (i < j && Character.toLowerCase(s.charAt(i)) != Character.toLowerCase(s.charAt(j)))
return false;
}
return true;
}
}
Complexity Analysis
Time complexity : O(n)O(n), in length nn of the string. We traverse over each character at-most once, until the two pointers meet in the middle, or when we break and return early.
Space complexity : O(1)O(1). No extra space required, at all.
Such a property is formally known as a loop invariant. ↩︎
Such a property is often called a loop termination condition. It is one of several used in this solution. Can you identify the others? ↩︎
leetcode.com/problems/string-to-integer-atoi/
Intuition
The problem looks simple at first glance. However, there are a few edge conditions that we must handle. The basic idea is to scan the string from left to right and build the result for numerical characters. In this process, we must carefully handle the integer overflow and underflow conditions.
Algorithm
We have to build numerical value by repeatedly shifting the result to left by one digit and adding the next number at the unit place. Since numerical value is a decimal number represented as base 10 in the number system, each digit can be expressed as multiples of powers of 10.
Example - "142" can be represented as 1 * (10^2) + 4 * (10^1) + 2 * (10^0)1∗(102)+4∗(101)+2∗(100)
1 is at the hundredth place and left-shifted twice, 4 at tens place and shifted once, and so on.
This can be formulated as :
result * 10 + (str[i] - '0')
However, this could lead to integer overflow or underflow conditions. Since integer must be within the 32-bit signed integer range.
Handling overflow and underflow conditions
If the integer is positive, for 32 bit signed integer, \text{INT\_MAX}INT_MAX is 2147483647 (2^{31}-1)2147483647(231−1) To avoid integer overflow, we must ensure that it doesn't exceed this value. This condition needs to be handled when the result is greater than or equal to \frac{\text{INT\_MAX}}{10}10INT_MAX (214748364)
Case 1). If \text{result} = \frac{\text{INT\_MAX}}{10}result=10INT_MAX, it would result in integer overflow if next integer character is greater than 7. (7 in this case is last digit of \text{INT\_MAX} (2147483647)INT_MAX(2147483647)). We can use \text{INT\_MAX} \% 10INT_MAX%10 to generically represent the last digit.
Case 2). If \text{result} > \frac{\text{INT\_MAX}}{10}result>10INT_MAX, we are sure that adding next number would result in integer overflow.
This holds for negative integer as well. In the 32-bit integer, \text{INT\_MIN}INT_MIN value is -2147483648 (-2^{31})−2147483648(−231). As the last digit is greater than 7 (\text{INT\_MAX} \% 10INT_MAX%10), integer underflow can also be handled using the above approach.
We must return \text{INT\_MAX}INT_MAX in case of integer overflow and \text{INT\_MIN}INT_MIN in case of integer underflow.
Also, it must be noted that in some languages like Python, an integer value is not restricted by the number of bits. Handling of overflow and underflow won't be required in such cases. We could simply check if the value of an integer is out of specified range [{−2}^{31}−231, {2}^{31} − 1231−1]
Steps
The algorithm can be summarised as follows
Note: If there exists any non-integer character at step 3, we return 0 by default since it is not a valid integer. Also, we have to just discard all the characters after the first numerical value.
class Solution {
public int myAtoi(String str) {
int i = 0;
int sign = 1;
int result = 0;
if (str.length() == 0) return 0;
//Discard whitespaces in the beginning
while (i < str.length() && str.charAt(i) == ' ')
i++;
// Check if optional sign if it exists
if (i < str.length() && (str.charAt(i) == '+' || str.charAt(i) == '-'))
sign = (str.charAt(i++) == '-') ? -1 : 1;
// Build the result and check for overflow/underflow condition
while (i < str.length() && str.charAt(i) >= '0' && str.charAt(i) <= '9') {
if (result > Integer.MAX_VALUE / 10 ||
(result == Integer.MAX_VALUE / 10 && str.charAt(i) - '0' > Integer.MAX_VALUE % 10)) {
return (sign == 1) ? Integer.MAX_VALUE : Integer.MIN_VALUE;
}
result = result * 10 + (str.charAt(i++) - '0');
}
return result * sign;
}
}
Complexity Analysis
라빈카프를 꼭 알고 넘어가도록 하자.
leetcode.com/problems/implement-strstr/
Easy
Implement strStr().
Return the index of the first occurrence of needle in haystack, or -1 if needle is not part of haystack.
Example 1:
Input: haystack = "hello", needle = "ll" Output: 2
Example 2:
Input: haystack = "aaaaa", needle = "bba" Output: -1
Clarification:
What should we return when needle is an empty string? This is a great question to ask during an interview.
For the purpose of this problem, we will return 0 when needle is an empty string. This is consistent to C's strstr() and Java's indexOf().
Constraints:
The problem is to find needle of length L in the haystack of length N.
Let's discuss three different ideas how to proceed. They are all based on sliding window idea. The key point is how to implement a window slice.
Linear time window slice \mathcal{O}(L)O(L) is quite easy, move the window of length L along the haystack and compare substring in the window with the needle. Overall that would result in \mathcal{O}((N - L)L)O((N−L)L) time complexity.
Could that be improved? Yes. Two pointers approach is still the case of linear time slice, though the comparison happens not for all substrings, and that improves the best time complexity up to \mathcal{O}(N)O(N). The worst time complexity is still \mathcal{O}((N - L)L)O((N−L)L) though.
Could that be improved to \mathcal{O}(N)O(N)? Yes, but one has to implement constant time slice \mathcal{O}(1)O(1). There are two ways to do it:
Rabin-Karp = constant-time slice using rolling hash algorithm.
Bit manipulation = constant-time slice using bitmasks.
Bit Manipulation approach in Java is more suitable for the short strings or strings with very limited number of characters, ex. Repeated DNA Sequences. That's a consequence of overflow issues in Java (in Python there is no such a problem). Here we deal with quite long strings and it's more simple to implement the basic version of Rabin Karp algorithm.
Quite straightforward approach - move sliding window along the string and compare substring in the window with the needle.
Implementation
Complexity Analysis
Time complexity: \mathcal{O}((N - L)L)O((N−L)L), where N is a length of haystack and L is a length of needle. We compute a substring of length L in a loop, which is executed (N - L) times.
Space complexity: \mathcal{O}(1)O(1).
Drawback of the previous algorithm is that one compares absolutely all substrings of length L with the needle. There is no need to that.
First, let's compare only substrings which starts from the first character in the needle substring.
Second, let's compare the characters one by one and stop immediately in the case of mismatch.
Here it was impossible to manage the full match up to the length of needle string, which is L = 5. Let's backtrack then. Note, that we move pn pointer back to the position pn = pn - curr_len + 1, and not to the position pn = pn - curr_len, since this last one was already investigated.
Let's try again. Here we've managed to get the full match during the second attempt, so let's return the start position of that match, pn - L.
Algorithm
Move pn till you'll find the first character of the needle string in the haystack.
Compute the max string match by increasing pn, pL and curr_len in the case of equal characters.
If you managed to get the full match, curr_len == L, return the start position of that match, pn - L.
If you didn't, backtrack: pn = pn - curr_len + 1, pL = 0, curr_len = 0.
Implementation
class Solution {
public int strStr(String haystack, String needle) {
int L = needle.length(), n = haystack.length();
if (L == 0) return 0;
int pn = 0;
while (pn < n - L + 1) {
// find the position of the first needle character
// in the haystack string
while (pn < n - L + 1 && haystack.charAt(pn) != needle.charAt(0)) ++pn;
// compute the max match string
int currLen = 0, pL = 0;
while (pL < L && pn < n && haystack.charAt(pn) == needle.charAt(pL)) {
++pn;
++pL;
++currLen;
}
// if the whole needle string is found,
// return its start position
if (currLen == L) return pn - L;
// otherwise, backtrack
pn = pn - currLen + 1;
}
return -1;
}
}
Complexity Analysis
Time complexity: \mathcal{O}((N - L)L)O((N−L)L) in the worst case of numerous almost complete false matches, and \mathcal{O}(N)O(N) in the best case of one single match.
Space complexity: \mathcal{O}(1)O(1).
Let's now design the algorithm with \mathcal{O}(N)O(N) time complexity even in the worst case. The idea is simple: move along the string, generate hash of substring in the sliding window and compare it with the reference hash of the needle string.
There are two technical problems:
How to implement a string slice in a constant time?
How to generate substring hash in a constant time?
String slice in a constant time
Strings are immutable in Java and Python, and to move sliding window in a constant time one has to convert string to another data structure, for example, to integer array of ascii-values.
Rolling hash: hash generation in a constant time
To generate hash of array of length L, one needs \mathcal{O}(L)O(L) time.
How to have constant time of hash generation? Use the advantage of slice: only one integer in, and only one - out.
That's the idea of rolling hash. Here we'll implement the simplest one, polynomial rolling hash. Beware that's polynomial rolling hash is NOT the Rabin fingerprint.
Since one deals here with lowercase English letters, all values in the integer array are between 0 and 25 : arr[i] = (int)S.charAt(i) - (int)'a'.
So one could consider string abcd -> [0, 1, 2, 3] as a number in a numeral system with the base 26. Hence abcd -> [0, 1, 2, 3] could be hashed as
h_0 = 0 \times 26^3 + 1 \times 26^2 + 2 \times 26^1 + 3 \times 26^0h0=0×263+1×262+2×261+3×260
Let's write the same formula in a generalised way, where c_ici is an integer array element and a = 26a=26 is a system base.
h_0 = c_0 a^{L - 1} + c_1 a^{L - 2} + ... + c_i a^{L - 1 - i} + ... + c_{L - 1} a^1 + c_L a^0h0=c0aL−1+c1aL−2+...+ciaL−1−i+...+cL−1a1+cLa0
h_0 = \sum_{i = 0}^{L - 1}{c_i a^{L - 1 - i}}h0=∑i=0L−1ciaL−1−i
Now let's consider the slice abcd -> bcde. For int arrays that means [0, 1, 2, 3] -> [1, 2, 3, 4], to remove number 0 and to add number 4.
h_1 = (h_0 - 0 \times 26^3) \times 26 + 4 \times 26^0h1=(h0−0×263)×26+4×260
In a generalised way
h_1 = (h_0 a - c_0 a^L) + c_{L + 1}h1=(h0a−c0aL)+cL+1
Now hash regeneration is perfect and fits in a constant time. There is one more issue to address: possible overflow problem.
How to avoid overflow
a^LaL could be a large number and hence the idea is to set limits to avoid the overflow. To set limits means to limit a hash by a given number called modulus and use everywhere not hash itself but h % modulus.
It's quite obvious that modulus should be large enough, but how large? Here one could read more about the topic, for the problem here 2^{31}231 is enough.
Algorithm
Compute the hash of substring haystack.substring(0, L) and reference hash of needle.substring(0, L).
Iterate over the start position of possible match: from 1 to N - L.
Compute rolling hash based on the previous hash value.
Return start position if the hash is equal to the reference one.
Return -1, that means that needle is not found.
Implementation
class Solution {
// function to convert character to integer
public int charToInt(int idx, String s) {
return (int)s.charAt(idx) - (int)'a';
}
public int strStr(String haystack, String needle) {
int L = needle.length(), n = haystack.length();
if (L > n) return -1;
// base value for the rolling hash function
int a = 26;
// modulus value for the rolling hash function to avoid overflow
long modulus = (long)Math.pow(2, 31);
// compute the hash of strings haystack[:L], needle[:L]
long h = 0, ref_h = 0;
for (int i = 0; i < L; ++i) {
h = (h * a + charToInt(i, haystack)) % modulus;
ref_h = (ref_h * a + charToInt(i, needle)) % modulus;
}
if (h == ref_h) return 0;
// const value to be used often : a**L % modulus
long aL = 1;
for (int i = 1; i <= L; ++i) aL = (aL * a) % modulus;
for (int start = 1; start < n - L + 1; ++start) {
// compute rolling hash in O(1) time
h = (h * a - charToInt(start - 1, haystack) * aL
+ charToInt(start + L - 1, haystack)) % modulus;
if (h == ref_h) return start;
}
return -1;
}
}
Complexity Analysis
Time complexity: \mathcal{O}(N)O(N), one computes the reference hash of the needle string in \mathcal{O}(L)O(L) time, and then runs a loop of (N - L)(N−L) steps with constant time operations in it.
Space complexity: \mathcal{O}(1)O(1).
라빈카프 알고리즘: inner-game.tistory.com/380
어프로치2에서 포인터 각 위치의 의미 생각해보기
leetcode.com/problems/count-and-say/
broute-force로 짰기 때문에 다시 공부해야함.
Overview
Before we proceed to the solutions, in this section, we would like to rephrase the problem.
Given two adjacent sequences of number, [S_{n}, S_{n+1}][Sn,Sn+1], there exists a pattern that one can produce the sequence S_{n+1}Sn+1 from its previous sequence S_{n}Sn.
More specifically, one can consider the sequence S_{n+1}Sn+1 as a sort of summary to its previous sequence S_{n}Sn, i.e. S_{n+1}Sn+1 contains a list of pairs as |\text{count}, \text{digit}|∣count,digit∣ which encodes all the information about its previous sequence S_{n}Sn.
Let us take the sequence S_{4} = 1211S4=1211 as an example, from left to right, we then can divide the sequence into three sub-groups where each sub-group contains a list of identical and adjacent digit, i.e. S_{4} = \{1\}\{2\}\{11\}S4={1}{2}{11} , as shown in the following:
We then count the number of digits in each sub-group, and then output the summary in the format of |\text{count}, \text{digit}|∣count,digit∣. As the end, we would obtain the exact sequence of S_5S5.
With the generated sequence S_5S5, we then recursively apply the above rule to generate the next sequence.
Now that the description of the problem is clear, one might dismiss it as yet another strange and artificial problem to solve. Well, it is not true in this case.
Actually, we could consider this problem as a naive compression algorithm for a sequence of numbers. Instead of storing repetitive adjacent digits as they are, we could summarize them a bit with the method presented in the problem, which could save us some space as long as there are indeed repetitive occurring patterns.
Intuition
Now that the problem has been clarified, the solution should be intuitive.
Following the rule as we described above, in order to generate the next sequence, we could scan the current sequence with a sort of sliding window which would hold the identical and adjacent digits. With the sliding window, we would divide the original sequence into a list of sub-sequences. We then count the number of digits within each sub-sequence and output the summary as pairs of |\text{count}, \text{digit}|∣count,digit∣.
Algorithm
Here we define a function nextSequence() to generate a following sequence from a previous sequence, and we recursively call this function to get the desired sequence that is located at a specific index.
class Solution {
public String countAndSay(int n) {
LinkedList<Integer> prevSeq = new LinkedList<Integer>();
prevSeq.add(1);
// Using -1 as the delimiter
prevSeq.add(-1);
List<Integer> finalSeq = this.nextSequence(n, prevSeq);
StringBuffer seqStr = new StringBuffer();
for (Integer digit : finalSeq) {
seqStr.append(String.valueOf(digit));
}
return seqStr.toString();
}
protected LinkedList<Integer> nextSequence(int n, LinkedList<Integer> prevSeq) {
if (n <= 1) {
// remove the delimiter before return
prevSeq.pollLast();
return prevSeq;
}
LinkedList<Integer> nextSeq = new LinkedList<Integer>();
Integer prevDigit = null;
Integer digitCnt = 0;
for (Integer digit : prevSeq) {
if (prevDigit == null) {
prevDigit = digit;
digitCnt += 1;
} else if (digit == prevDigit) {
// in the middle of the sub-sequence
digitCnt += 1;
} else {
// end of a sub-sequence
nextSeq.add(digitCnt);
nextSeq.add(prevDigit);
// reset for the next sub-sequence
prevDigit = digit;
digitCnt = 1;
}
}
// add the delimiter for the next recursion
nextSeq.add(-1);
return this.nextSequence(n - 1, nextSeq);
}
}
Complexity
Intuition
This problem could be a good exercise to apply pattern matching, where in our case we need to find out all those repetitive groups of digits.
A regular expression (a.k.a regex) is a sequence of characters that defines a search pattern. The regex serves as a common tool for many pattern matching problems. And many programming languages provides regex capacities either with build-in constructs or via some libraries.
Note that, although the syntax of regex is mostly universal across all programming languages, there might exist some subtile differences. Here we show two examples in Java and Python respectively.
Java: regex = "(.)\1*"
We could break down the above regex expression into three 3 parts:
"(.)": it defines a group that contains a single character that could be of anything.
"\1": this part refers to the defined group with the index of 1.
"*": this qualifier followed by the group reference \1, indicates that we would like to see the group repeats itself zero or more times.
Python: regex = "((.)*)"
Slightly different than the above regex in Java, here we define two groups instead of one.
"(.)": again, this is a group that contains a single character that could be of anything.
"": this part refers to the second group (i.e. (.)) that we define.
"((.)*)": the outer bracket defines the scope of the first group, which contains the repetitive appearance of the second group above.
Algorithm
Based on the above definitions of regex, we then find all the matches to the regex and then concatenate the results together.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
class Solution {
public String countAndSay(int n) {
String currSeq = "1";
// Pattern to match the repetitive digits
String regexPattern = "(.)\\1*";
Pattern pattern = Pattern.compile(regexPattern);
for (int i = 1; i < n; ++i) {
Matcher m = pattern.matcher(currSeq);
StringBuffer nextSeq = new StringBuffer();
// each group contains identical and adjacent digits
while (m.find()) {
nextSeq.append(m.group().length() + String.valueOf(m.group().charAt(0)));
}
// prepare for the next iteration
currSeq = nextSeq.toString();
}
return currSeq;
}
}
Complexity
leetcode.com/problems/longest-common-prefix/
가장 쉬운 1번 어프로치부터 마지막 트라이까지해서 별로 나아지는게 없는데 어떤 답이 가장 좋은건지..
머리가 맑을때 다시한번 들여다 봐야겠다.
Intuition
For a start we will describe a simple way of finding the longest prefix shared by a set of strings LCP(S_1 \ldots S_n)LCP(S1…Sn). We will use the observation that :
LCP(S_1 \ldots S_n) = LCP(LCP(LCP(S_1, S_2),S_3),\ldots S_n)LCP(S1…Sn)=LCP(LCP(LCP(S1,S2),S3),…Sn)
Algorithm
To employ this idea, the algorithm iterates through the strings [S_1 \ldots S_n][S1…Sn], finding at each iteration ii the longest common prefix of strings LCP(S_1 \ldots S_i)LCP(S1…Si) When LCP(S_1 \ldots S_i)LCP(S1…Si) is an empty string, the algorithm ends. Otherwise after nn iterations, the algorithm returns LCP(S_1 \ldots S_n)LCP(S1…Sn).
Figure 1. Finding the longest common prefix (Horizontal scanning)
public String longestCommonPrefix(String[] strs) {
if (strs.length == 0) return "";
String prefix = strs[0];
for (int i = 1; i < strs.length; i++)
while (strs[i].indexOf(prefix) != 0) {
prefix = prefix.substring(0, prefix.length() - 1);
if (prefix.isEmpty()) return "";
}
return prefix;
}
Complexity Analysis
Time complexity : O(S) , where S is the sum of all characters in all strings.
In the worst case all nn strings are the same. The algorithm compares the string S1S1 with the other strings [S_2 \ldots S_n][S2…Sn] There are SS character comparisons, where SS is the sum of all characters in the input array.
Space complexity : O(1)O(1). We only used constant extra space.
Algorithm
Imagine a very short string is at the end of the array. The above approach will still do SS comparisons. One way to optimize this case is to do vertical scanning. We compare characters from top to bottom on the same column (same character index of the strings) before moving on to the next column.
public String longestCommonPrefix(String[] strs) {
if (strs == null || strs.length == 0) return "";
for (int i = 0; i < strs[0].length() ; i++){
char c = strs[0].charAt(i);
for (int j = 1; j < strs.length; j ++) {
if (i == strs[j].length() || strs[j].charAt(i) != c)
return strs[0].substring(0, i);
}
}
return strs[0];
}
Complexity Analysis
Intuition
The idea of the algorithm comes from the associative property of LCP operation. We notice that : LCP(S_1 \ldots S_n) = LCP(LCP(S_1 \ldots S_k), LCP (S_{k+1} \ldots S_n))LCP(S1…Sn)=LCP(LCP(S1…Sk),LCP(Sk+1…Sn)) , where LCP(S_1 \ldots S_n)LCP(S1…Sn) is the longest common prefix in set of strings [S_1 \ldots S_n][S1…Sn] , 1 < k < n1<k<n
Algorithm
To apply the observation above, we use divide and conquer technique, where we split the LCP(S_i \ldots S_j)LCP(Si…Sj) problem into two subproblems LCP(S_i \ldots S_{mid})LCP(Si…Smid) and LCP(S_{mid+1} \ldots S_j)LCP(Smid+1…Sj), where mid is \frac{i + j}{2}2i+j. We use their solutions lcpLeft and lcpRight to construct the solution of the main problem LCP(S_i \ldots S_j)LCP(Si…Sj). To accomplish this we compare one by one the characters of lcpLeft and lcpRight till there is no character match. The found common prefix of lcpLeft and lcpRight is the solution of the LCP(S_i \ldots S_j)LCP(Si…Sj).
Figure 2. Finding the longest common prefix of strings using divide and conquer technique
public String longestCommonPrefix(String[] strs) {
if (strs == null || strs.length == 0) return "";
return longestCommonPrefix(strs, 0 , strs.length - 1);
}
private String longestCommonPrefix(String[] strs, int l, int r) {
if (l == r) {
return strs[l];
}
else {
int mid = (l + r)/2;
String lcpLeft = longestCommonPrefix(strs, l , mid);
String lcpRight = longestCommonPrefix(strs, mid + 1,r);
return commonPrefix(lcpLeft, lcpRight);
}
}
String commonPrefix(String left,String right) {
int min = Math.min(left.length(), right.length());
for (int i = 0; i < min; i++) {
if ( left.charAt(i) != right.charAt(i) )
return left.substring(0, i);
}
return left.substring(0, min);
}
Complexity Analysis
In the worst case we have nn equal strings with length mm
Time complexity : O(S)O(S), where SS is the number of all characters in the array, S = m \cdot nS=m⋅n Time complexity is 2 \cdot T\left ( \frac{n}{2} \right ) + O(m)2⋅T(2n)+O(m). Therefore time complexity is O(S)O(S). In the best case this algorithm performs O(minLen \cdot n)O(minLen⋅n) comparisons, where minLenminLen is the shortest string of the array
Space complexity : O(m \cdot \log n)O(m⋅logn)
There is a memory overhead since we store recursive calls in the execution stack. There are \log nlogn recursive calls, each store need mm space to store the result, so space complexity is O(m \cdot \log n)O(m⋅logn)
The idea is to apply binary search method to find the string with maximum value L, which is common prefix of all of the strings. The algorithm searches space is the interval (0 \ldots minLen)(0…minLen), where minLen is minimum string length and the maximum possible common prefix. Each time search space is divided in two equal parts, one of them is discarded, because it is sure that it doesn't contain the solution. There are two possible cases:
Figure 3. Finding the longest common prefix of strings using binary search technique
public String longestCommonPrefix(String[] strs) {
if (strs == null || strs.length == 0)
return "";
int minLen = Integer.MAX_VALUE;
for (String str : strs)
minLen = Math.min(minLen, str.length());
int low = 1;
int high = minLen;
while (low <= high) {
int middle = (low + high) / 2;
if (isCommonPrefix(strs, middle))
low = middle + 1;
else
high = middle - 1;
}
return strs[0].substring(0, (low + high) / 2);
}
private boolean isCommonPrefix(String[] strs, int len){
String str1 = strs[0].substring(0,len);
for (int i = 1; i < strs.length; i++)
if (!strs[i].startsWith(str1))
return false;
return true;
}
Complexity Analysis
In the worst case we have nn equal strings with length mm
Time complexity : O(S \cdot \log m)O(S⋅logm), where SS is the sum of all characters in all strings.
The algorithm makes \log mlogm iterations, for each of them there are S = m \cdot nS=m⋅n comparisons, which gives in total O(S \cdot \log m)O(S⋅logm) time complexity.
Space complexity : O(1)O(1). We only used constant extra space.
Let's take a look at a slightly different problem:
Given a set of keys S = [S_1,S_2 \ldots S_n][S1,S2…Sn], find the longest common prefix among a string q and S. This LCP query will be called frequently.
We could optimize LCP queries by storing the set of keys S in a Trie. For more information about Trie, please see this article Implement a trie (Prefix trie). In a Trie, each node descending from the root represents a common prefix of some keys. But we need to find the longest common prefix of a string q and all key strings. This means that we have to find the deepest path from the root, which satisfies the following conditions:
Algorithm
The only question left, is how to find the deepest path in the Trie, that fulfills the requirements above. The most effective way is to build a trie from [S_1 \ldots S_n][S1…Sn] strings. Then find the prefix of query string q in the Trie. We traverse the Trie from the root, till it is impossible to continue the path in the Trie because one of the conditions above is not satisfied.
Figure 4. Finding the longest common prefix of strings using Trie
public String longestCommonPrefix(String q, String[] strs) {
if (strs == null || strs.length == 0)
return "";
if (strs.length == 1)
return strs[0];
Trie trie = new Trie();
for (int i = 1; i < strs.length ; i++) {
trie.insert(strs[i]);
}
return trie.searchLongestPrefix(q);
}
class TrieNode {
// R links to node children
private TrieNode[] links;
private final int R = 26;
private boolean isEnd;
// number of children non null links
private int size;
public void put(char ch, TrieNode node) {
links[ch -'a'] = node;
size++;
}
public int getLinks() {
return size;
}
//assume methods containsKey, isEnd, get, put are implemented as it is described
//in https://leetcode.com/articles/implement-trie-prefix-tree/)
}
public class Trie {
private TrieNode root;
public Trie() {
root = new TrieNode();
}
//assume methods insert, search, searchPrefix are implemented as it is described
//in https://leetcode.com/articles/implement-trie-prefix-tree/)
private String searchLongestPrefix(String word) {
TrieNode node = root;
StringBuilder prefix = new StringBuilder();
for (int i = 0; i < word.length(); i++) {
char curLetter = word.charAt(i);
if (node.containsKey(curLetter) && (node.getLinks() == 1) && (!node.isEnd())) {
prefix.append(curLetter);
node = node.get(curLetter);
}
else
return prefix.toString();
}
return prefix.toString();
}
}
Complexity Analysis
In the worst case query qq has length mm and it is equal to all nn strings of the array.
Time complexity : preprocessing O(S), where SS is the number of all characters in the array, LCP query O(m)O(m).
Trie build has O(S)O(S) time complexity. To find the common prefix of qq in the Trie takes in the worst case O(m)O(m).
Space complexity : O(S). We only used additional SS extra space for the Trie.
Array - Top Interview Questions[EASY] (1/9) : Java
String - Top Interview Questions[EASY] (2/9) : Java
Linked List - Top Interview Questions[EASY] (3/9) : Java
Trees - Top Interview Questions[EASY] (4/9) : Java
Sorting and Searching - Top Interview Questions[EASY] (5/9) : Java
Dynamic Programming - Top Interview Questions[EASY] (6/9) : Java
Design - Top Interview Questions[EASY] (7/9) - Java
Array - Top Interview Questions[EASY] (1/9) : Python
String - Top Interview Questions[EASY] (2/9) : Python
Linked List - Top Interview Questions[EASY] (3/9) : Python
Trees - Top Interview Questions[EASY] (4/9) : Python
Sorting and Searching - Top Interview Questions[EASY] (5/9) : Python
Dynamic Programming - Top Interview Questions[EASY] (6/9) : Python
Design - Top Interview Questions[EASY] (7/9) - Python