Welcome to another problem-solving adventure! Today, we will delve into the Word Break problem, a common coding challenge that appears frequently in technical interviews. We will explore the problem, understand it in depth, and then tackle it with efficient algorithms. By the end of this article, you will have a solid understanding of how to approach and solve this problem using dynamic programming and memoization techniques in Python.
If you’re just joining us, it might be helpful to catch up on the previous entries. Want to see more problem-solving techniques? Follow me here on my blog!
Understanding the Problem
The Word Break problem is defined as follows: Given a string s and a dictionary of strings wordDict, determine if s can be segmented into a space-separated sequence of one or more dictionary words. A word in the dictionary can be reused multiple times in the segmentation.
Let’s look at some examples to make it clearer:
Example 1:
Input: s = "leetcode", wordDict = ["leet","code"]
Output: true
Explanation: Return true because "leetcode" can be segmented as "leet code".
Example 2:
Input: s = "applepenapple", wordDict = ["apple","pen"]
Output: true
Explanation: Return true because "applepenapple" can be segmented as "apple pen apple".
Note that you are allowed to reuse a dictionary word.
Example 3:
Input: s = "catsandog", wordDict = ["cats","dog","sand","and","cat"]
Output: false
Using Efficient Approach (Dynamic Programming)
Dynamic programming is a method used to solve problems by breaking them down into simpler subproblems and storing the results of these subproblems to avoid redundant calculations. In the context of the Word Break problem, we can use a boolean array dp where dp[i]
indicates whether the substring s[0:i]
can be segmented into dictionary words.
def wordBreak(self, s: str, wordDict: List[str]) -> bool:
n = len(s)
dp = [False] * (n + 1)
dp[0] = True
for i in range(1, n + 1):
for j in range(i):
if dp[j] and s[j:i] in wordDict:
dp[i] = True
break
return dp[n]
Explanation
- Initialization: We initialize a boolean array
dp
of lengthn + 1
(wheren
is the length of the strings
) with all values set toFalse
, exceptdp[0]
which is set toTrue
. This is because an empty string can always be segmented trivially. - Nested Loops: We iterate through the string using two nested loops. The outer loop variable
i
represents the end index of the substring, and the inner loop variablej
represents the start index. - Checking Substrings: For each pair
(i, j)
, we check if the substrings[j:i]
is in the word dictionary and ifdp[j]
isTrue
. If both conditions are met, it means the substrings[0:i]
can be segmented, so we setdp[i]
toTrue
and break out of the inner loop. - Result: Finally, we return
dp[n]
which indicates whether the entire strings
can be segmented into dictionary words.
Time Complexity: O(n^2 * k)
, where n
is the length of the string s
and k
is the maximum length of words in the dictionary. The nested loops contribute O(n^2)
, and checking if a substring is in the dictionary takes O(k)
.
Space Complexity: O(n)
, due to the space required for the dp
array.
Using Efficient Approach 2 (Memoization)
Memoization is another optimization technique where we store the results of expensive function calls and reuse them when the same inputs occur again. For the Word Break problem, we can use a dictionary to memoize the results of subproblems.
def wordBreak(self, s: str, wordDict: List[str]) -> bool:
memo = {}
def rec(s):
if s in memo:
return memo[s]
if not s:
return True
for word in wordDict:
if s.startswith(word):
memo[s] = rec(s[len(word):])
if memo[s]:
return True
if s not in memo:
memo[s] = False
return memo[s]
return rec(s)
Explanation
- Memo Initialization: We initialize an empty dictionary
memo
to store the results of subproblems. - Recursive Function: We define a recursive function
rec(s)
that checks if the strings
can be segmented. - Base Case: If the string
s
is empty, we returnTrue
because an empty string can be segmented trivially. - Memo Check: Before processing, we check if the result for the current string
s
is already memoized. If it is, we return the memoized result. - Word Checking: We iterate through each word in the dictionary and check if the string
s
starts with that word. If it does, we recursively check the remaining substrings[len(word):]
. - Memoization: If a valid segmentation is found, we memoize the result as
True
. If no valid segmentation is found after checking all words, we memoize the result asFalse
. - Result: Finally, we return the result of the recursive function for the entire string
s
.
Time Complexity: O(n^2 * k)
, where n
is the length of the string s
and k
is the number of words in the dictionary. The recursion with memoization ensures that each substring is processed at most once.
Space Complexity: O(n)
, due to the space required for the memoization dictionary and the recursion stack.
Conclusion In this article, we explored the Word Break problem and solved it using dynamic programming and memoization techniques. Both approaches optimize the process of checking substrings and avoid redundant calculations, making them efficient solutions for this problem.
Each approach has its trade-offs in terms of time and space complexity, and choosing the right approach depends on the constraints of the problem and the size of the input.
Found this helpful? Follow me for more leetcode adventures! Questions? React out via email.