Leetcode - 30. 串联所有单词的子串
给定一个字符串
s和一些长度相同的单词words。找出s中恰好可以由words中所有单词串联形成的子串的起始位置。
注意子串要与words中的单词完全匹配,中间不能有其他字符,但不需要考虑words中单词串联的顺序。
示例 1:
输入:s = "barfoothefoobarman", words = ["foo","bar"]
输出:[0,9]
解释:
从索引 0 和 9 开始的子串分别是 "barfoo" 和 "foobar" 。
输出的顺序不重要, [9,0] 也是有效答案。
示例 2:
输入:s = "wordgoodgoodgoodbestword", words = ["word","good","best","word"]
输出:[]
示例 3:
输入:s = "barfoofoobarthefoobarman", words = ["bar","foo","the"]
输出:[6,9,12]
提示:
- 1 <= s.length <= 104
- s 由小写英文字母组成
- 1 <= words.length <= 5000
- 1 <= words[i].length <= 30
- words[i] 由小写英文字母组成
来源:力扣(LeetCode)
链接:https://leetcode-cn.com/problems/substring-with-concatenation-of-all-words
著作权归领扣网络所有。商业转载请联系官方授权,非商业转载请注明出处。
(超时)解1 2021/8/30 O(?)
import re
from datetime import datetime
from itertools import permutations
def findSubstring(s: str, words: list) -> list:
# 用permutations得到所有组合情况,保存到set,避免words中有重复str导致permutations得到的内容重复
# 然后用find依次找set中的每个元素
needles=set()
res=[]
### 错误 - 2
#len_needle=0
### 错误 - 2
len=s.__len__()
for p in permutations(words):
tmp=''
for x in p:
tmp+=x
needles.add(tmp)
### 错误 - 2
#if len_needle==0: len_needle=tmp.__len__()
### 错误 - 2
'''
for x in needles:
i=0
while i<len:
idx=s[i:].find(x)
if idx!=-1:
### 错误 - 1
#res.append(idx)
res.append(idx+i)
### 错误 - 1
### 错误 - 2
#i+=idx+len_needle
i+=idx+1
### 错误 - 2
else: break
'''
return res
if __name__ == '__main__':
'''
print(findSubstring('barfoothefoobarman',["foo","bar"]))
print(findSubstring('wordgoodgoodgoodbestword',["word","good","best","word"]))
print(findSubstring('barfoofoobarthefoobarman',["bar","foo","the"]))
print(findSubstring('bar',["bar"]))
print(findSubstring('foo',["bar"]))
### 错误
# 1, [0,3,6]
print(findSubstring("foobarfoobar",["foo","bar"]))
# 2, [0,1]
# 这样一来,就不能一步跳一个needle的长度了,只能1步1步走
print(findSubstring("aaa",["a","a"]))
'''
### 超时
# 1,
# 内存跑掉了8个G,跑了1分钟还没出结果
# 其中permutation耗时就已经巨大,指数级上升
t1=datetime.today()
print(findSubstring("pjzkrkevzztxductzzxmxsvwjkxpvukmfjywwetvfnujhweiybwvvsrfequzkhossmootkmyxgjgfordrpapjuunmqnxxdrqrfgkrsjqbszgiqlcfnrpjlcwdrvbumtotzylshdvccdmsqoadfrpsvnwpizlwszrtyclhgilklydbmfhuywotjmktnwrfvizvnmfvvqfiokkdprznnnjycttprkxpuykhmpchiksyucbmtabiqkisgbhxngmhezrrqvayfsxauampdpxtafniiwfvdufhtwajrbkxtjzqjnfocdhekumttuqwovfjrgulhekcpjszyynadxhnttgmnxkduqmmyhzfnjhducesctufqbumxbamalqudeibljgbspeotkgvddcwgxidaiqcvgwykhbysjzlzfbupkqunuqtraxrlptivshhbihtsigtpipguhbhctcvubnhqipncyxfjebdnjyetnlnvmuxhzsdahkrscewabejifmxombiamxvauuitoltyymsarqcuuoezcbqpdaprxmsrickwpgwpsoplhugbikbkotzrtqkscekkgwjycfnvwfgdzogjzjvpcvixnsqsxacfwndzvrwrycwxrcismdhqapoojegggkocyrdtkzmiekhxoppctytvphjynrhtcvxcobxbcjjivtfjiwmduhzjokkbctweqtigwfhzorjlkpuuliaipbtfldinyetoybvugevwvhhhweejogrghllsouipabfafcxnhukcbtmxzshoyyufjhzadhrelweszbfgwpkzlwxkogyogutscvuhcllphshivnoteztpxsaoaacgxyaztuixhunrowzljqfqrahosheukhahhbiaxqzfmmwcjxountkevsvpbzjnilwpoermxrtlfroqoclexxisrdhvfsindffslyekrzwzqkpeocilatftymodgztjgybtyheqgcpwogdcjlnlesefgvimwbxcbzvaibspdjnrpqtyeilkcspknyylbwndvkffmzuriilxagyerjptbgeqgebiaqnvdubrtxibhvakcyotkfonmseszhczapxdlauexehhaireihxsplgdgmxfvaevrbadbwjbdrkfbbjjkgcztkcbwagtcnrtqryuqixtzhaakjlurnumzyovawrcjiwabuwretmdamfkxrgqgcdgbrdbnugzecbgyxxdqmisaqcyjkqrntxqmdrczxbebemcblftxplafnyoxqimkhcykwamvdsxjezkpgdpvopddptdfbprjustquhlazkjfluxrzopqdstulybnqvyknrchbphcarknnhhovweaqawdyxsqsqahkepluypwrzjegqtdoxfgzdkydeoxvrfhxusrujnmjzqrrlxglcmkiykldbiasnhrjbjekystzilrwkzhontwmehrfsrzfaqrbbxncphbzuuxeteshyrveamjsfiaharkcqxefghgceeixkdgkuboupxnwhnfigpkwnqdvzlydpidcljmflbccarbiegsmweklwngvygbqpescpeichmfidgsjmkvkofvkuehsmkkbocgejoiqcnafvuokelwuqsgkyoekaroptuvekfvmtxtqshcwsztkrzwrpabqrrhnlerxjojemcxel",
["dhvf", "sind", "ffsl", "yekr", "zwzq", "kpeo", "cila", "tfty", "modg", "ztjg", "ybty", "heqg", "cpwo", "gdcj","lnle", "sefg", "vimw", "bxcb"]))
#["dhvf", "sind", "ffsl", "yekr", "zwzq", "kpeo", "cila"])) # 5微秒
#["dhvf", "sind", "ffsl", "yekr", "zwzq", "kpeo", "cila", "tfty", "modg", "ztjg"])) # 5.3s
#["dhvf", "sind", "ffsl", "yekr", "zwzq", "kpeo", "cila", "tfty", "modg", "ztjg", "ybty", "heqg", "cpwo", "gdcj","lnle", "sefg", "vimw", "bxcb"]))
t2=datetime.today()
print(t2-t1)
'''
t1=datetime.today()
# 除了permutation,在查找这块,
# 换个思路,其实每个word,在s中的idxs都可以找出来,即使像上面的找法,找这些idxs就可以了
# step=1的步进的找应该是耗时的主要原因
# 像下面的找,几乎没有耗时
# 另外,总的解数,一定是所有word idxs中的最小值,比如’dhvf‘idxs只有1个,其他的word的idxs>=1,那么解最多只有1个,那么s[dhvf's idx:]是就是,不是就不是
# 所以,我们不妨:
# 1)找出每个word的idxs,同时计算最小个数(这个最小个数也是解的可能最大个数)
# 2)只在这些idxs切片中找答案
stest="pjzkrkevzztxductzzxmxsvwjkxpvukmfjywwetvfnujhweiybwvvsrfequzkhossmootkmyxgjgfordrpapjuunmqnxxdrqrfgkrsjqbszgiqlcfnrpjlcwdrvbumtotzylshdvccdmsqoadfrpsvnwpizlwszrtyclhgilklydbmfhuywotjmktnwrfvizvnmfvvqfiokkdprznnnjycttprkxpuykhmpchiksyucbmtabiqkisgbhxngmhezrrqvayfsxauampdpxtafniiwfvdufhtwajrbkxtjzqjnfocdhekumttuqwovfjrgulhekcpjszyynadxhnttgmnxkduqmmyhzfnjhducesctufqbumxbamalqudeibljgbspeotkgvddcwgxidaiqcvgwykhbysjzlzfbupkqunuqtraxrlptivshhbihtsigtpipguhbhctcvubnhqipncyxfjebdnjyetnlnvmuxhzsdahkrscewabejifmxombiamxvauuitoltyymsarqcuuoezcbqpdaprxmsrickwpgwpsoplhugbikbkotzrtqkscekkgwjycfnvwfgdzogjzjvpcvixnsqsxacfwndzvrwrycwxrcismdhqapoojegggkocyrdtkzmiekhxoppctytvphjynrhtcvxcobxbcjjivtfjiwmduhzjokkbctweqtigwfhzorjlkpuuliaipbtfldinyetoybvugevwvhhhweejogrghllsouipabfafcxnhukcbtmxzshoyyufjhzadhrelweszbfgwpkzlwxkogyogutscvuhcllphshivnoteztpxsaoaacgxyaztuixhunrowzljqfqrahosheukhahhbiaxqzfmmwcjxountkevsvpbzjnilwpoermxrtlfroqoclexxisrdhvfsindffslyekrzwzqkpeocilatftymodgztjgybtyheqgcpwogdcjlnlesefgvimwbxcbzvaibspdjnrpqtyeilkcspknyylbwndvkffmzuriilxagyerjptbgeqgebiaqnvdubrtxibhvakcyotkfonmseszhczapxdlauexehhaireihxsplgdgmxfvaevrbadbwjbdrkfbbjjkgcztkcbwagtcnrtqryuqixtzhaakjlurnumzyovawrcjiwabuwretmdamfkxrgqgcdgbrdbnugzecbgyxxdqmisaqcyjkqrntxqmdrczxbebemcblftxplafnyoxqimkhcykwamvdsxjezkpgdpvopddptdfbprjustquhlazkjfluxrzopqdstulybnqvyknrchbphcarknnhhovweaqawdyxsqsqahkepluypwrzjegqtdoxfgzdkydeoxvrfhxusrujnmjzqrrlxglcmkiykldbiasnhrjbjekystzilrwkzhontwmehrfsrzfaqrbbxncphbzuuxeteshyrveamjsfiaharkcqxefghgceeixkdgkuboupxnwhnfigpkwnqdvzlydpidcljmflbccarbiegsmweklwngvygbqpescpeichmfidgsjmkvkofvkuehsmkkbocgejoiqcnafvuokelwuqsgkyoekaroptuvekfvmtxtqshcwsztkrzwrpabqrrhnlerxjojemcxel"
pat=re.compile("dhvf")
print(pat.findall(stest))
print(stest.find('dhvf'))
t2=datetime.today()
print(t2-t1)
# 另外,有一个资源被浪费了,
# —— 每个word是等长的,上面的解法是不针对word长度的,这个等长一定是关键一环
'''


浙公网安备 33010602011771号