Loading

Leetcode - 30. 串联所有单词的子串

给定一个字符串s和一些长度相同的单词words。找出s中恰好可以由words中所有单词串联形成的子串的起始位置。
注意子串要与words中的单词完全匹配,中间不能有其他字符,但不需要考虑words中单词串联的顺序。

示例 1:

输入:s = "barfoothefoobarman", words = ["foo","bar"]
输出:[0,9]
解释:
从索引 0 和 9 开始的子串分别是 "barfoo" 和 "foobar" 。
输出的顺序不重要, [9,0] 也是有效答案。

示例 2:

输入:s = "wordgoodgoodgoodbestword", words = ["word","good","best","word"]
输出:[]

示例 3:

输入:s = "barfoofoobarthefoobarman", words = ["bar","foo","the"]
输出:[6,9,12]

提示:

  • 1 <= s.length <= 104
  • s 由小写英文字母组成
  • 1 <= words.length <= 5000
  • 1 <= words[i].length <= 30
  • words[i] 由小写英文字母组成

来源:力扣(LeetCode)
链接:https://leetcode-cn.com/problems/substring-with-concatenation-of-all-words
著作权归领扣网络所有。商业转载请联系官方授权,非商业转载请注明出处。

(超时)解1 2021/8/30 O(?)

import re
from datetime import datetime
from itertools import permutations

def findSubstring(s: str, words: list) -> list:
    # 用permutations得到所有组合情况,保存到set,避免words中有重复str导致permutations得到的内容重复
    # 然后用find依次找set中的每个元素
    needles=set()
    res=[]
    ### 错误 - 2
    #len_needle=0
    ### 错误 - 2
    len=s.__len__()
    for p in permutations(words):
        tmp=''
        for x in p:
            tmp+=x
        needles.add(tmp)
        ### 错误 - 2
        #if len_needle==0: len_needle=tmp.__len__()
        ### 错误 - 2
    '''
    for x in needles:
        i=0
        while i<len:
            idx=s[i:].find(x)
            if idx!=-1:
                ### 错误 - 1
                #res.append(idx)
                res.append(idx+i)
                ### 错误 - 1
                ### 错误 - 2
                #i+=idx+len_needle
                i+=idx+1
                ### 错误 - 2
            else: break
    '''
    return res

if __name__ == '__main__':
    '''
    print(findSubstring('barfoothefoobarman',["foo","bar"]))
    print(findSubstring('wordgoodgoodgoodbestword',["word","good","best","word"]))
    print(findSubstring('barfoofoobarthefoobarman',["bar","foo","the"]))
    print(findSubstring('bar',["bar"]))
    print(findSubstring('foo',["bar"]))
    ### 错误
    # 1, [0,3,6]
    print(findSubstring("foobarfoobar",["foo","bar"]))
    # 2, [0,1]
    # 这样一来,就不能一步跳一个needle的长度了,只能1步1步走
    print(findSubstring("aaa",["a","a"]))
    '''
    ### 超时
    # 1,
    # 内存跑掉了8个G,跑了1分钟还没出结果
    # 其中permutation耗时就已经巨大,指数级上升
    t1=datetime.today()
    print(findSubstring("pjzkrkevzztxductzzxmxsvwjkxpvukmfjywwetvfnujhweiybwvvsrfequzkhossmootkmyxgjgfordrpapjuunmqnxxdrqrfgkrsjqbszgiqlcfnrpjlcwdrvbumtotzylshdvccdmsqoadfrpsvnwpizlwszrtyclhgilklydbmfhuywotjmktnwrfvizvnmfvvqfiokkdprznnnjycttprkxpuykhmpchiksyucbmtabiqkisgbhxngmhezrrqvayfsxauampdpxtafniiwfvdufhtwajrbkxtjzqjnfocdhekumttuqwovfjrgulhekcpjszyynadxhnttgmnxkduqmmyhzfnjhducesctufqbumxbamalqudeibljgbspeotkgvddcwgxidaiqcvgwykhbysjzlzfbupkqunuqtraxrlptivshhbihtsigtpipguhbhctcvubnhqipncyxfjebdnjyetnlnvmuxhzsdahkrscewabejifmxombiamxvauuitoltyymsarqcuuoezcbqpdaprxmsrickwpgwpsoplhugbikbkotzrtqkscekkgwjycfnvwfgdzogjzjvpcvixnsqsxacfwndzvrwrycwxrcismdhqapoojegggkocyrdtkzmiekhxoppctytvphjynrhtcvxcobxbcjjivtfjiwmduhzjokkbctweqtigwfhzorjlkpuuliaipbtfldinyetoybvugevwvhhhweejogrghllsouipabfafcxnhukcbtmxzshoyyufjhzadhrelweszbfgwpkzlwxkogyogutscvuhcllphshivnoteztpxsaoaacgxyaztuixhunrowzljqfqrahosheukhahhbiaxqzfmmwcjxountkevsvpbzjnilwpoermxrtlfroqoclexxisrdhvfsindffslyekrzwzqkpeocilatftymodgztjgybtyheqgcpwogdcjlnlesefgvimwbxcbzvaibspdjnrpqtyeilkcspknyylbwndvkffmzuriilxagyerjptbgeqgebiaqnvdubrtxibhvakcyotkfonmseszhczapxdlauexehhaireihxsplgdgmxfvaevrbadbwjbdrkfbbjjkgcztkcbwagtcnrtqryuqixtzhaakjlurnumzyovawrcjiwabuwretmdamfkxrgqgcdgbrdbnugzecbgyxxdqmisaqcyjkqrntxqmdrczxbebemcblftxplafnyoxqimkhcykwamvdsxjezkpgdpvopddptdfbprjustquhlazkjfluxrzopqdstulybnqvyknrchbphcarknnhhovweaqawdyxsqsqahkepluypwrzjegqtdoxfgzdkydeoxvrfhxusrujnmjzqrrlxglcmkiykldbiasnhrjbjekystzilrwkzhontwmehrfsrzfaqrbbxncphbzuuxeteshyrveamjsfiaharkcqxefghgceeixkdgkuboupxnwhnfigpkwnqdvzlydpidcljmflbccarbiegsmweklwngvygbqpescpeichmfidgsjmkvkofvkuehsmkkbocgejoiqcnafvuokelwuqsgkyoekaroptuvekfvmtxtqshcwsztkrzwrpabqrrhnlerxjojemcxel",
                        ["dhvf", "sind", "ffsl", "yekr", "zwzq", "kpeo", "cila", "tfty", "modg", "ztjg", "ybty", "heqg", "cpwo", "gdcj","lnle", "sefg", "vimw", "bxcb"]))
    #["dhvf", "sind", "ffsl", "yekr", "zwzq", "kpeo", "cila"])) # 5微秒
    #["dhvf", "sind", "ffsl", "yekr", "zwzq", "kpeo", "cila", "tfty", "modg", "ztjg"])) # 5.3s
    #["dhvf", "sind", "ffsl", "yekr", "zwzq", "kpeo", "cila", "tfty", "modg", "ztjg", "ybty", "heqg", "cpwo", "gdcj","lnle", "sefg", "vimw", "bxcb"]))
    t2=datetime.today()
    print(t2-t1)
    '''
    t1=datetime.today()
    # 除了permutation,在查找这块,
    # 换个思路,其实每个word,在s中的idxs都可以找出来,即使像上面的找法,找这些idxs就可以了
    # step=1的步进的找应该是耗时的主要原因
    # 像下面的找,几乎没有耗时
    # 另外,总的解数,一定是所有word idxs中的最小值,比如’dhvf‘idxs只有1个,其他的word的idxs>=1,那么解最多只有1个,那么s[dhvf's idx:]是就是,不是就不是
    # 所以,我们不妨:
    #       1)找出每个word的idxs,同时计算最小个数(这个最小个数也是解的可能最大个数)
    #       2)只在这些idxs切片中找答案
    stest="pjzkrkevzztxductzzxmxsvwjkxpvukmfjywwetvfnujhweiybwvvsrfequzkhossmootkmyxgjgfordrpapjuunmqnxxdrqrfgkrsjqbszgiqlcfnrpjlcwdrvbumtotzylshdvccdmsqoadfrpsvnwpizlwszrtyclhgilklydbmfhuywotjmktnwrfvizvnmfvvqfiokkdprznnnjycttprkxpuykhmpchiksyucbmtabiqkisgbhxngmhezrrqvayfsxauampdpxtafniiwfvdufhtwajrbkxtjzqjnfocdhekumttuqwovfjrgulhekcpjszyynadxhnttgmnxkduqmmyhzfnjhducesctufqbumxbamalqudeibljgbspeotkgvddcwgxidaiqcvgwykhbysjzlzfbupkqunuqtraxrlptivshhbihtsigtpipguhbhctcvubnhqipncyxfjebdnjyetnlnvmuxhzsdahkrscewabejifmxombiamxvauuitoltyymsarqcuuoezcbqpdaprxmsrickwpgwpsoplhugbikbkotzrtqkscekkgwjycfnvwfgdzogjzjvpcvixnsqsxacfwndzvrwrycwxrcismdhqapoojegggkocyrdtkzmiekhxoppctytvphjynrhtcvxcobxbcjjivtfjiwmduhzjokkbctweqtigwfhzorjlkpuuliaipbtfldinyetoybvugevwvhhhweejogrghllsouipabfafcxnhukcbtmxzshoyyufjhzadhrelweszbfgwpkzlwxkogyogutscvuhcllphshivnoteztpxsaoaacgxyaztuixhunrowzljqfqrahosheukhahhbiaxqzfmmwcjxountkevsvpbzjnilwpoermxrtlfroqoclexxisrdhvfsindffslyekrzwzqkpeocilatftymodgztjgybtyheqgcpwogdcjlnlesefgvimwbxcbzvaibspdjnrpqtyeilkcspknyylbwndvkffmzuriilxagyerjptbgeqgebiaqnvdubrtxibhvakcyotkfonmseszhczapxdlauexehhaireihxsplgdgmxfvaevrbadbwjbdrkfbbjjkgcztkcbwagtcnrtqryuqixtzhaakjlurnumzyovawrcjiwabuwretmdamfkxrgqgcdgbrdbnugzecbgyxxdqmisaqcyjkqrntxqmdrczxbebemcblftxplafnyoxqimkhcykwamvdsxjezkpgdpvopddptdfbprjustquhlazkjfluxrzopqdstulybnqvyknrchbphcarknnhhovweaqawdyxsqsqahkepluypwrzjegqtdoxfgzdkydeoxvrfhxusrujnmjzqrrlxglcmkiykldbiasnhrjbjekystzilrwkzhontwmehrfsrzfaqrbbxncphbzuuxeteshyrveamjsfiaharkcqxefghgceeixkdgkuboupxnwhnfigpkwnqdvzlydpidcljmflbccarbiegsmweklwngvygbqpescpeichmfidgsjmkvkofvkuehsmkkbocgejoiqcnafvuokelwuqsgkyoekaroptuvekfvmtxtqshcwsztkrzwrpabqrrhnlerxjojemcxel"
    pat=re.compile("dhvf")
    print(pat.findall(stest))
    print(stest.find('dhvf'))
    t2=datetime.today()
    print(t2-t1)
    # 另外,有一个资源被浪费了,
    #   —— 每个word是等长的,上面的解法是不针对word长度的,这个等长一定是关键一环
    '''

posted @ 2021-08-30 09:05  wwcg2235  阅读(96)  评论(0)    收藏  举报