Understanding Why Array Can Excel Map

 

Introduction

We have discussed in a previous article, in most common situation if we have to maintain two Arrays working consistantly, we should better use a Map instead. And that's the reason why Map is introduced at the first place. But there are also some cases an Array can excel a Map.

The first question is, why do we care? The answer is Arrays will work more quickly than Maps. If we want to improve the speed, in some situation we can go back to using Arrays.  In this article we will see, when we are working with english alphabet(a~z), Array and Slice are good choice. Of course, Map still works correctly, all the difference is efficiency. Considering there are only 26 english letters, if we do some arrangement, we can store them as 0 ~ 25 in an Array or a Slice's index, other than store them as real "a"~"z" in the key section of a Map. Base on this idea, we can use Array to make our application works more efficently. Because Array is stored in a contiguous memory.

Let's have a look at a question from LeetCode, helping us make more sense of the above idea.

 

LeetCode 1160

"Find Words That Can Be Formed by Characters"

You are given an array of strings words and a string chars.

A string is good if it can be formed by characters from chars (each character can only be used once).

Return the sum of lengths of all good strings in words.

Example 1:

Input: words = ["cat","bt","hat","tree"], chars = "atach"
Output: 6
Explanation: 
The strings that can be formed are "cat" and "hat" so the answer is 3 + 3 = 6.

Example 2:

Input: words = ["hello","world","leetcode"], chars = "welldonehoneyr"
Output: 10
Explanation: 
The strings that can be formed are "hello" and "world" so the answer is 5 + 5 = 10.

 

The idea of solving the question is to store available chars and it's number in Slice "chars". Then we travel all strings in "words". If the string is "good", we count it's length.

If we use Maps:

func countCharacters(words []string, chars string) int {
    lengths := 0

    for _, word := range words {
        // reset charAvailable in each loop
        charAvailable:= make(map[string]int)
        for _, c := range chars {
            charAvailable[string(c)]++
        }
        
        fail := false
        for _, c := range word {
            num, _ := charAvailable[string(c)]
            if num > 0 {
                charAvailable[string(c)]--
            } else {
                fail = true
                break
            }
        }
        if !fail {
            lengths += len(word)
        }
    }
    
    return lengths
}

This solution is correct. Website shows that the speed is around 100ms, and is faster than only 5% of Go users.

We can see each loop we reset the "charAvailable" Map so it is very time consuming.

With a little help of make() function's argument, we can change it into charAvailable := make(map[string]int, len(chars)/2).  This can help decrease 20ms of time consumping, but still have room for optimization.

Now we want two things's help to improve our speed:

1. If we can find other data struct faster than Map?

2. If we can initialize "charAvailable" only once, and copy it for each loop?

Firstly, here comes Slice backed by an Array. All the Other idea is exactly the same as above.

func countCharacters(words []string, chars string) int {
    lengths := 0

    for _, word := range words {
        // reset charAvailable in each loop
        charAvailable:= make([]int, 26)
        for _, c := range chars {
            charAvailable[c-'a']++
        }
        
        fail := false
        for _, c := range word {
            if charAvailable[c-'a'] > 0 {
                charAvailable[c-'a']--
            } else {
                fail = true
                break
            }
        }
        if !fail {
            lengths += len(word)
        }
    }
    
    return lengths
}

Website shows that the speed is around 10ms, and is faster than 100% of Go users. It is good enough already.

Still we can go further, what if we want to initialize "charAvailable" only once, instead of reset in reach loop.

We will get to know a builtin function called copy(). This function can make a value copy between two Slices, instead of pointer copy when we are using "=" . 

Website shows the speed of this process is also around 10ms.

func countCharacters(words []string, chars string) int {
    lengths := 0
    
    charAvailable:= make([]int, 26)
    for _, c := range chars {
        charAvailable[c-'a']++
    }
    
    for _, word := range words {
        charAv := make([]int, 26)
        copy(charAv, charAvailable)
        
        fail := false
        for _, c := range word {
            if charAv[c-'a'] > 0 {
                charAv[c-'a']--
            } else {
                fail = true
                break
            }
        }
        if !fail {
            lengths += len(word)
        }
    }
    
    return lengths
}

Finally, as we already knew "=" between two Arrays will do value copy by default. We can try the same thing with an Array. Website shows the speed is around 10ms also.

func countCharacters(words []string, chars string) int {
    lengths := 0
    
    charAvailable:= [26]int{}
    for _, c := range chars {
        charAvailable[c-'a']++
    }
    
    for _, word := range words {
        charAv := charAvailable
        
        fail := false
        for _, c := range word {
            if charAv[c-'a'] > 0 {
                charAv[c-'a']--
            } else {
                fail = true
                break
            }
        }
        if !fail {
            lengths += len(word)
        }
    }
    
    return lengths
}  

 

Summary

Things to take away from this article:

1. Although Maps can help us maintain two series of data consistantly, but it's speed is somehow much slower than Array or Slice. In some situation, if we feel comfortable for using a Map in design, we can use it. But Later we can try to improve it with Array/ Slices.

2. When we dealing with alphabet, we don't need a Map. Using Arrays/ Slice will be more efficently.

3. We can use builtin function copy() to value-copy a Slice. It's speed is equal a for-loop copy manually, but the code will be more clear.

 

posted @ 2020-01-24 12:55  DrVonGoosewing  阅读(185)  评论(0)    收藏  举报