doble_bern

导航

 

Set 集合 

set - unordered collections of unique elements

创建一个set/一个空set

# create a new set
set1 = {1,2,3}
print(type(set1))  # result => <class 'set>

set2 = set()
print(type(set2))  # create an empty set, result => <class 'set'>

# as the object is iterable, we can create a set with list, tuple, dictionary etc.
list1 = [1, 2, 3]
set2 = set(list1)
print(set2, type(set2))  
# {1, 2, 3} <class 'set'>

tuple1 = (1, 2, 3)
set2 = set(tuple1)
print(set2, type(set2))
# {1, 2, 3} <class 'set'>

dict1 = {'k1':1, 'k2':2}
set2 = set(dict1)
print(set2, type(set2))
# {'k1', 'k2'} <class 'set'>

# check if below is workable to create an empty set 
set3 = {}
print(type(set3))
# type result => <class 'dict'> Failed 
# so the variable_name = {} only works for creating a new dictionary 
set在爬虫领域应用较多,比如有很多待爬取的网页,每爬取一个新网页就添加到数据中,如果在数据里已经存在这个网页,就不爬取。 
set的功能:
# to add a new element, we can use add()
# as set only contains different elements, if we add the same element multiple times, only one will remain

set1 = {1, 2, 3, }

set1.add(4)
set1.add(4)
set1.add(4)
print(set1)
# result => {1, 2, 3, 4}

# to clear a set
set1.clear()
print(set1)
# => set()

# a.difference(b), elements in a but not in b
set1 = {1, 2, 3, 5, 8, }
set2 = {0, 1, 2, 3, }
ret1 = set1.difference(set2)
print(ret1)
# => {8, 5}
ret2 = set2.difference(set1)
print(ret2)
# => {0}

# difference() function does not change the set, if we want to change it, we can use difference_update()
set0 = {-1, }
set0.difference_update(set2)
print(set0)

# discard() to remove an element, if not a member, do nothing
set0 = {1, 2, 3, }
set0.discard(1)
print(set0)
set0.discard(-1)
print(set0)
# both print out: {2, 3}, no error msg

# if we use remove(), when the target element is not a member, there'll be error
# set0.remove(-1)
# print(set0)
'''
 Traceback (most recent call last):
set()
  File "D:/NaomiPyer/1016/1017_set_functions.py", line 44, in <module>
{8, 5}
    set0.remove(-1)
{0}
KeyError: -1
'''

# intersection(), gets the shared info
set1 = {1, 2, 3, 5, 8, }
set2 = {0, 1, 2, 3, }
ret3 = set1.intersection(set2)
print(ret3)
# => {1, 2, 3}

# a.intersection_update(b) update a with the intersection of a and b
set1 = {1, 2, 3, 5, 8, }
set2 = {0, 1, 2, 3, }
set1.intersection_update(set2)
print(set1)
# => {1, 2, 3, }

# isdisjoint() returns True if two sets has no intersection
set1 = {1, 2, 3, }
set2 = {0, }
ret1 = set1.isdisjoint(set2)
print(ret1)
# => True

set1 = {1, 2, 3, }
set2 = {1, }
ret1 = set1.isdisjoint(set2)
print(ret1)
# => False

# issubset() report whether another set contains this set
set1 = {1, }
set2 = {0, 1, 2, 3, }
ret1 = set1.issubset(set2)
print(ret1)
# => True, set1 is the subset of set2
# issuperset() is the opposite
ret1 = set1.issuperset(set2)
ret2 = set2.issuperset(set1)
print(ret1)  # False
print(ret2)  # True

# pop() remove and return an arbitrary set element.
# raises keyError if the set is empty
set1 = {1, 2, 3, 4, 6, }
ret1 = set1.pop()
print(set1)  # => {2, 3, 4, 6}
print(ret1)  # => 1
set1 = set()
ret1 = set1.pop()
print(set1)
print(ret1)  # KeyError: 'pop from an empty set' - like remove(), raises KeyError

# symmetric_difference()
# returns the symmetric difference of two sets as a new one
set1 = {0, 1, 2, }
set2 = {-1, 0, 3}
ret1 = set1.symmetric_difference(set2)
print(ret1)
# => {1, 2, 3, -1}

# symmetric_difference_update()
# update set1 with the result
set1 = {0, 1, 2, }
set2 = {-1, 0, 3}
set1.symmetric_difference_update(set2)
print(set1)
# => {1, 2, 3, -1}

# union() returns the union of sets as a new set
set1 = {0, 1, 2, }
set2 = {-1, 0, 3}
ret1 = set1.union(set2)
print(ret1)
# => {0, 1, 2, 3, -1}

# update() updates the set with the union of itself and the other set
set1 = {0, 1, 2, }
set2 = {-1, 0, 3}
set1.update(set2)
print(set1)
# => {0, 1, 2, 3, -1}

 

 

关于set的方法,intersection()即为两个set的交集,difference()为补集,union() 为并集。

方法里有update的,即为在原set上直接更新。 

 

 

三目运算|三元运算

if condition: 

  block1

else:

  block2 

name = value1 if condition else: value2 

name = if condition? value1: value2 

 

 

浅度copy VS 深度copy 

str 一次性创建后,不能被修改,数据修改其实是要再次创建的

list 链表创建后会指定下一个元素的位置 

int 和 str 的赋值,copy和deepcopy,他们所指向的都是同一内存地址

import copy

n1 = 123

n2 = n1

n3 = copy.copy(n1)

n4 = copy.deepcopy(n2) 

对于其他的数据类型如list, tuple, dict, set 等,在copy的时候,内存地址是不一样的 

1. 赋值 

创建一个变量,变量指向原内存地址 

2. 浅copy 和 深copy

浅copy 在内存中只额外创建第一层数据

deepcopy 是除了最后一层,都创建了一次

 

 

 1 import copy
 2 
 3 
 4 n1 = n1 = {
 5     'language': 'python',
 6     'IDE': 'PyCharm',
 7     'operating system': ['Linux', 'Windows']
 8 }
 9 
10 n2 = copy.copy(n1)
11 n3 = copy.deepcopy(n1)
12 
13 print(id(n1))  # 1195332519496
14 print(id(n2))  # 1195332981832
15 print(id(n3))  # 1195332937352
16 # all id()'s are different

 

对比copy和deepcopy的指向

 

 

 

 

posted on 2016-10-17 21:27  doble_bern  阅读(189)  评论(0编辑  收藏  举报