Peewee 入门

安装

pip install peewee

链接数据库

以mysql 为例（Peewee提供mysql,postgresql,sqllite）的支持

import peewee
settings = {'host': 'localhost', 'password': '', 'port': 3306, 'user': 'root'}
db = peewee.MySQLDatabase("test",**settings)

db.is_closed()  # 判断数据库是不是链接
db.connect()   # 数据库链接

db.create_tables([Person,]) #建库
#   如果数据表已经存在，执行create_table的时候，将会抛出异常。

建表

3.1 表关系

class Person(Model):
    name = CharField(verbose_name='姓名', max_length=10, null=False, index=True)
    passwd = CharField(verbose_name='密码', max_length=20, null=False, default='123456')
    email = CharField(verbose_name='邮件', max_length=50, null=True, unique=True)
    gender = IntegerField(verbose_name='姓别', null=False, default=1)
    birthday = DateField(verbose_name='生日', null=True, default=None)
    is_admin = BooleanField(verbose_name='是否是管理员', default=True)
    class Meta:
        database = db  # 这里是数据库链接，为了方便建立多个表，可以把这个部分提炼出来形成一个新的类
　　　　 table_name = 'persons'  # 这里可以自定义表名


# 例如：
　　　　class BaseModel(Model):
    　　   class Meta:
        　　  database = db
       class Person(BaseModel):
    　　    name = CharField()

全部数据类型

Field Type	Sqlite	Postgresql	MySQL
`IntegerField`	integer	integer	integer
`BigIntegerField`	integer	bigint	bigint
`SmallIntegerField`	integer	smallint	smallint
`AutoField`	integer	serial	integer
`FloatField`	real	real	real
`DoubleField`	real	double precision	double precision
`DecimalField`	decimal	numeric	numeric
`CharField`	varchar	varchar	varchar
`FixedCharField`	char	char	char
`TextField`	text	text	longtext
`BlobField`	blob	bytea	blob
`BitField`	integer	bigint	bigint
`BigBitField`	blob	bytea	blob
`UUIDField`	text	uuid	varchar(40)
`DateTimeField`	datetime	timestamp	datetime
`DateField`	date	date	date
`TimeField`	time	time	time
`TimestampField`	integer	integer	integer
`IPField`	integer	bigint	bigint
`BooleanField`	integer	boolean	bool
`BareField`	untyped	not supported	not supported
`ForeignKeyField`	integer	integer	integer

所有属性

null = False – 可否为空
index = False – index索引
unique = False – unique索引
column_name = None – string representing the underlying column to use if different, useful for legacy databases
default = None – 默认值，如果callable, 会调用生成！
primary_key = False – 主键
constraints = None - a list of one or more constraints, e.g. [Check('price > 0')]
sequence = None – sequence to populate field (if backend supports it)
collation = None – collation to use for ordering the field / index
unindexed = False – indicate field on virtual table should be unindexed (SQLite-only)
choices = None – an optional iterable containing 2-tuples of value, display
help_text = None – string representing any helpful text for this field
verbose_name = None – string representing the “user-friendly” name of this field

主键和约束

# 主键和约束
class Person(Model):
    first = CharField()
    last = CharField()

    class Meta:
        primary_key = CompositeKey('first', 'last')

class Pet(Model):
    owner_first = CharField()
    owner_last = CharField()
    pet_name = CharField()

    class Meta:
        constraints = [SQL('FOREIGN KEY(owner_first, owner_last) REFERENCES person(first, last)')]
        

# 复合主键
class BlogToTag(Model):
    """A simple "through" table for many-to-many relationship."""
    blog = ForeignKeyField(Blog)
    tag = ForeignKeyField(Tag)

    class Meta:
        primary_key = CompositeKey('blog', 'tag')
        
# 关闭主键自增
User._meta.auto_increment = False # turn off auto incrementing IDs

一、插入数据

插入数据，我们可以实例化一个Model，然后再使用save()的方法插入到数据库中。如下：

# 插入一条数据
p = Person(name='liuchungui', birthday=date(1990, 12, 20), is_relative=True)
p.save()

除了上面，我最常用的是insert()方法直接插入数据，它会返回新插入数据的主键给我们。

p_id = Person.insert({'name': 'liuchungui'}).execute() # 插入一条数据，返回主键
print(p_id) # 打印出新插入数据的id

有几种快速插入的方法。最直接的方法就是循环调用Model.create()

data_source = [
    {'field1': 'val1-1', 'field2': 'val1-2'},
    {'field1': 'val2-1', 'field2': 'val2-2'},
    # ...
]

for data_dict in data_source:
    Model.create(**data_dict)

以上方法较慢，有如下原因：

如果你用默认的自动提交，每当调用create方法时时，都会提交事务，所以这样很慢。
这种方法有少量的Python逻辑，而且每次都会有InsertQuery生成并且被解析为ＳＱＬ语句。
发往数据库有许多数据而不是ＳＱＬ原生字节需要数据库解析
我们得到了最后插入的id在某些情况下会引起额外的查询。

可以简单的在一个原子操作中包装而得到显著的速度提升。还不够快，可以使用insert_many()。接受字典list作参数。

# This is much faster.
with db.atomic():
    for data_dict in data_source:
        Model.create(**data_dict)

根据你的数据源的行数。可以将批量操作分割成块。

# Insert rows 1000 at a time.
with db.atomic():
    for idx in range(0, len(data_source), 1000):
        Model.insert_many(data_source[idx:idx+1000]).execute()

如果你想要插入的数据存储在另一张表中，你也可以创建一个INSERT查询，它的插入数据源是一个SELECT查询。用Model.insert_from()方法

query = (TweetArchive
         .insert_from(
             fields=[Tweet.user, Tweet.message],
             query=Tweet.select(Tweet.user, Tweet.message))
         .execute())

上面都是插入一条数据，若是有很多数据需要插入，例如几万条数据，为了性能，这时就需要使用insert_many()，如下：

data = [
    {'facid': 9, 'name': 'Spa', 'membercost': 20, 'guestcost': 30,'initialoutlay': 100000, 'monthlymaintenance': 800},
    {'facid': 10, 'name': 'Squash Court 2', 'membercost': 3.5,'guestcost': 17.5, 'initialoutlay': 5000, 'monthlymaintenance': 80}]
query = Facility.insert_many(data) # 插入了多个

with db.atomic():  # 一次链接
    for data_dict in data_source:
        MyModel.create(**data_dict)

data = [{'name': '123'} for i in range(10000)]
with database.atomic():
    for i in range(0, NUM, 100):
        # 每次批量插入100条，分成多次插入
        Person.insert_many(data[i:i + 100]).execute()

至于为啥要使用insert_many()，可以看看我前面写的Peewee批量插入数据。

grandma = Person.create(name='Grandma', birthday=date(1935, 3, 1))
herb = Person.create(name='Herb', birthday=date(1950, 5, 5))

注意：

插入的时候，如果字段设置了default值，则会按照default指定的值插入，如果没有指定，同时字段可以为null，则数据库自动初始化值为null，如果字段不能为null，则数据库自动初始化为其零值。如果字段为非Null，最好设置default值，同时数据库schema也设置其default值，如果字段为可以为null，那么初始值就设置为null即可。

删除数据

删除单个模型实例，可以使用Model.delete_instance()方法。delete_instance()将删除模型实例，选择性的递归删除依赖的对象（recursive=True）

>>> user = User.get(User.id == 1)
>>> user.delete_instance()  # Returns the number of rows deleted.
1

>>> User.get(User.id == 1)
UserDoesNotExist: instance matching query does not exist:
SQL: SELECT t1."id", t1."username" FROM "user" AS t1 WHERE t1."id" = ?
PARAMS: [1]

删除满足条件的行，用DELETE查询。下列例子删除超过1年的所有Tweet对象

>>> query = Tweet.delete().where(Tweet.creation_date < one_year_ago)
>>> query.execute()  # Returns the number of rows deleted.
7

p.delete_instance()
query = Member.delete().where(Member.memid == 37)


# DELETE FROM members WHERE NOT EXISTS (SELECT * FROM bookings WHERE bookings.memid = members.memid);
subq = Booking.select().where(Booking.member == Member.memid)
query = Member.delete().where(~fn.EXISTS(subq)) # 同样这样的操作是被推荐的

二、查询数据

1、查询单条数据

我们可以直接使用get()获取单条数据，在参数中传递查询条件。

User.get(User.id == 1)

User.get_by_id(1)  # Same as above.

User[1]  # Also same as above.
g = Person.select().where(Person.name == 'Grandma L.').get()   # where是查询一个集合, select是查询字段

g = Person.get(Person.name == 'fff.')   # get是得到第一个

g = Person.select().where(Person.age > 23).get()
# select 代表sql语句中select后面的语句表示要展示的字段 # where 代表where条件语句 得到一个数据集合，用for循环遍历 # get()代表找第一个


person, created = Person.get_or_create(
    first_name=first_name,
    last_name=last_name,
    defaults={'dob': dob, 'favorite_color': 'green'}
)

# defaults dictionary, which will be used to populate values on newly-created instances.

复合条件

query1 = Person.select().where((Person.name == "fff0") | (Person.name == "sss1"))
query2 = Person.select().where((Person.name == "fff") & (Person.is_relative == True))

==   x equals y
<    x is less than y
<=   x is less than or equal to y
>    x is greater than y
>=   x is greater than or equal to y
!=   x is not equal to y
<<   x IN y, where y is a list or query
>>   x IS y, where y is None/NULL
%    x LIKE y where y may contain wildcards
**   x ILIKE y where y may contain wildcards
^    x XOR y
~    Unary negation (e.g., NOT x)

模糊查询

#SELECT * FROM person WHERE name ILIKE '%tennis%';

Person.select().where(Person.name ** "%fff%").get()
######### OR ##############
query = Facility.select().where(Facility.name.contains('tennis'))

In 查询

# SELECT * FROM facilities WHERE facid IN (1, 5);
query = Facility.select().where(Facility.facid.in_([1, 5]))

字典展示(还可以tuples(),namedtuples(),objects())

query = User.select().dicts()
for row in query:
    print(row)
# row 是什么字典

Order by; Limit; Distinct; Group by; Having

query = (Person.select(Person.name).order_by(Person.name).limit(10).distinct())  # 几乎和sql一模一样
Person.select().order_by(Person.birthday.desc())  # 日期排序
query = (Booking
         .select(Booking.facid, fn.SUM(Booking.slots)) 
         .group_by(Booking.facid)   # group_by
         .order_by(Booking.facid))
query = (Booking
         .select(Booking.facility, fn.SUM(Booking.slots))
         .group_by(Booking.facility)
         .having(fn.SUM(Booking.slots) > 1000)   # having
         .order_by(Booking.facility))
 
Tweet.select().order_by(-Tweet.created_date) 
# Similarly you can use "+" to indicate ascending order, though ascending

聚合函数

# SELECT MAX(birthday) FROM person;
query = Person.select(fn.MAX(Person.birthday))

# SELECT name, is_relative FROM person WHERE birthday = (SELECT MAX(birthday) FROM person);
MemberAlias = Member.alias()  # 如果一个查询中用了两个表，需要这个Alias作为影子
subq = MemberAlias.select(fn.MAX(MemberAlias.joindate))
query = (Member.select(Person.is_relative, Person.name, ).where(Person.birthday == subq))

分页&计数

# paginate两个参数：page_number 和 items_per_page
for tweet in Tweet.select().order_by(Tweet.id).paginate(2, 10):
    print(tweet.message)

# 返回查到了多少条记录
Tweet.select().where(Tweet.id > 50).count()

原生SQL

query = MyModel.raw('SELECT * FROM my_table WHERE data = %s', user_data)
query = MyModel.select().where(SQL('Some SQL expression %s' % user_data))
# sql 注入: ensure that any user-defined data is passed in as a query parameter and not part of the actual SQL query

2、查询多条数据

使用select()查询，后面不添加where()是查询整个表的内容。

# 查询Person整张表的数据
persons = Person.select()
# 遍历数据
for p in persons:
    print(p.name, p.birthday, p.is_relative)

我们可以在select()后面添加where()当做查询条件

# 获取is_relative为True的数据
persons = Person.select().where(Person.is_relative == True)
for p in persons:
    print(p.name, p.birthday, p.is_relative)

我们可以通过sql()方法转换为SQL语句进行查看理解

persons = Person.select().where(Person.is_relative == True)
# 打印出的结果为：('SELECT `t1`.`id`, `t1`.`name`, `t1`.`is_relative` FROM `Person` AS `t1` WHERE (`t1`.`is_relative` = %s)', [True])
print(persons.sql())

3、查询数据条数、排序、Limit

查询数据条数，直接在后面加上count()就行了

# 查询整张表的数据条数
total_num = Person.select().count()

# 查询name为liuchungui的Person数量, 返回数量为1
num = Person.select().where(Person.name == 'liuchungui').count()

排序，使用的是order_by()，参数内加上按对应字段进行排序

# 按照创建时间降序排序
persons = Person.select().order_by(Person.create_time.desc())

# 按照创建时间升序排序
persons = Person.select().order_by(Person.create_time.asc())

Limit是使用limit()，传递一个数字，例如2就是获取前两条数据，它可以搭配offset()一起使用

# 相当于sql语句: select * from person order by create_time desc limit 5
persons = Person.select().order_by(Person.create_time.asc()).limit(5)

# 相当于sql语句中：select * from person order by create_time desc limit 2, 5
persons = Person.select().order_by(Person.create_time.asc()).limit(5).offset(2)

随机取得记录

如果你想随机得到记录。可以通过random 或rand函数去排序：

Postgresql和Sqlite通过Random函数：

# Pick 5 lucky winners:
LotteryNumber.select().order_by(fn.Random()).limit(5)

MySQL用Rand：

# Pick 5 lucky winners:
LotterNumber.select().order_by(fn.Rand()).limit(5)

聚合记录

假如你想要得到每个用户的tweets数量，annotate提供了简单的这种查询的方法：

query = User.select().annotate(Tweet)

以上查询相当于：

query = (User
         .select(User, fn.Count(Tweet.id).alias('count'))
         .join(Tweet)
         .group_by(User))

查询结果返回带有自身属性和额外包含每个用户tweets数量的count属性的User实例数组。默认如果外键不为空时使用内连接，

也就是没有tweets的用户将不在返回结果中。为了避免，手动指名连接类型去返回0 tweets的用户。

query = (User
         .select()
         .join(Tweet, JOIN.LEFT_OUTER)
         .switch(User)
         .annotate(Tweet))

也可以指定自定义的聚合器，比如MIN 或者MAX：

query = (User
         .select()
         .annotate(
             Tweet,
             fn.Max(Tweet.created_date).alias('latest_tweet_date')))

假如标签应用中想找到有某数量相关联对象的标签。这个例子需要不同模型多对多配置：

class Photo(Model):
    image = CharField()

class Tag(Model):
    name = CharField()

class PhotoTag(Model):
    photo = ForeignKeyField(Photo)
    tag = ForeignKeyField(Tag)

比如我们现在想找到有至少5张照片关联的标签：

query = (Tag
         .select()
         .join(PhotoTag)
         .join(Photo)
         .group_by(Tag)
         .having(fn.Count(Photo.id) > 5))

上面查询等价于下面的SQL语句：

SELECT t1."id", t1."name"
FROM "tag" AS t1
INNER JOIN "phototag" AS t2 ON t1."id" = t2."tag_id"
INNER JOIN "photo" AS t3 ON t2."photo_id" = t3."id"
GROUP BY t1."id", t1."name"
HAVING Count(t3."id") > 5

假如我们想得到上面结果每个标签关联照片的数量：

query = (Tag
         .select(Tag, fn.Count(Photo.id).alias('count'))
         .join(PhotoTag)
         .join(Photo)
         .group_by(Tag)
         .having(fn.Count(Photo.id) > 5))

得到标量返回值

你可以通过调用Query.scalar来返回标量值：

>>> PageView.select(fn.Count(fn.Distinct(PageView.url))).scalar()
100

传入as_tuple=True来返回多个标量值：

>>> Employee.select(
...     fn.Min(Employee.salary), fn.Max(Employee.salary)
... ).scalar(as_tuple=True)
(30000, 50000)

三、更新数据

Peewee可以让你使用原子性更新。比如我们需要更新一些计数器。直接的方法是：

>>> for stat in Stat.select().where(Stat.url == request.url):
...     stat.counter += 1
...     stat.save()

不要这样做！不仅速度很慢，而且容易冲突如果多进程同时在更新这个计数器时。

应该用update方法原子性的更新计数器

>>> query = Stat.update(counter=Stat.counter + 1).where(Stat.url == request.url)
>>> query.execute()

可以尽可能复杂的查询语句。我们将我们雇员的奖金修改为旧奖金加上工资的1成：

>>> query = Employee.update(bonus=(Employee.bonus + (Employee.salary * .1)))
>>> query.execute()  # Give everyone a bonus!

我们可以使用子查询去更新某列的值。假如我们有一个存储User模型某个user tweets数量的字段，我们想要阶段性的更新它的值：

>>> subquery = Tweet.select(fn.COUNT(Tweet.id)).where(Tweet.user == User.id)
>>> update = User.update(num_tweets=subquery)
>>> update.execute()

当一个Model实例拥有主键时，此时使用save()就是修改数据

# 已经实例化的数据,指定了id这个primary key,则此时保存就是更新数据
p = Person(name='liuchungui', birthday=date(1990, 12, 20), is_relative=False)
p.id = 1
p.save()

也可以使用update()来更新数据，一般都会搭配where()使用

query = Facility.update(membercost=6, guestcost=30).where(Facility.name.startswith('Tennis')))
query.execute()  # 元子操作

# 更新birthday数据
q = Person.update({Person.height: 1.75}).where(Person.name == 'Jack')
q.execute()

当然，除了使用Model的属性，我们可以直接使用字典结构来更新数据

q = Person.update({
    'height': 1.75
}).where(Person.name == 'Jack')
q.execute()

四、查询操作符

在查询、更新、删除数据的时候，经常会带有Where条件语句。而Peewee支持以下类型比较符：

查询操作符

peewee支持一下类型的比较符：


Comparison	Meaning
`==`	x equals y
`<`	x is less than y
`<=`	x is less than or equal to y
`>`	x is greater than y
`>=`	x is greater than or equal to y
`!=`	x is not equal to y
`<<`	x IN y, where y is a list or query
`>>`	x IS y, where y is None/NULL
`%`	x LIKE y where y may conta

因为懒得重载所有操作符，所以还有当作方法的些查询操作符：

Method	Meaning
`.contains(substr)`	Wild-card search for substring.
`.startswith(prefix)`	Search for values beginning with `prefix`.
`.endswith(suffix)`	Search for values ending with `suffix`.
`.between(low, high)`	Search for values between `low` and `high`.
`.regexp(exp)`	Regular expression match.
`.bin_and(value)`	Binary AND.
`.bin_or(value)`	Binary OR.
`.in_(value)`	IN lookup (identical to `<<`).
`.not_in(value)`	NOT IN lookup.
`.is_null(is_null)`	IS NULL or IS NOT NULL. Accepts boolean param.
`.concat(other)`	Concatenate two strings using `\|\|`.

用逻辑运算符去连接子句：

Operator	Meaning	Example
`&`	AND	`(User.is_active == True) & (User.is_admin == True)`
`\|` (pipe)	OR	`(User.is_admin) \| (User.is_superuser)`
`~`	NOT (unary negation)	`~(User.username << ['foo', 'bar', 'baz'])`

演示如何使用这些操作符：

其中，==、<、<=、>、>=、!=是很容易理解的，重点提下<<、>>和%。用示例说明：

# <<使用，查询省份属于湖北和湖南的，对应sql语句：select * from person where province in ('湖南', '湖北')
persons = Person.select().where(Person.province << ['湖南', '湖北'])

# >>使用，查询省份为空的，sql语句: select * from person where province is Null
persons = Person.select().where(Person.province >> None)

# %使用，查询省份中含有 湖 字，sql语句：select * from person where province like '%湖%'
persons = Person.select().where(Person.province % '%湖%')

有时，我们查询条件不止一个，需要使用逻辑运算符连接，而Python中的and、or在Peewee是不支持的，此时我们需要使用Peewee封装好的运算符，如下：

使用示例如下：

# 查询湖南和湖北的, 注意需要用()将Person.province == '湖南'包一层
persons = Person.select().where((Person.province == '湖南') | (Person.province == '湖北'))

# 查询湖南和身高1.75
persons = Person.select().where((Person.province == '湖南') & (Person.height == 1.75))

注意：使用的时候，需要内部还使用()将Person.province == '湖南'包起来，否则不会生效。示例：persons = Person.select().where((Person.province == '湖南') | (Person.province == '湖北'))

虽然你在试图用python 的in,and,or 和not操作符，但是它们不起作用。in 表达式总是返回boolean 值。

同样的，and，or和not将把参数当作boolean值，不能被重载。

所以注意：

用<<代替in
用&代替and
用!代替or
用~代替not
不要忘记你的比较表达式用逻辑运算符连接时使用括号包装。

SQLite中的LIKE 和ILIKE

因为SSQLite的LIKE操作符默认是大小写不敏感，peewee用SQLite GLOB操作来实现大小写敏感的查询。

glob操作符用*代替%作为通配符。如果你使用SQLite想要大小写敏感的字符匹配，记住用*来作为通配符。

三种值化的逻辑

SQL处理NULL的方法不同，可有下列表达式中的操作：

IS NULL
IS NOT NULL
IN
NOT IN

虽然你可以使用IS NULL 和IN前加上否定运算符~来实现，但是最好显式的用IS NOT NULL和NOT IN

来使得语义更明确。

最简单用IS NULL 和IN就是使用操作符重载：

# Get all User objects whose last login is NULL.
User.select().where(User.last_login >> None)

# Get users whose username is in the given list.
usernames = ['charlie', 'huey', 'mickey']
User.select().where(User.username << usernames)

如果不喜欢操作符重载，可以调用字段方法：

# Get all User objects whose last login is NULL.
User.select().where(User.last_login.is_null(True))

# Get users whose username is in the given list.
usernames = ['charlie', 'huey', 'mickey']
User.select().where(User.username.in_(usernames))

取反上面的查询，你可以使用一元运算符~，最好使用ＩＳ　ＮＯＴ和ＮＯＴ　ＩＮ来消除歧义。

# Get all User objects whose last login is *NOT* NULL.
User.select().where(User.last_login.is_null(False))

# Using unary negation instead.
User.select().where(~(User.last_login >> None))

# Get users whose username is *NOT* in the given list.
usernames = ['charlie', 'huey', 'mickey']
User.select().where(User.username.not_in(usernames))

# Using unary negation instead.
usernames = ['charlie', 'huey', 'mickey']
User.select().where(~(User.username << usernames))

自定义的操作符

因为懒得操作符重载，作者减去了一些操作比如，module。如果你你在上表中没发现你想要的操作符，ｎｉ

可以很容易的加上你自己定义的

在SQLite中增加module支持

from peewee import *
from peewee import Expression # the building block for expressions

OP['MOD'] = 'mod'

def mod(lhs, rhs):
    return Expression(lhs, OP.MOD, rhs)

SqliteDatabase.register_ops({OP.MOD: '%'})

现在使用自定义的操作构建丰富的查询：

# Users with even ids.
User.select().where(mod(User.id, 2) == 0)

表达式

Peewee 设计成提供简单的，印象深刻的，pythonic 来构造SQL查询。这部分给出一些表达式的用法。

主要两种对象类型来创建表达式：

Field 实例
SQL聚合和fn函数

我们假如简单User模型，拥有username和其他事情：

class User(Model):
    username = CharField()
    is_admin = BooleanField()
    is_active = BooleanField()
    last_login = DateTimeField()
    login_count = IntegerField()
    failed_logins = IntegerField()

用查询表达式的比较：

# username is equal to 'charlie'
User.username == 'charlie'

# user has logged in less than 5 times
User.login_count < 5

比较可以用and 和 or的位符号来连接。操作符可以被python控制优先级，同时可以嵌套任意深度：

# User is both and admin and has logged in today
(User.is_admin == True) & (User.last_login >= today)

# User's username is either charlie or charles
(User.username == 'charlie') | (User.username == 'charles')

也可以用函数来比较：

# user's username starts with a 'g' or a 'G':
fn.Lower(fn.Substr(User.username, 1, 1)) == 'g'

我们可以做些有意思的事情，表达式可以与其他表达式比较。表达式之间也支持算术操作符：

# users who entered the incorrect more than half the time and have logged
# in at least 10 times
(User.failed_logins > (User.login_count * .5)) & (User.login_count > 10)

表达式也支持原子性更新：

# when a user logs in we want to increment their login count:
User.update(login_count=User.login_count + 1).where(User.id == user_id)

表达式可以用在查询的任意部分，所以自己去尝试一下！

五、联表查询

有时，我们需要查询两个表中的数据，在Peewee中也可以实现，官方示例如下：

query = (Tweet
         .select(Tweet.content, Tweet.timestamp, User.username)
         .join(User, on=(User.id == Tweet.user_id))
         .order_by(Tweet.timestamp.desc()))

上面查询的结果，会在Tweet的Model中添加一个属性user，此时我们可以通过user来访问到查询到的User表信息，如下：

for tweet in query:
    print(tweet.content, tweet.timestamp, tweet.user.username)

外键

表关系一对多

class Pet(peewee.Model):
    name = peewee.CharField()
    owner = peewee.ForeignKeyField(Person,related_name="pets",backref="petties")   
　　# backref是反查的字段，如果有related_name用related_name反查，如果没有直接用petties反查 e.g. [i.name for i in Person.get(name="aaa").petties] 

    class Meta: 
        database = db

class Category(Model):
    name = CharField()
    parent = ForeignKeyField('self', null=True, backref='children')
   # 注意自关联永远是null = True

# 插入数据
g2 = Person.get(tablesinfo.Person.is_relative == False)
d2 = Pet.create(name="dog2",owner=g2)

#正查
dog1 = Pet.get(name="dog1")
dog1.owner.name

# 反查
aaa = Person.get(name="aaa").pets  # pets为related_name字段，如果没写用backref字段
for a in aaa:
    print(i.name)
    
g1 = Person.select().join(Pet).where(Pet.name == "dog2")

# SELECT DISTINCT m.firstname, m.surname FROM members AS m2 INNER JOIN members AS m ON (m.memid = m2.recommendedby) ORDER BY m.surname, m.firstname;

MA = Member.alias()
query = (Member
         .select(Member.firstname, Member.surname)
         .join(MA, on=(MA.recommendedby == Member.memid))  # join中用on表示链接方法
         .order_by(Member.surname, Member.firstname)
)
         
User.select().join(Tweet).join(Comment)
Artist.select().join(Album).switch(Artist).join(Genre) # 如果连一表多次

多对多

Peewee没提供一个字段去解决像django 那样的多对多关系映射-这是因为事实上这个字段是隐藏在一张中间表中。

用peewee去实现多对多映射，你必须亲自实现中间表，并且用它作媒介来查询：

class Student(Model):
    name = CharField()

class Course(Model):
    name = CharField()

class StudentCourse(Model):
    student = ForeignKeyField(Student)
    course = ForeignKeyField(Course)

去查询选修数学的所有学生：

query = (Student
         .select()
         .join(StudentCourse)
         .join(Course)
         .where(Course.name == 'math'))
for student in query:
    print student.name

去查询某学生选修的所有课程：

courses = (Course
    .select()
    .join(StudentCourse)
    .join(Student)
    .where(Student.name == 'da vinci'))

for course in courses:
    print course.name

有效的去列举多对多关系，比如：列举所有学生和他们的选修课程，我们通过StudentCourse来查询，并且预计算Student和

Course：

query = (StudentCourse
         .select(StudentCourse, Student, Course)
         .join(Course)
         .switch(StudentCourse)
         .join(Student)
         .order_by(Student.name))

下面代码打印学生们和他们的选修课：

last = None
for student_course in query:
    student = student_course.student
    if student != last:
        last = student
        print 'Student: %s' % student.name
    print '    - %s' % student_course.course.name

因为我们在select子句中包含了Student和Course的所有字段，所以这些外键穿越是免费的，我们在一次查询中完成了整个列举。

ManyToManyField

ManyToManyField提供了在字段中表示多对多映射。你最好使用标准的peewee API去处理大多数情况，除非，如果你的模型是非常简单

并且你的查询不是太复杂，你可以使用ManyToManyField去得到快速的查询提升体验。详细使用说明请看FIelds扩展部分。

用ManyToManyField的studen和course模型：

from peewee import *
from playhouse.fields import ManyToManyField

db = SqliteDatabase('school.db')

class BaseModel(Model):
    class Meta:
        database = db

class Student(BaseModel):
    name = CharField()

class Course(BaseModel):
    name = CharField()
    students = ManyToManyField(Student, related_name='courses')

StudentCourse = Course.students.get_through_model()

db.create_tables([
    Student,
    Course,
    StudentCourse])

# Get all classes that "huey" is enrolled in:
huey = Student.get(Student.name == 'Huey')
for course in huey.courses.order_by(Course.name):
    print course.name

# Get all students in "English 101":
engl_101 = Course.get(Course.name == 'English 101')
for student in engl_101.students:
    print student.name

# When adding objects to a many-to-many relationship, we can pass
# in either a single model instance, a list of models, or even a
# query of models:
huey.courses.add(Course.select().where(Course.name.contains('English')))

engl_101.students.add(Student.get(Student.name == 'Mickey'))
engl_101.students.add([
    Student.get(Student.name == 'Charlie'),
    Student.get(Student.name == 'Zaizee')])

# The same rules apply for removing items from a many-to-many:
huey.courses.remove(Course.select().where(Course.name.startswith('CS')))

engl_101.students.remove(huey)

# Calling .clear() will remove all associated objects:
cs_150.students.clear()

详细使用：

ManyToManyField.add()
ManyToManyField.remove()
ManyToManyField.clear()
ManyToManyField.get_through_model()

六、事务

atomic和rollback

with db.atomic() as transaction:  # Opens new transaction.
    try:
        save_some_objects()
    except ErrorSavingData:
        # Because this block of code is wrapped with "atomic", a
        # new transaction will begin automatically after the call
        # to rollback().
        transaction.rollback()
        error_saving = True

atomic甚至可以多层嵌套

db = SqliteDatabase(':memory:')

with db.atomic() as txn:
    # This is the outer-most level, so this block corresponds to
    # a transaction.
    User.create(username='charlie')

    with db.atomic() as nested_txn:
        # This block corresponds to a savepoint.
        User.create(username='huey')

        # This will roll back the above create() query.
        nested_txn.rollback()

    User.create(username='mickey')

# When the block ends, the transaction is committed (assuming no error
# occurs). At that point there will be two users, "charlie" and "mickey".

装饰器

db = SqliteDatabase(':memory:')

with db.atomic() as txn:
    # This is the outer-most level, so this block corresponds to
    # a transaction.
    User.create(username='charlie')

    with db.atomic() as nested_txn:
        # This block corresponds to a savepoint.
        User.create(username='huey')

        # This will roll back the above create() query.
        nested_txn.rollback()

    User.create(username='mickey')

# When the block ends, the transaction is committed (assuming no error
# occurs). At that point there will be two users, "charlie" and "mickey".

手动半自动

# 纯手动用commit()提交，用rollback回滚
with db.transaction() as txn:
    User.create(username='mickey')
    txn.commit()  # Changes are saved and a new transaction begins.
    User.create(username='huey')

    # Roll back. "huey" will not be saved, but since "mickey" was already
    # committed, that row will remain in the database.
    txn.rollback()



# 半自动用savepoint上下文管理
with db.transaction() as txn:
    with db.savepoint() as sp:
        User.create(username='mickey')

    with db.savepoint() as sp2:
        User.create(username='zaizee')
        sp2.rollback()  # "zaizee" will not be saved, but "mickey" will be.

Peewee实现事务最常用的方法是Database.atomic()方法，使用起来非常简单，如下：

for tweet in query:
    print(tweet.content, tweet.timestamp, tweet.user.username)

更多可参考我前面写的Peewee使用之事务

其他

自定义类型

import uuid

class UUIDField(Field):
    field_type = 'uuid'

    def db_value(self, value):
        return str(value) # convert UUID to str

    def python_value(self, value):
        return uuid.UUID(value) # convert str to UUID

数据表信息的访问

Person._meta.fields   # 用访问_meta访问定义的meta
Person._meta.primary_key   
Person._meta.database

# 属性有：database， table_name， table_function， indexes， primary_key， constraints， schema， only_save_dirty， options， table_alias， depends_on， without_rowid

索引增加

class Article(Model):
    name = TextField()
    timestamp = TimestampField()
    status = IntegerField()
    flags = IntegerField()

# Add an index on "name" and "timestamp" columns.
Article.add_index(Article.name, Article.timestamp)

# Add a partial index on name and timestamp where status = 1.
Article.add_index(Article.name, Article.timestamp,
                  where=(Article.status == 1))

# Create a unique index on timestamp desc, status & 4.
idx = Article.index(
    Article.timestamp.desc(),
    Article.flags.bin_and(4),
    unique=True)
Article.add_index(idx)

事务autocommit模式(不推荐)　

scalar

Employee.select(fn.Min(Employee.salary), fn.Max(Employee.salary) ).scalar(as_tuple=True)
# You can retrieve multiple scalar values by passing as_tuple=True


# scalar 表示结果的第一行第一列

安全和SQL注入

默认peewee将参数化查询，所以所有用户传入的参数将被转义。唯一的例外就是写原始SQL查询或者传递

包含未被信任的数据的SQL对象时。为避免此事发生，确保用户定义的数据作为查询参数而不是实际的SQL查询：

# Bad!
query = MyModel.raw('SELECT * FROM my_table WHERE data = %s' % (user_data,))

# Good. `user_data` will be treated as a parameter to the query.
query = MyModel.raw('SELECT * FROM my_table WHERE data = %s', user_data)

# Bad!
query = MyModel.select().where(SQL('Some SQL expression %s' % user_data))

# Good. `user_data` will be treated as a parameter.
query = MyModel.select().where(SQL('Some SQL expression %s', user_data))

MySQL和Postgresql使用'%s'可以来参数化。SQLite，用'?'。确保使用你的数据库适合的字符集。你也能

通过Database.interpolation来查找这个参数。

窗口函数

peewee提供基础的SQL窗口函数功能，可以通过调用fn.over()来创建传递到你的分区和排序参数中。

# Get the list of employees and the average salary for their dept.
query = (Employee
         .select(
             Employee.name,
             Employee.department,
             Employee.salary,
             fn.Avg(Employee.salary).over(
                 partition_by=[Employee.department]))
         .order_by(Employee.name))

# Rank employees by salary.
query = (Employee
         .select(
             Employee.name,
             Employee.salary,
             fn.rank().over(
                 order_by=[Employee.salary]))

得到原始的元组/字典

有时你想要简单的列举行数据元组，而不需要创建结果对象。可以用SelectQuery().tuples()或RawQuery.tuples():

stats = Stat.select(Stat.url, fn.Count(Stat.url)).group_by(Stat.url).tuples()

# iterate over a list of 2-tuples containing the url and count
for stat_url, stat_count in stats:
    print stat_url, stat_count

简单的，用SelectQuery.dicts()或RawQuery.dicts()可以返回行数据作为字典

stats = Stat.select(Stat.url, fn.Count(Stat.url).alias('ct')).group_by(Stat.url).dicts()

# iterate over a list of 2-tuples containing the url and count
for stat in stats:
    print stat['url'], stat['ct']

返回子句

PostgresqlDatabase 支持在Update，INSERT，DELETE查询里用RETURNING子句。

用RETURNING子句可以将查询语句影响的行数对应的模型对象返回。

例如，你用UpdateQuery去将注册过期的用户冻结。冻结后，你需要向每个用户发送邮件然他们知道情况。而不是用两次

查询，SELECT和UPDATE,通过带有RETURNING 的UPDATE的查询来一次完成。

query = (User
         .update(is_active=False)
         .where(User.registration_expired == True)
         .returning(User))

# Send an email to every user that was deactivated.
for deactivate_user in query.execute():
    send_deactivation_email(deactivated_user)

RETURNING也可用在InsertQUery和DeleteQuery上。当用在INSERT时，新增的行返回。用在DELETE时，删除的行

返回。

RETURNING子句唯一限制是，它仅包含查询FROM子句中列的表列。用模型类可以返回所有的列字段。

数据库反向生成

文档地址：http://docs.peewee-orm.com/en/latest/peewee/playhouse.html#pwiz-a-model-generator

# Introspect a Sqlite database.
python -m pwiz -e sqlite path/to/sqlite_database.db

# Introspect a MySQL database, logging in as root. You will be prompted
# for a password ("-P").
python -m pwiz -e mysql -u root -P mysql_db_name

# Introspect a Postgresql database on a remote server.
python -m pwiz -e postgres -u postgres -H 10.1.0.3 pg_db_name

性能技巧

下面讨论一下用peewee的些许提高性能的方法。

避免N+1查询

N+1查询指的是当应用提交一次查询获取结果，然后在取得结果数据集的每一行时，应用至少再次查询一次（也可以看做是嵌套循环）。

大多数情况下，n 查询可以通过使用SQL join或子查询来避免。数据库本身可能做了嵌套循环，但是它比在你的应用代码本身里做这些n查询更高效，后者通常会导致与数据库再次潜在通讯，没有利用数据库本身关联和执行子查询时会进行切片等优化工作。

Peewee提供了几种API去减轻N+1查询的行为。再看看贯串我们这篇文档的模型，User和Tweet，这部分我们重点讲一下一些N+1场景，说明peewee怎么帮助我们避免N+1查询。

在一些场景里，N+1查询不会明显表现为显著地或可测量的性能瓶颈点。它也由你要查询的数据，使用的数据库本身，以及执行查询获取结果的潜在因素。优化前后可以测试性能，确保和你预测的变化相同。

列出最近的tweets

tweets时间轴显示最近用户的tweets。除了tweet的内容，还要显示tweet作者的用户名。N+1场景描述为：

获取最近的10条tweets
每个tweet，查询作者信息（10次查询）

通过用join选择两个表，peewee使得在一次查询里完成任务：

query = (Tweet
         .select(Tweet, User)  # Note that we are selecting both models.
         .join(User)  # Use an INNER join because every tweet has an author.
         .order_by(Tweet.id.desc())  # Get the most recent tweets.
         .limit(10))

for tweet in query:
    print tweet.user.username, '-', tweet.message

没有用join时，得到tweet.user.username会触发一次查询去解析外键tweet.user从而得到相关联的user。

由于我们在User上关联并选择，peewee自动为我们解析外键。

列出所有用户和他们的tweets

你想要显示若干用户和他们所有的tweets的页面。N+1场景为：

取得些许用户。
每个用户取到他们的tweets。

虽然和上个例子相似，但是重要区别是：我们选择tweets时，每个tweet只有一个关联的用户，所以可以直接赋值到外键，

反过来不对，因为一个用户可有任意数量tweets或者没有。

Peewee提供两两种途径去避免O(n)查询：

1.首先取到用户，然后取到关联这些用户的所有tweets。一旦peewee取到tweets，将它们与合适的用户匹配。

这种方法通常很快，但是会在所选择的每个表上执行一次查询。

2.在一个查询里得到用户和tweets。用户数据将复制，所以peewee将在列举结果集时减少重复和聚合tweets。

这种方法导致有许多数据需要传输，并且要有许多python逻辑去减少行重复。

每种方案根据查询数据的大小和结构都会可能比另一种更好。

使用prefetch

peewee使用子查询可以预获取数据。这种方法需要prefetch特殊API使用。Pre-fetch，像其名字本身，

用子查询去急切加载给定用户的相应的tweets。意味着我们用O(k)查询K张表而不是O(n)查询n行纪录。

下面演示我们如何得到若干用户和他们最近一周的tweets：

week_ago = datetime.date.today() - datetime.timedelta(days=7)
users = User.select()
tweets = (Tweet
          .select()
          .where(
              (Tweet.is_published == True) &
              (Tweet.created_date >= week_ago)))

# This will perform two queries.
users_with_tweets = prefetch(users, tweets)

for user in users_with_tweets:
    print user.username
    for tweet in user.tweets_prefetch:
        print '  ', tweet.message

注意User 查询和Tweet查询都没有JOIN子句，当我们使用prefetch时不必指名join

prefetch可以用于任意数量的表。可以查看API文档看其他例子。

用prefetch时应考虑的事情：

预查询的模型必须存在外键
通常它比aggregate_rows方法更高效
因为数据没有重复的所以传输的数据更少
因为不用减重复所以python逻辑更少
当你想要在最外的查询里使用LIMIT没问题，但是可能正确的实现限制子查询的返回结果大小有些困难。

使用aggregate_rows

aggregeate_rows一次在内存中减少重复，选择所有的数据。它和prefetch都可以完成任意复杂的查询。

使用这个特性需要当创建查询时用到特殊的标志aggregate_rows。它告诉peewee减少那些根据JOIN的结构可能会重复的行。

因为在减少重复聚合数据时有许多计算，所以可能使用aggregate_rows可能在一些查询中会比用prefetch性能低，即使面对的是

O(n)简单的查询时，所以你不确定使用哪种方法时测试检查你的代码。

query = (User
         .select(User, Tweet)  # As in the previous example, we select both tables.
         .join(Tweet, JOIN.LEFT_OUTER)
         .order_by(User.username)  # We need to specify an ordering here.
         .aggregate_rows())  # Tell peewee to de-dupe and aggregate results.

for user in query:
    print user.username
    for tweet in user.tweets:
        print '  ', tweet.message

query = (User
         .select(User, Tweet)  # As in the previous example, we select both tables.
         .join(Tweet, JOIN.LEFT_OUTER)
         .order_by(User.username)  # We need to specify an ordering here.
         .aggregate_rows())  # Tell peewee to de-dupe and aggregate results.

for user in query:
    print user.username
    for tweet in user.tweets:
        print '  ', tweet.message

一般情况下，user.tweets返回SelectQuery，迭代它将触发额外的查询。使用aggregate_rows，user.tweets将是一个Python list，没有

额外的查询发生。

我们用LEFT OUTER关联确保没有tweets的用户也在结果数据中。

下面我们获取若干用户以及他们过去一周发表的tweets。因为我们过滤tweets时用户可以没有任何tweets，所以我们需要WHERE

子句允许有NULL ID的tweet。

week_ago = datetime.date.today() - datetime.timedelta(days=7)
query = (User
         .select(User, Tweet)
         .join(Tweet, JOIN.LEFT_OUTER)
         .where(
             (Tweet.id >> None) | (
                 (Tweet.is_published == True) &
                 (Tweet.created_date >= week_ago)))
         .order_by(User.username, Tweet.created_date.desc())
         .aggregate_rows())

for user in query:
    print user.username
    for tweet in user.tweets:
        print '  ', tweet.message

使用aggregate_rows需考虑的事情：

必须指定每个关联表的排序，以便行数据可以正确的聚合，有点像itertools.groupby
不要和LIMIT和OFFSET谓词混用，或者get（应用LIMIT 1 SQL 子句）。因为聚合结果集可能由于行的重复问题导致

多个记录返回，限制返回数量可能导致错误。假设你有三个用户，每个有10个tweets，如果你在查询后加上

LIMIT 5，你只返回了第一个用户和他的前5个tweets

通常python去减少数据重复是的这种方法性能比prefetch低，有时甚至比简单的O(n)查询性能还低！有疑问时。
因为每张表的每一列都包含在游标返回的行元组内，所以这种方法比prefetch需要更多带宽。

迭代大量行

默认在迭代SelectQuery时peewee缓存结果。这样使得多次迭代和索引切片容易，但是当你计划迭代大数量的行时可能有问题。

当迭代查询结果时为了减少peewee使用的内存，可用iterator方法。这个方法让你用更少的内存去迭代大量的结果集，不用缓存返回的每个实例。

# Let's assume we've got 10 million stat objects to dump to a csv file.
stats = Stat.select()

# Our imaginary serializer class
serializer = CSVSerializer()

# Loop over all the stats and serialize.
for stat in stats.iterator():
    serializer.serialize_object(stat)

可以使用native方法来为简单的查询进一步提升速度。这个方法加速peewee从原始的游标数据构造为peewee模型实例。

可看native文档了解其详细优化信息。

你也可以通过使用dicts和tuples来得到性能提升。

当迭代大量包含多表的列的行时，peewee为每个返回行重建模型图，这种操作在复杂的图上时很费时。为提速模型创建，你可以：

调用native方法，它不会构建图，直接将行的属性分配到模型实例上
使用dicts或tuples

peewee-asycn

获取数据总条数

total_count = await manager.count(model.select())

参考

Peewee querying
peewee 查询

原生sql，推荐 https://pypi.python.org/pypi/records/

PEEWEE基本使用:https://www.cnblogs.com/yxi-liu/p/8514763.html

https://www.jianshu.com/p/ba8a27cf7da1

https://www.cnblogs.com/miaojiyao/articles/5217757.html

posted @ 2019-08-21 18:40 逐梦客！阅读(1163) 评论(0) 收藏举报

刷新页面返回顶部