数据分析常用的SQL语句

1.创建新表
create table tabname(col1 type1 [not null] [primary key],col2 type2 [not null],..)
 
2.根据已有的表创建新表 
A:create table tab_new like tab_old (使用旧表创建新表)
B:create table tab_new as select col1,col2… from tab_old definition only

3.增加一个列
Alter table tabname add column col type
注:列增加后将不能删除。DB2中列加上后数据类型也不能改变,唯一能改变的是增加varchar类型的长度。

4.添加主键: Alter table tabname add primary key(col) 
说明:删除主键: Alter table tabname drop primary key(col) 

5.创建索引:create [unique] index idxname on tabname(col….) 
删除索引:drop index idxname
注:索引是不可更改的,想更改必须删除重新建。

6.几个简单的基本的sql语句
选择(这个功能是最最常用的)
select * from table1 where 范围
SELECT * FROM Persons WHERE City='Beijing'
SELECT * FROM Persons WHERE Year>1965
SELECT * FROM Persons WHERE City='Beijing' AND Year>1965

插入:insert into table1(field1,field2) values(value1,value2)
删除:delete from table1 where 范围
更新:update table1 set field1=value1 where 范围
查找:select * from table1 where field1 like ’%value1%’
排序:select * from table1 order by field1,field2 [desc]

7.几个常用的统计sql语句
总数:select count as totalcount from table1
求和:select sum(field1) as sumvalue from table1
平均:select avg(field1) as avgvalue from table1
最大:select max(field1) as maxvalue from table1
最小:select min(field1) as minvalue from table1

8.连接合并 
A、inner join:  
内连接:结果集仅包括两表的匹配行。 
select a.a, a.b, a.c, b.c, b.d, b.f 
from a INNER JOIN b ON a.a = b.c
B、left (outer) join: 
左外连接(左连接):结果集既包括连接表的匹配行,也包括左连接表的所有行。 
select a.a, a.b, a.c, b.c, b.d, b.f 
from a LEFT OUT JOIN b ON a.a = b.c
C:right (outer) join: 
右外连接(右连接):结果集既包括连接表的匹配行,也包括右连接表的所有行。 
select a.a, a.b, a.c, b.c, b.d, b.f 
from a RIGHT OUT JOIN b ON a.a = b.c

9.Group by分组
分组一般与统计函数结合使用,如无统计函数,就是分组去重。
select field1,sum(field2) as sumvalue from table1 group by field1
常用统计函数:count,sum,avg,max,min
 
10.复制表(只复制结构,源表名:a 新表名:b) (Access可用)
法一:select * into b from a where 1<>1(仅用于SQlServer)
法二:select top 0 * into b from a

11.拷贝表(拷贝数据,源表名:a 目标表名:b) (Access可用)
insert into b(a, b, c) select d,e,f from b;
 
12.子查询(表名1:a 表名2:b)
法一:select a,b,c from a where a IN (select d from b ) 
法二:select a,b,c from a where a IN (1,2,3)

13.between的用法
between限制查询数据范围时包括了边界值,not between不包括
select * from table1 where time between time1 and time2
select a,b,c, from table1 where a not between 数值1 and 数值2
 
14.in 的使用方法
包含:select * from table1 where a in (‘值1’,’值2’,’值4’,’值6’) 
不包含:select * from table1 where a not in (‘值1’,’值2’,’值4’,’值6’) 
15.前10条记录
select top 10 * form table1 where 范围
 
16.选择在每一组b值相同的数据中对应的a最大的记录的所有信息
(类似这样的用法可以用于论坛每月排行榜,每月热销产品分析,按科目成绩排名,等等.)
select a,b,c from tablename ta where a=(select max(a) from tablename tb where tb.b=ta.b) 
 
17.随机取出10条数据
select top 10 * from tablename order by newid()
 
18.随机选择记录
select newid()

19.DISTINCT去重语句
关键词 DISTINCT 用于返回唯一不同的值,即去重,去重还可以使用GROUP BY
SELECT DISTINCT 列名称 FROM 表名称

20.FORMAT() 格式化函数
SELECT ProductName, FORMAT(Now(),'YYYY-MM-DD') as Date
FROM Products

21.HAVING 条件子句
HAVING 条件子句是用于对聚合后结果的限定,而Where是用于普通字段的限定
SELECT Customer,SUM(OrderPrice) FROM Orders
WHERE Customer='Bush'
GROUP BY Customer
HAVING SUM(OrderPrice)>1500

22.UNION 运算符 
UNION 运算符通过组合其他两个结果表(例如 TABLE1 和 TABLE2)并消去表中任何重复行而派生出一个结果表。
SELECT column_name(s) FROM table1
UNION
SELECT column_name(s) FROM table2
当 ALL 随 UNION 一起使用时(即 UNION ALL),不消除重复行。
SELECT column_name(s) FROM table1
UNION ALL
SELECT column_name(s) FROM table2
posted on 2019-12-19 10:07  zl666张良  阅读(1338)  评论(0)    收藏  举报