数据分析常用的SQL语句

1.创建新表
create table tabname(col1 type1 [not null] [primary key],col2 type2 [not null],..)
 
2.根据已有的表创建新表 
A：create table tab_new like tab_old (使用旧表创建新表)
B：create table tab_new as select col1,col2… from tab_old definition only

3.增加一个列
Alter table tabname add column col type
注：列增加后将不能删除。DB2中列加上后数据类型也不能改变，唯一能改变的是增加varchar类型的长度。

4.添加主键： Alter table tabname add primary key(col) 
说明：删除主键： Alter table tabname drop primary key(col) 

5.创建索引：create [unique] index idxname on tabname(col….) 
删除索引：drop index idxname
注：索引是不可更改的，想更改必须删除重新建。

6.几个简单的基本的sql语句
选择（这个功能是最最常用的）
select * from table1 where 范围
SELECT * FROM Persons WHERE City='Beijing'
SELECT * FROM Persons WHERE Year>1965
SELECT * FROM Persons WHERE City='Beijing' AND Year>1965

插入：insert into table1(field1,field2) values(value1,value2)
删除：delete from table1 where 范围
更新：update table1 set field1=value1 where 范围
查找：select * from table1 where field1 like ’%value1%’
排序：select * from table1 order by field1,field2 [desc]

7.几个常用的统计sql语句
总数：select count as totalcount from table1
求和：select sum(field1) as sumvalue from table1
平均：select avg(field1) as avgvalue from table1
最大：select max(field1) as maxvalue from table1
最小：select min(field1) as minvalue from table1

8.连接合并 
A、inner join：  
内连接：结果集仅包括两表的匹配行。 
select a.a, a.b, a.c, b.c, b.d, b.f 
from a INNER JOIN b ON a.a = b.c
B、left （outer） join： 
左外连接（左连接）：结果集既包括连接表的匹配行，也包括左连接表的所有行。 
select a.a, a.b, a.c, b.c, b.d, b.f 
from a LEFT OUT JOIN b ON a.a = b.c
C：right （outer） join: 
右外连接(右连接)：结果集既包括连接表的匹配行，也包括右连接表的所有行。 
select a.a, a.b, a.c, b.c, b.d, b.f 
from a RIGHT OUT JOIN b ON a.a = b.c

9.Group by分组
分组一般与统计函数结合使用，如无统计函数，就是分组去重。
select field1,sum(field2) as sumvalue from table1 group by field1
常用统计函数：count,sum,avg,max,min
 
10.复制表(只复制结构,源表名：a 新表名：b) (Access可用)
法一：select * into b from a where 1<>1（仅用于SQlServer）
法二：select top 0 * into b from a

11.拷贝表(拷贝数据,源表名：a 目标表名：b) (Access可用)
insert into b(a, b, c) select d,e,f from b;
 
12.子查询(表名1：a 表名2：b)
法一：select a,b,c from a where a IN (select d from b ) 
法二：select a,b,c from a where a IN (1,2,3)

13.between的用法
between限制查询数据范围时包括了边界值,not between不包括
select * from table1 where time between time1 and time2
select a,b,c, from table1 where a not between 数值1 and 数值2
 
14.in 的使用方法
包含：select * from table1 where a in (‘值1’,’值2’,’值4’,’值6’) 
不包含：select * from table1 where a not in (‘值1’,’值2’,’值4’,’值6’) 
15.前10条记录
select top 10 * form table1 where 范围
 
16.选择在每一组b值相同的数据中对应的a最大的记录的所有信息
(类似这样的用法可以用于论坛每月排行榜,每月热销产品分析,按科目成绩排名,等等.)
select a,b,c from tablename ta where a=(select max(a) from tablename tb where tb.b=ta.b) 
 
17.随机取出10条数据
select top 10 * from tablename order by newid()
 
18.随机选择记录
select newid()

19.DISTINCT去重语句
关键词 DISTINCT 用于返回唯一不同的值，即去重，去重还可以使用GROUP BY
SELECT DISTINCT 列名称 FROM 表名称

20.FORMAT() 格式化函数
SELECT ProductName, FORMAT(Now(),'YYYY-MM-DD') as Date
FROM Products

21.HAVING 条件子句
HAVING 条件子句是用于对聚合后结果的限定，而Where是用于普通字段的限定
SELECT Customer,SUM(OrderPrice) FROM Orders
WHERE Customer='Bush'
GROUP BY Customer
HAVING SUM(OrderPrice)>1500

22.UNION 运算符 
UNION 运算符通过组合其他两个结果表（例如 TABLE1 和 TABLE2）并消去表中任何重复行而派生出一个结果表。
SELECT column_name(s) FROM table1
UNION
SELECT column_name(s) FROM table2
当 ALL 随 UNION 一起使用时（即 UNION ALL），不消除重复行。
SELECT column_name(s) FROM table1
UNION ALL
SELECT column_name(s) FROM table2
posted on 2019-12-19 10:07 zl666张良阅读(1338) 评论(0) 收藏举报