递归 CTE 是 Greenplum 中一个非常强大的功能,它使得 Greenplum 具有了处理层级数据和图数据的能力。
什么是递归 CTE
CTE 的全称是 Common Table Expression,也被称为 WITH 语句。CTE 更加侧重于 SQL 的易读性,可以将其看作是一种生命周期更短的临时表,在父查询中可以像引用临时表一样,引用 CTE。
递归 CTE 使 SQL 增加了递归的语义,可以基于一个初始集合和变换的规则,不断递归扩大初始集合,直到满足设定的条件位置。递归 CTE 非常适合处理层级关系的数据,比如行政关系省、市、区和街道是典型的嵌套关系。基于初始集合省份名称,寻找出该省所有的市、区和街道名称的集合就可以使用递归 CTE 方便的实现。
基于公司层级的组织架构图示:
在上面的例子中,由于公司的层级关系使用关系模型表达,从数据表中只能直接查询到直接上下级关系。如何计算间接的上下级关系,并把所有数据进行汇总,就需要使用到递归 CTE。
以获取总监王爽手下的员工总数为例,初始非递归查询可以是查询王爽自己。递归查询项是查询当前结果的所有下属员工。SQL 语句如下所示:
with recursive myreport as ( select * from department where name = '王爽' union all select t.* from department t, myreport p where t.parent_department = p.id ) select * from myreport;
创建测试数据
- 创建表
-- drop table if exists emp cascade; create table if not exists emp ( empno int, ename varchar(10), job varchar(10), mgr int, deptno int ) distributed by (empno);
- 写入数据
insert into emp (empno, ename, job, mgr, deptno) values (7369, 'SMITH' , 'CLERK' , 7902, 20); insert into emp (empno, ename, job, mgr, deptno) values (7499, 'ALLEN' , 'SALESMAN' , 7698, 30); insert into emp (empno, ename, job, mgr, deptno) values (7521, 'WARD' , 'SALESMAN' , 7698, 30); insert into emp (empno, ename, job, mgr, deptno) values (7566, 'JONES' , 'MANAGER' , 7839, 20); insert into emp (empno, ename, job, mgr, deptno) values (7654, 'MARTIN', 'SALESMAN' , 7698, 30); insert into emp (empno, ename, job, mgr, deptno) values (7698, 'BLAKE' , 'MANAGER' , 7839, 30); insert into emp (empno, ename, job, mgr, deptno) values (7782, 'CLARK' , 'MANAGER' , 7839, 10); insert into emp (empno, ename, job, mgr, deptno) values (7788, 'SCOTT' , 'ANALYST' , 7566, 20); insert into emp (empno, ename, job, mgr, deptno) values (7839, 'KING' , 'PRESIDENT', null, 10); insert into emp (empno, ename, job, mgr, deptno) values (7844, 'TURNER', 'SALESMAN' , 7698, 30); insert into emp (empno, ename, job, mgr, deptno) values (7876, 'ADAMS' , 'CLERK' , 7788, 20); insert into emp (empno, ename, job, mgr, deptno) values (7900, 'JAMES' , 'CLERK' , 7698, 30); insert into emp (empno, ename, job, mgr, deptno) values (7902, 'FORD' , 'ANALYST' , 7566, 20); insert into emp (empno, ename, job, mgr, deptno) values (7934, 'MILLER', 'CLERK' , 7782, 10);
- 查询数据
testzw=> select * from emp; empno | ename | job | mgr | deptno -------+--------+-----------+------+-------- 7369 | SMITH | CLERK | 7902 | 20 7499 | ALLEN | SALESMAN | 7698 | 30 7521 | WARD | SALESMAN | 7698 | 30 7566 | JONES | MANAGER | 7839 | 20 7654 | MARTIN | SALESMAN | 7698 | 30 7698 | BLAKE | MANAGER | 7839 | 30 7782 | CLARK | MANAGER | 7839 | 10 7788 | SCOTT | ANALYST | 7566 | 20 7876 | ADAMS | CLERK | 7788 | 20 7839 | KING | PRESIDENT | | 10 7844 | TURNER | SALESMAN | 7698 | 30 7900 | JAMES | CLERK | 7698 | 30 7902 | FORD | ANALYST | 7566 | 20 7934 | MILLER | CLERK | 7782 | 10 (14 rows)
通过父级节点递归查询子节点
查询 empno = 7566
下的所有员工信息。
- 根据 empno 查询:
with recursive temp as ( select 1 as level, a.* from emp a where a.empno = 7566 union all select c.level + 1 as level, b.* from emp b, temp c where b.mgr = c.empno ) select * from temp; level empno ename job mgr deptno 1 7,566 JONES MANAGER 7,839 20 2 7,788 SCOTT ANALYST 7,566 20 2 7,902 FORD ANALYST 7,566 20 3 7,876 ADAMS CLERK 7,788 20 3 7,369 SMITH CLERK 7,902 20
注:该查询会检索出 empno = 7566
下的所有子节点,即 empno = 7902/7788
及 empno = 7369/7876
。
- 根据 ename 查询:
with recursive temp as ( select 1 as level, a.* from emp a where a.ename = 'KING' union all select c.level + 1 as level, b.* from emp b, temp c where b.mgr = c.empno ) select * from temp; level empno ename job mgr deptno 1 7,839 KING PRESIDENT [NULL] 10 2 7,698 BLAKE MANAGER 7,839 30 2 7,782 CLARK MANAGER 7,839 10 2 7,566 JONES MANAGER 7,839 20 3 7,844 TURNER SALESMAN 7,698 30 3 7,521 WARD SALESMAN 7,698 30 3 7,499 ALLEN SALESMAN 7,698 30 3 7,654 MARTIN SALESMAN 7,698 30 3 7,900 JAMES CLERK 7,698 30 3 7,934 MILLER CLERK 7,782 10 3 7,788 SCOTT ANALYST 7,566 20 3 7,902 FORD ANALYST 7,566 20 4 7,876 ADAMS CLERK 7,788 20 4 7,369 SMITH CLERK 7,902 20
- 根据 empno 查询,显示层级关系
with recursive temp as ( select 1 as level, a.* from emp a where a.empno = 7566 union all select c.level + 1 as level, b.* from emp b, temp c where b.mgr = c.empno ) select level ,empno ,ename ,lpad(' ', 2 * (level - 1)) || ename as ename_show ,job ,mgr ,deptno from temp; level empno ename ename_show job mgr deptno 1 7,566 JONES JONES MANAGER 7,839 20 2 7,788 SCOTT SCOTT ANALYST 7,566 20 2 7,902 FORD FORD ANALYST 7,566 20 3 7,876 ADAMS ADAMS CLERK 7,788 20 3 7,369 SMITH SMITH CLERK 7,902 20
通过子节点递归查询父级节点
查询 empno = 7900
上的所有员工信息。
- 根据 empno 查询:
with recursive temp as ( select 1 as level, a.* from emp a where a.empno = 7900 union all select c.level + 1 as level, b.* from emp b, temp c where b.empno = c.mgr ) select * from temp; level empno ename job mgr deptno 1 7,900 JAMES CLERK 7,698 30 2 7,698 BLAKE MANAGER 7,839 30 3 7,839 KING PRESIDENT [NULL] 10
注:该查询会检索出 empno = 7900
上的所有父节点,即 empno = 7698
及 empno = 7839
。
树深度及路径
- 通过父级节点查询树深度及路径
with recursive temp(empno, ename, job, mgr, depth, path) as ( select e.empno, e.ename, e.job, e.mgr, 1 depth, array[e.empno] from emp e where empno = 7566 union all select t.empno, t.ename, t.job, t.mgr, p.depth + 1, path||t.empno from emp t, temp p where t.mgr = p.empno ) select * from temp; empno ename job mgr depth path 7,566 JONES MANAGER 7,839 1 {7566} 7,788 SCOTT ANALYST 7,566 2 {7566,7788} 7,902 FORD ANALYST 7,566 2 {7566,7902} 7,876 ADAMS CLERK 7,788 3 {7566,7788,7876} 7,369 SMITH CLERK 7,902 3 {7566,7902,7369}
- 通过子节点查询树深度及路径
with recursive temp(empno, ename, job, mgr, depth, path) as ( select e.empno, e.ename, e.job, e.mgr, 1 depth, array[e.empno] from emp e where empno = 7900 union all select t.empno, t.ename, t.job, t.mgr, p.depth+1, path||t.empno from emp t, temp p where t.empno = p.mgr ) select * from temp; empno | ename | job | mgr | depth | path -------+-------+-----------+------+-------+------------------ 7900 | JAMES | CLERK | 7698 | 1 | {7900} 7698 | BLAKE | MANAGER | 7839 | 2 | {7900,7698} 7839 | KING | PRESIDENT | | 3 | {7900,7698,7839} (3 rows)
通过父级节点递归查询子节点(指定深度)
查询 empno = 7566
下的员工信息,且仅获取一级子节点:
with recursive temp(empno, ename, job, mgr, depth, path) as ( select e.empno, e.ename, e.job, e.mgr, 1 depth, array[e.empno] from emp e where empno = 7566 union all select t.empno, t.ename, t.job, t.mgr, p.depth+1, path||t.empno from emp t, temp p where t.mgr = p.empno ) select * from temp where depth = 2; empno | ename | job | mgr | depth | path -------+-------+---------+------+-------+------------- 7902 | FORD | ANALYST | 7566 | 2 | {7566,7902} 7788 | SCOTT | ANALYST | 7566 | 2 | {7566,7788} (2 rows)
通过子节点递归查询父级节点(指定深度)
查询 empno = 7900
上的员工信息,且仅获取一级父节点:
with recursive temp(empno, ename, job, mgr, depth, path) as ( select e.empno, e.ename, e.job, e.mgr, 1 depth, array[e.empno] from emp e where empno = 7900 union all select t.empno, t.ename, t.job, t.mgr, p.depth+1, path||t.empno from emp t, temp p where t.empno = p.mgr ) select * from temp where depth = 2; empno | ename | job | mgr | depth | path -------+-------+---------+------+-------+------------- 7698 | BLAKE | MANAGER | 7839 | 2 | {7900,7698} (1 row)
参考资料
https://blog.csdn.net/gp_community/article/details/109490234
https://sunbc.blog.csdn.net/article/details/79290736
原创文章,转载请注明出处:http://www.opcoder.cn/article/57/