​递归 CTE 是 Greenplum 中一个非常强大的功能,它使得 Greenplum 具有了处理层级数据和图数据的能力。

什么是递归 CTE

CTE 的全称是 Common Table Expression,也被称为 WITH 语句。CTE 更加侧重于 SQL 的易读性,可以将其看作是一种生命周期更短的临时表,在父查询中可以像引用临时表一样,引用 CTE。

递归 CTE 使 SQL 增加了递归的语义,可以基于一个初始集合和变换的规则,不断递归扩大初始集合,直到满足设定的条件位置。递归 CTE 非常适合处理层级关系的数据,比如行政关系省、市、区和街道是典型的嵌套关系。基于初始集合省份名称,寻找出该省所有的市、区和街道名称的集合就可以使用递归 CTE 方便的实现。

基于公司层级的组织架构图示:

greenplum-recursive01

在上面的例子中,由于公司的层级关系使用关系模型表达,从数据表中只能直接查询到直接上下级关系。如何计算间接的上下级关系,并把所有数据进行汇总,就需要使用到递归 CTE。

以获取总监王爽手下的员工总数为例,初始非递归查询可以是查询王爽自己。递归查询项是查询当前结果的所有下属员工。SQL 语句如下所示:

with recursive myreport as (
    select * from department where name = '王爽' 
    union all
    select t.* from department t, myreport p where t.parent_department = p.id
)
select * from myreport;

创建测试数据

  • 创建表
-- drop table if exists emp cascade;
create table if not exists emp (
    empno    int,
    ename    varchar(10),
    job      varchar(10),
    mgr      int,
    deptno   int
)
distributed by (empno);
  • 写入数据
insert into emp (empno, ename, job, mgr, deptno) values (7369, 'SMITH' , 'CLERK'    , 7902, 20);
insert into emp (empno, ename, job, mgr, deptno) values (7499, 'ALLEN' , 'SALESMAN' , 7698, 30);
insert into emp (empno, ename, job, mgr, deptno) values (7521, 'WARD'  , 'SALESMAN' , 7698, 30);
insert into emp (empno, ename, job, mgr, deptno) values (7566, 'JONES' , 'MANAGER'  , 7839, 20);
insert into emp (empno, ename, job, mgr, deptno) values (7654, 'MARTIN', 'SALESMAN' , 7698, 30);
insert into emp (empno, ename, job, mgr, deptno) values (7698, 'BLAKE' , 'MANAGER'  , 7839, 30);
insert into emp (empno, ename, job, mgr, deptno) values (7782, 'CLARK' , 'MANAGER'  , 7839, 10);
insert into emp (empno, ename, job, mgr, deptno) values (7788, 'SCOTT' , 'ANALYST'  , 7566, 20);
insert into emp (empno, ename, job, mgr, deptno) values (7839, 'KING'  , 'PRESIDENT', null, 10);
insert into emp (empno, ename, job, mgr, deptno) values (7844, 'TURNER', 'SALESMAN' , 7698, 30);
insert into emp (empno, ename, job, mgr, deptno) values (7876, 'ADAMS' , 'CLERK'    , 7788, 20);
insert into emp (empno, ename, job, mgr, deptno) values (7900, 'JAMES' , 'CLERK'    , 7698, 30);
insert into emp (empno, ename, job, mgr, deptno) values (7902, 'FORD'  , 'ANALYST'  , 7566, 20);
insert into emp (empno, ename, job, mgr, deptno) values (7934, 'MILLER', 'CLERK'    , 7782, 10);
  • 查询数据
testzw=> select * from emp;
 empno | ename  |    job    | mgr  | deptno
-------+--------+-----------+------+--------
  7369 | SMITH  | CLERK     | 7902 |     20
  7499 | ALLEN  | SALESMAN  | 7698 |     30
  7521 | WARD   | SALESMAN  | 7698 |     30
  7566 | JONES  | MANAGER   | 7839 |     20
  7654 | MARTIN | SALESMAN  | 7698 |     30
  7698 | BLAKE  | MANAGER   | 7839 |     30
  7782 | CLARK  | MANAGER   | 7839 |     10
  7788 | SCOTT  | ANALYST   | 7566 |     20
  7876 | ADAMS  | CLERK     | 7788 |     20
  7839 | KING   | PRESIDENT |      |     10
  7844 | TURNER | SALESMAN  | 7698 |     30
  7900 | JAMES  | CLERK     | 7698 |     30
  7902 | FORD   | ANALYST   | 7566 |     20
  7934 | MILLER | CLERK     | 7782 |     10
(14 rows)

通过父级节点递归查询子节点

查询 empno = 7566 下的所有员工信息。

  • 根据 empno 查询:
with recursive temp as (
    select 1 as level, a.* from emp a where a.empno = 7566
    union all 
    select c.level + 1 as level, b.* from emp b, temp c where b.mgr = c.empno
)
select * from temp;

level   empno   ename   job     mgr     deptno
1       7,566   JONES   MANAGER 7,839   20
2       7,788   SCOTT   ANALYST 7,566   20
2       7,902   FORD    ANALYST 7,566   20
3       7,876   ADAMS   CLERK   7,788   20
3       7,369   SMITH   CLERK   7,902   20

注:该查询会检索出 empno = 7566 下的所有子节点,即 empno = 7902/7788empno = 7369/7876

  • 根据 ename 查询:
with recursive temp as (
    select 1 as level, a.* from emp a where a.ename = 'KING'
    union all 
    select c.level + 1 as level, b.* from emp b, temp c where b.mgr = c.empno
)
select * from temp;

level   empno   ename   job         mgr     deptno
1       7,839   KING    PRESIDENT   [NULL]  10
2       7,698   BLAKE   MANAGER     7,839   30
2       7,782   CLARK   MANAGER     7,839   10
2       7,566   JONES   MANAGER     7,839   20
3       7,844   TURNER  SALESMAN    7,698   30
3       7,521   WARD    SALESMAN    7,698   30
3       7,499   ALLEN   SALESMAN    7,698   30
3       7,654   MARTIN  SALESMAN    7,698   30
3       7,900   JAMES   CLERK       7,698   30
3       7,934   MILLER  CLERK       7,782   10
3       7,788   SCOTT   ANALYST     7,566   20
3       7,902   FORD    ANALYST     7,566   20
4       7,876   ADAMS   CLERK       7,788   20
4       7,369   SMITH   CLERK       7,902   20
  • 根据 empno 查询,显示层级关系
with recursive temp as (
    select 1 as level, a.* from emp a where a.empno = 7566
    union all 
    select c.level + 1 as level, b.* from emp b, temp c where b.mgr = c.empno
)
select 
    level
    ,empno
    ,ename 
    ,lpad(' ', 2 * (level - 1)) || ename as ename_show
    ,job 
    ,mgr 
    ,deptno
from temp;

level   empno   ename   ename_show  job     mgr     deptno
1       7,566   JONES   JONES       MANAGER 7,839   20
2       7,788   SCOTT     SCOTT     ANALYST 7,566   20
2       7,902   FORD      FORD      ANALYST 7,566   20
3       7,876   ADAMS       ADAMS   CLERK   7,788   20
3       7,369   SMITH       SMITH   CLERK   7,902   20

通过子节点递归查询父级节点

查询 empno = 7900 上的所有员工信息。

  • 根据 empno 查询:
with recursive temp as (
    select 1 as level, a.* from emp a where a.empno = 7900
    union all 
    select c.level + 1 as level, b.* from emp b, temp c where b.empno = c.mgr
)
select * from temp;

level   empno   ename   job         mgr     deptno
1       7,900   JAMES   CLERK       7,698   30
2       7,698   BLAKE   MANAGER     7,839   30
3       7,839   KING    PRESIDENT   [NULL]  10

注:该查询会检索出 empno = 7900 上的所有父节点,即 empno = 7698empno = 7839

树深度及路径

  • 通过父级节点查询树深度及路径
with recursive temp(empno, ename, job, mgr, depth, path) 
as (
    select 
        e.empno, e.ename, e.job, e.mgr, 1 depth, array[e.empno] 
    from emp e where empno = 7566
    union all 
    select 
        t.empno, t.ename, t.job, t.mgr, p.depth + 1, path||t.empno 
    from emp t, temp p where t.mgr = p.empno
)
select * from temp;

empno   ename   job     mgr     depth   path
7,566   JONES   MANAGER 7,839   1       {7566}
7,788   SCOTT   ANALYST 7,566   2       {7566,7788}
7,902   FORD    ANALYST 7,566   2       {7566,7902}
7,876   ADAMS   CLERK   7,788   3       {7566,7788,7876}
7,369   SMITH   CLERK   7,902   3       {7566,7902,7369}
  • 通过子节点查询树深度及路径
with recursive temp(empno, ename, job, mgr, depth, path) 
as (
    select 
        e.empno, e.ename, e.job, e.mgr, 1 depth, array[e.empno] 
    from emp e where empno = 7900
    union all 
    select 
        t.empno, t.ename, t.job, t.mgr, p.depth+1, path||t.empno 
    from emp t, temp p where t.empno = p.mgr
)
select * from temp;

 empno | ename |    job    | mgr  | depth |       path
-------+-------+-----------+------+-------+------------------
  7900 | JAMES | CLERK     | 7698 |     1 | {7900}
  7698 | BLAKE | MANAGER   | 7839 |     2 | {7900,7698}
  7839 | KING  | PRESIDENT |      |     3 | {7900,7698,7839}
(3 rows)

通过父级节点递归查询子节点(指定深度)

查询 empno = 7566 下的员工信息,且仅获取一级子节点:

with recursive temp(empno, ename, job, mgr, depth, path) 
as (
    select
        e.empno, e.ename, e.job, e.mgr, 1 depth, array[e.empno] 
    from emp e where empno = 7566
    union all 
    select 
        t.empno, t.ename, t.job, t.mgr, p.depth+1, path||t.empno 
    from emp t, temp p where t.mgr = p.empno
)
select * from temp where depth = 2;

 empno | ename |   job   | mgr  | depth |    path
-------+-------+---------+------+-------+-------------
  7902 | FORD  | ANALYST | 7566 |     2 | {7566,7902}
  7788 | SCOTT | ANALYST | 7566 |     2 | {7566,7788}
(2 rows)

通过子节点递归查询父级节点(指定深度)

查询 empno = 7900 上的员工信息,且仅获取一级父节点:

with recursive temp(empno, ename, job, mgr, depth, path) 
as (
    select 
        e.empno, e.ename, e.job, e.mgr, 1 depth, array[e.empno] 
    from emp e where empno = 7900
    union all 
    select 
        t.empno, t.ename, t.job, t.mgr, p.depth+1, path||t.empno 
    from emp t, temp p where t.empno = p.mgr
)
select * from temp where depth = 2;

 empno | ename |   job   | mgr  | depth |    path
-------+-------+---------+------+-------+-------------
  7698 | BLAKE | MANAGER | 7839 |     2 | {7900,7698}
(1 row)

参考资料

https://blog.csdn.net/gp_community/article/details/109490234

https://sunbc.blog.csdn.net/article/details/79290736

原创文章,转载请注明出处:http://www.opcoder.cn/article/57/