现有一张股票价格表 stock_data
有三个字段分别是股票代码(stock_code)、日期(trade_date)、收盘价格(closing_price) ,请找出满足连续5天以上(含)每天上涨超过5%的股票,并给出连续满足天数及开始和结束日期。 备注:不考虑停牌或其他情况,仅仅关注每天连续5天上涨超过5%的股票。
样例数据
分析
首先应该计算每天相对上一天的涨幅,这个使用 lag()
函数; 然后判断是否符合涨幅超过5%; 然后处理连续问题,使用 row_number()
函数; 最后计算天数、开始日期、结束日期,返回结果。
SQL
使用 lag() 函数计算每天的涨幅
- 执行 SQL
select stock_code ,trade_date ,closing_price ,closing_price / lag(closing_price) over (partition by stock_code order by trade_date) - 1 as daily_return from stock_data order by stock_code, trade_date;
- 返回结果
将涨幅换算为是否符合涨幅>=5%
- 执行 SQL
select stock_code ,trade_date ,closing_price ,case when closing_price / lag(closing_price) over (partition by stock_code order by trade_date) - 1 >= 0.05 then 'T' else 'F' end as flag from stock_data order by stock_code, trade_date;
- 返回结果
按照 stock_code 分组,使用 row_number() 函数,获取日期编号
- 执行 SQL
with a as ( select stock_code ,trade_date ,closing_price ,case when closing_price / lag(closing_price) over (partition by stock_code order by trade_date) - 1 >= 0.05 then 'T' else 'F' end as flag from stock_data order by stock_code, trade_date ) select stock_code ,trade_date ,closing_price ,flag ,row_number() over(partition by stock_code order by trade_date) as rn from a order by stock_code, trade_date;
- 返回结果
按照 stock_code、flag 分组,使用 row_number() 函数,获取日期编号
- 执行 SQL
with a as ( select stock_code ,trade_date ,closing_price ,case when closing_price / lag(closing_price) over (partition by stock_code order by trade_date) - 1 >= 0.05 then 'T' else 'F' end as flag from stock_data order by stock_code, trade_date ) select stock_code ,trade_date ,closing_price ,flag ,row_number() over(partition by stock_code order by trade_date) as rn from a where flag = 'T' order by stock_code, trade_date;
- 返回结果
将上述两个结果集 left join,获取日期间隔
- 执行 SQL
with a as ( select stock_code ,trade_date ,closing_price ,case when closing_price / lag(closing_price) over (partition by stock_code order by trade_date) - 1 >= 0.05 then 'T' else 'F' end as flag from stock_data order by stock_code, trade_date ), b as ( select stock_code ,trade_date ,closing_price ,flag ,row_number() over(partition by stock_code order by trade_date) as rn from a order by stock_code, trade_date ) ,c as ( select stock_code ,trade_date ,closing_price ,flag ,row_number() over(partition by stock_code order by trade_date) as rn from a where flag = 'T' order by stock_code, trade_date ) select b.stock_code ,b.trade_date ,b.closing_price ,b.flag ,b.rn as rnb ,c.rn as rnc ,b.rn - c.rn as rn_diff from b left outer join c on b.stock_code = c.stock_code and b.trade_date = c.trade_date order by b.stock_code, b.trade_date;
- 返回结果
从上述返回的结果集可知,当 rn_diff=1
时,表示当日涨幅超过5%,且与上一条满足条件的日期为连续日期。
合并脚本
with a as ( select stock_code ,trade_date ,closing_price ,case when closing_price / lag(closing_price) over (partition by stock_code order by trade_date) - 1 >= 0.05 then 'T' else 'F' end as flag from stock_data order by stock_code, trade_date ), b as ( select stock_code ,trade_date ,closing_price ,flag ,row_number() over(partition by stock_code order by trade_date) as rn from a order by stock_code, trade_date ) ,c as ( select stock_code ,trade_date ,closing_price ,flag ,row_number() over(partition by stock_code order by trade_date) as rn from a where flag = 'T' order by stock_code, trade_date ) ,d as ( select b.stock_code ,b.trade_date ,b.closing_price ,b.flag ,b.rn as rnb ,c.rn as rnc ,b.rn - c.rn as rn_diff from b left outer join c on b.stock_code = c.stock_code and b.trade_date = c.trade_date order by b.stock_code, b.trade_date ) select stock_code ,min(trade_date) ,max(trade_date) ,count(*) from d where rn_diff = 1 group by stock_code having count(*) >= 5;
简化脚本,将 row_number() 函数放在一起
with a as ( select stock_code ,trade_date ,closing_price ,case when closing_price / lag(closing_price) over (partition by stock_code order by trade_date) - 1 >= 0.05 then 'T' else 'F' end as flag from stock_data order by stock_code, trade_date ), b as ( select stock_code ,trade_date ,closing_price ,flag ,row_number() over(partition by stock_code order by trade_date) as rn1 ,row_number() over(partition by stock_code, flag order by trade_date) as rn2 ,row_number() over(partition by stock_code order by trade_date) - row_number() over(partition by stock_code, flag order by trade_date) as date_diff from a order by stock_code, trade_date ) select stock_code ,min(trade_date) as start_date ,max(trade_date) as end_date ,count(*) as cnt from b Where flag = 'T' group by stock_code, date_diff having count(*) >= 5;
建表语句
-- drop table stock_data; set odps.sql.decimal.odps2=true; create table if not exists stock_data ( stock_code string, trade_date date, closing_price decimal(10, 2) ) comment '股票价格表';
示例数据
-- 样例数据 set odps.sql.decimal.odps2=true; insert into table stock_data values ('AAPL', date'2023-02-26', 100.00), ('AAPL', date'2023-02-27', 105.00), ('AAPL', date'2023-02-28', 110.25), ('AAPL', date'2023-03-01', 115.78), ('AAPL', date'2023-03-02', 121.59), ('AAPL', date'2023-03-03', 128.73), ('AAPL', date'2023-03-04', 137.00), ('AAPL', date'2023-03-05', 144.67), ('AAPL', date'2023-03-06', 147.64), ('GOOG', date'2023-02-26', 2000.00), ('GOOG', date'2023-02-27', 2100.00), ('GOOG', date'2023-02-28', 2205.00), ('GOOG', date'2023-03-01', 2313.25), ('GOOG', date'2023-03-02', 2431.01), ('GOOG', date'2023-03-03', 2547.56), ('GOOG', date'2023-03-04', 2680.19), ('GOOG', date'2023-03-05', 2814.20), ('GOOG', date'2023-03-06', 2955.91);
参考资料
原创文章,转载请注明出处:http://www.opcoder.cn/article/72/