我正在处理一些当前以1分钟为间隔存储的数据,如下所示:
CREATE TABLE #MinuteData
(
[Id] INT ,
[MinuteBar] DATETIME ,
[Open] NUMERIC(12, 6) ,
[High] NUMERIC(12, 6) ,
[Low] NUMERIC(12, 6) ,
[Close] NUMERIC(12, 6)
);
INSERT INTO #MinuteData
( [Id], [MinuteBar], [Open], [High], [Low], [Close] )
VALUES ( 1, '2015-01-01 17:00:00', 1.557870, 1.557880, 1.557870, 1.557880 ),
( 2, '2015-01-01 17:01:00', 1.557900, 1.557900, 1.557880, 1.557880 ),
( 3, '2015-01-01 17:02:00', 1.557960, 1.558070, 1.557960, 1.558040 ),
( 4, '2015-01-01 17:03:00', 1.558080, 1.558100, 1.558040, 1.558050 ),
( 5, '2015-01-01 17:04:00', 1.558050, 1.558100, 1.558020, 1.558030 ),
( 6, '2015-01-01 17:05:00', 1.558580, 1.558710, 1.557870, 1.557950 ),
( 7, '2015-01-01 17:06:00', 1.557910, 1.558120, 1.557910, 1.557990 ),
( 8, '2015-01-01 17:07:00', 1.557940, 1.558250, 1.557940, 1.558170 ),
( 9, '2015-01-01 17:08:00', 1.558140, 1.558200, 1.558080, 1.558120 ),
( 10, '2015-01-01 17:09:00', 1.558110, 1.558140, 1.557970, 1.557970 );
SELECT *
FROM #MinuteData;
DROP TABLE #MinuteData;
这些值跟踪货币汇率,因此对于每分钟间隔(条形),分钟开始时有未平仓价格,分钟结束时有收盘价格.高值和低值表示每个分钟期间的最高和最低速率.
期望的输出
我想要将这些数据重新格式化为5分钟,以产生以下输出:
MinuteBar Open Close Low High
2015-01-01 17:00:00.000 1.557870 1.558030 1.557870 1.558100
2015-01-01 17:05:00.000 1.558580 1.557970 1.557870 1.558710
这取5的第一分钟的开放值,即5的最后一分钟的关闭值.高和低值表示5分钟时段内的最高和最低低速率.
当前解决方案
我有一个解决方案,这样做(下面),但它感觉不优雅,因为它依赖于id值和自连接.此外,我打算在更大的数据集上运行它,所以我希望在可能的情况下以更有效的方式执行它:
-- Create a column to allow grouping in 5 minute Intervals
SELECT Id, MinuteBar, [Open], High, Low, [Close],
DATEDIFF(MINUTE, '2015-01-01T00:00:00', MinuteBar)/5 AS Interval
INTO #5MinuteData
FROM #MinuteData
ORDER BY minutebar
-- Group by inteval and aggregate prior to self join
SELECT Interval ,
MIN(MinuteBar) AS MinuteBar ,
MIN(Id) AS OpenId ,
MAX(Id) AS CloseId ,
MIN(Low) AS Low ,
MAX(High) AS High
INTO #DataMinMax
FROM #5MinuteData
GROUP BY Interval;
-- Self join to get the Open and Close values
SELECT t1.Interval ,
t1.MinuteBar ,
tOpen.[Open] ,
tClose.[Close] ,
t1.Low ,
t1.High
FROM #DataMinMax t1
INNER JOIN #5MinuteData tOpen ON tOpen.Id = OpenId
INNER JOIN #5MinuteData tClose ON tClose.Id = CloseId;
DROP TABLE #DataMinMax
DROP TABLE #5MinuteData
返工尝试
而不是上面的查询,我一直在寻找使用FIRST_VALUE和LAST_VALUE,因为它似乎是我所追求的,但我无法让它与我正在进行的分组工作.可能有比我正在尝试做的更好的解决方案,所以我愿意接受建议.目前我正在尝试这样做:
SELECT MIN(MinuteBar) MinuteBar5 ,
FIRST_VALUE([Open]) OVER (ORDER BY MinuteBar) AS Opening,
MAX(High) AS High ,
MIN(Low) AS Low ,
LAST_VALUE([Close]) OVER (ORDER BY MinuteBar) AS Closing ,
DATEDIFF(MINUTE, '2015-01-01 00:00:00', MinuteBar) / 5 AS Interval
FROM #MinuteData
GROUP BY DATEDIFF(MINUTE, '2015-01-01 00:00:00', MinuteBar) / 5
这给了我以下错误,如果删除这些行,则查询运行时会出现FIRST_VALUE和LAST_VALUE:
Column ‘#MinuteData.MinuteBar’ is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.