我有两个桌子。一种带有股票价格,另一种带有股票数量。我想加入两个表并计算每只股票的市值。
这里是一个示例数据表,其中仅创建了3只股票来复制该问题。CREATE TABLE stock_prices (country_exchange_code VARCHAR(2), stock_code VARCHAR(4), date DATE, close FLOAT, PRIMARY KEY (country_exchange_code,stock_code,date));
INSERT INTO stock_prices VALUES
("T", "1301", '2019-10-29', 75.2),
("T", "1301", '2019-10-30', 76.6),
("T", "1301", '2019-10-31', 77.6),
("T", "1301", '2019-11-01', 77.2),
("T", "1332", '2019-10-29', 52.5),
("T", "1332", '2019-10-30', 49.7),
("T", "1332", '2019-10-31', 50.8),
("T", "1332", '2019-11-01', 50.4),
("T", "1333", '2019-10-29', 13.9),
("T", "1333", '2019-10-30', 13.8),
("T", "1333", '2019-10-31', 14.3),
("T", "1333", '2019-11-01', 14.4);
CREATE TABLE stock_shares (country_exchange_code VARCHAR(2), stock_code VARCHAR(4), Num_Shares INT, PRIMARY KEY (country_exchange_code,stock_code));
INSERT INTO stock_shares VALUES
("T", "1301", 241587962),
("T", "1332", 369875187),
("T", "1333", 958621587);
以下查询将国家代码和股票代码这两个表结合起来,然后列出作为计算得出的市值值输入的股票数量和最后收盘价。我使用last_value窗口函数来获取最近的收盘价。SELECT Stock_Code, Date, Num_Shares,
last_value(Close) OVER (PARTITION BY Stock_Code ORDER BY Date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS Last_Close,
Num_Shares * last_value(Close) OVER (PARTITION BY Stock_Code ORDER BY Date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS Mkt_Cap
FROM stock_prices LEFT JOIN stock_shares USING (Country_Exchange_Code, Stock_Code)
WHERE Country_Exchange_Code = 'T' AND Date >= '2019-10-29'
ORDER BY Stock_Code, Date;
这将按预期工作并产生以下结果:
结果1:
[下一个我想使用DISTINCT语句来得出每只股票的一行。但是,我首先需要除去除Stock_Code和Mkt_Cap之外的所有列。这是发生问题的地方。当我从选择语句中删除Last_Close列时:SELECT Stock_Code, Date, Num_Shares,
Num_Shares * last_value(Close) OVER (PARTITION BY Stock_Code ORDER BY Date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS Mkt_Cap
FROM stock_prices LEFT JOIN stock_shares USING (Country_Exchange_Code, Stock_Code)
WHERE Country_Exchange_Code = 'T' AND Date >= '2019-10-29'
ORDER BY Stock_Code, Date;
我在每个股票代码分区的第一行中弹出这些意外的NULL。
结果2:
为什么会这样?我的表中没有NULL,并且从第一个结果中我们可以看到,那里有所有计算Mkt_Cap所需的数据。
附加信息:当我从SELECT语句中删除Date和/或Num_Shares时,没有问题。只是删除last_value函数才导致问题。
有趣的是,删除WHERE子句后,问题就消失了。我不明白这将如何影响结果,因为在我的小样本中,该WHERE子句甚至什么也没做。我所有的数据都具有Country_Exchange_Code ='T'并且日期> ='2019-10-29'。但是在我的数百万行的实际数据集中,此WHERE子句非常必要。因此,删除WHERE子句不是解决方案。