这篇文章主要介绍的是对于INSERT SQL,执行器interpreter调用execute()方法构造block 流的过程,对于具体的写入数据过程,在TCPHandler::processInsertQuery() 方法中,今天来具体分析下:
先看下带注释的代码:
void TCPHandler::processInsertQuery(const Settings &global_settings) {
/** Made above the rest of the lines, so that in case of `writePrefix` function throws an exception,
* client receive exception before sending data.
* 最先执行writePrefix()函数, 如果其抛出异常, 客户端就能在发送数据之前接收该异常
*/
state.io.out->writePrefix();
/// Send ColumnsDescription for insertion table 发送插入数据目标表的ColumnsDescription
if (client_revision >= DBMS_MIN_REVISION_WITH_COLUMN_DEFAULTS_METADATA) {
const auto &db_and_table = query_context->getInsertionTable();
if (query_context->getSettingsRef().input_format_defaults_for_omitted_fields)
sendTableColumns(query_context->getTable(db_and_table.first, db_and_table.second)->getColumns());
}
/// Send block to the client - table structure. 发送包含表结构的block给client, client将数据按照这个格式发送给server
//由 Client.cpp中 processInsertQuery() -> receiveSampleBlock()的方法接收并进行后续格式化数据并发送给server
sendData(state.io.out->getHeader());
//这个方法是重点, 主要方法调用链readData() -> receivePacket() -> receiveData() -> write()
readData(global_settings);
state.io.out->writeSuffix();
state.io.onFinish();
}
这里面是重点readData(global_settings)方法, 主要方法调用链readData() -> receivePacket() -> receiveData() -> write()
再看下receiveData()方法:
bool TCPHandler::receiveData() {
initBlockInput();
/// The name of the temporary table for writing data, default to empty string
String external_table_name;
readStringBinary(external_table_name, *in);
/// Read one block from the network and write it down 从网络中读取一个block, 并写到下层流/文件
Block block = state.block_in->read();
if (block) {
/// If there is an insert request, then the data should be written directly to `state.io.out`.
/// Otherwise, we write the blocks in the temporary `external_table_name` table.
// 如果是INSERT语句, need_receive_data_for_insert = true, 则直接写到state.io.out
// 如果是INSERT SELECT语句, need_receive_data_for_insert = false, 先写到临时的external_table_name表中
if (!state.need_receive_data_for_insert) {
StoragePtr storage;
/// If such a table does not exist, create it.
if (!(storage = query_context->tryGetExternalTable(external_table_name))) {
NamesAndTypesList columns = block.getNamesAndTypesList();
//临时表external_table_name是一个Memory引擎的表
storage = StorageMemory::create(external_table_name, ColumnsDescription{columns});
storage->startup();
query_context->addExternalTable(external_table_name, storage);
}
/// The data will be written directly to the table. data会直接被写到table中
state.io.out = storage->write(ASTPtr(), *query_context);
}
if (block)
state.io.out->write(block);//data会直接被写到table中
return true;
} else
return false;
}
逻辑还是比较清晰的:
1、读取到一个block后,根据INSERT SQL 和 INSERT SELECT SQL两种类型,进行不同的处理;
2、对于INSERT SQL,直接调用IBlockOutputStream的write()方法将数据写入table中;
3、对于INSERT SELECT SQ,写第一个block的时候会创建一个Memory引擎的临时表external_table_name,执行该表的startup()。然后再调用Memory引擎临时表的write()方法(最终调用的还是IBlockOutputStream的write()方法)将数据写入table中。
(这里注意下startup()方法,针对不同的表引擎有不同的实现,有的有具体实现,没有具体实现的表示do nothing)
对于IBlockOutputStream的write()方法,看其具体实现,发现一部分是将数据写入下层流的,一部分是将数据写入具体的table的。
具体分析几个write()方法的具体实现吧。
1、MemoryBlockOutputStream,这个应该就是对应着Memory引擎的表,表中的数据以block list的形式保存在内存中。
class MemoryBlockOutputStream : public IBlockOutputStream
{
public:
explicit MemoryBlockOutputStream(StorageMemory & storage_) : storage(storage_) {}
Block getHeader() const override { return storage.getSampleBlock(); }
void write(const Block & block) override
{
storage.check(block, true);
std::lock_guard lock(storage.mutex);
storage.data.push_back(block);
}
private:
StorageMemory & storage;
};
StorageMemory中有一个成员变量BlocksList data;保存着所有的block,向Memory引擎的表中写入数据调用storage.data.push_back(block);即可。
2、MergeTreeBlockOutputStream,对应着MergeTree引擎的表。
void MergeTreeBlockOutputStream::write(const Block &block) {
storage.delayInsertOrThrowIfNeeded();//如果一个partition中的part过多(可能是merge不过来了), 可能会延迟插入或抛出异常
/// split Block Into Parts
/// 将一个block分割成多个block, 每个block对应一个part (每个part中的分区都是一样的, 具有相同分区的part组成了partition)
auto part_blocks = storage.writer.splitBlockIntoParts(block, max_parts_per_block);
for (auto ¤t_block : part_blocks) {
Stopwatch watch;
//写temp part
MergeTreeData::MutableDataPartPtr part = storage.writer.writeTempPart(current_block);
//重命名temp part
storage.renameTempPartAndAdd(part, &storage.increment);
//添加新part
PartLog::addNewPart(storage.global_context, part, watch.elapsed());
/// Initiate async merge - it will be done if it's good time for merge and if there are space in 'background_pool'.
///启动异步合并. 如果是应该进行merge且background_pool中有空余线程, 则会执行merge
storage.background_task_handle->wake();//唤醒background_pool中的merge线程
}
}
每一步都增加注释了,就不具体说了。但是里面的方法还是值得看一下的。
3、DistributedBlockOutputStream,对应着Distributed引擎的表。
void DistributedBlockOutputStream::write(const Block & block)
{
if (insert_sync)
writeSync(block);
else
writeAsync(block);
}
根据同步写入还是异步写入,分别执行不同的逻辑。