webget
思路:
创建TCP套接字:
使用系统调用如 socket() 创建一个TCP套接字。
设定服务器地址:
根据服务器的IP地址和端口号构造地址结构。
连接服务器:
使用 connect() 方法和服务器地址来建立TCP连接。
发送HTTP请求:
构建合适的HTTP请求字符串。
使用套接字的文件描述符,通过 write() 或类似方法发送请求到服务器。
接收响应:
循环使用 read() 从套接字读取数据到缓冲区,处理TCP的分片和粘包问题。
解析HTTP响应数据,按需处理(如保存数据、打印等)。
关闭连接:
读取完所有数据后,使用 close() 关闭套接字,释放资源。
需要注意的是:
(1)通常情况下,这里的 "service" 指的是端口号(如80用于HTTP,443用于HTTPS)
(2)HTTP头的每一行都应以 "\r\n" 结尾,并在整个请求的末尾加上一个额外的 "\r\n" 来结束请求。
(3)TCP是一个面向流的协议,EOF的判断通常是read函数返回0,表示对端关闭了连接
(4)文件描述符类似IO流的概念,文件描述符提供了一种抽象接口,用于在程序和操作系统之间传递文件和其他输入/输出设备的操作。
(5)在理想情况下,HTTP响应的所有数据应该能够一次性存储到足够大的缓冲区中。
(6)如果响应数据量较大,无法确定其大小,或者存在限制(如内存限制),则可能需要分批次地读取数据,即通过循环读取的方式。
(7)在循环读取数据的过程中,通过初始化变量(如read_offset = 0
),读取一定数量的数据(如100字节),然后更新变量(如read_offset += 100
),以确保后续读取操作能够从正确位置开始,以保证数据的连续性和完整性。
(8)当前文件描述符的行为已经隐含地处理了读取位置的跟踪(int fd_num())实现了read_offset
的功能。
最新代码:
void get_URL(const std::string& host, const std::string& path) {
TCPSocket client;
Address server(host, "http"); // Port should be 80 by default for HTTP
client.connect(server);
// Properly formatted HTTP GET request
client.write("GET " + path + " HTTP/1.1\r\n");
client.write("Host: " + host + "\r\n");
client.write("Connection: close\r\n\r\n"); // Note the double \r\n
std::string response;
std::string buffer;
while (!client.eof()) {
buffer.clear(); // Clear the buffer to avoid accumulating old data
client.read(buffer);
response += buffer; // Append new data to the complete response
}
std::cout << response;
client.close();
}
测试结果:
ubun22@DESKTOP-1VEP92H:/mnt/e/Web/minnow$ cmake --build build --target check_webget
Test project /mnt/e/Web/minnow/build
Start 1: compile with bug-checkers
1/2 Test #1: compile with bug-checkers ........ Passed 4.88 sec
Start 2: t_webget
2/2 Test #2: t_webget ......................... Passed 1.66 sec
100% tests passed, 0 tests failed out of 2
Total Test time (real) = 6.69 sec
Built target check_webget
ubun22@DESKTOP-1VEP92H:/mnt/e/Web/minnow$
An in-memory reliable byte stream :
注意:
(1)字节流close后,不需要把buffer置为空
(2)只有pop可能会影响position的值
(3)peek()就是直接返回buffer中从position到结尾的视图
最新代码:
byte_stream.cc:
bool Writer::is_closed() const
{
return close_;
}
void Writer::push( string data )
{
if ( is_closed() ) {
return;
}
uint64_t available_space = available_capacity();
uint64_t data_size = data.size();
uint64_t push_size = std::min( available_space, data_size );
bytes_pushed_ += push_size;
buffer_ += data.substr( position_, push_size );
return;
}
void Writer::close()
{
close_ = true;
return;
}
uint64_t Writer::available_capacity() const
{
return capacity_ - buffer_.size();
}
uint64_t Writer::bytes_pushed() const
{
return bytes_pushed_;
}
bool Reader::is_finished() const
{
if ( close_ && !buffer_.size() ) {
return true;
} else {
return false;
}
}
uint64_t Reader::bytes_popped() const
{
return bytes_popped_;
}
std::string_view Reader::peek() const
{
if ( position_ < buffer_.size() ) {
// 如果缓冲区中有未读取的字节,则返回下一个字节的视图
return std::string_view( &buffer_[position_], buffer_.size() - position_ );
} else {
// 如果缓冲区已经读取完毕,则返回一个空的 string_view
return std::string_view();
}
}
void Reader::pop( uint64_t len )
{
if ( is_finished() ) {
return;
}
uint64_t buffer_size = bytes_buffered();
buffer_ = buffer_.substr( len, buffer_size );
bytes_popped_ += len;
if ( position_ >= len ) {
position_ -= len;
}
return;
}
uint64_t Reader::bytes_buffered() const
{
return buffer_.size();
}
byte_stream.hh:
#pragma once
#include <cstdint>
#include <string>
#include <string_view>
class Reader;
class Writer;
class ByteStream
{
public:
explicit ByteStream( uint64_t capacity );
// Helper functions (provided) to access the ByteStream's Reader and Writer interfaces
Reader& reader();
const Reader& reader() const;
Writer& writer();
const Writer& writer() const;
void set_error() { error_ = true; }; // Signal that the stream suffered an error.
bool has_error() const { return error_; }; // Has the stream had an error?
protected:
// Please add any additional state to the ByteStream here, and not to the Writer and Reader interfaces.
uint64_t capacity_;
std::string buffer_ = "";
uint64_t position_ = 0;
uint64_t bytes_popped_ = 0; // Initialize to zero
uint64_t bytes_pushed_ = 0; // Initialize to zero
bool close_ = false; // Initialize to false
bool error_ = false; // Initialize to false
};
class Writer : public ByteStream
{
public:
void push( std::string data ); // Push data to stream, but only as much as available capacity allows.
void close(); // Signal that the stream has reached its ending. Nothing more will be written.
bool is_closed() const; // Has the stream been closed?
uint64_t available_capacity() const; // How many bytes can be pushed to the stream right now?
uint64_t bytes_pushed() const; // Total number of bytes cumulatively pushed to the stream
};
class Reader : public ByteStream
{
public:
std::string_view peek() const; // Peek at the next bytes in the buffer
void pop( uint64_t len ); // Remove `len` bytes from the buffer
bool is_finished() const; // Is the stream finished (closed and fully popped)?
uint64_t bytes_buffered() const; // Number of bytes currently buffered (pushed and not popped)
uint64_t bytes_popped() const; // Total number of bytes cumulatively popped from stream
};
/*
* read: A (provided) helper function thats peeks and pops up to `len` bytes
* from a ByteStream Reader into a string;
*/
void read( Reader& reader, uint64_t len, std::string& out );
测试结果:
ubun22@DESKTOP-1VEP92H:/mnt/e/Web/minnow$ cmake --build build --target check0
Test project /mnt/e/Web/minnow/build
Start 1: compile with bug-checkers
1/10 Test #1: compile with bug-checkers ........ Passed 2.38 sec
Start 2: t_webget
2/10 Test #2: t_webget ......................... Passed 1.65 sec
Start 3: byte_stream_basics
3/10 Test #3: byte_stream_basics ............... Passed 0.20 sec
Start 4: byte_stream_capacity
4/10 Test #4: byte_stream_capacity ............. Passed 0.26 sec
Start 5: byte_stream_one_write
5/10 Test #5: byte_stream_one_write ............ Passed 0.18 sec
Start 6: byte_stream_two_writes
6/10 Test #6: byte_stream_two_writes ........... Passed 0.21 sec
Start 7: byte_stream_many_writes
7/10 Test #7: byte_stream_many_writes .......... Passed 0.23 sec
Start 8: byte_stream_stress_test
8/10 Test #8: byte_stream_stress_test .......... Passed 0.20 sec
Start 37: compile with optimization
9/10 Test #37: compile with optimization ........ Passed 1.95 sec
Start 38: byte_stream_speed_test
ByteStream throughput: 1.02 Gbit/s
10/10 Test #38: byte_stream_speed_test ........... Passed 0.18 sec
100% tests passed, 0 tests failed out of 10
Total Test time (real) = 7.72 sec
Built target check0