＜项目＞云备份

本篇文章的云备份是指自动将本地计算机上指定文件夹中需要备份的文件上传至服务器中。并且能够随时通过浏览器进行查看并下载，其中下载过程还支持断点续传功能（即下载过程中若出现网络故障、服务器关闭等情况断开下载，可以从已下载的位置继续下载未下载的部分）。而且服务器也会对上传文件进行热点管理（即经常被访问的文件），将非热点文件进行压缩存储，节省磁盘空间。

过程类似于下图：

二、实现目标

我们需实现客户服务两端程序，其中包括部署在用户机上的客户端程序，上传需备份的文件，以及运行在服务器上的服务端程序，实现备份文件的存储和管理，两端合作实现总体的自动云备份功能。

三、服务端程序负责功能及功能模块划分

功能细分：

支持客户端上传文件功能
支持客户端浏览器查看访问备份文件列表功能
支持客户端文件下载功能（断点续传）
支持对文件进行热点文件管理功能，对非热点文件(长时间无访问的文件)进行压缩存储，节省磁盘空间

功能模块划分：

数据管理模块：负责服务器上备份文件的信息管理
网络通信模块：搭建网络通信服务器，实现与客户端的网络通信
业务处理模块：针对客户端的每个请求(包括上传文件、列表查看、下载)进行对应业务处理并响应结果
热点管理模块：负责文件的热点判断，以及非热点文件压缩存储

四、客户端程序负责功能及功能模块划分

要实现的功能：自动对指定文件夹中的文件进行备份。

功能细分：

指定文件夹中的文件检测（获取文件夹中有什么文件）
判断指定文件夹中的文件是否需要备份（新增文件、已备份但被修改的文件）
将需要备份的文件上传至服务器上

功能模块划分：

数据管理模块：负责客户端备份的文件信息管理，通过这些信息数据判断一个文件是否需要备份
文件检查模块：遍历获取指定文件夹中所有文件路径名称
网络通信模块：上传需备份文件至服务器

五、环境搭建

（一）gcc 7.3

查看当前 CentOS-7 下 gcc 版本是否是7.3，不是则需升级

sudo yum install centos-release-scl-rh centos-release-scl
sudo yum install devtoolset-7-gcc devtoolset-7-gcc-c++
source /opt/rh/devtoolset-7/enable 
echo "source /opt/rh/devtoolset-7/enable" >> ~/.bashrc

（二）安装jsoncpp库

安装第三方库用于序列化与反序列化：

sudo yum install epel-release
sudo yum install jsoncpp-devel

注意，centos版本不同有可能安装的jsoncpp版本不同，安装的头文件位置也就可能不同了。

安装完毕查看目录下是否存在：

（三）下载bundle数据压缩库

github链接

git clone https://github.com/r-lyeh-archived/bundle.git

（四）下载httplib库

GitHub链接

git clone https://github.com/yhirose/cpp-httplib.git

六、第三方库认识

（一）json

json 是一种数据交换格式，采用完全独立于编程语言的文本格式来存储和表示数据。

例如：小明同学的学生信息

char name = "小明";
int age = 18;
float score[3] = {88.5, 99, 58};

则 json 这种数据交换格式是将这多种数据对象组织成为一个 字符串 ：

[
   {
        "姓名" : "小明",
        "年龄" : 18,
        "成绩" : [88.5, 99, 58]
   },
   {
        "姓名" : "小黑",
        "年龄" : 18,
        "成绩" : [88.5, 99, 58]
   }
]

json 数据类型：对象，数组，字符串，数字
对象：使用花括号 {} 括起来的表示一个对象。
数组：使用中括号 [] 括起来的表示一个数组。
字符串：使用常规双引号 "" 括起来的表示一个字符串。
数字：包括整形和浮点型，直接使用。

（二）jsoncpp认识

jsoncpp 库用于实现 json 格式的序列化和反序列化，完成将多个数据对象组织成为 json 格式字符串，以及将 json 格式字符串解析得到多个数据对象的功能。

这其中主要借助三个类以及其对应的少量成员函数完成：

// Json数据对象类
class Json::Value
{
  Value &operator=(const Value &other); 
  Value& operator[](const std::string& key);
  // value重载了[]和=，因此所有的赋值和获取数据都可以通过简单的方式完成
  // val["姓名"] = "小明";
  Value& operator[](const char* key);
  Value removeMember(const char* key); // 移除元素
  const Value& operator[](ArrayIndex index) const; // val["成绩"][0] val["成绩"][1]
  Value& append(const Value& value); // 只能通过这个接口添加数组元素 val["成绩"].append(90);
  ArrayIndex size() const; // 获取数组元素个数 val["成绩"].size();
  std::string asString() const;  // 转string string name=val["name"].asString();
  const char* asCString() const; // 转char*  char* name=val["name"].asCString();
  Int asInt() const; // 转int
  float asFloat() const; // 转float
  bool asBool() const; // 转bool
};

// json序列化类，低版本用这个更简单
class JSON_API Writer {
  virtual std::string write(const Value& root) = 0;
}
class JSON_API FastWriter : public Writer {
  virtual std::string write(const Value& root);
}
class JSON_API StyledWriter : public Writer {
  virtual std::string write(const Value& root);
}

// json序列化类，高版本推荐，如果使用低版本接口可能会有警告
class JSON_API StreamWriter {
  virtual int write(Value const& root, std::ostream* sout) = 0;
}
class JSON_API StreamWriterBuilder : public StreamWriter::Factory {
  virtual StreamWriter* newStreamWriter() const;
}

//json反序列化类，低版本用起来更简单
class JSON_API Reader {
 bool parse(const std::string& document, Value& root, bool collectComments = true);
}
//json反序列化类，高版本更推荐
class JSON_API CharReader {
  virtual bool parse(char const* beginDoc, char const* endDoc, Value* root, std::string* errs) = 0;
}
class JSON_API CharReaderBuilder : public CharReader::Factory {
  virtual CharReader* newCharReader() const;
}

这三个类都有对应的头文件可以查看：

1. demo样例

（1）序列化

#include <iostream>
#include <string>
#include <sstream>
#include <memory>
#include <jsoncpp/json/json.h>

int main()
{
  const char *name = "小明";
  int age = 18;
  float sorce[] = {80, 90, 33.3};

  Json::Value root;
  root["姓名"] = name;
  root["年龄"] = age;
  root["成绩"].append(sorce[0]);
  root["成绩"].append(sorce[1]);
  root["成绩"].append(sorce[2]);

  Json::StreamWriterBuilder swb;
  std::unique_ptr<Json::StreamWriter> sw(swb.newStreamWriter());
  std::stringstream ss;
  sw->write(root, &ss);
  std::cout << ss.str() << std::endl;

  return 0;
}

（2）反序列化

R"(raw_string)" 为C++11及以后版本新语法，raw_string为原始字符串字面量，方便解析原始字符串。

#include <iostream>
#include <string>
#include <memory>
#include <jsoncpp/json/json.h>

int main()
{
  std::string str = R"({"姓名":"小黑", "年龄":19, "成绩":[60, 80, 90.5]})";
  Json::Value root;
  Json::CharReaderBuilder crb;
  std::unique_ptr<Json::CharReader> cr(crb.newCharReader());
  std::string err;
  
  bool ret = cr->parse(str.c_str(), str.c_str() + str.size(), &root, &err);
  if(ret == false)
  {
    std::cout << "parse error: " << err << std::endl;
    return -1;
  }

  std::cout << root["姓名"].asString() << std::endl;
  std::cout << root["年龄"].asInt() << std::endl;
  int sz = root["成绩"].size();
  for(int i = 0; i < sz; i++) std::cout << root["成绩"][i] << std::endl;

  return 0;
}

（三）bundle文件压缩库认识

BundleBundle 是一个嵌入式压缩库，支持23种压缩算法和2种存档格式。使用的时候只需要加入两个文件 bundle.h 和 bundle.cpp 即可

namespace bundle
{
  // low level API (raw pointers)
  bool is_packed( *ptr, len );
  bool is_unpacked( *ptr, len );
  unsigned type_of( *ptr, len );
  size_t len( *ptr, len );
  size_t zlen( *ptr, len );
  const void *zptr( *ptr, len );
  bool pack( unsigned Q, *in, len, *out, &zlen );
  bool unpack( unsigned Q, *in, len, *out, &zlen );
  // medium level API, templates (in-place)
  bool is_packed( T );
  bool is_unpacked( T );
  unsigned type_of( T );
  size_t len( T );
  size_t zlen( T );
  const void *zptr( T );
  bool unpack( T &, T );
  bool pack( unsigned Q, T &, T );
  // high level API, templates (copy)
  T pack( unsigned Q, T );
  T unpack( T );
}

1. demo样例

（1）bundle库实现文件压缩

#include <iostream>
#include <string>
#include <fstream>
#include "bundle.h"

int main(int argc, char *argv[])
{
    std::cout << "argv[1] 是原始路径名称\n";
    std::cout << "argv[2] 是压缩包名称\n";
    if(argc < 3) return -1;

    std::string ifilename = argv[1];
    std::string ofilename = argv[2];

    std::ifstream ifs;
    // 因为不知道文件是什么样的数据，所以以二进制形式打开原始文件
    ifs.open(ifilename, std::ios::binary); 
    ifs.seekg(0, std::ios::end); // 跳转读写位置到末尾
    size_t fsize = ifs.tellg(); // 获取末尾偏移量 -- 文件长度
    ifs.seekg(0, std::ios::beg); // 跳转至文件起始位置
    
    std::string body;
    body.resize(fsize); // 调整body大小为文件大小
    ifs.read(&body[0], fsize); // 读取文件所有数据至body

    std::string packed = bundle::pack(bundle::LZIP, body); // 以LZIP格式压缩文件数据

    std::ofstream ofs;
    ofs.open(ofilename, std::ios::binary); // 以二进制形式打开压缩包文件
    ofs.write(&packed[0], packed.size()); // 将压缩后的数据写入压缩包文件

    ifs.close();
    ofs.close();

    return 0;
}

编译时需带上库源文件，大概有十一万行，因此编译时间稍长：

g++ bundle_demo.cc bundle.cpp -o bundle_demo -lpthread

不带路径默认就为当前路径，可以看到文件压缩成功：

（2）bundle库实现文件解压缩

#include <iostream>
#include <string>
#include <fstream>
#include "bundle.h"

int main(int argc, char *argv[])
{
    if (argc < 3)
    {
        std::cout << "argv[1] 是压缩包名称\n";
        std::cout << "argv[2] 是解压后的文件名称\n";
        return -1;
    }

    std::string ifilename = argv[1]; // 压缩包名
    std::string ofilename = argv[2]; // 解压后文件名

    std::ifstream ifs;
    // 因为不知道文件是什么样的数据，所以以二进制形式打开原始文件
    ifs.open(ifilename, std::ios::binary);
    ifs.seekg(0, std::ios::end); // 跳转读写位置到末尾
    size_t fsize = ifs.tellg();  // 获取末尾偏移量 -- 文件长度
    ifs.seekg(0, std::ios::beg); // 跳转至文件起始位置

    std::string body;
    body.resize(fsize);        // 调整body大小为文件大小
    ifs.read(&body[0], fsize); // 读取文件所有数据至body
    ifs.close();

    std::string unpacked = bundle::unpack(body); // 解压缩数据文件

    std::ofstream ofs;
    ofs.open(ofilename, std::ios::binary); // 以二进制形式打开解压缩文件
    ofs.write(&unpacked[0], unpacked.size());  // 将解压缩后的数据写入解压缩文件
    ofs.close();

    return 0;
}

不带路径默认就为当前路径，可以看到文件解压缩成功：

我们可以使用md5sum值来比较两个文件是否相同，如果两个值相同，那么文件在传输过程中没有被修改或损坏：

md5sum 文件名称

可以看到二者完全一致：

（四）httplib库认识

httplib 库，一个 C++11 单文件头的跨平台 HTTP/HTTPS 库。安装起来非常容易。只需包含 httplib.h 在你的代码中即可。

httplib 库实际上是用于搭建一个简单的 http 服务器或者客户端的库，这种第三方网络库，可以让我们免去搭建服务器或客户端的时间，把更多的精力投入到具体的业务处理中，提高开发效率。

这些只是本次项目涉及到的几个类的字段和接口：

namespace httplib{
    struct MultipartFormData {
        std::string name;
        std::string content;
        std::string filename;
        std::string content_type;
   };
    using MultipartFormDataItems = std::vector<MultipartFormData>;
    struct Request {
        std::string method;
        std::string path;
        Headers headers;
        std::string body;
        // for server
        std::string version;
        Params params;
        MultipartFormDataMap files;
        Ranges ranges;
        bool has_header(const char *key) const;
        std::string get_header_value(const char *key, size_t id = 0) const;
        void set_header(const char *key, const char *val);
        bool has_file(const char *key) const;
        MultipartFormData get_file_value(const char *key) const;
   };
    struct Response {
        std::string version;
        int status = -1;
        std::string reason;
        Headers headers;
        std::string body;
        std::string location; // Redirect location
 void set_header(const char *key, const char *val);
        void set_content(const std::string &s, const char *content_type);
   };
 class Server {
        using Handler = std::function<void(const Request &, Response &)>;
        using Handlers = std::vector<std::pair<std::regex, Handler>>;
        std::function<TaskQueue *(void)> new_task_queue;
        Server &Get(const std::string &pattern, Handler handler);
        Server &Post(const std::string &pattern, Handler handler);
        Server &Put(const std::string &pattern, Handler handler);
        Server &Patch(const std::string &pattern, Handler handler);  
        Server &Delete(const std::string &pattern, Handler handler);
        Server &Options(const std::string &pattern, Handler handler);
        bool listen(const char *host, int port, int socket_flags = 0);
 };
    class Client {
        Client(const std::string &host, int port);
 Result Get(const char *path, const Headers &headers);
        Result Post(const char *path, const char *body, size_t content_length,
              const char *content_type);
        Result Post(const char *path, const MultipartFormDataItems &items);
   }
}

1. Reques类

    struct MultipartFormData {
        std::string name;         // 字段名称
        std::string content;      // 文件内容
        std::string filename;     // 文件名称
        std::string content_type; // 正文类型
   };
    using MultipartFormDataItems = std::vector<MultipartFormData>;
    struct Request {
        std::string method;         // 请求方法
        std::string path;           // 资源路径
        Headers headers;            // 头部字段
        std::string body;           // 正文
        // for server
        std::string version;        // 协议版本
        Params params;              // 查询字符串
        MultipartFormDataMap files; // 保存的是客户端上传的文件信息
        Ranges ranges;              // 用于实现断点续传的请求文件区间
        bool has_header(const char *key) const;
        std::string get_header_value(const char *key, size_t id = 0) const;
        void set_header(const char *key, const char *val);
        bool has_file(const char *key) const;
        MultipartFormData get_file_value(const char *key) const;
   };

Request结构体的功能：

客户端保存的所有http请求相关信息，最终组织http请求发送给服务器
服务器收到http请求之后进行解析，将解析的数据保存在Request结构体中，待后续处理

2. Reponse类

struct Response {
    std::string version;  // 协议版本
    int status = -1;      // 响应状态码
    std::string reason;   // 不考虑
    Headers headers;      // 头部字段
    std::string body;     // 响应给客户端的正文
    std::string location; // Redirect location 不考虑
    void set_header(const char *key, const char *val); // 设置头部字段
    void set_content(const std::string &s, const char *content_type); // 设置正文
};

Response结构体的功能：

用户将响应数据放到结构体中，httplib会将其中的数据按照http响应格式组织为http响应，发送给客户端。

3. Server类

class Server {
    using Handler = std::function<void(const Request &, Response &)>;
    using Handlers = std::vector<std::pair<std::regex, Handler>>;
    std::function<TaskQueue *(void)> new_task_queue;

    // 针对某种请求方法的某个请求设定映射的处理函数
    Server &Get(const std::string &pattern, Handler handler);
    Server &Post(const std::string &pattern, Handler handler);
    Server &Put(const std::string &pattern, Handler handler);
    Server &Patch(const std::string &pattern, Handler handler);  
    Server &Delete(const std::string &pattern, Handler handler);
    Server &Options(const std::string &pattern, Handler handler);

    // 搭建并启动http服务器
    bool listen(const char *host, int port, int socket_flags = 0);
};

Server类功能：用于搭建http服务器

using Handler = std::function<void(const Request &, Response &)>;

Handler: 函数指针类型：定义了一个http请求处理回调函数格式。
httplib搭建的服务器收到请求后，进行解析，得到一个Request结构体，其中包含了请求数据，根据请求数据我们就可以处理这个请求，这个处理函数定义的格式就是Handler格式。

Request参数：保存请求数据让用户能够根据请求数据进行业务处理。
Response参数：需要用户在业务处理中，填充数据，最终响应给客户端。

using Handlers = std::vector<std::pair<std::regex, Handler>>;

Handlers: 可以理解成一个请求与处理函数映射表。映射了一个客户端请求的资源路径和一个处理函数（用户自己定义的函数）
regex: 正则表达式，用于匹配http请求资源路径
Handler: 请求处理函数指针

当服务器收到请求解析得到Request就会根据资源路径及请求方法到这张表中查看有没有对应的处理函数。如果有则调用这个函数进行请求处理，没有则响应404。

说白了， Handlers表决定了哪个请求需哪个对应的函数处理。

std::function<TaskQueue *(void)> new_task_queue;

new_task_queue: 线程池，用于处理http请求

线程池中线程的工作：

接收并解析请求，得到Request结构体也就是请求的数据。
在Handlers表中，根据请求信息查找处理函数。
当处理函数调用完毕，根据函数返回的Response结构体中的数据组织http响应发送给客户端。

4. Client类

struct MultipartFormData {
    std::string name;
    std::string content;
    std::string filename;
    std::string content_type;
};
using MultipartFormDataItems = std::vector<MultipartFormData>;
class Client {
    // 传入服务器IP地址和端口
    Client(const std::string &host, int port);
    
    // 向服务器发送GET请求
    Result Get(const char *path, const Headers &headers);

    // 向服务器发送POST请求
    Result Post(const char *path, const char *body, size_t content_length, const char *content_type);
    
    // POST请求提交多区域数据，常用于多文件上传
    Result Post(const char *path, const MultipartFormDataItems &items);
};

5. httplib 库搭建简单服务器

#include "httplib.h"

void Hello(const httplib::Request &req, httplib::Response &rsp)
{
    rsp.set_content("Hello World!", "text/plain");
    rsp.status = 200;
}

void Numbers(const httplib::Request &req, httplib::Response &rsp)
{
    auto num = req.matches[1]; // 0里边保存的是整体path, 往后下标保存的是捕捉的数据
    rsp.set_content(num, "text/plain");
    rsp.status = 200;
}

void Multipart(const httplib::Request &req, httplib::Response &rsp)
{
    auto ret = req.has_file("file");
    if (ret == false)
    {
        std::cout << "not file upload\n";
        rsp.status = 400;
        return;
    }

    const auto &file = req.get_file_value("file");
    rsp.body.clear();
    rsp.body = file.filename; // 文件名称
    rsp.body += "\n";
    rsp.body += file.content; // 文件内容
    rsp.set_header("Content_Type", "text/plain");
    rsp.status = 200;
    return;
}

int main()
{
    httplib::Server server; // 实例化一个Server类的对象用于搭建服务器

    server.Get("/hi", Hello); // 注册一个针对 /hi 的Get请求的处理函数映射关系
    server.Get(R"(/numbers/(\d+))", Numbers);
    server.Post("/multipart", Multipart);
    server.listen("0.0.0.0", 8080);
    return 0;
}

连接失败，因为没有关闭防火墙：

sudo systemctl stop firewalld
sudo systemctl disable firewalld

6. httplib 库搭建简单客户端

#include "httplib.h"

#define SERVER_IP "10.0.12.13"
#define SERVER_PORT 8080

int main()
{
    httplib::Client client(SERVER_IP, SERVER_PORT);

    httplib::MultipartFormData item;
    item.name = "file";
    item.filename = "hello.txt";
    item.content = "Hello World!"; // 上传文件的内容
    item.content_type = "text/plain";

    httplib::MultipartFormDataItems items;
    items.push_back(item);

    auto res = client.Post("/multipart", items);
    std::cout << res->status << std::endl;
    std::cout << res->body << std::endl;

    return 0;
}

运行服务器，启动客户端：

七、文件实用工具类设计

不管是客户端还是服务端，文件的传输备份都涉及到文件的读写，包括数据管理信息的持久化也是如此，因此首先设计封装文件操作类，这个类封装完毕之后，则在任意模块中对文件进行操作时都将变的简单化。

class FileUtil
{
private:
    std::string _name;
public:
            
    bool Remove() // 删除文件
    int64_t FileSize(); // 获取文件大小
    time_t LastModTime(); // 获取文件最后一次修改时间
    time_t LastAccTime(); // 获取文件最后一次访问时间
    std::string FileName(); // 获取文件路径名中的文件名称
    bool SetContent(const std::string &body); // 向文件写入数据
    bool GetContent(std::string *body);       // 获取文件内容
    bool GetPosLen(std::string *body, size_t pos, size_t len); // 获取文件指定位置，指定长度的数据
    bool Exists(); // 判断文件是否存在
    bool CreateDirectory(); // 创建文件夹
    bool ScanDirectory(std::vector<std::string> *array); // 扫描指定文件夹的所有文件路径名称
    bool Compress(const std::string &packname); // 压缩文件
    bool UnCompress(const std::string &filename); // 解压缩文件
};

（一）属性获取

stat文件属性接口Windows系统也有，可以实现跨平台服务，具有可移植性：

#include <sys/stat.h>

int stat(const char *path, struct stat *buf);

#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <sys/stat.h>

namespace cloud
{
    class FileUtil
    {
    private:
        std::string _filename;

    public:
        FileUtil(const std::string &filename) : _filename(filename) {}

        // 删除文件
        bool Remove()
        {
            if (this->Exists() == false) return true;
            remove(_filename.c_str());
            return true;
        }

        int64_t FileSize() // 获取文件大小
        {
            struct stat st;
            if (stat(_filename.c_str(), &st) < 0)
            {
                std::cout << "get file size failed!\n" << std::endl;
                return -1;
            }
            return st.st_size;
        }

        time_t LastMTime() // 获取文件最后一次修改时间
        {
            struct stat st;
            if (stat(_filename.c_str(), &st) < 0)
            {
                std::cout << "get file modification time failed!\n" << std::endl;
                return -1;
            }
            return st.st_mtime;
        }

        time_t LastATime() // 获取文件最后一次访问时间
        {
            struct stat st;
            if (stat(_filename.c_str(), &st) < 0)
            {
                std::cout << "get file access time failed!\n" << std::endl;
                return -1;
            }
            return st.st_atime;
        }

        std::string FileName() // 获取文件路径名中的文件名称
        {
            // ./abc/abcd.txt --> abcd.txt
            size_t pos = _filename.find_last_of("/");
            if (pos == std::string::npos) return _filename;
            return _filename.substr(pos+1);
        }
    };
}

编写测试代码尝试编译：

#include "FileUtil.hpp"

void FileUtilTest(const std::string &filename)
{
    cloud::FileUtil fu(filename);
    std::cout << fu.FileSize() << std::endl;
    std::cout << fu.LastMTime() << std::endl;
    std::cout << fu.LastATime() << std::endl;
    std::cout << fu.FileName() << std::endl;
}

int main(int argc, char *argv[])
{
    FileUtilTest(argv[1]);

    return 0;
}

可以看到文件大小一致，时间的话为时间戳，也是一致的，文件名也一致：

（二）读写操作

#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <sys/stat.h>

namespace cloud
{
    class FileUtil
    {
    private:
        std::string _filename;

    public:
        // 获取文件指定位置，指定长度的数据
        bool GetPosLen(std::string *body, size_t pos, size_t len) 
        {
            size_t fsize = this->FileSize(); // 文件大小
            if (pos + len > fsize) // 超过文件最大边界则错误
            {
                std::cout << "pos+len longer than file size\n";
                return false;
            }

            std::ifstream ifs;
            ifs.open(_filename, std::ios::binary); // 以二进制格式打开文件
            if (ifs.is_open() == false)
            {
                std::cout << "read open file failed!\n";
                return false;
            }

            ifs.seekg(pos, std::ios::beg); // 从起始位置跳到pos位置
            body->resize(len);
            ifs.read(&(*body)[0], len); // 读取指定长度数据到body
            if (ifs.good() == false)
            {
                std::cout << "get file content failed\n";
                ifs.close();
                return false;
            }
            ifs.close(); // 关闭流
            return true;
        }
        
        // 获取文件内容
        bool GetContent(std::string *body) 
        {
            size_t fsize = this->FileSize();
            return GetPosLen(body, 0, fsize);
        }

        // 向文件写入数据
        bool SetContent(const std::string &body) 
        {
            std::ofstream ofs;
            ofs.open(_filename, std::ios::binary); // 以二进制格式打开
            if (ofs.is_open() == false)
            {
                std::cout << "write open file failed!\n";
                return false;
            }
            ofs.write(&body[0], body.size()); // 写入body数据到文件
            if (ofs.good() == false)
            {
                std::cout << "write file content failed!\n";
                ofs.close();
                return false;
            }
            ofs.close();
            return true;
        }
    };
}

编写测试代码，读取当前目录文件 cloud.cc 的内容，写入给 hello.txt

#include "FileUtil.hpp"

void FileUtilTest(const std::string &filename)
{
    cloud::FileUtil fu(filename);
    std::string body;
    fu.GetContent(&body);
    
    cloud::FileUtil nfu("./hello.txt");
    nfu.SetContent(body);

    return;
}

int main(int argc, char *argv[])
{
    FileUtilTest(argv[1]);

    return 0;
}

可以看到生成了hello.txt文件，并且两个文件内容一致，说明写入成功：

（三）压缩和解压缩

#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <sys/stat.h>
#include "bundle.h"

namespace cloud
{
    class FileUtil
    {
    private:
        std::string _filename;

    public:
        // 压缩文件
        bool Compress(const std::string &packname)
        {
            // 1. 获取源文件数据
            std::string body;
            if (this->GetContent(&body) == false)
            {
                std::cout << "compress get file content failed!\n" << std::endl;
                return false;
            }

            // 2. 对数据进行压缩
            std::string packed = bundle::pack(bundle::LZIP, body);

            // 3. 将压缩数据存储到压缩文件中
            FileUtil pafu(packname);
            if (pafu.SetContent(packed) == false)
            {
                std::cout << "compress write packed data failed!\b" << std::endl;
                return false;
            }
            return true;
        }

        // 解压缩文件               
        bool UnCompress(const std::string &filename)     
        {
            // 1. 将当前压缩包数据读取出来
            std::string body;
            if (this->GetContent(&body) == false)
            {
                std::cout << "uncompress get file content failed!\n" << std::endl;
                return false;
            }

            // 2. 对压缩的数据进行解压缩
            std::string unpacked = bundle::unpack(body);

            // 3. 将解压缩的数据写入到新文件
            FileUtil fu(filename);
            if (fu.SetContent(unpacked) == false)
            {
                std::cout << "uncompress write packed data failed!\n" << std::endl;
                return false;
            }
            return true;
        }     
    };
}

编写测试代码，压缩 bundle.cpp 文件，可以看到生成了 bundle.cpp.lz 压缩包，另外也生成了 uncompress.txt 解压缩文件，二者文件内容一致，说明压缩解压缩成功：

（四）目录操作

操作中涉及到多个库接口：

#ifndef __MY_UTIL__
#define __MY_UTIL__

#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <sys/stat.h>
#include <experimental/filesystem>
#include "bundle.h"

namespace cloud
{
    namespace fs = std::experimental::filesystem;
    class FileUtil
    {
    private:
        std::string _filename;

    public:       
        // 判断文件是否存在
        bool Exists()    
        {
            return fs::exists(_filename);
        }

        // 创建多级文件夹                                       
        bool CreateDirectory() 
        {
            if (this->Exists()) return true;
            return fs::create_directories(_filename);
        }           

        // 扫描指定文件夹的所有文件路径名称                 
        bool ScanDirectory(std::vector<std::string> *array)
        {
            for(auto& p: fs::directory_iterator(_filename))
            {
                // 为目录就跳过继续扫描文件
                if (fs::is_directory(p) == true)
                    continue;
                // relative_path 带有路径的文件名，p本身不是string类型，需经过转换
                array->push_back(fs::path(p).relative_path().string());
            }   
            return true;
        }              
    };
}
#endif

注意：

扫描目录时我们只需获取目录下的文件路径名称即可，并不需要目录下的目录路径名称。
p不为string类型，需转换成string类型。
另外我们需要的文件路径名称为相对路径，不需要绝对路径。

我们用到了fs库接口，所以在编译时需链接fs库：

g++ -o $@ $^ -std=c++11 -lpthread -lstdc++fs

编写测试代码：

#include "FileUtil.hpp"

void FileUtilTest(const std::string &filename)
{
    cloud::FileUtil fu(filename);
    fu.CreateDirectory();
    std::vector<std::string> arry;
    fu.ScanDirectory(&arry);
    for (auto &a : arry) std::cout << a << std::endl;

    return;
}

int main(int argc, char *argv[])
{
    FileUtilTest(argv[1]);

    return 0;
}

可以看到，创建了一个 test 目录，进入到 test 目录创建一些文件，再次运行代码加上 test 目录，可以看到获取到了 test 目录下文件的相对路径：

八、Json实用工具类设计

前面我们已经写过一遍样例了，都是大差不差：

    // Json工具类
    class JsonUtil
    {
    public:
        // 序列化
        static bool Serialize(const Json::Value &root, std::string *str)
        {
            Json::StreamWriterBuilder swb;
            std::unique_ptr<Json::StreamWriter> sw(swb.newStreamWriter());
            std::stringstream ss;
            if(sw->write(root, &ss) != 0)
            {
                std::cout << "json write failed!\n" << std::endl;
                return true;
            }
            *str = ss.str();
            return true;
        }

        // 反序列化
        static bool UnSerialize(const std::string &str, Json::Value *root)
        {
            Json::CharReaderBuilder crb;
            std::unique_ptr<Json::CharReader> cr(crb.newCharReader());
            std::string err;
            bool ret = cr->parse(str.c_str(), str.c_str()+str.size(), root, &err);
            if (ret == false)
            {
                std::cout << "parse error: " << err << std::endl;
                return false;
            }
            return true;
        }
    };

九、服务端配置信息模块实现

（一）配置信息文件模块

使用文件配置加载一些程序的运行关键信息可以让程序的运行更加灵活。

主要思想：将这个服务端系统运行所需的一些关键信息，记录在配置文件里，当运行系统时就可以从配置文件里读取到这些信息出来，在程序中使用。另外配置信息可以随时更改，程序也不需要重新生成重新编译，只需重启服务端程序，重新加载配置信息即可。

配置信息：

热点判断时间：热点管理：多长时间没有被访问的文件算是非热点文件？
文件下载的url前缀路径：用于表示客户端请求是一个下载请求。
例如：假设 /listshow 为一个备份列表查看请求，我们如何判断这个请求是否为 listshow 这个文件的下载请求？只需加个 /download 前缀区分即可。例如：url: http://192.168.133.144:8080/listshow(只是查看请求)，url: http://192.168.133.144:8080/download/listshow(下载listshow请求)。
压缩包后缀名：订立的压缩包命名规则，就是在文件原名称之后加上后缀。".lz"
上传文件存放路径：决定了文件上传之后实际存储在服务器的哪里。
压缩包存放路径：决定了非热点文件压缩后存放的路径。
服务端备份信息存放文件：服务端记录的备份文件信息的持久化存储。
服务器的监听IP地址和监听端口：当程序要运行在其他主机上，则不需修改程序，在配置信息文件修改即可。

cloud.conf

{
    "hot_time" : 30,
    "server_port" : 8080,
    "server_ip" : "0.0.0.0",
    "download_prefix" : "/download/",
    "packfile_suffix" : ".lz",
    "pack_dir" : "./packdir/",
    "back_dir" : "./backdir/",
    "backup_file" : "./cloud.dat"
}

（二）单例文件配置类设计

使用单例模式管理系统配置信息，能够让配置信息的管理控制更加统一灵活。

class Config
{
private:
    // 构造函数私有化
    Config() {}

    static std::mutex _mutex;
    static Config *_instance;  

private:
    int _hot_time;                // 热点判断时间
    std::string _download_prefix; // 下载的url前缀路径
    std::string _packfile_suffix; // 压缩包后缀名称
    std::string _back_dir;        // 备份文件存放目录
    std::string _pack_dir;        // 压缩包存放目录
    std::string _backup_file;     // 数据信息存放文件
    std::string _server_ip;       // 服务器IP地址
    int _server_port;             // 服务器监听端口

public:
    static Config *GetInstance();
    int GetHotTime();
    std::string GetDownloadPrefix();
    std::string GetPackFileSuffix();
    std::string GetBackDir();
    std::string GetPackDir();
    std::string GetBackupFile();
    std::string GetServerIP();
    int GetServerPort();
};

config.hpp

增加一个ReadConfigFile()接口将配置文件信息(Json格式)读取出来，需进行反序列化赋给成员变量：

#ifndef __MY_CONFIG__
#define __MY_CONFIG__

#include "Util.hpp"
#include <mutex>

namespace cloud
{
    #define CONFIG_FILE "./cloud.conf"
    class Config
    {
    private:
        // 构造函数私有化
        Config() { ReadConfigFile(); }

        static std::mutex _mutex;
        static Config *_instance;

    private:
        int _hot_time;                // 热点判断时间
        int _server_port;             // 服务器监听端口
        std::string _server_ip;       // 服务器IP地址
        std::string _download_prefix; // 下载的url前缀路径
        std::string _packfile_suffix; // 压缩包后缀名称
        std::string _back_dir;        // 备份文件存放目录
        std::string _pack_dir;        // 压缩包存放目录
        std::string _backup_file;     // 数据信息存放文件
        bool ReadConfigFile()
        {
            FileUtil fu(CONFIG_FILE);
            std::string body;
            if (fu.GetContent(&body) == false)
            {
                std::cout << "load config file failed!\n";
                return false;
            }
            Json::Value root;
            if (JsonUtil::UnSerialize(body, &root) == false)
            {
                std::cout << "parse config file failed!\n";
                return false;
            }
            _hot_time = root["hot_time"].asInt();
            _server_port = root["server_port"].asInt();
            _server_ip = root["server_ip"].asString();
            _download_prefix = root["download_prefix"].asString();
            _packfile_suffix = root["packfile_suffix"].asString();
            _back_dir = root["back_dir"].asString();
            _pack_dir = root["pack_dir"].asString();
            _backup_file = root["backup_file"].asString();
            return true;
        }

    public:
        static Config *GetInstance()
        {
            if (_instance == NULL)
            {
                _mutex.lock();
                if (_instance == NULL)
                {
                    _instance = new Config();
                }
                _mutex.unlock();
            }
            return _instance;
        }
        int GetHotTime()
        {
            return _hot_time;
        }
        std::string GetServerPort()
        {
            return _server_ip;
        }
        int GetServerIP()
        {
            return _server_port;
        }
        std::string GetDownloadPrefix()
        {
            return _download_prefix;
        }
        std::string GetPackFileSuffix()
        {
            return _packfile_suffix;
        }
        std::string GetBackDir()
        {
            return _back_dir;
        }
        std::string GetPackDir()
        {
            return _pack_dir;
        }
        std::string GetBackupFile()
        {
            return _backup_file;
        }
    };
    Config *Config::_instance = NULL;
    std::mutex Config::_mutex;
}

#endif

编写测试代码：

#include "Util.hpp"
#include "config.hpp"

void ConfigTest()
{
    cloud::Config *config = cloud::Config::GetInstance();
    std::cout << config->GetHotTime() << std::endl;
    std::cout << config->GetServerPort() << std::endl;
    std::cout << config->GetServerIP() << std::endl;
    std::cout << config->GetDownloadPrefix() << std::endl;
    std::cout << config->GetPackFileSuffix() << std::endl;
    std::cout << config->GetPackDir() << std::endl;
    std::cout << config->GetBackDir() << std::endl;
    std::cout << config->GetBackupFile() << std::endl;
}

int main(int argc, char *argv[])
{
    ConfigTest();
    return 0;
}

可以看到获取到的配置信息文件的内容与源文件一致：

十、服务端数据管理模块实现

（一）管理的数据信息

管理哪些数据，是因为后期要用到哪些数据

文件的实际存储路径：当客户端要下载文件时，则从这个文件中读取数据进行响应。
文件压缩包存放路径名：如果这个文件是一个非热点文件就会被压缩，则这个就是压缩包路径名称。如果客户端要下载文件，则需要先解压缩，然后读取解压后的文件数据。
文件是否压缩的标志位：判断文件是否已经被压缩了。
文件大小
文件最后一次修改时间
文件最后一次访问时间
文件访问URL中资源路径path

（二）如何管理数据

用于数据信息访问：使用hash表在内存中管理数据，以url的path就为key值 -- 查询速度快O(1)。
持久化存储管理：使用json序列化将所有数据信息保存在文件中。

（三）数据管理类的设计

数据管理类：管理服务端系统中会用到的数据

typedef struct BackupInfo
{
    bool pack_flag; // 是否是压缩标志
    size_t fsize; // 文件大小
    time_t atime; // 最后一次访问时间
    time_t mtime; // 最后一次修改时间
    std::string real_path; // 文件实际存储路径名称
    std::string pack_path; // 压缩包存储路径名称
    std::string url; 
    bool NewBackupInfo(const std::string realpath);
}BackupInfo;

class DataManager
{
private:
    std::string _backup_file; // 持久化存储文件
    std::unordered_map <std::string, BackupInfo> _table; // 内存中以hash表存储
    pthread_rwlock_t _rwlock; // 读写锁 -- 读共享，写互斥
public:
    DataManager();

    // 每次数据新增或修改都要重新持久化存储，避免数据丢失
    bool Storage();

    // 初始化加载，在每次系统重启都要加载以前的数据
    bool InitLoad();

    // 新增
    bool Insert(const BackupInfo &info);
    
    // 修改
    bool Update(const BackupInfo &info);

    bool GetOneByURL(const std::string &url, BackupInfo *info);

    bool GetOneRealPath(const std::string &realpath, BackupInfo *info);

    bool GetAll(std::vector<BackupInfo> *arry);    
};

1. 备份信息实现

#ifndef __MY_DATA__
#define __MY_DATA__

#include <pthread.h>
#include <unordered_map>
#include "util.hpp"
#include "config.hpp"

namespace cloud
{
    typedef struct BackupInfo
    {
        bool pack_flag;        // 是否是压缩标志
        size_t fsize;          // 文件大小
        time_t atime;          // 最后一次访问时间
        time_t mtime;          // 最后一次修改时间
        std::string real_path; // 文件实际存储路径名称
        std::string pack_path; // 压缩包存储路径名称
        std::string url;
        bool NewBackupInfo(const std::string &realpath)
        {
            FileUtil fu(realpath);
            if (fu.Exists() == false)
            {
                std::cout << "new backupinfo: file not exists!\n" << std::endl;
                return false;
            }
            Config *config = Config::GetInstance();
            std::string packdir = config->GetPackDir();
            std::string packsuffix = config->GetPackFileSuffix();
            std::string download_prefix = config->GetDownloadPrefix();

           
            this->pack_flag = false; // 新增文件没有压缩
            this->fsize = fu.FileSize();
            this->mtime = fu.LastMTime();
            this->atime = fu.LastATime();
            this->real_path = realpath;
            // ./backdir/a.txt ——> ./packdir/a.txt.lz
            this->pack_path = packdir + fu.FileName() + packsuffix; 
            // ./backdir/a.txt ——> /download/a.txt.lz
            this->url = download_prefix + fu.FileName();
            return true;
        }
    } BackupInfo;
}

#endif

因为 bundle.cpp 这个文件过于庞大，编译速度慢，因此我们将它变成一个静态库：

cloud:cloud.cc util.hpp 
	g++ -o $@ $^ -L./lib -std=c++11 -lpthread -lstdc++fs -ljsoncpp -lbundle

.PHONY:clean
clean:
	rm -rf cloud

编写测试代码：

#include "util.hpp"
#include "config.hpp"
#include "data.hpp"

void DataTest(const std::string &filename)
{
    cloud::BackupInfo info;
    info.NewBackupInfo(filename);
    std::cout << info.pack_flag << std::endl;
    std::cout << info.fsize << std::endl;
    std::cout << info.mtime << std::endl;
    std::cout << info.atime << std::endl;
    std::cout << info.real_path << std::endl;
    std::cout << info.pack_path << std::endl;
    std::cout << info.url << std::endl;
}

int main(int argc, char *argv[])
{
    DataTest(argv[1]);

    return 0;
}

bundle.h 文件信息获取成功：

2. 数据管理类实现

持久化存储跟初始化加载后面实现：

#ifndef __MY_DATA__
#define __MY_DATA__

#include <pthread.h>
#include <unordered_map>
#include "util.hpp"
#include "config.hpp"

namespace cloud
{
    class DataManager
    {
    private:
        std::string _backup_file;                           // 持久化存储文件
        std::unordered_map<std::string, BackupInfo> _table; // 内存中以hash表存储
        pthread_rwlock_t _rwlock;                           // 读写锁 -- 读共享，写互斥
    public:
        DataManager()
        {
            _backup_file = Config::GetInstance()->GetBackupFile();
            pthread_rwlock_init(&_rwlock, NULL); // 初始化读写锁
        }

        ~DataManager()
        {
            pthread_rwlock_destroy(&_rwlock); // 销毁读写锁
        }

        // 每次数据新增或修改都要重新持久化存储，避免数据丢失
        bool Storage()
        {
        }

        // 初始化加载，在每次系统重启都要加载以前的数据
        bool InitLoad()
        {
        }

        // 新增
        bool Insert(const BackupInfo &info)
        {
            pthread_rwlock_wrlock(&_rwlock);
            _table[info.url] = info;
            pthread_rwlock_unlock(&_rwlock); // 销毁读写锁
            return true;
        }

        // 修改 - 跟插入一样，因为hash表中不允许重复键值对，所以相同元素就会覆盖
        bool Update(const BackupInfo &info)
        {
            pthread_rwlock_wrlock(&_rwlock);
            _table[info.url] = info;
            pthread_rwlock_unlock(&_rwlock); 
            return true;
        }

        bool GetOneByURL(const std::string &url, BackupInfo *info)
        {
            pthread_rwlock_wrlock(&_rwlock);
            // 因为url是key值，所以直接通过find进行查找
            auto it = _table.find(url);
            if (it == _table.end()) 
            {
                pthread_rwlock_unlock(&_rwlock); 
                return false;
            }
            *info = it->second;
            pthread_rwlock_unlock(&_rwlock); 
            return true;
        }

        bool GetOneRealPath(const std::string &realpath, BackupInfo *info)
        {
            pthread_rwlock_wrlock(&_rwlock);
            auto it = _table.begin();
            for (; it != _table.end(); ++it) 
            {
                if(it->second.real_path == realpath)
                {
                    *info = it->second;
                    pthread_rwlock_unlock(&_rwlock); 
                    return true;
                }
            }
            pthread_rwlock_unlock(&_rwlock); 
            return false;
        }

        bool GetAll(std::vector<BackupInfo> *arry)
        {
            pthread_rwlock_wrlock(&_rwlock);
            auto it = _table.begin();
            for (; it != _table.end(); ++it) 
            {
                arry->push_back(it->second);
            }
            pthread_rwlock_unlock(&_rwlock); 
            return false;
        }
    };
}

#endif

编写测试代码：

#include "util.hpp"
#include "config.hpp"
#include "data.hpp"

void DataTest(const std::string &filename)
{
    cloud::BackupInfo info;
    info.NewBackupInfo(filename);

    cloud::DataManager data;
    std::cout << "----------Insert and GetOneByURL----------\n";

    data.Insert(info);

    cloud::BackupInfo tmp;
    data.GetOneByURL("/download/bundle.h", &tmp);
    std::cout << tmp.pack_flag << std::endl;
    std::cout << tmp.fsize << std::endl;
    std::cout << tmp.mtime << std::endl;
    std::cout << tmp.atime << std::endl;
    std::cout << tmp.real_path << std::endl;
    std::cout << tmp.pack_path << std::endl;
    std::cout << tmp.url << std::endl;

    std::cout << "----------Update and GetAll----------\n";
    info.pack_flag = true;
    data.Update(info);
    std::vector<cloud::BackupInfo> arry;
    data.GetAll(&arry);
    for (auto &a : arry)
    {
        std::cout << a.pack_flag << std::endl;
        std::cout << a.fsize << std::endl;
        std::cout << a.mtime << std::endl;
        std::cout << a.atime << std::endl;
        std::cout << a.real_path << std::endl;
        std::cout << a.pack_path << std::endl;
        std::cout << a.url << std::endl;
    }
    std::cout << "----------GetOneRealPath----------\n";
    data.GetOneRealPath(filename, &tmp);
    std::cout << tmp.pack_flag << std::endl;
    std::cout << tmp.fsize << std::endl;
    std::cout << tmp.mtime << std::endl;
    std::cout << tmp.atime << std::endl;
    std::cout << tmp.real_path << std::endl;
    std::cout << tmp.pack_path << std::endl;
    std::cout << tmp.url << std::endl;
}

int main(int argc, char *argv[])
{
    DataTest(argv[1]);

    return 0;
}

持久化存储

新增跟修改了就存储：

        // 新增
        bool Insert(const BackupInfo &info)
        {
            pthread_rwlock_wrlock(&_rwlock);
            _table[info.url] = info;
            pthread_rwlock_unlock(&_rwlock); 
            Storage(); // 新增了就持久化存储
            return true;
        }

        // 修改 - 跟插入一样，因为hash表中不允许重复键值对，所以相同元素就会覆盖
        bool Update(const BackupInfo &info)
        {
            pthread_rwlock_wrlock(&_rwlock);
            _table[info.url] = info;
            pthread_rwlock_unlock(&_rwlock);
            Storage(); // 修改了就持久化存储
            return true;
        }

        // 每次数据新增或修改都要重新持久化存储，避免数据丢失
        bool Storage()
        {
            // 1. 获取所有数据
            std::vector<BackupInfo> arry;
            this->GetAll(&arry);
            // 2. 添加到Json::Value
            Json::Value root;
            for (int i = 0; i < arry.size(); i++)
            {
                Json::Value item;
                item["pack_flag"] = arry[i].pack_flag;
                // Json只重载了普通类型，没有重载size_t这些类型，所以需进行强转
                item["fsize"] = (Json::Int64)arry[i].fsize;
                item["atime"] = (Json::Int64)arry[i].atime;
                item["mtime"] = (Json::Int64)arry[i].mtime;
                item["real_path"] = arry[i].real_path;
                item["pack_path"] = arry[i].pack_path;
                item["url"] = arry[i].url;
                root.append(item); // 添加数组元素
            }
            // 3. 对Json::Value序列化
            std::string body;
            JsonUtil::Serialize(root, &body);

            // 4. 写文件
            FileUtil fu(_backup_file);
            fu.SetContent(body);
            
            return true;
        }

测试代码不变，运行程序：

可以看到生成了cloud.dat 存储文件，查看文件内容：

初始化加载

加载信息时在对象构造时加载：

        DataManager()
        {
            _backup_file = Config::GetInstance()->GetBackupFile();
            pthread_rwlock_init(&_rwlock, NULL); // 初始化读写锁
            InitLoad(); // 在对象构造时加载
        }

        // 初始化加载，初始化程序运行时从文件读取数据
        bool InitLoad()
        {
            // 1. 将数据文件中的数据读取出来
            FileUtil fu(_backup_file);
            if (fu.Exists() == false) return true;
            std::string body;
            fu.GetContent(&body);

            // 2. 反序列化
            Json::Value root;
            JsonUtil::UnSerialize(body, &root); 

            // 3. 将反序列化得到的Json::Value中的数据添加到table中
            for (int i = 0; i < root.size(); i++)
            {
                BackupInfo info;
                info.pack_flag = root[i]["pack_flag"].asBool();
                info.fsize = root[i]["fsize"].asInt64();
                info.atime = root[i]["atime"].asInt64();
                info.mtime = root[i]["mtime"].asInt64();
                info.pack_path = root[i]["pack_path"].asString();
                info.real_path = root[i]["real_path"].asString();
                info.url = root[i]["url"].asString();
                Insert(info);
            }
            return true;    
        }

编写测试代码：

#include "util.hpp"
#include "config.hpp"
#include "data.hpp"

void DataTest(const std::string &filename)
{
    cloud::DataManager data;
    std::vector<cloud::BackupInfo> arry;
    data.GetAll(&arry);
    for (auto &a : arry)
    {
        std::cout << a.pack_flag << std::endl;
        std::cout << a.fsize << std::endl;
        std::cout << a.mtime << std::endl;
        std::cout << a.atime << std::endl;
        std::cout << a.real_path << std::endl;
        std::cout << a.pack_path << std::endl;
        std::cout << a.url << std::endl; 
    }
}

int main(int argc, char *argv[])
{
    DataTest(argv[1]);

    return 0;
}

bundle.h 文件信息获取成功：

十一、服务端热点管理模块实现

服务器端的热点文件管理是对服务端上备份的文件进行检测，哪些文件长时间没有被访问，则认为是非热点文件，则压缩存储，节省磁盘空间。

实现思路：

遍历所有文件，检测文件的最后一次访问时间，与当前时间进行相减得到差值，这个差值如果大于设定好的非热点判断时间则认为是非热点文件，则进行压缩存放到压缩路径中，再删除源文件。

注意： 上传文件有自己的上传存储位置，非热点文件的压缩存储有自己的存储位置。在遍历上传文件夹的时候不至于将压缩过的文件又进行非热点处理了。

流程：

遍历备份目录，获取所有文件路径名称
逐个文件获取最后一次访问时间与当前系统时间进行比较判断
对非热点文件进行压缩处理，删除源文件
修改数据管理模块对应的文件信息（压缩标志 --> true）

（一）热点管理类的设计

#ifndef __MY_HOT__
#define __MY_HOT__

#include "data.hpp"
#include <unistd.h>

//因为数据管理是要在多个模块中访问的，因此将其作为全局数据定义，在此处声明使用即可
extern cloud::DataManager *_data;
namespace cloud
{
    class HotManager
    {
    private:
        std::string _back_dir;    // 备份文件路径
        std::string _pack_dir;    // 压缩文件路径
        std::string _pack_suffix; // 压缩包后缀名
        int _hot_time;            // 热点判断时间
    private:
        // 非热点文件返回true，热点文件返回false
        bool HotJudge(const std::string &filename)
        {
            FileUtil fu(filename);
            time_t last_time = fu.LastATime();
            time_t cur_time = time(NULL);
            if (cur_time - last_time > _hot_time)
                return true;
            return false;
        }

    public:
        HotManager()
        {
            Config *config = Config::GetInstance();
            _back_dir = config->GetBackDir();
            _pack_dir = config->GetPackDir();
            _pack_suffix = config->GetPackFileSuffix();
            _hot_time = config->GetHotTime();
            FileUtil tmp1(_back_dir);
            FileUtil tmp2(_pack_dir);
            tmp1.CreateDirectory();
            tmp2.CreateDirectory();
        }
        bool RunModule()
        {
            while (1)
            {
                // 1. 遍历备份目录，获取所有文件名
                FileUtil fu(_back_dir);
                std::vector<std::string> arry;
                fu.ScanDirectory(&arry);
                // 2. 遍历判断文件是否是非热点文件
                for (auto &a : arry)
                {
                    if (HotJudge(a) == false)
                        continue; // 热点文件不需特别处理
                    // 获取文件的备份信息
                    BackupInfo bi;
                    if (_data->GetOneRealPath(a, &bi) == false)
                    {
                        // 有一个文件存在但是没有备份信息
                        bi.NewBackupInfo(a); // 那就设置一个新的备份信息
                    }
                    // 3. 对非热点文件进行压缩处理
                    FileUtil tmp(a);
                    tmp.Compress(bi.pack_path);
                    // 4. 删除源文件，修改备份信息
                    tmp.Remove();
                    bi.pack_flag = true;
                    _data->Update(bi);
                }
                usleep(1000);// 避免空目录循环遍历，消耗cpu资源过高
            }
            return true;
        }
    };
}

#endif

编写测试代码：

#include "util.hpp"
#include "config.hpp"
#include "data.hpp"
#include "hot.hpp"

cloud::DataManager *_data;
void HotTest()
{
    _data = new cloud::DataManager();
    cloud::HotManager hot;
    hot.RunModule();
}

int main(int argc, char *argv[])
{
    HotTest();

    return 0;
}

运行程序，可以看到生成了两个目录，拷贝文件至备份目录backdir，因为我们设置的热点时间为30秒，所以过了三十秒后，可以看到源文件不存在了，而在压缩目录packdir中存在源文件的压缩文件：

并且在在cloud.dat中也添加进了新文件的信息：

十二、服务端业务处理模块实现

将网络通信模块和业务处理进行了合并：

借助网络通信httplib库搭建httplib库搭建http服务器与客户端进行网络通信；

针对收到的请求进行对应的业务处理并进行响应（文件上传，列表查看，文件下载（包含断点续传））：

文件上传请求：备份客户端上传的文件，响应上传成功
文件列表请求：客户端浏览器请求一个备份文件的展示页面，响应页面
文件下载请求：通过展示页面，点击下载，响应客户端要下载的文件数据

（一）业务处理类设计

class server
{
private:
    int _server_port;             // 服务器端口
    std::string _server_ip;       // 服务端IP地址
    std::string _download_prefix; // 下载路径前缀 - /download
    httplib::Server _server;      // 服务端
public:
    Service()
    {
        Config *config = Config::GetInstance();
        _server_port = config->GetServerPort();
        _server_ip = config->GetServerIP();
        _download_prefix = config->GetDownloadPrefix();
    }
    bool RunModule()
    {
        _server.Post("/upload", Upload); // 上传请求
        _server.Get("/listshow", ListShow); // 查看请求
        _server.Get("/", ListShow); // 请求资源路为根目录也是查看请求
        std::string download_url = _download_prefix + "(.*)"; // 正则表达式-匹配字符串
        _server.Get(download_url, Download); // 下载请求
        _server.listen(_server_ip, _server_port);
    }
private:
    static void Upload(const httplib::Request &req, httplib::Response &rsp);
    static void List(const httplib::Request &req, httplib::Response &rsp);
    static void Download(const httplib::Request &req,httplib::Response &rsp);
};

1. 文件上传请求处理实现

        // 上传文件
        static void Upload(const httplib::Request &req, httplib::Response &rsp)
        {
            // post /upload   文件数据在正文中(正文不全是文件数据)
            auto ret = req.has_file("file"); // 判断有没有上传的文件区域
            if (ret == false)
            {
                rsp.status = 400;
                return;
            }
            const auto &file = req.get_file_value("file");
            // file.filename//文件名称      file.content//文件数据
            std::string back_dir = Config::GetInstance()->GetBackDir();
            std::string realpath = back_dir + FileUtil(file.filename).FileName();
            FileUtil fu(realpath);
            fu.SetContent(file.content); // 将文件写入文件中
            BackupInfo info;
            info.NewBackupInfo(realpath); // 组织备份的文件信息
            _data->Insert(info); // 向数据管理模块添加备份的文件信息
            return;
        }

编写测试代码：

#include "util.hpp"
#include "config.hpp"
#include "data.hpp"
#include "hot.hpp"
#include "service.hpp"

cloud::DataManager *_data;
void ServiceTest()
{
    cloud::Service srv;
    srv.RunModule();
}

int main(int argc, char *argv[])
{
    _data = new cloud::DataManager();
    ServiceTest();
    return 0;
}

编写一个前端上传页面：

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Document</title>
</head>
<body>
    <form action="http://43.139.55.38:8080/upload" method="post"
    enctype="multipart/form-data">
        <div>
            <input type="file" name="file">
        </div>
        <div><input type="submit" value="上传"></div>
    </form>
</body>
</html>

选择文件点击上传：

可以看到在备份目录中存在上传的文件，并且在 cloud.dat 中也添加进了文件信息：

2. 展示页面请求处理实现

        static std::string TimetoStr(time_t t)
        {
            std::string tmp = std::ctime(&t);
            return tmp;
        }
        // 展示文件
        static void ListShow(const httplib::Request &req, httplib::Response &rsp)
        {
            // 1. 获取所有文件的备份信息
            std::vector<BackupInfo> arry;
            _data->GetAll(&arry);
            // 2. 根据所有备份信息，组织html文件数据
            std::stringstream ss;
            ss << "<html><head><meta charset='UTF-8'><title>Download</title></head>";
            ss << "<body><h1>Download</h1><table>";
            for (auto &a : arry)
            {
                // 文件存在就展示
                FileUtil fu(a.real_path);
                if (fu.Exists())
                {
                    ss << "<tr>";
                    std::string filename = FileUtil(a.real_path).FileName();
                    ss << "<td><a href='" << a.url << "'>" << filename << "</a></td>";
                    ss << "<td align='right'>" << TimetoStr(a.mtime) << "</td>";
                    ss << "<td align='right'>" << a.fsize / 1024 << "k</td>";
                    ss << "</tr>";
                }
            }
            ss << "</table></body></html>";
            rsp.body = ss.str();
            rsp.set_header("Content-Type", "text/html");
            rsp.status = 200;
            return;
        }

我们再上传几个文件：

资源路径为 "/" 或者 "/listshow" 即为展示文件列表请求：

3. 下载请求处理实现

http的ETag头部字段：其中存储了一个资源唯一标识。

客户端第一次下载文件的时候，会收到这个响应信息；第二次下载，就会将这个信息发送给服务器，想要让服务器根据这个唯一标识判断这个资源有没有被修改过，如果没有被修改过，直接使用原先缓存的数据，不用重新下载。

http协议本身对于etag中是什么数据并不关心，只要你服务器能够自己标识就行。因此etag字段就可以用"文件名-文件大小-最后一次修改时间"组成。而etag字段不仅仅是缓存用到，还有就是后边的断点续传的实现也会用到，因为断点续传也要保证文件没被修改过。

        static std::string GetETag(const BackupInfo &info)
        {
            // etg: filename-fsize-mtime
            FileUtil fu(info.real_path);
            std::string etag = fu.FileName();
            etag += "-";
            etag += std::to_string(info.fsize);
            etag += "-";
            etag += std::to_string(info.mtime);
            return etag;
        }

        // 下载文件
        static void Download(const httplib::Request &req, httplib::Response &rsp)
        {
            //1. 获取客户端请求的资源路径path --- rea.path         
            BackupInfo info;
            //2. 根据资源路径，获取文件备份信息
            _data->GetOneByURL(req.path, &info);

            //3. 判断文件是否被压缩
            //如果被压缩，需先解压缩，删除压缩包，修改备份信息(没被压缩)
            if (info.pack_flag == true)
            {
                FileUtil fu(info.pack_path);
                fu.UnCompress(info.real_path);
                fu.Remove();
                info.pack_flag = false;
                _data->Update(info);
            }

            //4. 读取文件数据，放入rsp.body中
            FileUtil fu(info.real_path);
            fu.GetContent(&rsp.body);
            //5. 设置响应头部字段: ETag, Accept-Ranges: bytes
            rsp.set_header("Accept-Ranges", "bytes");
            rsp.set_header("ETag", GetETag(info));
            // 表示响应的整文为一个二进制数据流
            rsp.set_header("Content-Type", "application/octet-stream"); 
            rsp.status = 200;
        }

http协议的Accept-Ranges:bytes字段：用于告诉客户端支持断点续传，并且数据单位一字节作为单位。

测试代码不变，点击列表的文件进行上传，上传成功：

分别进入到源文件目录跟已下载文件目录然后借助md5值进行比较，可以看到值相同，说明两个文件内容一致：

4. 断点续传处理实现

功能：当前文件下载过程中，因为某种异常而中断，如果再次进行从头下载，效率较低，因为需要将之前已经传输过的数据再次传输一遍。因此断点续传就是从上次下载断开的位置重新下载即可，之前已经传输过的数据将不需要再重新传输。

目的：提高文件重新传输。

实现思想：

客户端在下载文件的时候，要每次接收到数据写入文件后记录自己当前下载的数据量。当异常下载中断时，下次断点续传的时候，将要重新下载的数据区间（下载起始位置，结束位置）发送给服务器，服务器收到后，仅仅回传客户端需要的区间数据即可。

需考虑的问题：如果上次下载文件之后，这个文件在服务器上被修改了，则这时候将不能重新断点续传，而是应该重新下载这个文件。

http协议中断点续传的实现的主要关键点：

在于能够告诉服务器下载区间范围。
服务器要能够检测上一次下载之后这个文件是否被修改过。

文件下载请求：

GET /download/test.txt HTTP/1.1

...

对应的响应：

HTTP/1.1 200 OK

Accept-Ranges: bytes --- 告诉客户端服务器支持断点续传功能

ETag: "ksahjfdfhghdj-文件唯一标识" --- 客户端收到响应会保存这个信息

...

文件数据正文

断点续传报文：

GET /download/test.txt HTTP/1.1

If-Range: "服务端在下载时响应的etag字段" --- 用于服务端判断这个文件与原先下载的文件是否一致。

Range: bytes start-end --- 这个字段用于告诉服务器客户端需要的数据区间范围。例如：Range:bytes 100-10000 表示从第100字节至10000字节的数据区间；Range:bytes 100- 表示从第100字节开始至文件末尾。

...

对应的响应：

HTTP/1.1 206 Partial Content

ETag: "gsvblseuat"
Content-Range: bytes 100-10000/文件大小

...

文件数据正文

        // 下载文件
        static void Download(const httplib::Request &req, httplib::Response &rsp)
        {
            //1. 获取客户端请求的资源路径path --- rea.path         
            BackupInfo info;
            //2. 根据资源路径，获取文件备份信息
            _data->GetOneByURL(req.path, &info);

            //3. 判断文件是否被压缩
            //如果被压缩，需先解压缩，删除压缩包，修改备份信息(没被压缩)
            if (info.pack_flag == true)
            {
                FileUtil fu(info.pack_path);
                fu.UnCompress(info.real_path);
                fu.Remove();
                info.pack_flag = false;
                _data->Update(info);
            }

            bool retrans = false; // 断点续传标志
            std::string old_etag;
            if (req.has_header("If-Range"))
            {
                old_etag = req.get_header_value("If-Range");
                // 有If-Range字段且这个字段的值与请求文件的最新etag一致则符合断点续传
                if (old_etag == GetETag(info))
                {
                    retrans = true;
                }
            }

            //4. 读取文件数据，放入rsp.body中
            FileUtil fu(info.real_path);

            if (retrans == false) // 为false则是正常下载
            {
                fu.GetContent(&rsp.body);
                //5. 设置响应头部字段: ETag, Accept-Ranges: bytes
                rsp.set_header("Accept-Ranges", "bytes");
                rsp.set_header("ETag", GetETag(info));
                // 表示响应的整文为一个二进制数据流
                rsp.set_header("Content-Type", "application/octet-stream"); 
                rsp.status = 200;
            }
            else                  // 为true则是正常下载
            {
                // httplib库内部实现了对于区间请求也就是断点续传请求的处理
                // 只需要我们从用户将文件所有数据读取到rsp.body中，它内部会
                // 自动根据请求区间，从body中取出指定区间数据进行响应
                fu.GetContent(&rsp.body);
                rsp.set_header("Accept-Ranges", "bytes");
                rsp.set_header("ETag", GetETag(info));
                rsp.set_header("Content-Type", "application/octet-stream"); 
                // rsp.set_header("Content-Range", "bytes start-end/fsize");
                rsp.status = 206;
            }
        }

下载过程中我们可以断开网络或者终止服务器测试断点续传：

重新运行服务器，可以发现下载过程是从上次下载断开的位置继续下载：

分别进入到源文件目录跟已下载文件目录然后借助md5值进行比较，可以看到值相同，说明两个文件内容一致：

（二）引入多线程

热点管理功能与下载管理功能都是在循环内，我们使用两个线程来独立执行这两个功能：

#include "util.hpp"
#include "config.hpp"
#include "data.hpp"
#include "hot.hpp"
#include "service.hpp"
#include <thread>

cloud::DataManager *_data;
void HotTest()
{
    cloud::HotManager hot;
    hot.RunModule();
}

void ServiceTest()
{
    cloud::Service srv;
    srv.RunModule();
}

int main(int argc, char *argv[])
{
    _data = new cloud::DataManager();

    std::thread thread_hot_manager(HotTest);
    std::thread thread_service(ServiceTest);
    thread_hot_manager.join();
    thread_service.join();
    return 0;
}

因为这些文件已经过了三十秒没有访问，所以会变成非热点文件进而被压缩至压缩包目录：

若文件已下载，我们先删除已下载的文件，接着可以回到展示文件页面选择文件再次进行下载，测试热点管理功能：

可以看到下载后文件变成热点文件了进而被解压缩至备份目录，但过了热点时间后，文件会变成非热点文件进而被压缩至压缩包目录：

十三、客户端模块功能逻辑整体实现

数据管理模块：管理备份的文件信息，客户端要备份文件，什么文件需要备份，都是通过数据管理判断的。

目录遍历模块：获取指定文件夹中的所有文件路径名

文件备份模块：将需要的文件上传备份到服务器

数据管理模块：其中的信息用于判断一个文件是否需要重新备份

文件是否是新增的
不是新增的，则上次备份后有没有被修改过

数据管理模块实现思想：

内存存储：高访问效率 --- 使用的是hash表即unordered_map
持久化存储：文件存储

管理的数据：文件的路径名+文件的唯一标识。文件存储涉及到数据序列化因为在vs中安装jsoncpp较为麻烦，我们直接自定义序列化格式：

key+val : key是文件路径名，val是文件唯一标识
采取空格分割key跟val，换行字符分割元素：key val\nkey val

注意：当前客户端的程序开发是在Windows下的，使用的工具是vs2017以上版本（需支持C++17）。

（一）客户端工具类设计

复制服务端的工具类util.hpp类，删掉压缩解压缩接口、序列化类，测试工具类util.hpp代码能否运行：

#ifndef __MY_UTIL__
#define __MY_UTIL__

#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <sys/stat.h>
#include <experimental/filesystem>

namespace cloud
{
    namespace fs = std::experimental::filesystem;
    class FileUtil
    {
    private:
        std::string _filename;

    public:
        FileUtil(const std::string& filename) : _filename(filename) {}

        // 删除文件
        bool Remove()
        {
            if (this->Exists() == false) return true;
            remove(_filename.c_str());
            return true;
        }

        // 判断文件是否存在
        bool Exists()
        {
            return fs::exists(_filename);
        }

        // 获取文件大小
        size_t FileSize()
        {
            struct stat st;
            if (stat(_filename.c_str(), &st) < 0)
            {
                std::cout << "get file size failed!\n" << std::endl;
                return 0;
            }
            return st.st_size;
        }
        //获取文件最后一次修改时间
        time_t LastMtime()
        {
            struct stat st;
            if (stat(_filename.c_str(), &st) < 0)
            {
                std::cout << "get file modification time failed!\n" << std::endl;
                return -1;
            }
            return st.st_mtime;
        }

        // 获取文件最后一次访问时间
        time_t LastATime()
        {
            struct stat st;
            if (stat(_filename.c_str(), &st) < 0)
            {
                std::cout << "get file access time failed!\n" << std::endl;
                return -1;
            }
            return st.st_atime;
        }

        // 获取文件路径名中的文件名称
        std::string FileName()
        {
            // ./abc/abcd.txt --> abcd.txt
            size_t pos = _filename.find_last_of("/"); // 找最后一个"/"
            if (pos == std::string::npos) return _filename;
            return _filename.substr(pos + 1);
        }

        // 获取文件指定位置，指定长度的数据
        bool GetPosLen(std::string* body, size_t pos, size_t len)
        {
            size_t fsize = this->FileSize();// 文件大小
            if (pos + len > fsize)// 超过文件最大边界则错误
            {
                std::cout << "pos+len longer than file size\n" << std::endl;
                return false;
            }

            std::ifstream ifs;
            ifs.open(_filename, std::ios::binary);// 以二进制格式打开文件
            if (ifs.is_open() == false)
            {
                std::cout << "read open file failed!\n" << std::endl;
                return false;
            }

            ifs.seekg(pos, std::ios::beg);// 从起始位置跳到pos位置
            body->resize(len);
            ifs.read(&(*body)[0], len);// 读取指定长度数据到body
            if (ifs.good() == false)
            {
                std::cout << "get file content failed\n" << std::endl;
                ifs.close();
                return false;
            }
            ifs.close();// 关闭流
            return true;
        }

        bool GetContent(std::string* body)// 获取文件内容
        {
            size_t fsize = this->FileSize();
            return GetPosLen(body, 0, fsize);
        }

        bool SetContent(const std::string& body)// 向文件写入数据
        {
            std::ofstream ofs;
            ofs.open(_filename, std::ios::binary);// 以二进制格式打开
            if (ofs.is_open() == false)
            {
                std::cout << "write open file failed!\n" << std::endl;
                return false;
            }
            ofs.write(&body[0], body.size());// 写入body数据到文件
            if (ofs.good() == false)
            {
                std::cout << "write file content failed!\n" << std::endl;
                ofs.close();
                return false;
            }
            ofs.close();
            return true;
        }

        // 创建多级文件夹
        bool CreateDirectory()
        {
            if (this->Exists())
                return true;
            return fs::create_directories(_filename);
        }

        // 扫描指定文件夹的所有文件路径名称 
        bool ScanDirectory(std::vector<std::string>* array)
        {
            this->CreateDirectory();
            // 为目录就跳过继续扫描文件
            for (auto& p : fs::directory_iterator(_filename))
            {

                if (fs::is_directory(p) == true)
                {
                    continue;
                }
                // relative_path 带有路径的文件名，p本身不是string类型，需经过一系列转换
                array->push_back(fs::path(p).relative_path().string());
            }
            return true;
        }
    };
}

#endif

编写测试代码，打印出当前目录下的所有文件路径：

#include "util.hpp"

int main()
{
	cloud::FileUtil fu("./");
	std::vector<std::string> arry;
	fu.ScanDirectory(&arry);
	for (auto& a : arry)
	{
		std::cout << a << std::endl;
	}
	return 0;
}

运行成功：

（二）数据管理类实现

管理的数据：文件的路径名+文件的唯一标识。存储格式：文件路径名 + " " + 文件唯一标识 + "\n"

#ifndef __MY_DATA__
#define __MY_DATA__

#include <unordered_map>
#include <sstream>
#include "util.hpp"

namespace cloud
{
	class DataManager
	{
	private:
		std::string _backup_file; // 备份信息的持久化存储文件
		std::unordered_map<std::string, std::string> _table;
	public:
		DataManager(const std::string& backup_file) : _backup_file(backup_file)
		{
			InitLoad();
		}

		bool Storage()
		{
			//1.获取所有的备份信息
			std::stringstream ss;
			auto it = _table.begin();
			for (; it != _table.end(); ++it)
			{
				//2. 将所有信息进行指定持久化格式的组织 
				// 这里采取  key + " " + val + "\n" 格式存储
				ss << it->first << " " << it->second << "\n";
			}

			//3.持久化存储
			FileUtil fu(_backup_file);
			fu.SetContent(ss.str());
			return true;
		}

		int	Split(const std::string& str, const std::string& sep, std::vector<std::string>* arry)
		{
			int count = 0;
			size_t pos = 0, idx = 0;
			while (1)
			{
				// abc bcd def
				// find(要查找的字符，从哪里开始查找的偏移量)
				pos = str.find(sep, idx);
				if (pos == std::string::npos)
					break;
				if (pos == idx)
				{
					idx = pos + sep.size();
					continue;
				}
				// substr(截取起始位置，长度)
				std::string tmp = str.substr(idx, pos - idx);
				arry->push_back(tmp);
				count++;
				idx = pos + sep.size();
			}
			if (idx < str.size())
			{
				arry->push_back(str.substr(idx));
				count++;
			}
			return count;
		}

		// 数据初始化加载
		bool InitLoad()
		{
			//1. 从文件中读取所有数据
			FileUtil fu(_backup_file);
			std::string body;
			fu.GetContent(&body);
			//2.进行数据解析，添加至表中
			std::vector<std::string> arry;
			Split(body, "\n", &arry);
			for (auto& a : arry)
			{ 
				// key + " " + val + "\n" 
				// a.txt a.txt-34567-345636
				std::vector <std::string> tmp;
				Split(a, " ", &tmp);
				if (tmp.size() != 2)
					continue;
				_table[tmp[0]] = tmp[1];
			}
			return true;
		}
		
		bool Insert(const std::string& key, const std::string& val)
		{
			_table[key] = val;
			Storage();
			return true;
		}

		bool Update(const std::string& key, const std::string& val) 
		{
			_table[key] = val;
			Storage();
			return true;
		}
		
		bool GetOneByKey(const std::string& key, std::string* val) 
		{
			auto it = _table.find(key);
			if (it == _table.end())
				return false;
			*val = it->second;
			return true;
		}
	};
}

#endif

先测试在当前目录下所有文件(认为是需备份的文件)的信息能否以指定格式进行存储：

#define _CRT_SECURE_NO_WARNINGS 1
#include "util.hpp"
#include "data.hpp"

#define BACKUP_FILE "./backup.dat"

int main()
{
	cloud::FileUtil fu("./");
	std::vector<std::string> arry;
	fu.ScanDirectory(&arry);
	cloud::DataManager data(BACKUP_FILE);
	for (auto& a : arry)
	{
		data.Insert(a, "nauighahb"); // 随便给一串数据进行测试
	} 

	return 0;
}

因为当前目录之前没有backup.dat这个文件，所以就有提示错误，这个并不影响我们测试。程序运行后就会生成 backup.dat 这个文件，下次客户端再运行程序时因为已经存在 backup.dat，所以就不会提示错误信息：

查看 backup.dat， 你可以看到当前目录下的所有文件(除backup.dat)都以 文件路径 + " " + 文件唯一标识(随便设置的) + "\n" 格斯添加进了backup.dat文件。

再测试能否在备份文件信息文件(backup.dat)中查询指定文件的唯一标识符：

#define _CRT_SECURE_NO_WARNINGS 1
#include "util.hpp"
#include "data.hpp"

#define BACKUP_FILE "./backup.dat"

int main()
{
	//cloud::FileUtil fu("./");
	//std::vector<std::string> arry;
	//fu.ScanDirectory(&arry);
	//cloud::DataManager data(BACKUP_FILE);
	//for (auto& a : arry)
	//{
	//	data.Insert(a, "nauighahb"); // 随便给一串数据进行测试
	//} 

	// 查询文件的唯一标识信息
	cloud::DataManager data(BACKUP_FILE);
	std::string str;
	data.GetOneByKey(".\\cloud.cpp", &str); 
	std::cout << str << std::endl;
	/* 因为是查看windows目录的文件，其是用 \ 进行分割，所以需对 \ 进行转义，即 \\   */
	return 0;
}

可以看到cloud.cpp文件标识符跟我们随便设置的标识符一致：

（三）客户端备份类

功能：自动将指定文件夹中的文件备份到服务器。

流程：

遍历指定文件夹，获取文件信息
注意判断文件是否需要备份
需要备份的文件进行上传备份

#ifndef __MY_CLOUD__
#define __MY_CLOUD__

#include "data.hpp"
#include <Windows.h>

namespace cloud
{
	class Backup
	{
	private:
		std::string _back_dir;
		DataManager* _data;
	public:
		Backup(const std::string& back_dir, const std::string& back_file)
			: _back_dir(back_dir)
		{
			_data = new DataManager(back_file);
		}

		//获取文件唯一标识 
		std::string GetFileIdentifier(const std::string& filename)
		{
			// 格式: 文件名-文件大小-文件最后一次访问时间
			FileUtil fu(filename);
			std::stringstream ss;
			ss << fu.FileName() << "-" << fu.FileSize() << "-" << fu.LastMtime();
			return ss.str();
		}

		bool Runmodule()
		{
            // 死循环
			while (1)
			{
				FileUtil fu(_back_dir);
				std::vector<std::string> arry;
				fu.ScanDirectory(&arry);
				for (auto& a : arry)
				{
					std::string id = GetFileIdentifier(a);
					_data->Insert(a, id);
				}
				Sleep(1);
			}
		}
	};
}

#endif

编写测试代码运行程序：

#define _CRT_SECURE_NO_WARNINGS 1
#include "util.hpp"
#include "data.hpp"
#include "cloud.hpp"

#define BACKUP_FILE "./backup.dat"
#define BACKUP_DIR "./backup"

int main()
{
	cloud::Backup backup(BACKUP_DIR, BACKUP_FILE);
	backup.Runmodule();
	return 0;
}

可以看到在代码目录下生成了一个 backup 备份目录，将 data.hpp util.hpp 复制到backup 目录下：

再查看cloud.dat文件，可以看到随着两个文件添加至备份文件目录backup ，备份文件信息文件cloud.dat也会添加备份文件的信息，可以说明此过程是动态的：

（四）文件上传功能实现

文件上传前还需判断文件是否需要上传，所以我们实现一个判断文件是否需要上传的接口：

		bool IsNeedUpload(const std::string& filename)
		{
			// 需要上传的文件的判断条件: 1.文件时新增的 2.不是新增的但被修改过
			// 1.文件是新增的: 查看有没有历史备份信息
			// 2.不是新增但被修改过: 有历史信息，但历史的唯一标识与当前最新的唯一标识不一致

			std::string id;
			if (_data->GetOneByKey(filename, &id) != false)
			{
				// 有历史信息
				std::string new_id = GetFileIdentifier(filename);
				if (new_id == id)
					return false; // 不需要上传-上次上传后没有被修改过
			}

			// 文件比较大，正在不断地拷贝到备份目录下，拷贝是一个过程，
			// 如果每次遍历则都会判断文件标识不一致，从而导致上传一个几十G的文件需上传上百次
			// 因此应该判断一个文件在指定时间段内有无被修改，没被修改则上传。
			FileUtil fu(filename);
			if (time(NULL) - fu.LastMtime() < 3)
			{
				// 文件在3秒内被修改过则认为还不需要上传
				return false;
			}
			std::cout << filename << " need upload!\n";
			return true;
		}


		// 上传文件
		bool Upload(const std::string& filename)
		{
			//1. 获取文件数据
			FileUtil fu(filename);
			std::string body;
			fu.GetContent(&body);

			//2. 搭建http客户端上传文件数据
			httplib::Client client(SERVER_ADDR, SERVER_PORT);
			httplib::MultipartFormData item;
			item.content = body;
			item.filename = fu.FileName();
			item.name = "file";
			item.content_type = "application/octet-stream";
			httplib::MultipartFormDataItems items;
			items.push_back(item);
			auto res = client.Post("/upload", items);
			if (!res || res->status != 200)
				return false;
			return true;
		}


		bool Runmodule()
		{
			while (1)// 死循环
			{
				// 1.遍历获取指定文件夹中所有文件
				FileUtil fu(_back_dir);
				std::vector<std::string> arry;
				fu.ScanDirectory(&arry);
				// 2.逐个判断文件是否需上传
				for (auto& a : arry)
				{
					if (IsNeedUpload(a) == false)
						continue;
					// 3.如果需要上传则上传文件
					if (Upload(a) == true)
					{
						_data->Insert(a, GetFileIdentifier(a)); // 新增文件备份信息
						std::cout << a << " upload success!\n";
					}
				}
				Sleep(1);
			}
		}

十四、客户端与服务端功能联调测试

测试前，需将客户端工具类的获取文件名接口调整一下。因为客户端程序是在Windows下实现的，其文件路径是以 " \ "作为分隔符的（需转义）；而服务端程序是在Linux下实现的，其文件路径是以 " / "作为分隔符的：

        // 获取文件路径名中的文件名称
        std::string FileName()
        {
            // .\abc\abcd.txt --> abcd.txt
            size_t pos = _filename.find_last_of("\\"); // 找最后一个"\"
            if (pos == std::string::npos) return _filename;

            // return fs::path(_filename).filename().string() // 获取纯文件名
            return _filename.substr(pos + 1);
        }

编写测试代码：

#define _CRT_SECURE_NO_WARNINGS 1
#include "util.hpp"
#include "data.hpp"
#include "cloud.hpp"

#define BACKUP_FILE "./backup.dat"
#define BACKUP_DIR "./backup"

int main()
{
	cloud::Backup backup(BACKUP_DIR, BACKUP_FILE);
	backup.Runmodule();
	return 0;
}

运行服务端前，删除以前测试的文件：

同理客户端也是，删除 backup.dat 和 backup 目录下的文件。

运行服务器跟客户端程序，客户端提示错误原因上面已经说明过了，这里不再阐述：

可以看到这里是没有 backup.dat 文件的，现在复制 httplib.h、data.hpp、util.hpp 文件到 backup 目录：

可以看到这三个文件文件上传至服务器成功：

并且过了热点时间后（30秒）文件会自动压缩至压缩目录：

并且也可以在展示页面进行下载：

对源文件和下载文件进行md5值比较，可以看到三个文件的值一样，说明文件内容一致：

十五、项目总结

项目名称：云备份系统

项目功能：搭建云备份服务器与客户端，客户端程序运行在客户机上自动将指定目录下的文件备份到服务器，并且能够支持浏览器查看与下载，其中下载支持断点续传功能，并且服务器端对备份的文件进行热点管理，将长时间无访问
文件进行压缩存储。

开发环境： centos7.6/vscode、g++、gdb、makefile 以windows11/vs2019

技术特点： http 客户端/服务器搭建， json 序列化，文件压缩，热点管理，断点续传，线程池，读写锁，单例模式

项目模块：

服务端：

配置信息模块：实用单例模式来管理服务端配置文件数据
数据管理模块：内存中使用hash表存储提高访问效率，持久化使用文件存储管理备份数据
业务处理模块：搭建http 服务器与客户端进行通信处理客户端的上传，下载，查看请求，并支持断点续传
热点管理模块：对备份的文件进行热点管理，将长时间无访问文件进行压缩存储，节省磁盘空间。

客户端:

数据管理模块：内存中使用hash表存储提高访问效率，持久化使用文件存储管理备份数据
文件检索模块：基于c++17 文件系统库，遍历获取指定文件夹下所有文件。
文件备份模块：搭建http 客户端上传备份文件。

a篇博客就溜

关注

19
点赞
踩
14

收藏

觉得还不错? 一键收藏
0
评论
＜项目＞云备份

自动将本地计算机上指定文件夹中需要备份的文件上传备份到服务器中。并且能够随时通过浏览器进行查看并且下载，其中下载过程支持断点续传功能，而服务器也会对上传文件进行热点管理，将非热点文件进行压缩存储，节省磁盘空间。
复制链接

扫一扫