tpproxy-tcp透明代理

20 篇文章 4 订阅
8 篇文章 2 订阅

tcp transparent proxy

Background

 最近有个需求,需要在路由器设备中截获数据包,从而实现中转。按照下面的拓补,说明。我们需要在主机h2上截获h1发往h3的TCP协议包。最先实现的版本是基于tun设备,数据包截获之后,采用UDP协议中转,类似openvpn的方式。

h1----s1----h2------h3 

 采用tun的方式,需要配置h1的默认网关。但是我们的需求比较特别,我们对h1是没有修改权限的。之前师弟实现了一个版本,采用tap的方式,将h2-eth0同tap设备进行桥接。这种解决方案,参考博客[1]。
 采用桥接的模式,确实不用修改h1中路由表配置。但是我们又进一步的需求,h2截获了h1发送的tcp数据包,h2能够回复ack报文给h1。如果采用桥接的模式,截获的是以太网帧,回复ack,就需要移植一个用户态TCP协议栈,也看了几天,相中了mtcp。但是,感觉太麻烦。麻烦的事情,坚持走下去,就需要大量的时间去填坑。我希望这事可以快点。我写代码的能力有限,摘抄别人的代码很厉害。比如今天写的demo,连抄带写,不到500行。
 之后,就查到了tcp透明代理[3,4,5],可以通过修改h2的路由表,截获TCP数据包,中转应用层数据。
 但是,这种模式,需要在h1中指定h2位默认网关。之前测试的时候,在h1指定了h2作为网关,h1就能ping通h3。h1指定h3或者一个不存在的一个ip作为网关,h1 ping h3就失败。我一直在猜,h1在发送icmp包之前和h2是不是存在什么握手协议。无奈,本科时候计算机网络学的太差,实在不知道这里有个什么握手协议。在照本宣科的课堂上,能学到鬼的东西。当时学习的教材还是国内的一本经典教材,不过也是从坦南鲍姆那本书中抄的。特殊时代的特殊产物。在知乎上有个很调皮的问题:谭浩强是c++之父吗? 有个回答很有意思:不是,谭是c+++++之父。学过他的教材的同学就能理解这个梗:请问 i+++++i,i最后的取值是什么?据说谭根本不会编程,但却写了一系列编程语言教材。谭有他的贡献,在那个国内计算机教育起步的年代,他出的书,也算是播下火种。
 后来,很偶然,在抓包的时候,发现h1在发送ip数据包前,会向外广播arp报文,获得下一跳的mac地址。获取了h2的mac地址,h1发送的数据包的目的mac就是h2-eth0的mac。h2才会处理数据包。后来就采用arp欺骗的方式[5],在不更改h1的路由配置的情况下,h2可以伪造arp响应,将h2-eth0的mac地址返回给h1。

Code and experiment

 The topology on mininet:

#!/usr/bin/python
from mininet.topo import Topo
from mininet.net import Mininet
from mininet.cli import CLI
from mininet.link import TCLink
import time
import datetime
import subprocess
import os,signal
import sys
#https://segmentfault.com/a/1190000009562333
# arpspoof -i h2-eth0 -t 10.0.1.1 10.0.1.3
# 
#    h1--s1----h2------h3             
#
bottleneckbw=6
nonbottlebw=20;  
buffer_size =bottleneckbw*1000*30*3/(1500*8) 
net = Mininet( cleanup=True )
h1 = net.addHost('h1',ip='10.0.1.1')
h2 = net.addHost('h2',ip='10.0.1.2')
h3 = net.addHost('h3',ip='10.0.2.2')
s1 = net.addSwitch( 's1' )
c0 = net.addController('c0')
net.addLink(h1,s1,intfName1='h1-eth0',intfName2='s1-eth0',cls=TCLink , bw=nonbottlebw, delay='10ms', max_queue_size=10*buffer_size)
net.addLink(s1,h2,intfName1='s1-eth1',intfName2='h2-eth0',cls=TCLink , bw=nonbottlebw, delay='10ms', max_queue_size=10*buffer_size) 
net.addLink(h2,h3,intfName1='h2-eth1',intfName2='h3-eth0',cls=TCLink , bw=bottleneckbw, delay='10ms', max_queue_size=buffer_size)
net.build()
h1.cmd("ifconfig h1-eth0 10.0.1.1/24")
h1.cmd("route add default gw 10.0.1.2 dev h1-eth0")
h1.cmd('sysctl net.ipv4.ip_forward=1')


#tproxy
h2.cmd("iptables -t nat -N MY_TCP")
h2.cmd("iptables -t nat -A PREROUTING -j MY_TCP")
h2.cmd("iptables -t nat -A MY_TCP -p tcp -d 10.0.2.2 -j REDIRECT --to-ports 2223")

h2.cmd("ifconfig h2-eth0 10.0.1.2/24")
h2.cmd("ifconfig h2-eth1 10.0.2.1/24")
h2.cmd("ip route add to 10.0.2.0/24 via 10.0.2.2")
h2.cmd("ip route add to 10.0.1.0/24 via 10.0.1.1")
h2.cmd('sysctl net.ipv4.ip_forward=1')

h3.cmd("ifconfig h3-eth0 10.0.2.2/24")
h3.cmd("route add default gw 10.0.2.1 dev h3-eth0")
h3.cmd('sysctl net.ipv4.ip_forward=1')

net.start()
time.sleep(1)
CLI(net)
net.stop()

These commands can be also used on real host to enble tcp transparent proxy.

iptables -t nat -N MY_TCP
iptables -t nat -A PREROUTING -j MY_TCP
iptables -t nat -A MY_TCP -p tcp -d 10.0.2.2 -j REDIRECT --to-ports 2223

 The proxy server listens on port 2223 and wokes on hosts. Iperf can be used to test the available bandwidth between h1 and h2.

iperf client(h1)--------|TpProxyLeft|TpProxyRight|(h2)-----------iperf server(h3)

 In an early version (the code below), h2 will put packets in buffer as long as the socket fd is readable. Then TpProxyRight will relay packets to h3. But there is an obvious drawback: when the available bandwidth between h2 and h3 is less than the bandwidth between h1 and h2, the read packets can not be sent to h3 as fast as they come in, packets will be buffered by TpProxyRight. In extreme case, the buffer in h2 will be exhausted and the transparent proxy server wll collapse. To test the below code on the sampled topology, iperf client can probe bandwidth about 20Mbps when it sends packets to h3. Since the packets are intercepted by h2, the probed bandwidth (20Mbps) is the bandwidth between h1 and h2. So traffic control module shoud be implmented.
 透明代理的v1版本。这里TpProxyLeft截获包,就立马发送给TpProxyRight。这种方案有很大缺点,没有流控。h1到h3的瓶颈链路带宽是6Mbps。采用iperf测试h1到h3的带宽,因为数据包被h2截获,测试出的带宽是h1到h2链路的带宽20Mbps。在读取速率高于数据发送速率的情况下,h2中的TpProxyRight会占用大量的内存资源,最终可能导致程序崩溃。
tpproxy_server.h

#pragma once
#include <string>
#include <atomic>
#include "base/epoll_api.h"
#include "base/socket_address.h"
#include "tcp/tcp_server.h"
#include "tcp/tcp_types.h"
namespace basic{
class TpProxyRight;
class TpProxyBase{
public:
    TpProxyBase(basic::BaseContext *context,int fd);
    virtual ~TpProxyBase(){}
    virtual void Notify(uint8_t sig){}
    void SendData(const char *pv,size_t size);
    void set_peer(TpProxyBase *peer) {peer_=peer;}
protected:
    void FlushBuffer();
    void OnReadEvent(int fd);
    void OnWriteEvent(int fd);
    void CheckCloseFd();
    void CloseFd();
    void DeleteSelf();
    basic::BaseContext* context_=nullptr;
    int fd_=-1;
    std::string fd_write_buffer_;
    std::atomic<bool> destroyed_{false};
    int send_bytes_=0;
    int recv_bytes_=0;
    TcpConnectionStatus status_=TCP_DISCONNECT;
    uint8_t signal_=0;
    TpProxyBase *peer_=nullptr;
};
class TpProxyLeft:public TpProxyBase,
public EpollCallbackInterface{
public:
    TpProxyLeft(basic::BaseContext *context,int fd);
    ~TpProxyLeft();
    void Notify(uint8_t sig) override;
    // From EpollCallbackInterface
    void OnRegistration(basic::EpollServer* eps, int fd, int event_mask) override{}
    void OnModification(int fd, int event_mask) override {}
    void OnEvent(int fd, basic::EpollEvent* event) override;
    void OnUnregistration(int fd, bool replaced) override {}
    void OnShutdown(basic::EpollServer* eps, int fd) override;
    std::string Name() const override {return "TpProxyLeft";}
};
class TpProxyRight:public TpProxyBase,
public EpollCallbackInterface{
public:
    TpProxyRight(basic::BaseContext *context,int fd);
    ~TpProxyRight();
    void Notify(uint8_t sig) override;
    bool AsynConnect(SocketAddress &local,SocketAddress &remote);
    // From EpollCallbackInterface
    void OnRegistration(basic::EpollServer* eps, int fd, int event_mask) override {}
    void OnModification(int fd, int event_mask) override {}
    void OnEvent(int fd, basic::EpollEvent* event) override;
    void OnUnregistration(int fd, bool replaced) override {}
    void OnShutdown(basic::EpollServer* eps, int fd) override;
    std::string Name() const override {return "TpProxyRight";}
private:
    struct sockaddr_storage src_addr_;
    struct sockaddr_storage dst_addr_;
};
class TpProxyBackend:public Backend{
public:
    TpProxyBackend(){}
    void CreateEndpoint(basic::BaseContext *context,int fd) override;
};
class TpProxyFactory: public SocketServerFactory{
public:
    ~TpProxyFactory(){}
    PhysicalSocketServer* CreateSocketServer(BaseContext *context) override;
};     
}

tpproxy_server.cc

#include <memory.h>
#include <unistd.h>
#include <error.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netinet/ip.h>
#include <linux/netfilter_ipv4.h>
#include <iostream>
#include "tpproxy/tpproxy_server.h"
namespace basic{
const int kBufferSize=1500;
enum TPPROXY_SIGNAL: uint8_t{
    TPPROXY_MIN,
    TPPROXY_CONNECT_FAIL,
    TPPROXY_CONNECTED,
    TPPROXY_CLOSE,
    TPPROXY_MAX,
};
/*EPOLLHUP
Hang  up   happened   on   the   associated   file   descriptor.
epoll_wait(2)  will always wait for this event; it is not neces-
sary to set it in events.
*/
TpProxyBase::TpProxyBase(basic::BaseContext *context,int fd):context_(context),fd_(fd){}
void TpProxyBase::FlushBuffer(){
    if(fd_<0){
        return ;
    }
    int remain=fd_write_buffer_.size();
    const char *data=fd_write_buffer_.data();
    bool flushed=false;
    while(remain>0){
        int intend=std::min(kBufferSize,remain);
        int sent=write(fd_,data,intend);
        if(sent<=0){
            break;
        }
        send_bytes_+=sent;
        flushed=true;
        data+=sent;
        remain-=sent;
    }
    if(flushed){
        if(remain>0){
            std::string copy(data,remain);
            copy.swap(fd_write_buffer_);
            
        }else{
            std::string null_str;
            null_str.swap(fd_write_buffer_);
        }        
    }
}
void TpProxyBase::SendData(const char *pv,size_t size){
    if(fd_<0){
        return ;
    }
    if(status_!=TCP_CONNECTED){
        size_t old_size=fd_write_buffer_.size();
        fd_write_buffer_.resize(old_size+size);
        memcpy(&fd_write_buffer_[old_size],pv,size);
    }
    if(status_==TCP_CONNECTED){
        FlushBuffer();
        size_t old_size=fd_write_buffer_.size();
        if(old_size>0){
            fd_write_buffer_.resize(old_size+size);
            memcpy(&fd_write_buffer_[old_size],pv,size);
            return;
        }
        if(old_size==0){
            size_t sent=write(fd_,pv,size);
            if(sent<size){
                send_bytes_+=sent;
                const char *data=pv+sent;
                size_t remain=size-sent;
                fd_write_buffer_.resize(old_size+remain);
                memcpy(&fd_write_buffer_[old_size],data,remain);
                context_->epoll_server()->ModifyCallback(fd_,EPOLLIN|EPOLLOUT|EPOLLET|EPOLLRDHUP|EPOLLERR);
            }
        }    
    }
}
void TpProxyBase::OnReadEvent(int fd){
    char buffer[kBufferSize];
    while(true){
        size_t nbytes=read(fd,buffer,kBufferSize);  
        if (nbytes == -1) {
            //if(errno == EWOULDBLOCK|| errno == EAGAIN){}
            break;            
        }else if(nbytes==0){
            CloseFd();
        }else{
            recv_bytes_+=nbytes;
            if(peer_){
                peer_->SendData(buffer,nbytes);
            }
        }       
    }    
}
void TpProxyBase::OnWriteEvent(int fd){
    FlushBuffer();
    if(fd_write_buffer_.size()>0){
       context_->epoll_server()->ModifyCallback(fd_,EPOLLIN|EPOLLOUT|EPOLLET|EPOLLRDHUP|EPOLLERR); 
    }else{
        context_->epoll_server()->ModifyCallback(fd_,EPOLLIN|EPOLLET|EPOLLRDHUP|EPOLLERR);
    }
    CheckCloseFd();
}
void TpProxyBase::CheckCloseFd(){
    if((TPPROXY_CLOSE==signal_)&&(fd_write_buffer_.size()==0)){
        CloseFd();
    }
}
void TpProxyBase::CloseFd(){
    if(fd_>0){
        context_->epoll_server()->UnregisterFD(fd_);        
        close(fd_);
        fd_=-1;
        status_=TCP_DISCONNECT;
    }
    if(peer_){
        peer_->Notify(TPPROXY_CLOSE);
        peer_=nullptr;
    }
    DeleteSelf();
}
void TpProxyBase::DeleteSelf(){
    if(destroyed_){
        return;
    }
    destroyed_=true;
    context_->PostTask([this]{
        delete this;
    });
}

TpProxyLeft::TpProxyLeft(basic::BaseContext *context,int fd):TpProxyBase(context,fd){
    context_->epoll_server()->RegisterFD(fd_,this,EPOLLIN|EPOLLET|EPOLLRDHUP| EPOLLERR);
    struct sockaddr_storage remote_addr;
    /*IpAddress ip_addr;
    ip_addr.FromString("10.0.2.2");
    SocketAddress socket_addr(ip_addr,3333);
    remote_addr=socket_addr.generic_address();*/
    socklen_t n=sizeof(remote_addr);
    int ret =getsockopt(fd_, SOL_IP, SO_ORIGINAL_DST, &remote_addr, &n);
    if(ret!=0){
        CloseFd();
        return;
    }
    int right_fd=socket(AF_INET, SOCK_STREAM, 0);
    if(0>right_fd){
        CloseFd();
        return;        
    }
    SocketAddress local(IpAddress::Any4(),0);
    SocketAddress remote(remote_addr);
    std::cout<<remote.ToString()<<std::endl;
    peer_=new TpProxyRight(context,right_fd);
    bool success=((TpProxyRight*)peer_)->AsynConnect(local,remote);
    if(!success){
        CloseFd();
        std::cout<<"asyn failed"<<std::endl;
        return;         
    }else{
        peer_->set_peer(this);
    }
    status_=TCP_CONNECTED;
}
TpProxyLeft::~TpProxyLeft(){
    std::cout<<"left dtor "<<recv_bytes_<<" "<<send_bytes_<<std::endl;
}
void TpProxyLeft::Notify(uint8_t sig){
    if(TPPROXY_CLOSE==sig||TPPROXY_CONNECT_FAIL==sig){
        peer_=nullptr;
        signal_=TPPROXY_CLOSE;
        CheckCloseFd();
    }
}
void TpProxyLeft::OnEvent(int fd, basic::EpollEvent* event){
    if(event->in_events & EPOLLIN){
        OnReadEvent(fd);
    }
    if(event->in_events&EPOLLOUT){
        OnWriteEvent(fd);
    }
    if(event->in_events &(EPOLLRDHUP|EPOLLHUP)){
        CloseFd(); 
    }    
}
void TpProxyLeft::OnShutdown(basic::EpollServer* eps, int fd){
    if(fd_>0){
        close(fd_);
        fd_=-1;
    }
    DeleteSelf();
}

TpProxyRight::TpProxyRight(basic::BaseContext *context,int fd):TpProxyBase(context,fd){}
TpProxyRight::~TpProxyRight(){
    std::cout<<"right dtor "<<recv_bytes_<<" "<<send_bytes_<<std::endl;
}
void TpProxyRight::Notify(uint8_t sig){
    if(TPPROXY_CLOSE==sig){
        signal_=sig;
        peer_=nullptr;
        CheckCloseFd();
    }
}
bool TpProxyRight::AsynConnect(SocketAddress &local,SocketAddress &remote){
    src_addr_=local.generic_address();
    dst_addr_=remote.generic_address();
    int yes=1;
    bool success=false;
    size_t addr_size = sizeof(struct sockaddr_storage);
    if(bind(fd_, (struct sockaddr *)&src_addr_, addr_size)<0){
        CloseFd();
        return success;
    }
    if(setsockopt(fd_,SOL_SOCKET,SO_REUSEADDR,&yes,sizeof(int))!=0){
        CloseFd();
        return success;        
    }
    context_->epoll_server()->RegisterFD(fd_, this,EPOLLIN|EPOLLOUT| EPOLLRDHUP | EPOLLERR | EPOLLET);
    if(connect(fd_,(struct sockaddr *)&dst_addr_,addr_size) == -1&& errno != EINPROGRESS){
        //connect doesn't work, are we running out of available ports ? if yes, destruct the socket   
        if (errno == EAGAIN){
            CloseFd();
            return success;                
        }   
    }
    status_=TCP_CONNECTING;
    return true;    
}
void TpProxyRight::OnEvent(int fd, basic::EpollEvent* event){
    if (event->in_events&(EPOLLERR|EPOLLRDHUP| EPOLLHUP)){
        CloseFd();       
    }   
    if(event->in_events&EPOLLOUT){
        if(status_==TCP_CONNECTING){
            status_=TCP_CONNECTED;
            std::cout<<"right connected"<<std::endl;
            context_->epoll_server()->ModifyCallback(fd_,EPOLLIN|EPOLLRDHUP|EPOLLERR | EPOLLET);
        }
        OnWriteEvent(fd);
    }
    if(event->in_events&EPOLLIN){
        OnReadEvent(fd);
    }    
}
void TpProxyRight::OnShutdown(basic::EpollServer* eps, int fd){
    if(fd_>0){
        close(fd_);
        fd_=-1;
    }
    DeleteSelf();
}
void TpProxyBackend::CreateEndpoint(basic::BaseContext *context,int fd){
    TpProxyLeft *endpoint=new TpProxyLeft(context,fd);
    UNUSED(endpoint);
}
PhysicalSocketServer* TpProxyFactory::CreateSocketServer(BaseContext *context){
    std::unique_ptr<TpProxyBackend> backend(new TpProxyBackend());
    return new PhysicalSocketServer(context,std::move(backend));
}
}

 The newest version of the transparent proxy code [7] on github has implemented the traffic control function. When the size of fd_write_buffer_ in TpProxyRight exceeds a predefined threshold (1500*10), TpProxyLeft will not read packets from tcp stack for temporally and the occupied buffer in tcp stack will feedback reduced receive window (Window Size in tcp header 1) to sender. The sender with reduce packet sending speed. With such mechanism, the probed bandwidth between h1 and h3 is closed to the bottleneck bnadwidth (6Mbps).

[1] 利用TAP设备实现透明传输
[2] mtcp
[3] socket的IP_TRANSPARENT选项实现代理
[4] tproxy-example
[5] Linux透明代理 —— 使用iptables实现TCP透明代理
[6] test arp spoof on mininet
[7] tcp transparent proxy
[8] TCP流量控制、拥塞控制
[9] Network Analysis: TCP Window Size


  1. The size of the receive window, which specifies the number of window size units[c] that the sender of this segment is currently willing to receive ↩︎

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值