【Android】(三) iptables 详解

chendh1977

已于 2024-09-14 09:37:57 修改

阅读量577

点赞数 13

分类专栏： Android 文章标签： android 网络

于 2024-09-13 22:13:26 首次发布

本文链接：https://blog.csdn.net/CHENDONGHAO1105/article/details/142229051

版权

Android 专栏收录该内容

14 篇文章 0 订阅

订阅专栏

iptables 源码分析 - 内核源码-Chinaunix

【iptables交流贴】iptables执行的流程分析 - 内核源码-Chinaunix

iptables 命令

规则 rules

用于定义 IP 数据包的识别和处理规则，每条 Rule 都包含了 “匹配" 和 “动作" 这 2 个元素。其中，动作包括有：修改或跳转。跳转可以用于处理接受该数据包、拒绝该数据包，也可以跳转到其他 Chain 中继续进行匹配，或者从当前 Chain 中返回到调用者 Chain。

每条 Rule 都由下列 2 个元素组成：

若干个匹配条件（Xmatch）：与 IP 数据包进行匹配，具有以下匹配条件类型：
1. **Interface（接口，**e.g. eth0、eth1）
2. Protocol（协议类型，e.g. ICMP、TCP、UDP）
3. Source IP / Destination IP
4. Source Port / Destination Port
一个执行动作（Action）：数据包匹配所有条件后所需要执行的动作。具有以下动作类型：
1. ACCEPT：运行通过。
2. DROP：直接丢弃。
3. REJECT：拒绝通过。
4. SNAT：源地址转换。
5. DNAT：目标地址转换。
6. LOG：记录日志信息。
7. QUEUE：将数据包移交到用户空间。
8. RETURN：防火墙停止执行当前链中的后续规则，并返回到调用链。
9. REDIRECT：端口重定向。
10. iptablMARK：做防火墙标记。

链 chains

链（Chains）的本质是一个有序的 Rules 列表。在复杂的网络环境中，用户可以通过配置 Rules 在 Chain 中的顺序来灵活实现多种效果。

也因为 Chain 中 Rules 的次序非常关键，执行 Rules 时，会按照从上往下的顺序进行。所以越严格的 Rule，就越应该放在越靠前，而 Default Rule 则总是在最后生效。

此外，Netfilter 提供了 5 条内建的 Chains，用户也可以新建自定义的 Chains。

INPUT：发往本机的数据包通过此链，并执行此链上关于 INPUT 的 Rules，例如：DDoS 攻击防御规则。
OUTPUT：从本机发出的数据包通过此链，并执行此链上关于 OUTPUT 的 Rules。
PORWARD：由本机转发的数据包通过此链，并执行此链上关于 PORWARD 的 Rules。例如：作为 IP 路由器。
PREROUTING：在处理 IP 路由规则前（Pre-Routing）通过此链，并执行此链上关于 Pre-Routing 的 Rules。例如：DNAT。
POSTOUTING：在处理 IP 路由规则后（Post-Routing）通过此链，并执行此链上关于 Post-Routing 的 Rules。例如：SNAT。

表 tables

表（Table）是面向应用场景的管理方式，每张表被赋予了不同的应用场景，所以也内含了不同的 Chains 和 Rules。

用户在实际使用 Netfilter 时，往往是通过 Table 作为操作入口，然后对 Chains 和 Rules 进行定义。

Netfilter 内建了以下五张表：

filter 表（默认）：提供数据包的过滤功能，例如：用于防火墙规则。
nat 表：提供了 NAT、NAPT 功能，例如：用于网关路由器。
mangle 表：提供了数据包修改功能，例如：更改 IP Header 的 TOS、DSCP、ECN 位。
raw 表：用来提前标记报文不需要执行一些流程，例如：不需要建立会话。

Usage: iptables -[ACD] chain rule-specification [options]
       iptables -I chain [rulenum] rule-specification [options]
       iptables -R chain rulenum rule-specification [options]
       iptables -D chain rulenum [options]
       iptables -[LS] [chain [rulenum]] [options]
       iptables -[FZ] [chain] [options]
       iptables -[NX] chain
       iptables -E old-chain-name new-chain-name
       iptables -P chain target [options]
       iptables -h (print this help information)

Commands:
Either long or short options are allowed.
  --append  -A chain    Append to chain
  --check   -C chain    Check for the existence of a rule
  --delete  -D chain    Delete matching rule from chain
  --delete  -D chain rulenum
        Delete rule rulenum (1 = first) from chain
  --insert  -I chain [rulenum]
        Insert in chain as rulenum (default 1=first)
  --replace -R chain rulenum
        Replace rule rulenum (1 = first) in chain
  --list    -L [chain [rulenum]]
        List the rules in a chain or all chains
  --list-rules -S [chain [rulenum]]
        Print the rules in a chain or all chains
  --flush   -F [chain]    Delete all rules in  chain or all chains
  --zero    -Z [chain [rulenum]]
        Zero counters in chain or all chains
  --new     -N chain    Create a new user-defined chain
  --delete-chain
            -X [chain]    Delete a user-defined chain
  --policy  -P chain target
        Change policy on chain to target
  --rename-chain
            -E old-chain new-chain
        Change chain name, (moving any references)
Options:
    --ipv4  -4    Nothing (line is ignored by ip6tables-restore)
    --ipv6  -6    Error (line is ignored by iptables-restore)
[!] --protocol  -p proto  protocol: by number or name, eg. tcp'
[!] --source  -s address[/mask][...]
        source specification
[!] --destination -d address[/mask][...]
        destination specification
[!] --in-interface -i input name[+]
        network interface name ([+] for wildcard)
 --jump -j target
        target for rule (may load target extension)
  --goto      -g chain
                              jump to chain with no return
  --match -m match
        extended match (may load extension)
  --numeric -n    numeric output of addresses and ports
[!] --out-interface -o output name[+]
        network interface name ([+] for wildcard)
  --table -t table  table to manipulate (default: filter')
  --verbose -v    verbose mode
  --wait  -w [seconds]  maximum wait to acquire xtables lock before give up
  --wait-interval -W [usecs]  wait time to try to acquire xtables lock
        default is 1 second
  --line-numbers    print line numbers when listing
  --exact -x    expand numbers (display exact values)
[!] --fragment  -f    match second or further fragments only
  --modprobe=<command>    try to insert modules using this command
  --set-counters PKTS BYTES set the counter during insert/append
[!] --version -V    print package version.

MARK target options:
  --set-xmark value[/mask]  Clear bits in mask and XOR value into nfmark
  --set-mark value[/mask]   Clear bits in mask and OR value into nfmark
  --and-mark bits           Binary AND the nfmark with bits
  --or-mark bits            Binary OR the nfmark with bits
  --xor-mark bits           Binary XOR the nfmark with bits
iptables -N 1000

iptables -I OUTPUT -p tcp --dport 47000:48000 -j 1000

iptables -F 1000

iptables -A 1000 -m owner --uid-owner 1000 -j ACCEPT

iptables -A 1000 -j REJECT

external/iptables

iptables_main -> do_command4 -> iptc_commit

Netd

IptablesRestoreController

forkAndExec

//system/netd/server/IptablesRestoreController.cpp
IptablesProcess* IptablesRestoreController::forkAndExec(const IptablesProcessType type) {
    const char* const cmd = (type == IPTABLES_PROCESS) ?
        IPTABLES_RESTORE_PATH : IP6TABLES_RESTORE_PATH;

    // Create the pipes we'll use for communication with the child
    // process. One each for the child's in, out and err files.
    int stdin_pipe[2];
    int stdout_pipe[2];
    int stderr_pipe[2];

    if (pipe2(stdin_pipe,  O_CLOEXEC) == -1 ||
        pipe2(stdout_pipe, O_NONBLOCK | O_CLOEXEC) == -1 ||
        pipe2(stderr_pipe, O_NONBLOCK | O_CLOEXEC) == -1) {

        ALOGE("pipe2() failed: %s", strerror(errno));
        return nullptr;
    }

    const auto& sys = sSyscalls.get();
    StatusOr<pid_t> child_pid = sys.fork();
    if (!isOk(child_pid)) {
        ALOGE("fork() failed: %s", strerror(child_pid.status().code()));
        return nullptr;
    }

    if (child_pid.value() == 0) {
        // The child process. Reads from stdin, writes to stderr and stdout.

        // stdin_pipe[0] : The read end of the stdin pipe.
        // stdout_pipe[1] : The write end of the stdout pipe.
        // stderr_pipe[1] : The write end of the stderr pipe.
        if (dup2(stdin_pipe[0], 0) == -1 ||
            dup2(stdout_pipe[1], 1) == -1 ||
            dup2(stderr_pipe[1], 2) == -1) {
            ALOGE("dup2() failed: %s", strerror(errno));
            abort();
        }

        if (execl(cmd,
                  cmd,
                  "--noflush",  // Don't flush the whole table.
                  "-w",         // Wait instead of failing if the lock is held.
                  "-v",         // Verbose mode, to make sure our ping is echoed
                                // back to us.
                  nullptr) == -1) {
            ALOGE("execl(%s, ...) failed: %s", cmd, strerror(errno));
            abort();
        }

        // This statement is unreachable. We abort() upon error, and execl
        // if everything goes well.
        return nullptr;
    }

    // The parent process. Writes to stdout and stderr and reads from stdin.
    // stdin_pipe[0] : The read end of the stdin pipe.
    // stdout_pipe[1] : The write end of the stdout pipe.
    // stderr_pipe[1] : The write end of the stderr pipe.
    if (close(stdin_pipe[0]) == -1 ||
        close(stdout_pipe[1]) == -1 ||
        close(stderr_pipe[1]) == -1) {
        ALOGW("close() failed: %s", strerror(errno));
    }

    return new IptablesProcess(child_pid.value(), stdin_pipe[1], stdout_pipe[0], stderr_pipe[0]);
}

根据IptablesProcessType的值确定要执行的命令：iptables-restore 或者 ip6tables-restore
使用fork函数创建子进程。通过 stdin_pipe、stdout_pipe 和 stderr_pipe 进行父子进程的通信。
在子进程中，通过dup2函数将标准输入重定向到 stdin_pipe 读端，标准输出和标准错误重定向到 stdout_pipe 和 stderr_pipe 写端。然后使用execl函数执行iptables-restore命令。
在父进程中，关闭 stdin_pipe 的读端，关闭 stdout_pipe 和 stderr_pipe 的写端。
最后，创建一个IptablesProcess对象，IptablesProcess对象封装了子进程的PID以及与子进程通信的管道的文件描述符。

execIptablesRestoreWithOutput

//system/netd/server/NetdConstants.cpp
int execIptablesRestoreWithOutput(IptablesTarget target, const std::string& commands,
                                  std::string *output) {
    return android::net::gCtls->iptablesRestoreCtrl.execute(target, commands, output);
}
//system/netd/server/IptablesRestoreController.cpp
int IptablesRestoreController::execute(const IptablesTarget target, const std::string& command,
                                       std::string *output) {
    std::lock_guard lock(mLock);

    std::string buffer;
    if (output == nullptr) {
        output = &buffer;
    } else {
        output->clear();
    }

    int res = 0;
    if (target == V4 || target == V4V6) {
        res |= sendCommand(IPTABLES_PROCESS, command, output);
    }
    if (target == V6 || target == V4V6) {
        res |= sendCommand(IP6TABLES_PROCESS, command, output);
    }
    return res;
}

// TODO: Return -errno on failure instead of -1.
// TODO: Maybe we should keep a rotating buffer of the last N commands
// so that they can be dumped on dumpsys.
int IptablesRestoreController::sendCommand(const IptablesProcessType type,
                                           const std::string& command,
                                           std::string *output) {
   std::unique_ptr<IptablesProcess> *process =
           (type == IPTABLES_PROCESS) ? &mIpRestore : &mIp6Restore;

    // We might need to fork a new process if we haven't forked one yet, or
    // if the forked process terminated.
    //
    // NOTE: For a given command, this is the last point at which we try to
    // recover from a child death. If the child dies at some later point during
    // the execution of this method, we will receive an EPIPE and return an
    // error. The command will then need to be retried at a higher level.
    IptablesProcess *existingProcess = process->get();
    if (existingProcess != nullptr && !existingProcess->outputReady()) {
        existingProcess->stop();
        existingProcess = nullptr;
    }

    if (existingProcess == nullptr) {
        // Fork a new iptables[6]-restore process.
        IptablesProcess *newProcess = IptablesRestoreController::forkAndExec(type);
        if (newProcess == nullptr) {
            LOG(ERROR) << "Unable to fork ip[6]tables-restore, type: " << type;
            return -1;
        }

        process->reset(newProcess);
    }

    if (!android::base::WriteFully((*process)->stdIn, command.data(), command.length())) {
        ALOGE("Unable to send command: %s", strerror(errno));
        return -1;
    }

    if (!android::base::WriteFully((*process)->stdIn, PING, PING_SIZE)) {
        ALOGE("Unable to send ping command: %s", strerror(errno));
        return -1;
    }

    if (!drainAndWaitForAck(*process, command, output)) {
        // drainAndWaitForAck has already logged an error.
        return -1;
    }

    return 0;
}

根据target参数的值，判断要执行的命令类型。如果target是V4或V4V6，则调用sendCommand函数执行iptables-restore命令，并将结果存储在output中。如果target是V6或V4V6，则调用sendCommand函数执行ip6tables-restore命令，并将结果存储在output中。

在IptablesRestoreController类中的sendCommand函数中，执行了具体的命令发送和处理逻辑：

首先，根据type参数的值，确定要执行的命令类型。如果type是IPTABLES_PROCESS，则使用mIpRestore指针，否则使用mIp6Restore指针。
检查当前已经存在的进程。如果存在进程且输出结果未准备好，则停止当前进程。如果当前进程为空，则通过调用forkAndExec函数创建一个新的进程，并将其赋值给相应的指针。
使用WriteFully函数将命令数据写入到进程的标准输入中。
使用WriteFully函数将一个特殊的"PING"命令写入到进程的标准输入中，用于标识命令的结束。
调用drainAndWaitForAck函数来读取进程的输出，并等待命令的确认。

FirewallController

binder::Status NetdNativeService::firewallSetFirewallType(int32_t firewallType)
binder::Status NetdNativeService::firewallSetInterfaceRule(const std::string& ifName,
                                                           int32_t firewallRule)
binder::Status NetdNativeService::firewallEnableChildChain(int32_t childChain, bool enable)
binder::Status NetdNativeService::firewallSetUidRule(int32_t childChain, int32_t uid,
                                                     int32_t firewallRule)
binder::Status NetdNativeService::firewallReplaceUidChain(const std::string& chainName,
                                                          bool isAllowlist,
                                                          const std::vector<int32_t>& uids,
                                                          bool* ret)

TetherController

binder::Status NetdNativeService::tetherAddForward(const std::string& intIface,
                                                   const std::string& extIface)
binder::Status NetdNativeService::tetherRemoveForward(const std::string& intIface,
                                                      const std::string& extIface)
** binder::Status NetdNativeService::tetherGetStats(
        std::vector<TetherStatsParcel>* tetherStatsParcelVec)

BandwidthController

binder::Status NetdNativeService::bandwidthEnableDataSaver(bool enable, bool *ret)
** binder::Status NetdNativeService::bandwidthSetInterfaceQuota(const std::string& ifName,
                                                             int64_t bytes)
** binder::Status NetdNativeService::bandwidthRemoveInterfaceQuota(const std::string& ifName)
binder::Status NetdNativeService::bandwidthSetGlobalAlert(int64_t bytes)
binder::Status NetdNativeService::bandwidthSetInterfaceAlert(const std::string& ifName,
                                                             int64_t bytes)
binder::Status NetdNativeService::bandwidthRemoveInterfaceAlert(const std::string& ifName)
"bw_happy_box",
"bw_penalty_box",
"bw_data_saver",
"bw_costly_shared",

bw_happy_box：可能用于处理和限制优先级较高的数据流，例如优先处理实时视频、音频等流量。
bw_penalty_box：可能用于处理和限制优先级较低的数据流，例如限制下载、非关键任务等流量。
bw_data_saver：可能用于实现数据节省功能，例如启用数据压缩、禁止某些应用程序的后台数据传输等。
bw_costly_shared：可能用于处理和限制消耗资源较多的共享连接，例如限制对公共Wi-Fi网络的带宽使用。

IdletimerController

binder::Status NetdNativeService::idletimerAddInterface(const std::string& ifName, int32_t timeout,
                                                        const std::string& classLabel)
binder::Status NetdNativeService::idletimerRemoveInterface(const std::string& ifName,
                                                           int32_t timeout,
                                                           const std::string& classLabel)

StrictController

binder::Status NetdNativeService::strictUidCleartextPenalty(int32_t uid, int32_t policyPenalty)
const char* StrictController::LOCAL_CLEAR_DETECT = "st_clear_detect";
const char* StrictController::LOCAL_CLEAR_CAUGHT = "st_clear_caught";
const char* StrictController::LOCAL_PENALTY_LOG = "st_penalty_log";
const char* StrictController::LOCAL_PENALTY_REJECT = "st_penalty_reject";