2025-08-24

rt_tgsigqueueinfo系统调用及示例

我们来深入学习 rt_tgsigqueueinfo 系统调用,在 Linux 系统中，信号（Signal）是进程间通信的一种方式。kill() 系统调用可以向一个进程发送信号，而 sigqueue() 则可以发送信号并附带一些数据（union sigval）。rt_tgsigqueueinfo 是一个更底层、更具体、但也更复杂的系统调用。

1. 函数介绍

在 Linux 系统中，信号（Signal）是进程间通信的一种方式。kill() 系统调用可以向一个进程发送信号，而 sigqueue() 则可以发送信号并附带一些数据（union sigval）。

rt_tgsigqueueinfo 是一个更底层、更具体、但也更复杂的系统调用。它的名字可以拆解为：

rt_: 表示这是一个与实时信号相关的系统调用。
tg: 表示 Thread Group（线程组）。在 Linux 中，一个进程（Process）可以包含多个线程（Threads），这些线程共享同一个 PID，但每个线程有自己的唯一 Thread ID (TID)。所有共享同一个 PID 的线程构成了一个线程组。
sigqueueinfo: 表示它像 sigqueue 一样发送信号和数据，但允许发送者提供更详细的信号信息（通过 siginfo_t 结构体）。

所以，rt_tgsigqueueinfo 的作用是：向指定进程（线程组）中的指定线程（由 TID 指定）发送一个信号，并允许发送者完全指定该信号的 siginfo_t 信息。

与 sigqueue 的区别:

sigqueue(pid_t pid, …): 向进程 pid 发送信号。内核会选择该进程（线程组）中的某个线程来接收信号。
rt_tgsigqueueinfo(pid_t tgid, pid_t tid, …): tgid 是线程组 ID（通常就是进程的 PID），tid 是线程组内具体线程的 ID（TID）。它允许你指定哪个线程应该接收这个信号。

为什么复杂/危险?与 rt_sigqueueinfo 类似，它允许发送者完全伪造 siginfo_t 中的信息（如发送者 PID、si_code 等），这可能带来安全风险。因此，它的使用受到严格的权限检查。

对于 Linux 编程小白：你通常不需要直接使用 rt_tgsigqueueinfo。在需要向特定线程发送信号时，更常见的做法是在线程内部使用 pthread_kill()（POSIX 线程库函数）。rt_tgsigqueueinfo 主要供系统级程序或 C 库实现使用。

2. 函数原型

// 这是底层系统调用，用户空间标准 C 库通常不直接提供包装函数
#include <sys/syscall.h> // 包含系统调用号
#include <unistd.h>      // 包含 syscall 函数
#include <signal.h>      // 包含 siginfo_t 定义

long syscall(SYS_rt_tgsigqueueinfo, pid_t tgid, pid_t tid, int sig, siginfo_t *uinfo);

用户空间更常用的相关函数：

#include <signal.h>
int sigqueue(pid_t pid, int sig, const union sigval value); // 向进程发送信号和数据

#include <pthread.h> // (需要链接 -lpthread)
int pthread_kill(pthread_t thread, int sig); // 向特定 POSIX 线程发送信号

3. 功能

向指定线程组 ID (tgid) 和线程 ID (tid) 的线程发送指定信号 (sig)，并允许发送者完全指定 siginfo_t 结构体 (uinfo) 中包含的详细信息。

4. 参数

tgid:

pid_t 类型。
目标线程所属的线程组 ID (Thread Group ID)。对于单线程进程，这通常就是它的 PID。对于多线程程序，所有线程共享同一个 tgid（即主线程的 PID）。

tid:

pid_t 类型。
目标线程在内核中的唯一 ID (Thread ID)。在用户空间的 pthread_create 返回的 pthread_t 和内核的 tid 之间需要转换（可以使用 /proc/self/task/ 目录或特定的系统调用来获取）。

sig:

int 类型。
要发送的信号编号，例如 SIGUSR1, SIGRTMIN 等。注意，不能发送 SIGKILL 和 SIGSTOP。

uinfo:

siginfo_t * 类型。
一个指向 siginfo_t 结构体的指针。这个结构体包含了你想随信号一起传递给目标线程的所有详细信息。调用者需要自己填充这个结构体的各个字段。

5. 返回值

成功: 返回 0。

失败: 返回 -1，并设置全局变量 errno 来指示具体的错误原因：

EAGAIN: (对于实时信号) 已达到接收者排队信号的最大数量限制 (RLIMIT_SIGPENDING)。
EPERM: 调用者没有权限发送信号给目标线程（例如，权限不足，或试图伪造某些 si_code）。
EINVAL: sig 是无效的信号号，tgid 或 tid 无效，或者 uinfo 中的 si_code 是无效的或不允许由用户设置的。
ESRCH: 找不到 tgid 指定的线程组或 tid 指定的线程。

6. 相似函数或关联函数

sigqueue: 向进程（线程组）发送信号和数据，由内核选择线程接收。
pthread_kill: POSIX 标准函数，用于向指定的 pthread_t 线程发送信号，更易于使用。
tgkill: 一个更安全的系统调用，用于向指定线程组和线程发送信号（但不带 siginfo_t 数据），通常由 pthread_kill 内部调用。
kill: 向进程发送信号。
siginfo_t: 包含信号详细信息的结构体。
getpid: 获取当前进程的 PID (也是线程组 ID tgid)。
gettid: 获取当前线程的内核 TID。注意：这不是标准 C 库函数，需要通过 syscall(SYS_gettid) 调用。
pthread_self: 获取当前线程的 POSIX 线程 ID (pthread_t)。

7. 示例代码

由于直接使用 rt_tgsigqueueinfo 需要获取内核线程 ID (tid) 并且容易因权限或错误的 siginfo_t 而失败，下面的示例将演示如何获取 tid 并尝试调用 rt_tgsigqueueinfo，同时提供一个使用推荐的 pthread_kill 的对比示例。

警告：直接使用 rt_tgsigqueueinfo 可能会因为权限问题或 siginfo_t 构造不当而失败。

#define _GNU_SOURCE // 启用 GNU 扩展，获取 gettid
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
#include <string.h>
#include <sys/syscall.h> // 包含 syscall 和 SYS_rt_tgsigqueueinfo, SYS_gettid
#include <errno.h>
#include <sys/types.h> // 包含 pid_t
#include <pthread.h>   // 包含 pthread_t, pthread_create 等
#include <stdatomic.h> // 用于线程间安全通信

// 全局变量用于线程间通信
_Atomic int signal_count = 0;

// 信号处理函数 (使用 SA_SIGINFO 获取详细信息)
void signal_handler(int sig, siginfo_t *info, void *context) {
    atomic_fetch_add(&signal_count, 1); // 原子增加计数
    printf("Thread %ld: Received signal %d\n", syscall(SYS_gettid), sig);
    printf("  si_signo: %d\n", info->si_signo);
    printf("  si_code:  %d", info->si_code);
    switch(info->si_code) {
        case SI_USER: printf(" (SI_USER: kill, sigsend or raise)\n"); break;
        case SI_QUEUE: printf(" (SI_QUEUE: sigqueue)\n"); break;
        case SI_TKILL: printf(" (SI_TKILL: tkill or tgkill)\n"); break;
        default: printf(" (Other code)\n"); break;
    }
    printf("  si_pid:   %d\n", info->si_pid);
    if (info->si_code == SI_QUEUE) {
        printf("  si_value.sival_int: %d\n", info->si_value.sival_int);
    }
}

// 线程函数
void* worker_thread(void *arg) {
    sigset_t set;
    int sig;
    siginfo_t info;

    printf("Worker thread started. TID: %ld, PID (TGID): %d\n", syscall(SYS_gettid), getpid());

    // 为这个线程设置信号掩码，只等待 SIGUSR1
    sigemptyset(&set);
    sigaddset(&set, SIGUSR1);

    // 阻塞 SIGUSR1，准备用 sigwaitinfo 接收
    if (sigprocmask(SIG_BLOCK, &set, NULL) == -1) {
        perror("sigprocmask (worker)");
        return NULL;
    }

    while (atomic_load(&signal_count) < 3) { // 等待接收3次信号
        printf("Worker thread (TID %ld) waiting for SIGUSR1...\n", syscall(SYS_gettid));
        sig = sigwaitinfo(&set, &info); // 同步等待 SIGUSR1
        if (sig == -1) {
            perror("sigwaitinfo (worker)");
            break;
        }
        // sigwaitinfo 成功返回后，信号处理函数 signal_handler 会被调用
        // 因为我们阻塞了信号，sigwaitinfo 会接收它，但处理函数仍然会执行
        // 这里我们主要依靠 signal_handler 中的计数
    }
    printf("Worker thread (TID %ld) finished.\n", syscall(SYS_gettid));
    return NULL;
}

int main() {
    pthread_t thread;
    pid_t my_tgid, my_tid, worker_tid;
    struct sigaction sa;
    siginfo_t si_to_send;
    union sigval data_to_send = {.sival_int = 54321};

    my_tgid = getpid();
    my_tid = syscall(SYS_gettid); // 获取主线程的内核 TID
    printf("Main thread: PID (TGID): %d, TID: %ld\n", my_tgid, my_tid);

    // 1. 设置 SIGUSR1 的处理函数
    memset(&sa, 0, sizeof(sa));
    sa.sa_sigaction = signal_handler; // 使用 sa_sigaction 获取 info
    sigemptyset(&sa.sa_mask);
    sa.sa_flags = SA_SIGINFO; // 必须设置
    if (sigaction(SIGUSR1, &sa, NULL) == -1) {
        perror("sigaction");
        exit(EXIT_FAILURE);
    }

    // 2. 创建一个工作线程
    if (pthread_create(&thread, NULL, worker_thread, NULL) != 0) {
        perror("pthread_create");
        exit(EXIT_FAILURE);
    }

    // 简单等待一下，让子线程启动并打印其 TID
    sleep(1);
    // 在真实的程序中，获取另一个线程的 TID 比较复杂，
    // 通常需要线程自己报告。这里我们简化处理，
    // 假设我们 somehow 知道了 worker 线程的 TID。
    // 为了演示，我们还是向主线程自己发送。
    worker_tid = my_tid; // 简化：发送给自己
    printf("\n--- Attempting to use rt_tgsigqueueinfo ---\n");
    printf("Targeting TGID: %d, TID: %ld\n", my_tgid, worker_tid);

    // 3. 准备 siginfo_t 结构体
    memset(&si_to_send, 0, sizeof(si_to_send));
    si_to_send.si_signo = SIGUSR1;
    si_to_send.si_code = SI_QUEUE; // 伪造为 sigqueue 发送
    si_to_send.si_pid = my_tgid;   // 伪造 PID
    si_to_send.si_uid = getuid();  // 伪造 UID
    si_to_send.si_value = data_to_send; // 附带数据

    printf("Sending SIGUSR1 with data %d using rt_tgsigqueueinfo()...\n", data_to_send.sival_int);
    printf("(This may fail with EPERM unless run with special privileges or on some systems)\n");

    // 4. 调用底层系统调用
    long result = syscall(SYS_rt_tgsigqueueinfo, my_tgid, worker_tid, SIGUSR1, &si_to_send);

    if (result == -1) {
        perror("rt_tgsigqueueinfo");
        printf("Error: rt_tgsigqueueinfo failed.\n");
        printf("Errno: %d\n", errno);
        if (errno == EPERM) {
            printf("Reason: EPERM - Operation not permitted (insufficient privileges or invalid si_code).\n");
        }
    } else {
        printf("rt_tgsigqueueinfo succeeded.\n");
    }

    sleep(2); // 等待信号处理

    // 5. 再发送一次，使用标准的 sigqueue (推荐方式)
    printf("\n--- Using sigqueue (Recommended) ---\n");
    printf("Sending SIGUSR1 with data %d using sigqueue()...\n", data_to_send.sival_int);
    if (sigqueue(my_tgid, SIGUSR1, data_to_send) == -1) {
        perror("sigqueue");
    }
    sleep(2);

    // 6. 等待线程结束
    if (pthread_join(thread, NULL) != 0) {
        perror("pthread_join");
    }

    printf("\nFinal signal count: %d\n", atomic_load(&signal_count));
    printf("Main program finished.\n");
    return 0;
}

使用 pthread_kill 的推荐示例:

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
#include <string.h>
#include <pthread.h>
#include <stdatomic.h>

_Atomic int signal_received = 0;

void thread_signal_handler(int sig) {
    atomic_store(&signal_received, 1);
    printf("Signal %d received by thread %lu\n", sig, pthread_self());
}

void* target_thread(void *arg) {
    // 设置本线程的信号处理掩码，解除对 SIGUSR1 的阻塞
    // (假设主线程已经阻塞了它)
    sigset_t set;
    sigemptyset(&set);
    // sigaddset(&set, SIGUSR1); // 如果需要在这里处理
    pthread_sigmask(SIG_UNBLOCK, &set, NULL);

    printf("Target thread (pthread_t: %lu) running...\n", pthread_self());
    // 简单循环等待信号
    while(!atomic_load(&signal_received)) {
        sleep(1);
    }
    printf("Target thread received signal and is exiting.\n");
    return NULL;
}

int main() {
    pthread_t thread;
    struct sigaction sa;

    // 设置信号处理函数
    memset(&sa, 0, sizeof(sa));
    sa.sa_handler = thread_signal_handler;
    sigemptyset(&sa.sa_mask);
    if (sigaction(SIGUSR1, &sa, NULL) == -1) {
        perror("sigaction");
        exit(EXIT_FAILURE);
    }

    // 阻塞 SIGUSR1 在主线程
    sigset_t block_set;
    sigemptyset(&block_set);
    sigaddset(&block_set, SIGUSR1);
    sigprocmask(SIG_BLOCK, &block_set, NULL);

    if (pthread_create(&thread, NULL, target_thread, NULL) != 0) {
        perror("pthread_create");
        exit(EXIT_FAILURE);
    }

    sleep(1); // 让子线程启动

    printf("Main thread sending SIGUSR1 to target thread using pthread_kill()...\n");
    if (pthread_kill(thread, SIGUSR1) != 0) { // 推荐方式
        perror("pthread_kill");
    }

    if (pthread_join(thread, NULL) != 0) {
        perror("pthread_join");
    }

    printf("Main thread finished.\n");
    return 0;
}

编译和运行:

# 假设代码保存在 rt_tgsigqueueinfo_example.c 和 pthread_kill_example.c 中
# 需要链接 pthread 库
gcc -o rt_tgsigqueueinfo_example rt_tgsigqueueinfo_example.c -lpthread
gcc -o pthread_kill_example pthread_kill_example.c -lpthread

# 运行示例 (rt_tgsigqueueinfo 可能会因权限失败)
./rt_tgsigqueueinfo_example

# 运行推荐示例 (pthread_kill)
./pthread_kill_example

总结:对于 Linux 编程新手，处理多线程信号时，请优先学习和使用 pthread_kill()。rt_tgsigqueueinfo 是一个底层、强大的工具，但使用不当会有安全风险且复杂，通常由系统库内部或高级系统编程使用。

https://www.calcguide.tech/2025/08/24/rt-tgsigqueueinfo系统调用及示例/

rt_tgsigqueueinfo系统调用及示例-CSDN博客

2025-08-24

Linux系统编程

select系统调用及示例

我们继续学习 Linux 系统编程中的重要函数。这次我们介绍 select 函数，它是一种经典的 I/O 多路复用机制，允许一个进程监视多个文件描述符，等待其中任何一个或多个文件描述符变为“就绪”状态（例如可读、可写或发生异常）。

注意：虽然 select 功能强大且历史悠久，但在处理大量文件描述符时，poll 和更现代的 epoll (Linux 特有) 通常性能更好。不过，select 因其可移植性（在多种 Unix 系统上都可用）和教学价值，仍然是需要了解的重要函数。

1. 函数介绍

select 是一个 Linux 系统调用（实际上在很多类 Unix 系统上都可用），用于实现 I/O 多路复用 (I/O multiplexing)。它的核心思想是让进程能够同时检查多个文件描述符（如套接字、管道、终端等）的状态，看它们是否准备好进行 I/O 操作（例如读取、写入），而无需对每个文件描述符都进行阻塞式等待。

在没有 select（或 poll、epoll）的情况下，如果一个程序需要同时处理多个网络连接或文件，它可能需要创建多个线程或进程，或者在一个文件描述符上阻塞等待，这会非常低效或复杂。select 允许一个线程/进程在一个调用中“监听”所有感兴趣的文件描述符，当其中任何一个准备好时，select 返回，程序就可以处理那个就绪的文件描述符。

你可以把它想象成一个“服务员”，同时照看多张餐桌（文件描述符）。服务员不需要一直站在某一张餐桌旁等客人点菜（数据），而是可以走一圈看看哪张餐桌的客人举手了（数据就绪），然后去为那张餐桌服务。

2. 函数原型

#include <sys/select.h> // 必需

// 标准形式
int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout);

// pselect 是 select 的变体，增加了信号掩码参数，这里暂不讨论
// int pselect(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds,
//             const struct timespec *timeout, const sigset_t *sigmask);

3. 功能

监视文件描述符集合: select 会检查 readfds, writefds, exceptfds 这三个集合中列出的文件描述符的状态。

等待就绪: 调用 select 的进程会阻塞（挂起），直到以下情况之一发生：

readfds, writefds, exceptfds 中指定的至少一个文件描述符变为“就绪”状态。
调用被信号中断（返回 -1，并设置 errno 为 EINTR）。
达到指定的超时时间 timeout（如果 timeout 不为 NULL）。
返回就绪数量: 当 select 返回时，它会报告有多少个文件描述符已就绪。
更新集合: 关键点：select 会修改 readfds, writefds, exceptfds 这三个集合作为输出。调用返回后，这些集合中只保留那些已就绪的文件描述符，所有未就绪的文件描述符都会从集合中被清除。因此，如果需要在下一次 select 调用中继续监视相同的文件描述符集合，必须在每次调用 select 之前重新设置这些集合。

4. 参数

int nfds: 这是需要监视的文件描述符中的最大值加 1。它用于确定内核需要检查的文件描述符范围。例如，如果监视的文件描述符是 0, 1, 4, 7，那么 nfds 应该是 8 (7 + 1)。在现代实现中，这个参数可能不那么关键，但为了兼容性和正确性，应始终正确设置。
fd_set *readfds: 指向一个 fd_set 类型的集合。调用者将所有关心可读性的文件描述符加入此集合（使用 FD_SET 宏）。select 返回时，此集合会被修改，仅保留那些已准备好读取的文件描述符。
fd_set *writefds: 指向一个 fd_set 类型的集合。调用者将所有关心可写性的文件描述符加入此集合。select 返回时，此集合会被修改，仅保留那些已准备好写入的文件描述符。
fd_set *exceptfds: 指向一个 fd_set 类型的集合。用于监视“异常条件”（如带外数据）。在很多应用中（特别是 TCP 网络编程），这个参数可以设为 NULL。
struct timeval *timeout: 指定 select 调用阻塞等待的超时时间。timeout == NULL: select 会无限期阻塞，直到至少一个文件描述符就绪或被信号中断。
timeout->tv_sec == 0 && timeout->tv_usec == 0: select 执行非阻塞检查，立即返回，报告当前有多少文件描述符已就绪。
timeout->tv_sec > 0 || timeout->tv_usec > 0: select 最多阻塞 tv_sec 秒 + tv_usec 微秒。如果在超时前没有文件描述符就绪，则返回 0。struct timeval 定义如下：
struct timeval { long tv_sec; // 秒 long tv_usec; // 微秒 (0-999999) }; 重要: 在 Linux 上，select 返回时，timeout 的值可能会被修改，以反映剩余的超时时间。但在可移植代码中，不应依赖此行为，每次调用 select 前都应重新设置 timeout。

5. fd_set 集合操作宏

select 使用 fd_set 数据结构来表示文件描述符集合。操作 fd_set 需要使用以下标准宏：

FD_ZERO(fd_set *set): 清空（初始化）整个 fd_set 集合，移除所有文件描述符。
FD_SET(int fd, fd_set *set): 将文件描述符 fd 添加到 fd_set 集合 set 中。
FD_CLR(int fd, fd_set *set): 将文件描述符 fd 从 fd_set 集合 set 中移除。
FD_ISSET(int fd, fd_set *set): 检查文件描述符 fd 是否在 fd_set 集合 set 中。如果在，返回非零值；否则返回 0。

6. 返回值

成功时:

返回就绪的文件描述符的总数。这个数字可以是 0（表示超时）。

失败时:

返回 -1，并设置全局变量 errno 来指示具体的错误原因（例如 EBADF 某个 fd 无效，EINTR 调用被信号中断，EINVAL nfds 负数等）。

超时:

如果在 timeout 指定的时间内没有任何文件描述符就绪，返回 0。

7. 相似函数，或关联函数

poll: 功能与 select 类似，但使用 struct pollfd 数组而不是 fd_set 位掩码。在处理大量文件描述符时通常性能更好，且没有 FD_SETSIZE 的硬性限制。
epoll_wait / epoll_ctl / epoll_create: Linux 特有的、更高效的 I/O 多路复用机制，特别适合处理大量的并发连接。它使用一个内核事件表来管理监视的文件描述符，避免了 select/poll 每次调用都需要传递整个文件描述符集合的开销。
read, write: 在 select 返回某个文件描述符就绪后，通常会调用 read 或 write 来执行实际的 I/O 操作。

8. 示例代码

示例 1：监视标准输入和一个管道

这个例子演示如何使用 select 同时监视标准输入（键盘）和一个管道的读端，看哪个先有数据可读。

#include <sys/select.h> // select, fd_set, FD_*
#include <sys/time.h>   // struct timeval
#include <unistd.h>     // pipe, read, write, close, STDIN_FILENO
#include <stdio.h>      // perror, printf, fprintf
#include <stdlib.h>     // exit
#include <string.h>     // strlen

int main() {
    int pipefd&#91;2];
    fd_set readfds;           // 用于 select 的文件描述符集合
    int max_fd;               // select 需要的最大文件描述符 + 1
    struct timeval timeout;   // 超时设置
    int activity;             // select 返回值
    char buffer&#91;100];
    ssize_t bytes_read;

    // 1. 创建管道
    if (pipe(pipefd) == -1) {
        perror("pipe");
        exit(EXIT_FAILURE);
    }

    // 2. 计算 select 需要监视的最大文件描述符 + 1
    max_fd = (STDIN_FILENO > pipefd&#91;0]) ? STDIN_FILENO : pipefd&#91;0];
    max_fd += 1;

    printf("Waiting up to 5 seconds for input from stdin or data in pipe...\n");
    printf("Type something in the terminal, or run 'echo hello > /proc/%d/fd/%d' in another terminal.\n",
           getpid(), pipefd&#91;1]); // 提示用户如何向管道写入

    // 3. 主循环
    while (1) {
        // 4. 每次循环开始前，必须重新初始化 fd_set
        FD_ZERO(&readfds);             // 清空集合
        FD_SET(STDIN_FILENO, &readfds); // 添加标准输入
        FD_SET(pipefd&#91;0], &readfds);    // 添加管道读端

        // 5. 设置超时 (5秒)
        timeout.tv_sec = 5;
        timeout.tv_usec = 0;

        // 6. 调用 select 进行等待
        // 注意: timeout 结构体可能会被 select 修改，所以放在循环内
        activity = select(max_fd, &readfds, NULL, NULL, &timeout);

        // 7. 检查 select 的返回值
        if (activity == -1) {
            if (errno == EINTR) {
                printf("select was interrupted by a signal, continuing...\n");
                continue; // 被信号中断，通常继续循环
            } else {
                perror("select error");
                break; // 或 exit(EXIT_FAILURE);
            }
        } else if (activity == 0) {
            printf("Timeout occurred! No data within 5 seconds. Exiting.\n");
            break; // 超时退出循环
        } else {
            printf("%d file descriptor(s) became ready.\n", activity);

            // 8. 检查哪个文件描述符就绪了
            // 注意：select 返回后，readfds 只包含就绪的 fd
            if (FD_ISSET(STDIN_FILENO, &readfds)) {
                printf("  -> Data is ready on standard input (stdin).\n");
                bytes_read = read(STDIN_FILENO, buffer, sizeof(buffer) - 1);
                if (bytes_read > 0) {
                    buffer&#91;bytes_read] = '\0'; // 确保字符串结束
                    printf("  -> Read from stdin: %s", buffer); // buffer 可能已包含 \n
                }
            }

            if (FD_ISSET(pipefd&#91;0], &readfds)) {
                printf("  -> Data is ready on the pipe.\n");
                bytes_read = read(pipefd&#91;0], buffer, sizeof(buffer) - 1);
                if (bytes_read > 0) {
                    buffer&#91;bytes_read] = '\0';
                    printf("  -> Read from pipe: %s", buffer);
                }
            }
        }
    }

    // 9. 清理资源
    close(pipefd&#91;0]);
    close(pipefd&#91;1]);

    return 0;
}

代码解释:

使用 pipe() 创建一个管道。

计算需要监视的最大文件描述符 max_fd（STDIN_FILENO 和 pipefd[0] 中的较大者加 1）。

进入主循环。

关键: 在每次 select 调用前，使用 FD_ZERO(&readfds) 清空 fd_set。

使用 FD_SET 将 STDIN_FILENO 和 pipefd[0] 添加到 readfds 集合中。

设置 timeout 结构体为 5 秒。

调用 select(max_fd, &readfds, NULL, NULL, &timeout)。我们只关心可读性，所以 writefds 和 exceptfds 都是 NULL。

检查 select 的返回值：

-1：错误（检查 errno 是否为 EINTR）。
0：超时。
0：就绪的文件描述符数量。

如果有文件描述符就绪（activity > 0），使用 FD_ISSET 检查 STDIN_FILENO 和 pipefd[0] 是否在修改后的 readfds 集合中。

对于就绪的文件描述符，调用 read 读取数据。

循环继续或根据条件（如超时）退出。

最后关闭管道的两端。

示例 2：简单的 TCP 服务器（非阻塞 accept 和客户端 socket）

这个例子演示如何在 TCP 服务器中使用 select 来同时监听监听套接字（用于接受新连接）和已建立连接的客户端套接字（用于接收数据）。

#include <sys/select.h> // select, fd_set, FD_*
#include <sys/time.h>   // struct timeval
#include <sys/socket.h> // socket, bind, listen, accept, recv, send
#include <netinet/in.h> // sockaddr_in
#include <arpa/inet.h>  // inet_ntoa (简化版，非线程安全)
#include <unistd.h>     // close, read, write
#include <stdio.h>      // perror, printf, fprintf
#include <stdlib.h>     // exit
#include <string.h>     // memset, strlen

#define PORT 8080
#define MAX_CLIENTS 30 // select 的 fd_set 通常有 FD_SETSIZE (1024) 的限制
#define BUFFER_SIZE 1024

int main() {
    int server_fd, new_socket, client_socket&#91;MAX_CLIENTS];
    struct sockaddr_in address;
    int addrlen = sizeof(address);
    fd_set readfds;         // select 用的读集合
    int max_sd;             // 当前最大的文件描述符
    int activity, i, valread;
    char buffer&#91;BUFFER_SIZE] = {0};
    char *hello = "Hello from server";

    // 1. 初始化客户端套接字数组
    for (i = 0; i < MAX_CLIENTS; i++) {
        client_socket&#91;i] = 0; // 0 表示空闲槽位
    }

    // 2. 创建服务器套接字
    if ((server_fd = socket(AF_INET, SOCK_STREAM, 0)) == 0) {
        perror("socket failed");
        exit(EXIT_FAILURE);
    }

    // 3. 配置服务器地址
    address.sin_family = AF_INET;
    address.sin_addr.s_addr = INADDR_ANY; // 绑定到所有本地接口
    address.sin_port = htons(PORT);

    // 4. 绑定套接字
    if (bind(server_fd, (struct sockaddr *)&address, sizeof(address)) < 0) {
        perror("bind failed");
        close(server_fd);
        exit(EXIT_FAILURE);
    }

    // 5. 监听连接
    if (listen(server_fd, 3) < 0) { // backlog=3
        perror("listen");
        close(server_fd);
        exit(EXIT_FAILURE);
    }

    printf("Server listening on port %d\n", PORT);

    // 6. 主循环
    while(1) {
        // 7. 清空 fd_set
        FD_ZERO(&readfds);

        // 8. 将监听套接字加入集合
        FD_SET(server_fd, &readfds);
        max_sd = server_fd;

        // 9. 将已连接的客户端套接字加入集合
        for (i = 0; i < MAX_CLIENTS; i++) {
            if (client_socket&#91;i] > 0) {
                FD_SET(client_socket&#91;i], &readfds);
            }
            // 更新最大文件描述符
            if (client_socket&#91;i] > max_sd) {
                max_sd = client_socket&#91;i];
            }
        }

        // 10. 设置超时 (这里设为 NULL，表示无限等待)
        struct timeval timeout;
        timeout.tv_sec = 30; // 30秒超时，避免永久阻塞
        timeout.tv_usec = 0;

        // 11. 调用 select 等待活动
        activity = select(max_sd + 1, &readfds, NULL, NULL, &timeout);

        if (activity < 0) {
            if (errno == EINTR) {
                printf("select interrupted by signal, continuing...\n");
                continue;
            }
            perror("select error");
            break;
        }

        if (activity == 0) {
             printf("select timeout (30s), continuing...\n");
             continue;
        }

        // 12. 检查监听套接字是否有活动 (新连接)
        if (FD_ISSET(server_fd, &readfds)) {
            if ((new_socket = accept(server_fd, (struct sockaddr *)&address, (socklen_t*)&addrlen)) < 0) {
                perror("accept");
                continue;
            }

            printf("New connection, socket fd is %d, ip is : %s, port : %d\n",
                   new_socket, inet_ntoa(address.sin_addr), ntohs(address.sin_port));

            // 13. 将新连接的套接字添加到客户端数组中
            int added = 0;
            for (i = 0; i < MAX_CLIENTS; i++) {
                if (client_socket&#91;i] == 0) {
                    client_socket&#91;i] = new_socket;
                    added = 1;
                    printf("Adding to list of sockets at index %d\n", i);
                    send(new_socket, hello, strlen(hello), 0); // 发送欢迎信息
                    break;
                }
            }
            if (!added) {
                printf("Too many clients, connection rejected\n");
                close(new_socket);
            }
        }

        // 14. 检查已连接的客户端套接字是否有活动
        for (i = 0; i < MAX_CLIENTS; i++) {
            int sd = client_socket&#91;i];

            if (FD_ISSET(sd, &readfds)) {
                // 15. 有数据从客户端发来
                valread = read(sd, buffer, BUFFER_SIZE - 1);
                if (valread == 0) {
                    // 客户端断开连接
                    getpeername(sd, (struct sockaddr*)&address, (socklen_t*)&addrlen);
                    printf("Host disconnected, ip %s, port %d. Closing socket %d\n",
                           inet_ntoa(address.sin_addr), ntohs(address.sin_port), sd);
                    close(sd);
                    client_socket&#91;i] = 0; // 标记槽位为空闲
                } else {
                    // 处理收到的数据
                    buffer&#91;valread] = '\0';
                    printf("Received message from socket %d: %s", sd, buffer);
                    // Echo 回去
                    send(sd, buffer, strlen(buffer), 0);
                }
            }
        }
    }

    // 16. 清理 (在真实应用中，需要更优雅的退出机制)
    for(i = 0; i < MAX_CLIENTS; i++) {
        if(client_socket&#91;i] != 0) {
            close(client_socket&#91;i]);
        }
    }
    close(server_fd);
    printf("Server closed.\n");
    return 0;
}

代码解释:

创建、绑定、监听 TCP 套接字。

初始化一个 client_socket 数组，用于存储已建立连接的客户端套接字。用 0 表示空闲槽位。

进入主循环。

每次循环开始，使用 FD_ZERO(&readfds) 清空 fd_set。

使用 FD_SET(server_fd, &readfds) 将监听套接字加入监视集合。

遍历 client_socket 数组，将所有有效的（非零）客户端套接字也加入 readfds 集合。

在添加文件描述符的过程中，动态计算并更新 max_sd（当前监视的最大文件描述符）。

设置 select 的超时时间（这里设为 30 秒，避免无限期阻塞）。

调用 select(max_sd + 1, &readfds, NULL, NULL, &timeout)。

检查 select 返回值 activity。

如果 FD_ISSET(server_fd, &readfds) 为真，说明监听套接字就绪，调用 accept 接受新连接，并将其存入 client_socket 数组的空闲槽位。

遍历 client_socket 数组，检查每个有效的客户端套接字 sd 是否在 readfds 中（即 FD_ISSET(sd, &readfds) 为真）。

如果某个客户端套接字就绪，调用 read 读取数据。

如果 read 返回 0，表示客户端关闭了连接，关闭该套接字，并将 client_socket 数组中对应的项标记为 0（空闲）。

如果 read 返回正数，表示读到了数据，这里简单地将其 echo 回客户端。

循环继续处理下一个事件。

这个例子展示了 select 如何在一个单线程服务器中高效地管理多个并发连接。与为每个连接创建一个线程或进程相比，select（以及更高效的 poll 和 epoll）是构建高性能网络服务器的基础技术之一。

总结:

select 是一个功能强大且历史悠久的 I/O 多路复用函数。理解其关键是掌握 fd_set 集合的操作（FD_ZERO, FD_SET, FD_CLR, FD_ISSET）、nfds 参数的正确设置、select 返回后集合的修改行为，以及如何根据返回的就绪文件描述符数量和状态来处理相应的 I/O 操作。虽然在处理大量连接时不如 poll 或 epoll 高效，但其良好的可移植性使其在很多场景下仍然有用。

2025-08-23

Linux系统编程

fanotify_init系统调用及示例

fanotify_init - 初始化fanotify监控实例

fanotify_init是一个Linux系统调用，用于创建和初始化fanotify文件系统监控实例。fanotify是Linux内核提供的高级文件系统事件监控机制，可以监控文件访问、修改等事件。

函数原型

#include <sys/fanotify.h>
#include <sys/syscall.h>
#include <unistd.h>
#include <fcntl.h>

int fanotify_init(unsigned int flags, unsigned int event_f_flags);

功能

创建一个新的fanotify监控实例，返回文件描述符用于后续的监控操作。

fanotify_init系统调用及示例-CSDN博客

参数

unsigned int flags: fanotify实例标志

FAN_CLASS_NOTIF: 仅通知类事件
FAN_CLASS_CONTENT: 内容监控类事件
FAN_CLASS_PRE_CONTENT: 预内容监控类事件
FAN_CLOEXEC: 设置执行时关闭标志
FAN_NONBLOCK: 设置非阻塞模式
FAN_UNLIMITED_QUEUE: 无限制队列大小
FAN_UNLIMITED_MARKS: 无限制标记数量

unsigned int event_f_flags: 事件文件描述符标志

O_RDONLY: 只读打开事件文件
O_WRONLY: 只写打开事件文件
O_RDWR: 读写打开事件文件
O_LARGEFILE: 支持大文件
O_CLOEXEC: 执行时关闭

返回值

成功时返回fanotify文件描述符（非负整数）
失败时返回-1，并设置errno

特殊限制

需要Linux 2.6.39以上内核支持
需要CAP_SYS_ADMIN能力
某些标志需要特定内核版本

相似函数

fanotify_mark(): 添加监控标记
read(): 读取fanotify事件
inotify_init(): 旧版文件监控机制

示例代码

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/fanotify.h>
#include <sys/syscall.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
#include <string.h>
#include <poll.h>

// 系统调用包装
static int fanotify_init_wrapper(unsigned int flags, unsigned int event_f_flags) {
    return syscall(__NR_fanotify_init, flags, event_f_flags);
}

// 读取fanotify事件
int read_fanotify_events(int fanotify_fd) {
    char buffer&#91;4096];
    ssize_t len;
    struct fanotify_event_metadata *metadata;
    
    len = read(fanotify_fd, buffer, sizeof(buffer));
    if (len == -1) {
        if (errno == EAGAIN || errno == EWOULDBLOCK) {
            return 0; // 非阻塞模式下无事件
        }
        perror("读取fanotify事件失败");
        return -1;
    }
    
    metadata = (struct fanotify_event_metadata *)buffer;
    while (FAN_EVENT_OK(metadata, len)) {
        if (metadata->vers != FANOTIFY_METADATA_VERSION) {
            fprintf(stderr, "不支持的元数据版本\n");
            return -1;
        }
        
        printf("事件: fd=%d, mask=0x%x, pid=%d\n", 
               metadata->fd, metadata->mask, metadata->pid);
        
        // 获取文件路径
        if (metadata->fd != FAN_NOFD) {
            char path&#91;PATH_MAX];
            char proc_path&#91;64];
            snprintf(proc_path, sizeof(proc_path), "/proc/self/fd/%d", metadata->fd);
            
            ssize_t path_len = readlink(proc_path, path, sizeof(path) - 1);
            if (path_len != -1) {
                path&#91;path_len] = '\0';
                printf("  文件路径: %s\n", path);
            }
            
            close(metadata->fd);
        }
        
        metadata = FAN_EVENT_NEXT(metadata, len);
    }
    
    return 0;
}

int main() {
    int fanotify_fd;
    int result;
    
    printf("=== Fanotify_init 函数示例 ===\n");
    printf("当前用户 UID: %d\n", getuid());
    printf("当前有效 UID: %d\n", geteuid());
    
    // 示例1: 基本使用
    printf("\n示例1: 基本使用\n");
    
    // 需要root权限或CAP_SYS_ADMIN能力
    fanotify_fd = fanotify_init_wrapper(FAN_CLASS_NOTIF, O_RDONLY);
    if (fanotify_fd == -1) {
        if (errno == EPERM) {
            printf("权限不足创建fanotify实例: %s\n", strerror(errno));
            printf("说明: 需要root权限或CAP_SYS_ADMIN能力\n");
        } else {
            printf("创建fanotify实例失败: %s\n", strerror(errno));
        }
        printf("在root权限下重新运行此程序以查看完整功能\n");
        return 0;
    }
    
    printf("成功创建fanotify实例，文件描述符: %d\n", fanotify_fd);
    
    // 检查文件描述符属性
    int flags = fcntl(fanotify_fd, F_GETFD);
    if (flags != -1) {
        printf("fanotify文件描述符验证成功\n");
    }
    
    // 关闭fanotify实例
    close(fanotify_fd);
    printf("关闭fanotify实例\n");
    
    // 示例2: 使用不同标志
    printf("\n示例2: 使用不同标志\n");
    
    // 使用FAN_CLOEXEC标志
    fanotify_fd = fanotify_init_wrapper(FAN_CLASS_NOTIF | FAN_CLOEXEC, O_RDONLY);
    if (fanotify_fd != -1) {
        printf("使用FAN_CLOEXEC标志创建成功: %d\n", fanotify_fd);
        
        // 验证CLOEXEC标志
        flags = fcntl(fanotify_fd, F_GETFD);
        if (flags != -1 && (flags & FD_CLOEXEC)) {
            printf("FAN_CLOEXEC标志已正确设置\n");
        }
        
        close(fanotify_fd);
    }
    
    // 使用FAN_NONBLOCK标志
    fanotify_fd = fanotify_init_wrapper(FAN_CLASS_NOTIF | FAN_NONBLOCK, O_RDONLY);
    if (fanotify_fd != -1) {
        printf("使用FAN_NONBLOCK标志创建成功: %d\n", fanotify_fd);
        
        // 测试非阻塞特性
        char buffer&#91;1024];
        ssize_t result = read(fanotify_fd, buffer, sizeof(buffer));
        if (result == -1) {
            if (errno == EAGAIN || errno == EWOULDBLOCK) {
                printf("非阻塞读取正确返回EAGAIN\n");
            }
        }
        
        close(fanotify_fd);
    }
    
    // 示例3: 不同类别的fanotify实例
    printf("\n示例3: 不同类别的fanotify实例\n");
    
    printf("fanotify类别说明:\n");
    
    // FAN_CLASS_NOTIF - 通知类
    fanotify_fd = fanotify_init_wrapper(FAN_CLASS_NOTIF, O_RDONLY);
    if (fanotify_fd != -1) {
        printf("FAN_CLASS_NOTIF (通知类) 创建成功\n");
        printf("  - 仅接收通知事件\n");
        printf("  - 不需要响应事件\n");
        printf("  - 权限要求较低\n");
        close(fanotify_fd);
    }
    
    // FAN_CLASS_CONTENT - 内容监控类
    fanotify_fd = fanotify_init_wrapper(FAN_CLASS_CONTENT, O_RDONLY);
    if (fanotify_fd != -1) {
        printf("FAN_CLASS_CONTENT (内容监控类) 创建成功\n");
        printf("  - 可以读取文件内容\n");
        printf("  - 需要更高权限\n");
        close(fanotify_fd);
    }
    
    // FAN_CLASS_PRE_CONTENT - 预内容监控类
    fanotify_fd = fanotify_init_wrapper(FAN_CLASS_PRE_CONTENT, O_RDONLY);
    if (fanotify_fd != -1) {
        printf("FAN_CLASS_PRE_CONTENT (预内容监控类) 创建成功\n");
        printf("  - 可以在操作前拦截\n");
        printf("  - 需要最高权限\n");
        close(fanotify_fd);
    }
    
    // 示例4: 错误处理演示
    printf("\n示例4: 错误处理演示\n");
    
    // 使用无效标志
    fanotify_fd = fanotify_init_wrapper(0x1000, O_RDONLY);
    if (fanotify_fd == -1) {
        if (errno == EINVAL) {
            printf("无效标志错误处理正确: %s\n", strerror(errno));
        }
    }
    
    // 使用无效的event_f_flags
    fanotify_fd = fanotify_init_wrapper(FAN_CLASS_NOTIF, 0x1000);
    if (fanotify_fd == -1) {
        if (errno == EINVAL) {
            printf("无效event_f_flags错误处理正确: %s\n", strerror(errno));
        }
    }
    
    // 示例5: 权限和能力说明
    printf("\n示例5: 权限和能力说明\n");
    
    printf("fanotify权限要求:\n");
    printf("FAN_CLASS_NOTIF:\n");
    printf("  - 需要对监控目录的读权限\n");
    printf("  - 一般用户可使用\n\n");
    
    printf("FAN_CLASS_CONTENT:\n");
    printf("  - 需要CAP_SYS_ADMIN能力\n");
    printf("  - 或root权限\n\n");
    
    printf("FAN_CLASS_PRE_CONTENT:\n");
    printf("  - 需要CAP_SYS_ADMIN能力\n");
    printf("  - 或root权限\n");
    printf("  - 最高权限级别\n\n");
    
    // 示例6: 实际使用场景
    printf("示例6: 实际使用场景\n");
    
    printf("fanotify典型应用场景:\n");
    printf("1. 文件完整性监控\n");
    printf("2. 入侵检测系统\n");
    printf("3. 文件访问审计\n");
    printf("4. 实时备份系统\n");
    printf("5. 安全策略执行\n");
    printf("6. 系统监控工具\n\n");
    
    // 示例监控代码框架
    printf("基本监控代码框架:\n");
    printf("int setup_file_monitoring(const char* path) {\n");
    printf("    int fanotify_fd = fanotify_init(FAN_CLASS_NOTIF, O_RDONLY);\n");
    printf("    if (fanotify_fd == -1) return -1;\n");
    printf("    \n");
    printf("    // 添加监控标记\n");
    printf("    if (fanotify_mark(fanotify_fd, FAN_MARK_ADD, \n");
    printf("                     FAN_OPEN | FAN_CLOSE_WRITE,\n");
    printf("                     AT_FDCWD, path) == -1) {\n");
    printf("        close(fanotify_fd);\n");
    printf("        return -1;\n");
    printf("    }\n");
    printf("    \n");
    printf("    return fanotify_fd;\n");
    printf("}\n\n");
    
    // 示例7: 事件处理循环
    printf("示例7: 事件处理循环\n");
    printf("fanotify事件处理循环示例:\n");
    printf("void monitor_files(int fanotify_fd) {\n");
    printf("    struct pollfd pfd = {.fd = fanotify_fd, .events = POLLIN};\n");
    printf("    \n");
    printf("    while (running) {\n");
    printf("        int ret = poll(&pfd, 1, 1000);  // 1秒超时\n");
    printf("        if (ret > 0 && (pfd.revents & POLLIN)) {\n");
    printf("            handle_fanotify_events(fanotify_fd);\n");
    printf("        }\n");
    printf("    }\n");
    printf("}\n\n");
    
    // 示例8: 与inotify的对比
    printf("示例8: 与inotify的对比\n");
    
    printf("fanotify vs inotify:\n");
    printf("fanotify优势:\n");
    printf("  - 可以监控整个文件系统\n");
    printf("  - 支持文件内容访问监控\n");
    printf("  - 可以获取文件描述符\n");
    printf("  - 支持权限检查\n");
    printf("  - 更高的性能\n\n");
    
    printf("inotify优势:\n");
    printf("  - 更广泛的内核支持\n");
    printf("  - 更简单的API\n");
    printf("  - 更低的权限要求\n");
    printf("  - 更好的可移植性\n\n");
    
    // 示例9: 性能和资源考虑
    printf("示例9: 性能和资源考虑\n");
    
    printf("fanotify性能特点:\n");
    printf("1. 内核级实现，性能优秀\n");
    printf("2. 批量事件处理\n");
    printf("3. 可配置的队列大小\n");
    printf("4. 低CPU开销\n\n");
    
    printf("资源使用考虑:\n");
    printf("1. 每个实例消耗内核资源\n");
    printf("2. 事件队列有大小限制\n");
    printf("3. 及时处理事件避免队列溢出\n");
    printf("4. 合理设置监控范围\n\n");
    
    // 示例10: 安全考虑
    printf("示例10: 安全考虑\n");
    
    printf("使用fanotify的安全注意事项:\n");
    printf("1. 需要适当权限\n");
    printf("2. 避免监控敏感目录\n");
    printf("3. 及时关闭不需要的实例\n");
    printf("4. 处理事件时注意安全\n");
    printf("5. 防止事件处理中的拒绝服务\n");
    printf("6. 限制监控的文件数量\n\n");
    
    printf("总结:\n");
    printf("fanotify_init是创建fanotify监控实例的基础函数\n");
    printf("提供了比inotify更强大的文件监控能力\n");
    printf("需要适当的权限和能力\n");
    printf("支持多种监控类别和标志\n");
    printf("是构建高级文件监控应用的重要工具\n");
    
    return 0;
}

2025-08-23

Linux系统编程

fanotify_mark系统调用及示例

fanotify_mark - 文件系统通知标记

函数介绍

fanotify_mark是一个Linux系统调用，用于在fanotify实例上添加、修改或删除文件系统对象的监视标记。fanotify是Linux的文件系统通知机制，可以监控文件访问和修改事件。

fanotify_init系统调用及示例-CSDN博客

https://www.calcguide.tech/2025/08/23/fanotify-mark系统调用及示例/

函数原型

#include <sys/fanotify.h>
#include <fcntl.h>
#include <unistd.h>

int fanotify_mark(int fanotify_fd, unsigned int flags,
                  uint64_t mask, int dirfd, const char *pathname);

功能

在fanotify文件描述符上设置监视标记，用于监控指定文件系统对象的事件。

参数

int fanotify_fd: fanotify实例的文件描述符（通过fanotify_init获得）

unsigned int flags: 操作标志

FAN_MARK_ADD: 添加标记
FAN_MARK_REMOVE: 删除标记
FAN_MARK_FLUSH: 刷新所有标记
FAN_MARK_DONT_FOLLOW: 不跟随符号链接
FAN_MARK_ONLYDIR: 只监视目录
FAN_MARK_MOUNT: 监视整个挂载点
FAN_MARK_FILESYSTEM: 监视整个文件系统

uint64_t mask: 事件掩码，指定要监视的事件类型

FAN_ACCESS: 文件被访问
FAN_OPEN: 文件被打开
FAN_CLOSE_WRITE: 可写文件被关闭
FAN_CLOSE_NOWRITE: 只读文件被关闭
FAN_MODIFY: 文件被修改
FAN_OPEN_PERM: 文件打开权限检查
FAN_ACCESS_PERM: 文件访问权限检查

int dirfd: 目录文件描述符（用于相对路径）

const char *pathname: 要监视的文件或目录路径

返回值

成功时返回0
失败时返回-1，并设置errno

特殊限制

需要Linux 2.6.39以上内核支持
需要CAP_SYS_ADMIN能力或特定权限
某些操作需要root权限

相似函数

fanotify_init(): 初始化fanotify实例
inotify_add_watch(): inotify监视添加
read(): 读取fanotify事件

示例代码

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/fanotify.h>
#include <fcntl.h>
#include <errno.h>
#include <string.h>
#include <sys/stat.h>
#include <poll.h>
#include <signal.h>

// 全局变量用于信号处理
volatile sig_atomic_t keep_running = 1;

// 信号处理函数
void signal_handler(int sig) {
    printf("\n接收到信号 %d，准备退出...\n", sig);
    keep_running = 0;
}

// 事件类型转换为字符串
const char* event_type_to_string(uint64_t mask) {
    if (mask & FAN_ACCESS) return "ACCESS";
    if (mask & FAN_OPEN) return "OPEN";
    if (mask & FAN_CLOSE_WRITE) return "CLOSE_WRITE";
    if (mask & FAN_CLOSE_NOWRITE) return "CLOSE_NOWRITE";
    if (mask & FAN_MODIFY) return "MODIFY";
    if (mask & FAN_OPEN_PERM) return "OPEN_PERM";
    if (mask & FAN_ACCESS_PERM) return "ACCESS_PERM";
    return "UNKNOWN";
}

int main() {
    int fan_fd, fd, result;
    char buffer&#91;4096];
    
    printf("=== Fanotify_mark 函数示例 ===\n");
    
    // 设置信号处理
    signal(SIGINT, signal_handler);
    signal(SIGTERM, signal_handler);
    
    // 示例1: 基本使用
    printf("\n示例1: 基本使用\n");
    
    // 创建测试文件
    fd = open("test_fanotify.txt", O_CREAT | O_WRONLY | O_TRUNC, 0644);
    if (fd != -1) {
        write(fd, "test content for fanotify", 25);
        close(fd);
        printf("创建测试文件: test_fanotify.txt\n");
    }
    
    // 初始化fanotify实例
    fan_fd = fanotify_init(FAN_CLASS_NOTIF, O_RDONLY);
    if (fan_fd == -1) {
        if (errno == EPERM) {
            printf("权限不足初始化fanotify: %s\n", strerror(errno));
            printf("说明: 需要CAP_SYS_ADMIN能力或root权限\n");
            unlink("test_fanotify.txt");
            exit(EXIT_FAILURE);
        } else {
            perror("fanotify_init失败");
            unlink("test_fanotify.txt");
            exit(EXIT_FAILURE);
        }
    }
    printf("成功初始化fanotify实例，文件描述符: %d\n", fan_fd);
    
    // 示例2: 添加监视标记
    printf("\n示例2: 添加监视标记\n");
    
    // 监视当前目录下的测试文件
    result = fanotify_mark(fan_fd, FAN_MARK_ADD, 
                          FAN_OPEN | FAN_CLOSE | FAN_ACCESS | FAN_MODIFY,
                          AT_FDCWD, "test_fanotify.txt");
    if (result == -1) {
        perror("添加监视标记失败");
    } else {
        printf("成功添加监视标记到 test_fanotify.txt\n");
        printf("监视事件: OPEN | CLOSE | ACCESS | MODIFY\n");
    }
    
    // 示例3: 监视目录
    printf("\n示例3: 监视目录\n");
    
    // 创建测试目录
    if (mkdir("fanotify_test_dir", 0755) == -1 && errno != EEXIST) {
        perror("创建测试目录失败");
    } else {
        printf("创建测试目录: fanotify_test_dir\n");
        
        // 监视整个目录
        result = fanotify_mark(fan_fd, FAN_MARK_ADD,
                              FAN_OPEN | FAN_CLOSE | FAN_ACCESS,
                              AT_FDCWD, "fanotify_test_dir");
        if (result == 0) {
            printf("成功添加目录监视标记\n");
        }
        
        // 在目录中创建文件进行测试
        fd = open("fanotify_test_dir/test_file.txt", O_CREAT | O_WRONLY, 0644);
        if (fd != -1) {
            write(fd, "test file in directory", 22);
            close(fd);
            printf("在目录中创建测试文件\n");
        }
    }
    
    // 示例4: 不同的标记操作
    printf("\n示例4: 不同的标记操作\n");
    
    // FAN_MARK_REMOVE - 删除标记
    result = fanotify_mark(fan_fd, FAN_MARK_REMOVE,
                          FAN_OPEN | FAN_CLOSE,
                          AT_FDCWD, "test_fanotify.txt");
    if (result == 0) {
        printf("成功删除部分监视标记\n");
    } else {
        printf("删除监视标记失败: %s\n", strerror(errno));
    }
    
    // FAN_MARK_FLUSH - 刷新所有标记
    result = fanotify_mark(fan_fd, FAN_MARK_FLUSH, 0, 0, NULL);
    if (result == 0) {
        printf("成功刷新所有监视标记\n");
    }
    
    // 重新添加标记进行后续测试
    fanotify_mark(fan_fd, FAN_MARK_ADD,
                  FAN_OPEN | FAN_CLOSE | FAN_ACCESS,
                  AT_FDCWD, "test_fanotify.txt");
    
    // 示例5: 错误处理演示
    printf("\n示例5: 错误处理演示\n");
    
    // 使用无效的fanotify文件描述符
    result = fanotify_mark(999, FAN_MARK_ADD, FAN_OPEN, AT_FDCWD, "test.txt");
    if (result == -1) {
        if (errno == EBADF) {
            printf("无效fanotify文件描述符错误处理正确: %s\n", strerror(errno));
        }
    }
    
    // 使用无效的标志
    result = fanotify_mark(fan_fd, 0x1000, FAN_OPEN, AT_FDCWD, "test.txt");
    if (result == -1) {
        if (errno == EINVAL) {
            printf("无效标志错误处理正确: %s\n", strerror(errno));
        }
    }
    
    // 监视不存在的文件
    result = fanotify_mark(fan_fd, FAN_MARK_ADD, FAN_OPEN, AT_FDCWD, "nonexistent.txt");
    if (result == -1) {
        if (errno == ENOENT) {
            printf("监视不存在文件错误处理正确: %s\n", strerror(errno));
        }
    }
    
    // 示例6: 权限相关操作
    printf("\n示例6: 权限相关操作\n");
    
    printf("fanotify权限要求:\n");
    printf("1. 需要CAP_SYS_ADMIN能力\n");
    printf("2. 或者root权限\n");
    printf("3. 某些操作可能需要额外权限\n");
    printf("4. 文件访问权限仍然适用\n\n");
    
    // 示例7: 实际应用场景演示
    printf("示例7: 实际应用场景演示\n");
    
    printf("文件完整性监控系统:\n");
    printf("int setup_file_monitoring(const char* path) {\n");
    printf("    int fan_fd = fanotify_init(FAN_CLASS_CONTENT, O_RDONLY);\n");
    printf("    if (fan_fd == -1) return -1;\n");
    printf("    \n");
    printf("    // 监视文件修改\n");
    printf("    fanotify_mark(fan_fd, FAN_MARK_ADD, \n");
    printf("                  FAN_CLOSE_WRITE | FAN_MODIFY,\n");
    printf("                  AT_FDCWD, path);\n");
    printf("    \n");
    printf("    return fan_fd;\n");
    printf("}\n\n");
    
    printf("实时备份系统:\n");
    printf("int setup_backup_monitoring(const char* directory) {\n");
    printf("    int fan_fd = fanotify_init(FAN_CLASS_CONTENT, O_RDONLY);\n");
    printf("    if (fan_fd == -1) return -1;\n");
    printf("    \n");
    printf("    // 监视目录及其子目录的所有修改\n");
    printf("    fanotify_mark(fan_fd, FAN_MARK_ADD | FAN_MARK_MOUNT,\n");
    printf("                  FAN_CLOSE_WRITE | FAN_MOVED_TO | FAN_CREATE,\n");
    printf("                  AT_FDCWD, directory);\n");
    printf("    \n");
    printf("    return fan_fd;\n");
    printf("}\n\n");
    
    // 示例8: 事件处理演示
    printf("示例8: 事件处理演示\n");
    
    printf("典型的fanotify事件处理循环:\n");
    printf("while (keep_running) {\n");
    printf("    ssize_t len = read(fan_fd, buffer, sizeof(buffer));\n");
    printf("    if (len == -1 && errno != EAGAIN) {\n");
    printf("        perror(\"读取fanotify事件失败\");\n");
    printf("        break;\n");
    printf("    }\n");
    printf("    \n");
    printf("    struct fanotify_event_metadata *metadata;\n");
    printf("    for (metadata = (void*)buffer;\n");
    printf("         FAN_EVENT_OK(metadata, len);\n");
    printf("         metadata = FAN_EVENT_NEXT(metadata, len)) {\n");
    printf("        \n");
    printf("        // 处理事件\n");
    printf("        printf(\"事件: %%s, 文件描述符: %%d\\n\",\n");
    printf("               event_type_to_string(metadata->mask),\n");
    printf("               metadata->fd);\n");
    printf("        \n");
    printf("        // 关闭事件提供的文件描述符\n");
    printf("        close(metadata->fd);\n");
    printf("    }\n");
    printf("}\n\n");
    
    // 示例9: 不同监视类型说明
    printf("示例9: 不同监视类型说明\n");
    
    printf("FAN_MARK_ADD:\n");
    printf("  - 添加新的监视标记\n");
    printf("  - 可以与现有标记共存\n\n");
    
    printf("FAN_MARK_REMOVE:\n");
    printf("  - 删除指定的监视标记\n");
    printf("  - 只删除指定的事件类型\n\n");
    
    printf("FAN_MARK_FLUSH:\n");
    printf("  - 删除所有监视标记\n");
    printf("  - 清空整个监视设置\n\n");
    
    printf("FAN_MARK_DONT_FOLLOW:\n");
    printf("  - 不跟随符号链接\n");
    printf("  - 直接监视符号链接本身\n\n");
    
    printf("FAN_MARK_ONLYDIR:\n");
    printf("  - 只监视目录\n");
    printf("  - 如果路径不是目录则失败\n\n");
    
    printf("FAN_MARK_MOUNT:\n");
    printf("  - 监视整个挂载点\n");
    printf("  - 包括挂载点下的所有文件\n\n");
    
    printf("FAN_MARK_FILESYSTEM:\n");
    printf("  - 监视整个文件系统\n");
    printf("  - 包括所有挂载的实例\n\n");
    
    // 示例10: 事件类型说明
    printf("示例10: 事件类型说明\n");
    
    printf("基本事件类型:\n");
    printf("FAN_ACCESS: 文件被访问（读取）\n");
    printf("FAN_OPEN: 文件被打开\n");
    printf("FAN_CLOSE_WRITE: 可写文件被关闭\n");
    printf("FAN_CLOSE_NOWRITE: 只读文件被关闭\n");
    printf("FAN_MODIFY: 文件内容被修改\n\n");
    
    printf("权限检查事件类型:\n");
    printf("FAN_OPEN_PERM: 文件打开权限检查（需要响应）\n");
    printf("FAN_ACCESS_PERM: 文件访问权限检查（需要响应）\n\n");
    
    printf("创建和删除事件:\n");
    printf("FAN_CREATE: 文件或目录被创建\n");
    printf("FAN_DELETE: 文件或目录被删除\n");
    printf("FAN_MOVED_FROM: 文件被移出\n");
    printf("FAN_MOVED_TO: 文件被移入\n\n");
    
    // 示例11: 性能和资源考虑
    printf("示例11: 性能和资源考虑\n");
    
    printf("fanotify性能特点:\n");
    printf("1. 内核级别的通知机制\n");
    printf("2. 比轮询方式更高效\n");
    printf("3. 支持大量文件监视\n");
    printf("4. 低延迟事件通知\n\n");
    
    printf("资源使用考虑:\n");
    printf("1. 每个监视标记消耗内核资源\n");
    printf("2. 事件队列有大小限制\n");
    printf("3. 需要及时处理事件\n");
    printf("4. 避免监视过多文件\n\n");
    
    printf("优化建议:\n");
    printf("1. 合理选择监视的事件类型\n");
    printf("2. 及时处理和响应事件\n");
    printf("3. 避免不必要的监视标记\n");
    printf("4. 使用适当的fanotify类\n");
    printf("5. 监控系统资源使用\n\n");
    
    // 示例12: 安全考虑
    printf("示例12: 安全考虑\n");
    
    printf("fanotify安全特性:\n");
    printf("1. 需要特殊权限才能使用\n");
    printf("2. 不能监视没有访问权限的文件\n");
    printf("3. 提供权限检查事件类型\n");
    printf("4. 事件包含文件描述符便于验证\n\n");
    
    printf("安全使用建议:\n");
    printf("1. 验证事件中的文件描述符\n");
    printf("2. 谨慎使用权限检查事件\n");
    printf("3. 限制监视的文件范围\n");
    printf("4. 及时响应权限检查事件\n");
    printf("5. 避免泄露敏感文件信息\n\n");
    
    // 清理资源
    printf("清理测试资源...\n");
    close(fan_fd);
    unlink("test_fanotify.txt");
    unlink("fanotify_test_dir/test_file.txt");
    rmdir("fanotify_test_dir");
    
    printf("\n总结:\n");
    printf("fanotify_mark是fanotify文件系统监控的核心函数\n");
    printf("支持灵活的监视标记管理和多种监视类型\n");
    printf("需要适当的权限才能使用\n");
    printf("适用于文件完整性监控、实时备份等场景\n");
    printf("是Linux系统管理和安全监控的重要工具\n");
    
    return 0;
}

2025-08-23

Linux系统编程

fchownat系统调用及示例

fchownat函数详解

fchownat函数是Linux系统中用于改变文件所有者和组的高级函数，它是chown()和fchown()函数的增强版本。可以把fchownat想象成一个”精确的所有权修改器”，它不仅能够修改文件的所有者和组，还支持相对路径操作和符号链接控制。fchownat的主要优势在于它提供了更多的控制选项，特别是相对于指定目录文件描述符的路径解析能力。这使得在复杂目录结构中进行批量文件所有权修改变得更加安全和高效。

本文链接

函数介绍

fchownat函数是Linux系统中用于改变文件所有者和组的高级函数，它是chown()和fchown()函数的增强版本。可以把fchownat想象成一个”精确的所有权修改器”，它不仅能够修改文件的所有者和组，还支持相对路径操作和符号链接控制。

fchownat的主要优势在于它提供了更多的控制选项，特别是相对于指定目录文件描述符的路径解析能力。这使得在复杂目录结构中进行批量文件所有权修改变得更加安全和高效。

使用场景：

系统管理工具中的文件权限管理
批量修改目录树中文件的所有权
安全的相对路径文件操作
系统恢复和维护工具
容器和虚拟化环境中的权限管理

函数原型

#include <fcntl.h>
#include <unistd.h>

int fchownat(int dirfd, const char *pathname, uid_t owner, gid_t group, int flags);

功能

fchownat函数的主要功能是修改指定文件的所有者和组。它支持相对于目录文件描述符的路径解析，并提供了控制符号链接行为的选项。

参数

dirfd: 目录文件描述符

类型：int
含义：基准目录的文件描述符，用于相对路径解析
特殊值：AT_FDCWD表示使用当前工作目录

pathname: 文件路径名

类型：const char*
含义：要修改所有权的文件路径名
可以是相对路径（相对于dirfd）或绝对路径

owner: 新的所有者用户ID

类型：uid_t
含义：文件的新所有者用户ID
-1表示不改变所有者

group: 新的组ID

类型：gid_t
含义：文件的新组ID
-1表示不改变组

flags: 操作标志

类型：int
含义：控制操作行为的标志位

常用值：

0：默认行为
AT_SYMLINK_NOFOLLOW：不跟随符号链接（修改符号链接本身）
AT_EMPTY_PATH：允许空路径名（Linux 2.6.39+）

返回值

成功: 返回0

失败: 返回-1，并设置errno错误码

EACCES：权限不足
EBADF：dirfd无效或不是目录
EFAULT：pathname指向无效内存
EIO：I/O错误
ELOOP：符号链接循环
ENAMETOOLONG：路径名过长
ENOENT：文件或路径不存在
ENOTDIR：路径前缀不是目录
EPERM：操作不被允许
EROFS：文件系统只读

相似函数或关联函数

chown(): 改变文件所有权
fchown(): 通过文件描述符改变文件所有权
lchown(): 改变符号链接所有权（不跟随链接）
chmod(): 改变文件权限
fchmodat(): 相对于目录文件描述符改变文件权限
openat(): 相对于目录文件描述符打开文件
readlinkat(): 相对于目录文件描述符读取符号链接

示例代码

示例1：基础fchownat使用 - 简单所有权修改

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <pwd.h>
#include <grp.h>
#include <string.h>
#include <errno.h>

// 创建测试文件
int create_test_file(const char* filename, const char* content) {
    int fd = open(filename, O_CREAT | O_WRONLY | O_TRUNC, 0644);
    if (fd == -1) {
        perror("创建文件失败");
        return -1;
    }
    
    if (write(fd, content, strlen(content)) == -1) {
        perror("写入文件失败");
        close(fd);
        return -1;
    }
    
    close(fd);
    printf("创建文件: %s\n", filename);
    return 0;
}

// 显示文件所有权信息
void show_file_ownership(const char* filename) {
    struct stat file_stat;
    
    if (stat(filename, &file_stat) == -1) {
        perror("获取文件状态失败");
        return;
    }
    
    // 获取用户名
    struct passwd* pwd = getpwuid(file_stat.st_uid);
    struct group* grp = getgrgid(file_stat.st_gid);
    
    printf("文件 %s 的所有权信息:\n", filename);
    printf("  用户ID: %d", (int)file_stat.st_uid);
    if (pwd) {
        printf(" (%s)", pwd->pw_name);
    }
    printf("\n");
    
    printf("  组ID: %d", (int)file_stat.st_gid);
    if (grp) {
        printf(" (%s)", grp->gr_name);
    }
    printf("\n");
    
    printf("  权限: %o\n", file_stat.st_mode & 0777);
}

// 获取当前用户和组信息
void show_current_user_info() {
    uid_t uid = getuid();
    gid_t gid = getgid();
    
    struct passwd* pwd = getpwuid(uid);
    struct group* grp = getgrgid(gid);
    
    printf("当前用户信息:\n");
    printf("  用户ID: %d", (int)uid);
    if (pwd) {
        printf(" (%s)", pwd->pw_name);
    }
    printf("\n");
    
    printf("  组ID: %d", (int)gid);
    if (grp) {
        printf(" (%s)", grp->gr_name);
    }
    printf("\n");
}

int main() {
    printf("=== 基础fchownat使用示例 ===\n");
    
    // 检查权限
    if (geteuid() != 0) {
        printf("警告: 修改文件所有权通常需要root权限\n");
        printf("某些操作可能失败\n");
    }
    
    show_current_user_info();
    
    // 创建测试文件
    const char* test_file = "fchownat_test.txt";
    if (create_test_file(test_file, "这是fchownat测试文件的内容\n") == -1) {
        exit(EXIT_FAILURE);
    }
    
    // 显示初始所有权
    printf("\n1. 初始文件所有权:\n");
    show_file_ownership(test_file);
    
    // 获取root用户信息（用于测试）
    struct passwd* root_pwd = getpwnam("root");
    struct group* root_grp = getgrnam("root");
    
    uid_t root_uid = root_pwd ? root_pwd->pw_uid : 0;
    gid_t root_gid = root_grp ? root_grp->gr_gid : 0;
    
    // 使用fchownat修改所有权为root
    printf("\n2. 使用fchownat修改所有权为root:\n");
    if (fchownat(AT_FDCWD, test_file, root_uid, root_gid, 0) == 0) {
        printf("✓ 所有权修改成功\n");
    } else {
        printf("✗ 所有权修改失败: %s\n", strerror(errno));
    }
    
    show_file_ownership(test_file);
    
    // 使用fchownat只修改用户ID
    printf("\n3. 使用fchownat只修改用户ID:\n");
    uid_t current_uid = getuid();
    if (fchownat(AT_FDCWD, test_file, current_uid, -1, 0) == 0) {
        printf("✓ 用户ID修改成功\n");
    } else {
        printf("✗ 用户ID修改失败: %s\n", strerror(errno));
    }
    
    show_file_ownership(test_file);
    
    // 使用fchownat只修改组ID
    printf("\n4. 使用fchownat只修改组ID:\n");
    gid_t current_gid = getgid();
    if (fchownat(AT_FDCWD, test_file, -1, current_gid, 0) == 0) {
        printf("✓ 组ID修改成功\n");
    } else {
        printf("✗ 组ID修改失败: %s\n", strerror(errno));
    }
    
    show_file_ownership(test_file);
    
    // 演示错误处理
    printf("\n5. 错误处理演示:\n");
    
    // 尝试修改不存在的文件
    if (fchownat(AT_FDCWD, "nonexistent.txt", 0, 0, 0) == -1) {
        printf("修改不存在文件的所有权: %s (预期行为)\n", strerror(errno));
    }
    
    // 尝试使用无效的dirfd
    if (fchownat(999, test_file, 0, 0, 0) == -1) {
        printf("使用无效dirfd: %s (预期行为)\n", strerror(errno));
    }
    
    // 清理测试文件
    unlink(test_file);
    
    printf("\n=== 基础fchownat演示完成 ===\n");
    
    return 0;
}

示例2：相对路径和符号链接控制

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <pwd.h>
#include <grp.h>
#include <string.h>
#include <errno.h>

// 创建测试目录结构
int create_test_directory_structure() {
    // 创建测试目录
    if (mkdir("fchownat_test_dir", 0755) == -1 && errno != EEXIST) {
        perror("创建测试目录失败");
        return -1;
    }
    
    if (mkdir("fchownat_test_dir/subdir", 0755) == -1 && errno != EEXIST) {
        perror("创建子目录失败");
        return -1;
    }
    
    // 创建测试文件
    int fd = open("fchownat_test_dir/test_file.txt", 
                  O_CREAT | O_WRONLY | O_TRUNC, 0644);
    if (fd != -1) {
        write(fd, "测试文件内容", 12);
        close(fd);
        printf("创建测试文件: fchownat_test_dir/test_file.txt\n");
    }
    
    // 创建子目录中的文件
    fd = open("fchownat_test_dir/subdir/sub_file.txt", 
              O_CREAT | O_WRONLY | O_TRUNC, 0644);
    if (fd != -1) {
        write(fd, "子目录文件内容", 14);
        close(fd);
        printf("创建子目录文件: fchownat_test_dir/subdir/sub_file.txt\n");
    }
    
    // 创建符号链接
    if (symlink("test_file.txt", "fchownat_test_dir/link_to_file") == 0) {
        printf("创建符号链接: fchownat_test_dir/link_to_file\n");
    }
    
    return 0;
}

// 显示目录内容和所有权
void show_directory_ownership(const char* dir_path) {
    printf("目录 %s 的内容和所有权:\n", dir_path);
    
    DIR* dir = opendir(dir_path);
    if (dir == NULL) {
        perror("打开目录失败");
        return;
    }
    
    struct dirent* entry;
    while ((entry = readdir(dir)) != NULL) {
        if (strcmp(entry->d_name, ".") == 0 || strcmp(entry->d_name, "..") == 0) {
            continue;
        }
        
        char full_path&#91;512];
        snprintf(full_path, sizeof(full_path), "%s/%s", dir_path, entry->d_name);
        
        struct stat file_stat;
        if (stat(full_path, &file_stat) == 0) {
            struct passwd* pwd = getpwuid(file_stat.st_uid);
            struct group* grp = getgrgid(file_stat.st_gid);
            
            char type_char = '?';
            if (S_ISREG(file_stat.st_mode)) type_char = 'f';
            else if (S_ISDIR(file_stat.st_mode)) type_char = 'd';
            else if (S_ISLNK(file_stat.st_mode)) type_char = 'l';
            
            printf("  %c %s", type_char, entry->d_name);
            if (pwd) printf(" (用户:%s", pwd->pw_name);
            if (grp) printf(" 组:%s", grp->gr_name);
            printf(")\n");
        }
    }
    
    closedir(dir);
}

int main() {
    printf("=== fchownat相对路径和符号链接控制示例 ===\n");
    
    if (geteuid() != 0) {
        printf("警告: 此演示需要root权限以修改文件所有权\n");
    }
    
    // 创建测试目录结构
    printf("1. 创建测试目录结构:\n");
    if (create_test_directory_structure() == -1) {
        exit(EXIT_FAILURE);
    }
    
    show_directory_ownership("fchownat_test_dir");
    
    // 演示相对路径操作
    printf("\n2. 演示相对路径操作:\n");
    
    // 打开基准目录
    int base_dirfd = open("fchownat_test_dir", O_RDONLY);
    if (base_dirfd == -1) {
        perror("打开基准目录失败");
        exit(EXIT_FAILURE);
    }
    
    printf("打开基准目录 fd=%d\n", base_dirfd);
    
    // 相对于基准目录修改文件所有权
    printf("相对于基准目录修改文件所有权:\n");
    
    // 修改test_file.txt的所有权
    if (fchownat(base_dirfd, "test_file.txt", 0, 0, 0) == 0) {
        printf("  ✓ 修改 test_file.txt 成功\n");
    } else {
        printf("  ✗ 修改 test_file.txt 失败: %s\n", strerror(errno));
    }
    
    // 修改子目录中文件的所有权
    if (fchownat(base_dirfd, "subdir/sub_file.txt", 0, 0, 0) == 0) {
        printf("  ✓ 修改 subdir/sub_file.txt 成功\n");
    } else {
        printf("  ✗ 修改 subdir/sub_file.txt 失败: %s\n", strerror(errno));
    }
    
    show_directory_ownership("fchownat_test_dir");
    
    // 演示AT_FDCWD的使用
    printf("\n3. 演示AT_FDCWD的使用:\n");
    
    // 切换到测试目录
    if (chdir("fchownat_test_dir") == -1) {
        perror("切换目录失败");
        close(base_dirfd);
        exit(EXIT_FAILURE);
    }
    
    printf("切换到测试目录\n");
    
    // 使用AT_FDCWD修改文件所有权
    if (fchownat(AT_FDCWD, "test_file.txt", 1, 1, 0) == 0) {
        printf("  ✓ 使用AT_FDCWD修改所有权成功\n");
    } else {
        printf("  ✗ 使用AT_FDCWD修改所有权失败: %s\n", strerror(errno));
    }
    
    // 切换回原目录
    chdir("..");
    show_directory_ownership("fchownat_test_dir");
    
    // 演示符号链接控制
    printf("\n4. 演示符号链接控制:\n");
    
    // 显示符号链接信息
    struct stat link_stat, target_stat;
    if (lstat("fchownat_test_dir/link_to_file", &link_stat) == 0 &&
        stat("fchownat_test_dir/link_to_file", &target_stat) == 0) {
        printf("符号链接信息:\n");
        printf("  链接文件所有权: UID=%d, GID=%d\n", 
               (int)link_stat.st_uid, (int)link_stat.st_gid);
        printf("  目标文件所有权: UID=%d, GID=%d\n", 
               (int)target_stat.st_uid, (int)target_stat.st_gid);
    }
    
    // 默认行为：修改目标文件所有权
    printf("默认行为（修改目标文件）:\n");
    if (fchownat(base_dirfd, "link_to_file", 2, 2, 0) == 0) {
        printf("  ✓ 修改符号链接目标成功\n");
        // 检查目标文件所有权
        if (stat("fchownat_test_dir/link_to_file", &target_stat) == 0) {
            printf("  目标文件新所有权: UID=%d, GID=%d\n", 
                   (int)target_stat.st_uid, (int)target_stat.st_gid);
        }
    }
    
    // 使用AT_SYMLINK_NOFOLLOW：修改符号链接本身
    printf("使用AT_SYMLINK_NOFOLLOW（修改链接本身）:\n");
    if (fchownat(base_dirfd, "link_to_file", 3, 3, AT_SYMLINK_NOFOLLOW) == 0) {
        printf("  ✓ 修改符号链接本身成功\n");
        // 检查链接文件所有权
        if (lstat("fchownat_test_dir/link_to_file", &link_stat) == 0) {
            printf("  链接文件新所有权: UID=%d, GID=%d\n", 
                   (int)link_stat.st_uid, (int)link_stat.st_gid);
        }
    }
    
    show_directory_ownership("fchownat_test_dir");
    
    // 清理资源
    close(base_dirfd);
    
    // 清理测试文件
    unlink("fchownat_test_dir/link_to_file");
    unlink("fchownat_test_dir/test_file.txt");
    unlink("fchownat_test_dir/subdir/sub_file.txt");
    rmdir("fchownat_test_dir/subdir");
    rmdir("fchownat_test_dir");
    
    printf("\n=== 相对路径和符号链接控制演示完成 ===\n");
    
    return 0;
}

示例3：批量文件所有权管理

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <pwd.h>
#include <grp.h>
#include <string.h>
#include <errno.h>
#include <dirent.h>
#include <time.h>

#define MAX_FILES 1000

// 文件信息结构
typedef struct {
    char path&#91;512];
    uid_t old_uid;
    gid_t old_gid;
    uid_t new_uid;
    gid_t new_gid;
    int changed;
} file_ownership_info_t;

file_ownership_info_t file_list&#91;MAX_FILES];
int file_count = 0;

// 创建批量测试文件
int create_batch_test_files(const char* base_dir) {
    printf("创建批量测试文件...\n");
    
    // 创建基础目录
    if (mkdir(base_dir, 0755) == -1 && errno != EEXIST) {
        perror("创建基础目录失败");
        return -1;
    }
    
    // 创建子目录结构
    char dirs&#91;]&#91;256] = {
        "documents", "images", "videos", "music", "archives"
    };
    
    for (int i = 0; i < 5; i++) {
        char full_path&#91;512];
        snprintf(full_path, sizeof(full_path), "%s/%s", base_dir, dirs&#91;i]);
        if (mkdir(full_path, 0755) == -1 && errno != EEXIST) {
            printf("创建目录失败 %s: %s\n", full_path, strerror(errno));
        }
    }
    
    // 创建测试文件
    srand(time(NULL));
    const char* extensions&#91;] = {".txt", ".pdf", ".jpg", ".mp3", ".zip"};
    const char* contents&#91;] = {"文档内容", "PDF内容", "图片数据", "音频数据", "压缩数据"};
    
    for (int i = 0; i < 50; i++) {
        int dir_index = rand() % 6;  // 0-4是子目录，5是根目录
        int ext_index = rand() % 5;
        
        char file_path&#91;512];
        if (dir_index < 5) {
            snprintf(file_path, sizeof(file_path), "%s/%s/file_%03d%s", 
                     base_dir, dirs&#91;dir_index], i, extensions&#91;ext_index]);
        } else {
            snprintf(file_path, sizeof(file_path), "%s/root_file_%03d%s", 
                     base_dir, i, extensions&#91;ext_index]);
        }
        
        int fd = open(file_path, O_CREAT | O_WRONLY | O_TRUNC, 0644);
        if (fd != -1) {
            write(fd, contents&#91;ext_index], strlen(contents&#91;ext_index]));
            close(fd);
            printf("创建文件: %s\n", file_path);
        }
    }
    
    return 0;
}

// 递归扫描目录中的文件
int scan_directory_files(int dirfd, const char* base_path) {
    DIR* dir = fdopendir(dirfd);
    if (dir == NULL) {
        close(dirfd);
        return -1;
    }
    
    struct dirent* entry;
    while ((entry = readdir(dir)) != NULL) {
        if (strcmp(entry->d_name, ".") == 0 || strcmp(entry->d_name, "..") == 0) {
            continue;
        }
        
        // 构造完整路径
        char full_path&#91;1024];
        if (base_path&#91;0] == '\0') {
            snprintf(full_path, sizeof(full_path), "%s", entry->d_name);
        } else {
            snprintf(full_path, sizeof(full_path), "%s/%s", base_path, entry->d_name);
        }
        
        // 获取文件状态
        struct stat file_stat;
        if (fstatat(dirfd, entry->d_name, &file_stat, AT_SYMLINK_NOFOLLOW) == 0) {
            if (file_count < MAX_FILES) {
                strncpy(file_list&#91;file_count].path, full_path, 
                        sizeof(file_list&#91;file_count].path) - 1);
                file_list&#91;file_count].old_uid = file_stat.st_uid;
                file_list&#91;file_count].old_gid = file_stat.st_gid;
                file_list&#91;file_count].new_uid = -1;  // 保持不变
                file_list&#91;file_count].new_gid = -1;  // 保持不变
                file_list&#91;file_count].changed = 0;
                file_count++;
            }
            
            // 如果是目录，递归扫描
            if (S_ISDIR(file_stat.st_mode)) {
                int sub_dirfd = openat(dirfd, entry->d_name, O_RDONLY);
                if (sub_dirfd != -1) {
                    scan_directory_files(sub_dirfd, full_path);
                }
            }
        }
    }
    
    closedir(dir);
    return 0;
}

// 显示文件列表
void show_file_list(int max_show) {
    printf("文件列表 (共 %d 个文件):\n", file_count);
    printf("%-40s %-8s %-8s %-8s %-8s %-8s\n", 
           "路径", "旧UID", "旧GID", "新UID", "新GID", "状态");
    printf("%-40s %-8s %-8s %-8s %-8s %-8s\n", 
           "----", "------", "------", "------", "------", "----");
    
    int show_count = (max_show > 0 && max_show < file_count) ? max_show : file_count;
    
    for (int i = 0; i < show_count; i++) {
        const file_ownership_info_t* info = &file_list&#91;i];
        printf("%-40s %-8d %-8d %-8d %-8d %-8s\n",
               info->path,
               (int)info->old_uid, (int)info->old_gid,
               (int)info->new_uid, (int)info->new_gid,
               info->changed ? "已修改" : "未修改");
    }
    
    if (show_count < file_count) {
        printf("... (还有 %d 个文件)\n", file_count - show_count);
    }
    printf("\n");
}

// 批量修改文件所有权
int batch_change_ownership(int base_dirfd, uid_t new_uid, gid_t new_gid, 
                          const char* pattern) {
    printf("批量修改文件所有权:\n");
    printf("  新UID: %d, 新GID: %d\n", (int)new_uid, (int)new_gid);
    if (pattern) {
        printf("  匹配模式: %s\n", pattern);
    }
    
    int changed_count = 0;
    
    for (int i = 0; i < file_count; i++) {
        file_ownership_info_t* info = &file_list&#91;i];
        
        // 检查是否匹配模式（如果指定了模式）
        if (pattern && strstr(info->path, pattern) == NULL) {
            continue;
        }
        
        // 确定新的UID和GID
        uid_t target_uid = (new_uid == (uid_t)-1) ? info->old_uid : new_uid;
        gid_t target_gid = (new_gid == (gid_t)-1) ? info->old_gid : new_gid;
        
        // 只有当所有权需要改变时才执行
        if (target_uid != info->old_uid || target_gid != info->old_gid) {
            if (fchownat(base_dirfd, info->path, target_uid, target_gid, 0) == 0) {
                info->new_uid = target_uid;
                info->new_gid = target_gid;
                info->changed = 1;
                changed_count++;
                printf("  修改成功: %s (UID:%d->%d, GID:%d->%d)\n",
                       info->path, (int)info->old_uid, (int)target_uid,
                       (int)info->old_gid, (int)target_gid);
            } else {
                printf("  修改失败: %s (%s)\n", info->path, strerror(errno));
            }
        }
    }
    
    printf("总共修改了 %d 个文件的所有权\n", changed_count);
    return changed_count;
}

// 按文件类型修改所有权
int change_ownership_by_type(int base_dirfd, const char* extension, 
                            uid_t new_uid, gid_t new_gid) {
    printf("按文件类型修改所有权: %s\n", extension);
    
    int changed_count = 0;
    
    for (int i = 0; i < file_count; i++) {
        file_ownership_info_t* info = &file_list&#91;i];
        
        // 检查文件扩展名
        char* dot = strrchr(info->path, '.');
        if (dot && strcmp(dot, extension) == 0) {
            if (fchownat(base_dirfd, info->path, new_uid, new_gid, 0) == 0) {
                info->new_uid = new_uid;
                info->new_gid = new_gid;
                info->changed = 1;
                changed_count++;
                printf("  修改成功: %s\n", info->path);
            } else {
                printf("  修改失败: %s (%s)\n", info->path, strerror(errno));
            }
        }
    }
    
    printf("按类型修改了 %d 个文件的所有权\n", changed_count);
    return changed_count;
}

int main() {
    printf("=== 批量文件所有权管理示例 ===\n");
    
    if (geteuid() != 0) {
        printf("警告: 批量所有权修改需要root权限\n");
    }
    
    const char* test_dir = "batch_ownership_test";
    
    // 创建批量测试文件
    printf("1. 创建批量测试文件:\n");
    if (create_batch_test_files(test_dir) == -1) {
        exit(EXIT_FAILURE);
    }
    
    // 扫描文件
    printf("\n2. 扫描目录中的文件:\n");
    int base_dirfd = open(test_dir, O_RDONLY);
    if (base_dirfd == -1) {
        perror("打开测试目录失败");
        exit(EXIT_FAILURE);
    }
    
    file_count = 0;
    scan_directory_files(dup(base_dirfd), "");  // dup因为scan_directory_files会关闭fd
    
    printf("扫描完成，找到 %d 个文件\n", file_count);
    show_file_list(10);
    
    // 批量修改所有文件的所有权为root
    printf("3. 批量修改所有文件所有权为root:\n");
    batch_change_ownership(base_dirfd, 0, 0, NULL);
    
    show_file_list(10);
    
    // 按文件类型修改所有权
    printf("\n4. 按文件类型修改所有权:\n");
    struct passwd* test_user = getpwnam("nobody");
    struct group* test_group = getgrnam("nobody");
    uid_t test_uid = test_user ? test_user->pw_uid : 65534;
    gid_t test_gid = test_group ? test_group->gr_gid : 65534;
    
    change_ownership_by_type(base_dirfd, ".txt", test_uid, test_gid);
    change_ownership_by_type(base_dirfd, ".jpg", test_uid + 1, test_gid + 1);
    
    // 按目录修改所有权
    printf("\n5. 按目录修改所有权:\n");
    batch_change_ownership(base_dirfd, test_uid + 2, test_gid + 2, "documents/");
    batch_change_ownership(base_dirfd, test_uid + 3, test_gid + 3, "images/");
    
    // 显示最终结果
    printf("\n6. 最终文件所有权状态:\n");
    show_file_list(20);
    
    // 统计修改情况
    int changed_count = 0;
    int unchanged_count = 0;
    for (int i = 0; i < file_count; i++) {
        if (file_list&#91;i].changed) {
            changed_count++;
        } else {
            unchanged_count++;
        }
    }
    
    printf("统计信息:\n");
    printf("  已修改文件: %d\n", changed_count);
    printf("  未修改文件: %d\n", unchanged_count);
    printf("  总文件数: %d\n", file_count);
    
    // 清理测试文件
    printf("\n7. 清理测试文件:\n");
    
    // 由于是递归目录，需要特殊清理
    DIR* dir = opendir(test_dir);
    if (dir) {
        struct dirent* entry;
        while ((entry = readdir(dir)) != NULL) {
            if (strcmp(entry->d_name, ".") != 0 && strcmp(entry->d_name, "..") != 0) {
                char full_path&#91;512];
                snprintf(full_path, sizeof(full_path), "%s/%s", test_dir, entry->d_name);
                
                if (entry->d_type == DT_DIR) {
                    // 递归删除目录
                    DIR* subdir = opendir(full_path);
                    if (subdir) {
                        struct dirent* subentry;
                        while ((subentry = readdir(subdir)) != NULL) {
                            if (strcmp(subentry->d_name, ".") != 0 && 
                                strcmp(subentry->d_name, "..") != 0) {
                                char sub_path&#91;512];
                                snprintf(sub_path, sizeof(sub_path), "%s/%s", 
                                         full_path, subentry->d_name);
                                unlink(sub_path);
                            }
                        }
                        closedir(subdir);
                        rmdir(full_path);
                    }
                } else {
                    unlink(full_path);
                }
            }
        }
        closedir(dir);
        rmdir(test_dir);
        printf("清理完成\n");
    }
    
    close(base_dirfd);
    
    printf("\n=== 批量文件所有权管理演示完成 ===\n");
    
    return 0;
}

示例4：高级所有权管理和安全特性

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <pwd.h>
#include <grp.h>
#include <string.h>
#include <errno.h>
#include <dirent.h>
#include <time.h>
#include <sys/xattr.h>

// 安全的所有权修改结构
typedef struct {
    char path&#91;512];
    uid_t current_uid;
    gid_t current_gid;
    uid_t target_uid;
    gid_t target_gid;
    mode_t current_mode;
    time_t modify_time;
    int success;
    char error_msg&#91;128];
} ownership_change_record_t;

#define MAX_RECORDS 1000
ownership_change_record_t change_records&#91;MAX_RECORDS];
int record_count = 0;

// 创建安全测试环境
int create_secure_test_environment() {
    printf("创建安全测试环境...\n");
    
    // 创建测试目录
    if (mkdir("secure_ownership_test", 0755) == -1 && errno != EEXIST) {
        perror("创建测试目录失败");
        return -1;
    }
    
    // 创建敏感文件
    const char* sensitive_files&#91;] = {
        "secure_ownership_test/config.txt",
        "secure_ownership_test/private.key",
        "secure_ownership_test/database.db",
        "secure_ownership_test/logs/app.log"
    };
    
    // 创建日志目录
    if (mkdir("secure_ownership_test/logs", 0755) == -1 && errno != EEXIST) {
        perror("创建日志目录失败");
    }
    
    // 创建敏感文件
    for (int i = 0; i < 4; i++) {
        int fd = open(sensitive_files&#91;i], O_CREAT | O_WRONLY | O_TRUNC, 
                     (i == 1) ? 0600 : 0644);  // private.key使用更严格的权限
        if (fd != -1) {
            const char* content = (i == 1) ? "PRIVATE KEY DATA" : "SENSITIVE DATA";
            write(fd, content, strlen(content));
            close(fd);
            printf("创建敏感文件: %s\n", sensitive_files&#91;i]);
        }
    }
    
    // 创建符号链接
    if (symlink("config.txt", "secure_ownership_test/link_to_config") == 0) {
        printf("创建符号链接\n");
    }
    
    return 0;
}

// 安全的所有权修改函数
int secure_chownat(int dirfd, const char* pathname, uid_t owner, gid_t group, 
                   int flags, const char* reason) {
    if (record_count >= MAX_RECORDS) {
        fprintf(stderr, "记录缓冲区已满\n");
        return -1;
    }
    
    ownership_change_record_t* record = &change_records&#91;record_count];
    strncpy(record->path, pathname, sizeof(record->path) - 1);
    record->target_uid = owner;
    record->target_gid = group;
    record->modify_time = time(NULL);
    record->success = 0;
    record->error_msg&#91;0] = '\0';
    
    // 获取当前状态
    struct stat current_stat;
    if (fstatat(dirfd, pathname, &current_stat, 
                (flags & AT_SYMLINK_NOFOLLOW) ? AT_SYMLINK_NOFOLLOW : 0) == 0) {
        record->current_uid = current_stat.st_uid;
        record->current_gid = current_stat.st_gid;
        record->current_mode = current_stat.st_mode;
    } else {
        snprintf(record->error_msg, sizeof(record->error_msg), 
                 "获取当前状态失败: %s", strerror(errno));
        record_count++;
        return -1;
    }
    
    // 执行所有权修改
    if (fchownat(dirfd, pathname, owner, group, flags) == 0) {
        record->success = 1;
        printf("✓ 安全修改所有权: %s (%s)\n", pathname, reason ? reason : "无原因");
    } else {
        snprintf(record->error_msg, sizeof(record->error_msg), 
                 "%s", strerror(errno));
        printf("✗ 修改所有权失败: %s (%s) - %s\n", 
               pathname, reason ? reason : "无原因", strerror(errno));
    }
    
    record_count++;
    return record->success ? 0 : -1;
}

// 验证所有权修改的安全性
int verify_ownership_change(int dirfd, const char* pathname, 
                           uid_t expected_uid, gid_t expected_gid) {
    struct stat verify_stat;
    if (fstatat(dirfd, pathname, &verify_stat, 0) == 0) {
        if (verify_stat.st_uid == expected_uid && verify_stat.st_gid == expected_gid) {
            printf("✓ 所有权验证通过: %s (UID:%d, GID:%d)\n", 
                   pathname, (int)expected_uid, (int)expected_gid);
            return 1;
        } else {
            printf("✗ 所有权验证失败: %s (期望UID:%d,GID:%d 实际UID:%d,GID:%d)\n", 
                   pathname, (int)expected_uid, (int)expected_gid,
                   (int)verify_stat.st_uid, (int)verify_stat.st_gid);
            return 0;
        }
    } else {
        printf("✗ 验证失败，无法获取文件状态: %s\n", strerror(errno));
        return 0;
    }
}

// 显示修改记录
void show_change_records() {
    printf("\n=== 所有权修改记录 ===\n");
    printf("%-30s %-8s %-8s %-8s %-8s %-10s %s\n",
           "路径", "原UID", "原GID", "新UID", "新GID", "状态", "时间");
    printf("%-30s %-8s %-8s %-8s %-8s %-10s %s\n",
           "----", "------", "------", "------", "------", "----", "----");
    
    for (int i = 0; i < record_count; i++) {
        const ownership_change_record_t* record = &change_records&#91;i];
        char time_str&#91;32];
        strftime(time_str, sizeof(time_str), "%H:%M:%S", 
                 localtime(&record->modify_time));
        
        printf("%-30s %-8d %-8d %-8d %-8d %-10s %s\n",
               record->path,
               (int)record->current_uid, (int)record->current_gid,
               (int)record->target_uid, (int)record->target_gid,
               record->success ? "成功" : "失败",
               time_str);
        
        if (!record->success && record->error_msg&#91;0] != '\0') {
            printf("  错误信息: %s\n", record->error_msg);
        }
    }
    printf("========================\n\n");
}

// 模拟权限检查
int check_permission_for_chown(uid_t current_uid, uid_t file_uid) {
    // Root用户可以修改任何文件的所有权
    if (current_uid == 0) {
        return 1;
    }
    
    // 文件所有者可以修改自己文件的组
    if (current_uid == file_uid) {
        return 1;
    }
    
    // 需要CAP_CHOWN能力的其他情况
    printf("警告: 非root用户修改他人文件所有权可能失败\n");
    return 0;
}

int main() {
    printf("=== 高级所有权管理和安全特性示例 ===\n");
    
    uid_t current_uid = getuid();
    printf("当前用户ID: %d (%s)\n", (int)current_uid, 
           current_uid == 0 ? "root" : "非root");
    
    // 创建安全测试环境
    printf("\n1. 创建安全测试环境:\n");
    if (create_secure_test_environment() == -1) {
        exit(EXIT_FAILURE);
    }
    
    // 打开测试目录
    int test_dirfd = open("secure_ownership_test", O_RDONLY);
    if (test_dirfd == -1) {
        perror("打开测试目录失败");
        exit(EXIT_FAILURE);
    }
    
    // 显示初始状态
    printf("\n2. 初始文件状态:\n");
    DIR* dir = fdopendir(dup(test_dirfd));
    if (dir) {
        struct dirent* entry;
        while ((entry = readdir(dir)) != NULL) {
            if (strcmp(entry->d_name, ".") != 0 && strcmp(entry->d_name, "..") != 0) {
                struct stat file_stat;
                if (fstatat(test_dirfd, entry->d_name, &file_stat, AT_SYMLINK_NOFOLLOW) == 0) {
                    struct passwd* pwd = getpwuid(file_stat.st_uid);
                    struct group* grp = getgrgid(file_stat.st_gid);
                    printf("  %s: UID=%d(%s) GID=%d(%s) 权限=%o\n",
                           entry->d_name,
                           (int)file_stat.st_uid, pwd ? pwd->pw_name : "unknown",
                           (int)file_stat.st_gid, grp ? grp->gr_name : "unknown",
                           file_stat.st_mode & 0777);
                }
            }
        }
        closedir(dir);
    }
    
    // 演示安全的所有权修改
    printf("\n3. 安全所有权修改演示:\n");
    
    // 获取测试用户
    struct passwd* test_user1 = getpwnam("nobody");
    struct passwd* test_user2 = getpwnam("daemon");
    uid_t user1_uid = test_user1 ? test_user1->pw_uid : 65534;
    uid_t user2_uid = test_user2 ? test_user2->pw_uid : 1;
    
    struct group* test_group1 = getgrnam("nobody");
    struct group* test_group2 = getgrnam("daemon");
    gid_t group1_gid = test_group1 ? test_group1->gr_gid : 65534;
    gid_t group2_gid = test_group2 ? test_group2->gr_gid : 1;
    
    // 安全修改配置文件所有权
    secure_chownat(test_dirfd, "config.txt", user1_uid, group1_gid, 0, 
                   "配置文件所有权转移");
    verify_ownership_change(test_dirfd, "config.txt", user1_uid, group1_gid);
    
    // 安全修改日志文件所有权
    secure_chownat(test_dirfd, "logs/app.log", user2_uid, group2_gid, 0, 
                   "日志文件所有权转移");
    verify_ownership_change(test_dirfd, "logs/app.log", user2_uid, group2_gid);
    
    // 演示符号链接处理
    printf("\n4. 符号链接处理演示:\n");
    
    // 获取符号链接当前状态
    struct stat link_stat, target_stat;
    if (fstatat(test_dirfd, "link_to_config", &link_stat, AT_SYMLINK_NOFOLLOW) == 0 &&
        fstatat(test_dirfd, "link_to_config", &target_stat, 0) == 0) {
        printf("符号链接状态:\n");
        printf("  链接文件: UID=%d, GID=%d\n", 
               (int)link_stat.st_uid, (int)link_stat.st_gid);
        printf("  目标文件: UID=%d, GID=%d\n", 
               (int)target_stat.st_uid, (int)target_stat.st_gid);
    }
    
    // 默认行为：修改目标文件
    secure_chownat(test_dirfd, "link_to_config", user1_uid + 1, group1_gid + 1, 0, 
                   "修改符号链接目标");
    verify_ownership_change(test_dirfd, "config.txt", user1_uid + 1, group1_gid + 1);
    
    // 使用AT_SYMLINK_NOFOLLOW：修改链接本身
    secure_chownat(test_dirfd, "link_to_config", user2_uid + 1, group2_gid + 1, 
                   AT_SYMLINK_NOFOLLOW, "修改符号链接本身");
    
    // 验证符号链接状态
    if (fstatat(test_dirfd, "link_to_config", &link_stat, AT_SYMLINK_NOFOLLOW) == 0) {
        printf("修改后链接文件所有权: UID=%d, GID=%d\n", 
               (int)link_stat.st_uid, (int)link_stat.st_gid);
    }
    
    // 演示错误处理和权限检查
    printf("\n5. 错误处理和权限检查演示:\n");
    
    // 尝试修改不存在的文件
    secure_chownat(test_dirfd, "nonexistent.txt", 0, 0, 0, "测试不存在文件");
    
    // 尝试使用无效的UID/GID
    secure_chownat(test_dirfd, "config.txt", 999999, 999999, 0, "测试无效UID/GID");
    
    // 尝试在只读文件系统上修改（如果可能）
    // 这个测试需要特殊的环境设置
    
    // 显示所有修改记录
    show_change_records();
    
    // 统计成功和失败的修改
    int success_count = 0, failure_count = 0;
    for (int i = 0; i < record_count; i++) {
        if (change_records&#91;i].success) {
            success_count++;
        } else {
            failure_count++;
        }
    }
    
    printf("修改统计:\n");
    printf("  成功: %d\n", success_count);
    printf("  失败: %d\n", failure_count);
    printf("  总计: %d\n", record_count);
    
    // 清理测试环境
    printf("\n6. 清理测试环境:\n");
    
    // 删除文件
    const char* test_files&#91;] = {
        "secure_ownership_test/link_to_config",
        "secure_ownership_test/config.txt",
        "secure_ownership_test/private.key",
        "secure_ownership_test/database.db",
        "secure_ownership_test/logs/app.log"
    };
    
    for (int i = 0; i < 5; i++) {
        if (unlink(test_files&#91;i]) == 0) {
            printf("删除文件: %s\n", test_files&#91;i]);
        }
    }
    
    // 删除目录
    rmdir("secure_ownership_test/logs");
    rmdir("secure_ownership_test");
    printf("删除测试目录\n");
    
    close(test_dirfd);
    
    printf("\n=== 高级所有权管理演示完成 ===\n");
    
    return 0;
}

编译和运行

# 编译示例1
sudo gcc -o fchownat_example1 fchownat_example1.c
sudo ./fchownat_example1

# 编译示例2
sudo gcc -o fchownat_example2 fchownat_example2.c
sudo ./fchownat_example2

# 编译示例3
sudo gcc -o fchownat_example3 fchownat_example3.c
sudo ./fchownat_example3

# 编译示例4
sudo gcc -o fchownat_example4 fchownat_example4.c
sudo ./fchownat_example4

重要注意事项

权限要求: 修改文件所有权通常需要root权限或CAP_CHOWN能力

相对路径: 使用dirfd参数支持相对路径操作

符号链接: 默认修改目标文件，使用AT_SYMLINK_NOFOLLOW修改链接本身

错误处理: 必须检查返回值并适当处理错误

安全性: 验证输入参数以防止安全问题

原子性: fchownat操作是原子的

审计: 建议记录所有权修改操作

最佳实践

权限检查: 在修改前检查必要的权限

输入验证: 验证路径和ID参数的有效性

错误处理: 完善的错误处理和日志记录

原子操作: 利用fchownat的原子性特性

安全审计: 记录所有权修改操作

批量操作: 使用相对路径提高批量操作效率

符号链接: 明确处理符号链接的行为

通过这些示例，你可以理解fchownat在文件所有权管理方面的强大功能，它为Linux系统提供了灵活、安全的文件权限控制能力，特别适用于系统管理、安全控制和批量文件操作等场景。

2025-08-23

Linux系统编程

getitimer系统调用及示例

我们继续学习 Linux 系统编程中的重要函数。这次我们介绍 setitimer 函数的“伙伴”——getitimer 函数。虽然我们在介绍 setitimer 时已经涉及了 getitimer，但为了完整性和强调其重要性，我们再单独详细讲解一下。

1. 函数介绍

getitimer 是一个 Linux 系统调用，专门用于查询当前进程的间隔计时器（interval timers）的当前状态。它允许程序了解指定类型计时器的剩余时间（距离下次超时还有多久）和重载时间（超时后会自动重启的周期）。

你可以把它想象成查看你设置的闹钟或定时器上还剩多少时间。

2. 函数原型

#include <sys/time.h> // 必需

int getitimer(int which, struct itimerval *curr_value);

3. 功能

查询状态: 获取由 which 参数指定的计时器类型的当前设置。
填充结构体: 将获取到的计时器状态信息填充到调用者提供的 struct itimerval 结构体指针 curr_value 所指向的内存中。

4. 参数

int which: 指定要查询的计时器类型。与 setitimer 相同，主要有以下三种：

ITIMER_REAL: 实时计时器。时钟源是系统实时时间，超时发送 SIGALRM。
ITIMER_VIRTUAL: 虚拟计时器。时钟源是进程在用户态的 CPU 时间，超时发送 SIGVTALRM。
ITIMER_PROF: 性能计时器。时钟源是进程在用户态和内核态的总 CPU 时间，超时发送 SIGPROF。

struct itimerval *curr_value: 这是一个指向 struct itimerval 结构体的指针。getitimer 调用成功后，会将查询到的计时器当前状态信息存储到这个结构体中。

5. struct itimerval 结构体

这个结构体在 setitimer 和 getitimer 中都使用，用于表示计时器的时间设置：

struct itimerval {
    struct timeval it_interval; // 重载时间 (Interval)
    struct timeval it_value;    // 当前值/剩余时间 (Value)
};

其中 struct timeval 定义了秒和微秒：

struct timeval {
    time_t      tv_sec;         // 秒
    suseconds_t tv_usec;        // 微秒 (0 - 999,999)
};

通过 getitimer 返回的 curr_value:

curr_value->it_value: 表示该计时器当前还剩多少时间就会超时。如果计时器未启动或已超时，这个值通常是 0。
curr_value->it_interval: 表示该计时器被设置的重载时间（周期）。这个值是在调用 setitimer 时设置的，getitimer 只是将其读取出来。

6. 返回值

成功时: 返回 0。同时，curr_value 指向的结构体被成功填充。
失败时: 返回 -1，并设置全局变量 errno 来指示具体的错误原因（例如 EINVAL which 参数无效）。

7. 相似函数，或关联函数

setitimer: 用于设置间隔计时器。
getitimer 与 setitimer: 通常成对出现，用于查询和设置计时器。
alarm / setitimer: alarm(seconds) 大致等价于 setitimer(ITIMER_REAL, …)。
timer_gettime: POSIX 定时器 API 中用于获取定时器时间的函数，功能更强大。
clock_gettime: 用于获取各种系统时钟的当前时间。

8. 示例代码

示例 1：监控 ITIMER_REAL 的倒计时

这个例子演示了如何在启动一个 ITIMER_REAL 计时器后，使用 getitimer 在循环中监控其剩余时间。

#include <sys/time.h> // getitimer, setitimer
#include <signal.h>   // signal
#include <unistd.h>   // pause
#include <stdio.h>    // printf, perror
#include <stdlib.h>   // exit
#include <string.h>   // memset

volatile sig_atomic_t alarm_fired = 0;

void alarm_handler(int sig) {
    write(STDERR_FILENO, "Timer expired! SIGALRM received.\n", 32);
    alarm_fired = 1;
}

int main() {
    struct itimerval timer, current_timer;
    int seconds_to_wait = 5;

    if (signal(SIGALRM, alarm_handler) == SIG_ERR) {
        perror("signal SIGALRM");
        exit(EXIT_FAILURE);
    }

    // 1. 设置 ITIMER_REAL 计时器
    memset(&timer, 0, sizeof(timer)); // 清零结构体是个好习惯
    timer.it_value.tv_sec = seconds_to_wait; // 5 秒后超时
    timer.it_interval.tv_sec = 0;            // 一次性，不重复

    printf("Setting ITIMER_REAL to expire in %d seconds.\n", seconds_to_wait);

    if (setitimer(ITIMER_REAL, &timer, NULL) == -1) {
        perror("setitimer");
        exit(EXIT_FAILURE);
    }

    printf("Monitoring timer countdown:\n");

    // 2. 循环监控计时器剩余时间
    while (!alarm_fired) {
        if (getitimer(ITIMER_REAL, &current_timer) == -1) {
            perror("getitimer");
            // 即使 getitimer 失败，也继续循环或退出
            break;
        }

        // 打印剩余时间
        printf("  Time remaining: %ld.%06ld seconds\n",
               (long)current_timer.it_value.tv_sec,
               (long)current_timer.it_value.tv_usec);

        // 简单延时 0.5 秒再检查 (可以使用 nanosleep 实现更精确的延时)
        usleep(500000); // 500,000 微秒 = 0.5 秒
    }

    if (alarm_fired) {
        printf("Timer has fired. Program exiting.\n");
    } else {
        printf("Loop exited before timer fired.\n");
    }

    return 0;
}

代码解释:

设置 SIGALRM 信号处理函数。

使用 setitimer 启动一个 5 秒后超时的一次性 ITIMER_REAL 计时器。

进入一个 while 循环，循环条件是 alarm_fired 标志为假。

在循环内部：

调用 getitimer(ITIMER_REAL, &current_timer) 获取计时器的当前状态。
检查返回值，处理可能的错误。
打印 current_timer.it_value 中的剩余秒数和微秒数。
调用 usleep(500000) 延时 0.5 秒，然后继续下一次循环。

当 SIGALRM 信号到达，alarm_handler 被调用，设置 alarm_fired 为真，主循环退出。

示例 2：检查周期性计时器的状态

这个例子演示了如何查询一个周期性计时器的剩余时间和重载时间。

#include <sys/time.h>
#include <signal.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdatomic.h>
#include <string.h>

volatile atomic_int prof_count = 0;

void prof_handler(int sig) {
    atomic_fetch_add(&prof_count, 1);
    // 简单打印，实际信号处理应更谨慎
    write(STDERR_FILENO, "SIGPROF\n", 8);
}

int main() {
    struct itimerval timer, status;
    const int period_sec = 2;
    const int num_intervals = 3;

    if (signal(SIGPROF, prof_handler) == SIG_ERR) {
        perror("signal SIGPROF");
        exit(EXIT_FAILURE);
    }

    // 1. 设置 ITIMER_PROF 周期性计时器
    memset(&timer, 0, sizeof(timer));
    timer.it_value.tv_sec = period_sec;     // 第一次 2 秒后超时
    timer.it_interval.tv_sec = period_sec;  // 之后每 2 秒超时一次

    printf("Setting periodic ITIMER_PROF timer (every %d seconds).\n", period_sec);

    if (setitimer(ITIMER_PROF, &timer, NULL) == -1) {
        perror("setitimer ITIMER_PROF");
        exit(EXIT_FAILURE);
    }

    printf("Checking timer status immediately after setting:\n");
    if (getitimer(ITIMER_PROF, &status) == -1) {
        perror("getitimer ITIMER_PROF");
        exit(EXIT_FAILURE);
    }
    printf("  Current value (remaining time): %ld.%06ld seconds\n",
           (long)status.it_value.tv_sec, (long)status.it_value.tv_usec);
    printf("  Interval (reload time): %ld.%06ld seconds\n",
           (long)status.it_interval.tv_sec, (long)status.it_interval.tv_usec);

    // 2. 等待几个周期
    printf("Waiting for %d intervals...\n", num_intervals);
    int last_count = 0;
    while (atomic_load(&prof_count) < num_intervals) {
        // 可以在这里做些消耗 CPU 的工作，让 ITIMER_PROF 计时
        // 为了简单，我们用 sleep 模拟时间流逝
        // 注意：sleep 是墙钟时间，而 ITIMER_PROF 是 CPU 时间
        // 如果进程大部分时间在 sleep，ITIMER_PROF 可能不会按预期计时
        // 这里只是演示 getitimer，实际使用中要考虑时钟源
        sleep(1);

        // 检查计时器状态
        if (getitimer(ITIMER_PROF, &status) == -1) {
            perror("getitimer during loop");
        } else {
            // 只在计数变化时打印，减少输出
            int current_count = atomic_load(&prof_count);
            if (current_count != last_count) {
                printf("  After %d SIGPROF signals:\n", current_count);
                printf("    Current value: %ld.%06ld seconds\n",
                       (long)status.it_value.tv_sec, (long)status.it_value.tv_usec);
                printf("    Interval: %ld.%06ld seconds (unchanged)\n",
                       (long)status.it_interval.tv_sec, (long)status.it_interval.tv_usec);
                last_count = current_count;
            }
        }
    }

    printf("Received %d SIGPROF signals. Stopping timer.\n", num_intervals);

    // 3. 停止计时器
    struct itimerval stop_timer = {{0, 0}, {0, 0}};
    if (setitimer(ITIMER_PROF, &stop_timer, NULL) == -1) {
        perror("setitimer stop ITIMER_PROF");
    }

    // 4. 再次检查状态 (应该都是 0)
    printf("Checking timer status after stopping:\n");
    if (getitimer(ITIMER_PROF, &status) == -1) {
        perror("getitimer after stop");
    } else {
        printf("  Current value: %ld.%06ld seconds\n",
               (long)status.it_value.tv_sec, (long)status.it_value.tv_usec);
        printf("  Interval: %ld.%06ld seconds\n",
               (long)status.it_interval.tv_sec, (long)status.it_interval.tv_usec);
    }

    return 0;
}

代码解释:

设置 SIGPROF 信号处理函数，并使用原子变量 prof_count 来计数。

使用 setitimer 启动一个每 2 秒触发一次的周期性 ITIMER_PROF 计时器。

立即调用 getitimer 检查并打印计时器的初始状态，可以看到 it_value 和 it_interval 都被正确设置。

进入循环等待 SIGPROF 信号。为了简化，循环中使用 sleep(1)，但这并不会增加 ITIMER_PROF 的计时，因为 sleep 是墙钟时间，而 ITIMER_PROF 计算的是 CPU 时间。在实际性能分析中，这里应该是消耗 CPU 的工作。

在循环中定期调用 getitimer，并打印剩余时间。注意 it_interval 始终保持不变，因为它表示的是设置的周期。

接收到指定数量的信号后，通过设置 it_value 和 it_interval 都为 0 来停止计时器。

最后再次调用 getitimer，确认计时器已停止（值为 0）。

示例 3：结合 setitimer 的 old_value 和 getitimer

这个例子演示了如何结合使用 setitimer 的 old_value 参数和 getitimer 来管理计时器状态。

#include <sys/time.h>
#include <signal.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void alarm_handler(int sig) {
    write(STDERR_FILENO, "SIGALRM\n", 8);
}

void print_timer_status(const char *msg, const struct itimerval *timer) {
    printf("%s:\n", msg);
    printf("  Current value (remaining): %ld.%06ld seconds\n",
           (long)timer->it_value.tv_sec, (long)timer->it_value.tv_usec);
    printf("  Interval (reload): %ld.%06ld seconds\n",
           (long)timer->it_interval.tv_sec, (long)timer->it_interval.tv_usec);
}

int main() {
    struct itimerval initial_timer, old_timer, current_timer;

    if (signal(SIGALRM, alarm_handler) == SIG_ERR) {
        perror("signal SIGALRM");
        exit(EXIT_FAILURE);
    }

    // 1. 设置一个初始计时器 (10秒后超时，不重复)
    memset(&initial_timer, 0, sizeof(initial_timer));
    initial_timer.it_value.tv_sec = 10;
    printf("Setting initial timer for 10 seconds.\n");
    if (setitimer(ITIMER_REAL, &initial_timer, NULL) == -1) {
        perror("setitimer initial");
        exit(EXIT_FAILURE);
    }

    sleep(3); // 等待 3 秒

    // 2. 使用 setitimer 的 old_value 参数获取并保存旧设置
    struct itimerval temp_timer = {{0, 0}, {0, 0}}; // 新的临时设置 (这里设置为 0)
    printf("\nCalling setitimer to get old value (without changing timer).\n");
    // 通过设置一个 '0' 计时器并获取 old_value，可以读取当前状态
    // 但这会取消当前计时器，不太符合 '只读' 的目的
    // 更标准的 '只读' 方式是使用 getitimer

    // 让我们用更清晰的方式：先 getitimer，再用 setitimer 的 old_value
    printf("--- Correct way to get timer status ---\n");
    if (getitimer(ITIMER_REAL, &current_timer) == -1) {
        perror("getitimer");
        exit(EXIT_FAILURE);
    }
    print_timer_status("Status after 3 seconds (using getitimer)", &current_timer);

    // 3. 现在，设置一个新计时器 (5秒)，并保存旧设置
    struct itimerval new_timer;
    memset(&new_timer, 0, sizeof(new_timer));
    new_timer.it_value.tv_sec = 5; // 5 秒后超时

    printf("\n--- Setting new timer (5s) and saving old setting ---\n");
    if (setitimer(ITIMER_REAL, &new_timer, &old_timer) == -1) {
        perror("setitimer with old_value");
        exit(EXIT_FAILURE);
    }
    print_timer_status("Saved old timer setting", &old_timer);

    sleep(2); // 再等待 2 秒

    // 4. 检查当前计时器状态
    if (getitimer(ITIMER_REAL, &current_timer) == -1) {
        perror("getitimer current");
    } else {
        print_timer_status("Current timer status (after 2s of new timer)", &current_timer);
    }

    printf("Waiting for new 5-second timer to expire...\n");
    pause(); // 等待 SIGALRM

    // 5. 恢复旧的计时器设置
    printf("\n--- Restoring old timer setting ---\n");
    if (setitimer(ITIMER_REAL, &old_timer, NULL) == -1) {
        perror("setitimer restore old");
        exit(EXIT_FAILURE);
    }

    if (getitimer(ITIMER_REAL, &current_timer) == -1) {
        perror("getitimer after restore");
    } else {
        print_timer_status("Timer status after restore", &current_timer);
    }

    printf("Waiting for restored timer to expire...\n");
    pause(); // 等待恢复的计时器 SIGALRM

    printf("Restored timer expired. Program finished.\n");
    return 0;
}

代码解释:

设置一个 10 秒后超时的初始计时器。

等待 3 秒后，调用 getitimer 获取并打印当前计时器状态（剩余约 7 秒）。

调用 setitimer(ITIMER_REAL, &new_timer, &old_timer)：

设置一个新的 5 秒计时器。
将旧计时器的设置（剩余约 7 秒）通过 old_value 参数保存在 old_timer 变量中。

等待 2 秒后，再次调用 getitimer 检查当前（新）计时器的状态（剩余约 3 秒）。

等待新计时器超时。

调用 setitimer(ITIMER_REAL, &old_timer, NULL) 将计时器恢复为之前保存的设置（约 7 秒）。

调用 getitimer 验证恢复是否成功。

等待恢复的计时器超时。

重要提示与注意事项:

“只读”查询: getitimer 是查询计时器状态的标准且推荐方式。虽然可以通过 setitimer 的 old_value 参数间接获取信息（例如设置一个 0 秒的定时器），但这通常会改变计时器状态（取消它），不符合“只读”查询的初衷。

精度: getitimer 返回的时间精度是微秒（tv_usec）。

时钟源: 返回的 it_value（剩余时间）是基于对应计时器的时钟源的。ITIMER_REAL 是墙钟时间，ITIMER_VIRTUAL 和 ITIMER_PROF 是 CPU 时间。

状态检查: getitimer 是检查计时器是否仍在运行以及还剩多少时间的有效方法。

与 setitimer 配合: getitimer 和 setitimer 经常一起使用，用于保存、恢复或动态调整计时器设置。

总结:

getitimer 是一个专门用于查询进程间隔计时器当前状态的系统调用。它通过填充 struct itimerval 结构体，返回指定计时器的剩余时间和重载时间。理解其与 setitimer 的配合使用对于精确控制和监控基于信号的定时任务至关重要。虽然 setitimer 的 old_value 参数也能提供一些信息，但 getitimer 是进行“只读”状态查询的标准和清晰的方法。

getitimer系统调用及示例-CSDN博客

2025-08-23

Linux命令

Kali Linux Download Websites and Mirror Sites Collection

KEYWORDS：kalilinux，kali Linux downaload，Kali Linux download, Kali Linux mirror sites, Download Kali Linux official site, Kali Linux alternative download sites, Best Kali Linux download mirrors, Kali Linux latest version download, Kali Linux download guide, Kali Linux mirror list, Where to download Kali Linux, Kali Linux secure download sources

For Chinese：https://www.calcguide.tech/2025/08/23/kali-linux-download-官方站点及镜像站点整理/

For English：https://www.calcguide.tech/2025/08/23/kali-linux-download-websites-and-mirror-sites-collection/

Kali Linux Download

🌐 Official Download Websites

Official Main Sites

Official Website: https://www.kali.org/
Official Download Page: https://www.kali.org/get-kali/
Official Documentation: https://www.kali.org/docs/

Official Direct Download Links

Latest ISO: https://cdimage.kali.org/kali-current/
Weekly Builds: https://cdimage.kali.org/kali-weekly/
Historical Versions: https://cdimage.kali.org/

🌍 Domestic Mirror Sites (China)

Educational Network Mirrors

Tsinghua University Mirror

Address: https://mirrors.tuna.tsinghua.edu.cn/kali/
Features: Fast speed, stable

USTC Mirror

Address: https://mirrors.ustc.edu.cn/kali/
Features: Timely synchronization

Shanghai Jiao Tong University Mirror

Address: https://mirror.sjtu.edu.cn/kali/
Features: Fast access in East China region

Huazhong University of Science and Technology Mirror

Address: https://mirrors.hust.edu.cn/kali/
Features: Optimized for Central China region

Commercial Mirrors

Alibaba Cloud Mirror

Address: https://mirrors.aliyun.com/kali/
Features: Nationwide CDN acceleration

Huawei Cloud Mirror

Address: https://mirrors.huaweicloud.com/kali/
Features: Enterprise-level stability

Tencent Cloud Mirror

Address: https://mirrors.cloud.tencent.com/kali/
Features: Optimized for Southern China region

🌎 International Mirror Sites

Asia Region

Japan Mirror

Address: https://kali.download/kali/
Features: Official recommended mirror

Korea Mirror

Address: https://ftp.kaist.ac.kr/kali/
Features: Alternative for Asia region

Europe Region

Germany Mirror

Address: https://ftp.halifax.rwth-aachen.de/kali/
Features: Preferred for European region

France Mirror

Address: https://kali.mirror.garr.it/kali/
Features: High-speed European mirror

UK Mirror

Address: https://www.mirrorservice.org/sites/ftp.kali.org/kali/
Features: UK educational network mirror

North America Region

USA Mirror

Address: https://kali.mirror.globo.tech/kali/
Features: West Coast US mirror

📦 Different Version Download Links

Desktop Versions

Standard Edition: kali-linux-default-amd64.iso
Large Edition: kali-linux-large-amd64.iso
Full Feature Edition: kali-linux-everything-amd64.iso

Lightweight Versions

Light Edition: kali-linux-light-amd64.iso
Minimal Edition: kali-linux-minimal-amd64.iso

Special Purpose Versions

NetHunter: Designed for mobile devices
ARM Version: For ARM architecture devices
Live Version: Bootable Live system

⚡ Download Recommendations

Mirror Selection Suggestions

Domestic Users: Prioritize Tsinghua, USTC, Alibaba Cloud mirrors

Educational Network Users: Prioritize educational network mirror sites

International Users: Choose mirrors geographically closer

Download Methods

# Using wget (Recommended)
wget https://cdimage.kali.org/kali-current/kali-linux-default-amd64.iso

# Using aria2 for multi-threaded download
aria2c -x 16 -s 16 https://cdimage.kali.org/kali-current/kali-linux-default-amd64.iso

# Using curl
curl -O https://cdimage.kali.org/kali-current/kali-linux-default-amd64.iso

Verify File Integrity

# Verify SHA256 hash
sha256sum kali-linux-default-amd64.iso

# Verify GPG signature
gpg --verify kali-linux-default-amd64.iso.sig kali-linux-default-amd64.iso

🔧 Related Tools

Image Writing Tools

Rufus (Windows)

Etcher (Cross-platform)

UNetbootin (Cross-platform)

Ventoy (Multi-image management)

Virtual Machine Images

VMware: Pre-configured virtual machine images
VirtualBox: OVA format virtual machines
Hyper-V: VHDX format images

⚠️ Important Notes

Security Reminder: Download only from official or trusted mirrors

File Verification: Always verify file integrity after download

Legal Usage: Use only for legitimate security testing purposes

System Requirements: Ensure hardware meets minimum requirements

🔄 Repository Configuration

Using domestic mirrors can speed up package updates:

# Edit sources list
sudo nano /etc/apt/sources.list

# Add domestic mirror (Tsinghua example)
deb https://mirrors.tuna.tsinghua.edu.cn/kali kali-rolling main non-free contrib
deb-src https://mirrors.tuna.tsinghua.edu.cn/kali kali-rolling main non-free contrib

These mirror sites can provide you with fast and stable Kali Linux download experience. It is recommended to choose the most suitable mirror site based on your geographical location and network environment.

Kali Linux Smart Download Script

#!/bin/bash

# Kali Linux Smart Download Script
# Automatically detects language, country, and OS to select the fastest mirror

# Colors for output
RED='\033&#91;0;31m'
GREEN='\033&#91;0;32m'
YELLOW='\033&#91;1;33m'
BLUE='\033&#91;0;34m'
PURPLE='\033&#91;0;35m'
CYAN='\033&#91;0;36m'
NC='\033&#91;0m' # No Color

# Global variables
SCRIPT_VERSION="1.0"
SELECTED_MIRROR=""
SELECTED_VERSION=""
DOWNLOAD_URL=""
DOWNLOAD_DIR=""
LANGUAGE=""
COUNTRY=""
OS_TYPE=""

# Mirror lists by region
declare -A MIRRORS_CN=(
    &#91;"tsinghua"]="https://mirrors.tuna.tsinghua.edu.cn/kali"
    &#91;"ustc"]="https://mirrors.ustc.edu.cn/kali"
    &#91;"aliyun"]="https://mirrors.aliyun.com/kali"
    &#91;"huaweicloud"]="https://mirrors.huaweicloud.com/kali"
    &#91;"sjtu"]="https://mirror.sjtu.edu.cn/kali"
    &#91;"hust"]="https://mirrors.hust.edu.cn/kali"
    &#91;"tencent"]="https://mirrors.cloud.tencent.com/kali"
)

declare -A MIRRORS_GLOBAL=(
    &#91;"official"]="https://http.kali.org/kali"
    &#91;"jp"]="https://kali.download/kali"
    &#91;"de"]="https://ftp.halifax.rwth-aachen.de/kali"
    &#91;"fr"]="https://kali.mirror.garr.it/kali"
    &#91;"uk"]="https://www.mirrorservice.org/sites/ftp.kali.org/kali"
    &#91;"us"]="https://kali.mirror.globo.tech/kali"
)

# Kali versions
declare -A KALI_VERSIONS=(
    &#91;"default"]="kali-linux-default-amd64.iso"
    &#91;"large"]="kali-linux-large-amd64.iso"
    &#91;"everything"]="kali-linux-everything-amd64.iso"
    &#91;"light"]="kali-linux-light-amd64.iso"
    &#91;"minimal"]="kali-linux-minimal-amd64.iso"
)

# Function to print banner
print_banner() {
    echo -e "${CYAN}"
    echo "  _  __       _ _    _     _     _ _     "
    echo " | |/ /      | | |  | |   (_)   | (_)    "
    echo " | ' / __ _  | | |  | |___ _ ___| |_ ___ "
    echo " |  < / _\` | | | |  | / __| / __| | / __|"
    echo " | . \ (_| | | | |__| \__ \ \__ \ | \__ \\"
    echo " |_|\_\__,_|_|_|\____/|___/_|___/_|_|___/"
    echo -e "${NC}"
    echo -e "${YELLOW}Kali Linux Smart Download Script v${SCRIPT_VERSION}${NC}"
    echo -e "${BLUE}Automatically selecting the fastest mirror for your location${NC}"
    echo ""
}

# Function to detect system information
detect_system() {
    echo -e "${GREEN}&#91;INFO]${NC} Detecting system information..."

    # Detect language
    if &#91; -n "$LANG" ]; then
        LANGUAGE=$(echo "$LANG" | cut -d'_' -f1 | tr '&#91;:upper:]' '&#91;:lower:]')
    else
        LANGUAGE="en"
    fi

    # Detect country
    if command -v curl >/dev/null 2>&1; then
        COUNTRY=$(curl -s https://ipapi.co/country_code/ 2>/dev/null || echo "US")
    elif command -v wget >/dev/null 2>&1; then
        COUNTRY=$(wget -qO- https://ipapi.co/country_code/ 2>/dev/null || echo "US")
    else
        COUNTRY="US"
    fi

    # Detect OS type
    if &#91;&#91; "$OSTYPE" == "linux-gnu"* ]]; then
        OS_TYPE="Linux"
    elif &#91;&#91; "$OSTYPE" == "darwin"* ]]; then
        OS_TYPE="macOS"
    elif &#91;&#91; "$OSTYPE" == "cygwin" ]] || &#91;&#91; "$OSTYPE" == "msys" ]] || &#91;&#91; "$OSTYPE" == "win32" ]]; then
        OS_TYPE="Windows"
    else
        OS_TYPE="Unknown"
    fi

    echo -e "${BLUE}Language:${NC} $LANGUAGE"
    echo -e "${BLUE}Country:${NC} $COUNTRY"
    echo -e "${BLUE}OS Type:${NC} $OS_TYPE"
    echo ""
}

# Function to test mirror speed
test_mirror_speed() {
    local mirror_url=$1
    local timeout=10

    echo -ne "${YELLOW}Testing${NC} $mirror_url ... "

    # Test connection speed
    if command -v curl >/dev/null 2>&1; then
        local start_time=$(date +%s%3N)
        local response=$(curl -s -o /dev/null -w "%{http_code}" --max-time $timeout "$mirror_url" 2>/dev/null)
        local end_time=$(date +%s%3N)
        local duration=$((end_time - start_time))
    elif command -v wget >/dev/null 2>&1; then
        local start_time=$(date +%s%3N)
        local response=$(wget -q --spider --timeout=$timeout "$mirror_url" 2>&1 && echo "200" || echo "0")
        local end_time=$(date +%s%3N)
        local duration=$((end_time - start_time))
    else
        echo -e "${RED}&#91;FAIL]${NC} No download tool available"
        return 999
    fi

    if &#91; "$response" = "200" ] || &#91; "$response" = "301" ] || &#91; "$response" = "302" ]; then
        echo -e "${GREEN}&#91;OK]${NC} (${duration}ms)"
        return $duration
    else
        echo -e "${RED}&#91;FAIL]${NC}"
        return 999
    fi
}

# Function to find fastest mirror
find_fastest_mirror() {
    echo -e "${GREEN}&#91;INFO]${NC} Finding the fastest mirror..."
    echo ""

    local fastest_time=999999
    local fastest_mirror=""
    local mirror_name=""

    # Test Chinese mirrors if in China or Chinese language
    if &#91;&#91; "$COUNTRY" == "CN" ]] || &#91;&#91; "$LANGUAGE" == "zh" ]]; then
        echo -e "${PURPLE}&#91;Testing Chinese Mirrors]${NC}"
        for name in "${!MIRRORS_CN&#91;@]}"; do
            local url="${MIRRORS_CN&#91;$name]}"
            test_mirror_speed "$url"
            local result=$?
            if &#91; $result -lt $fastest_time ] && &#91; $result -ne 999 ]; then
                fastest_time=$result
                fastest_mirror=$url
                mirror_name=$name
            fi
        done
        echo ""
    fi

    # Test global mirrors
    echo -e "${PURPLE}&#91;Testing Global Mirrors]${NC}"
    for name in "${!MIRRORS_GLOBAL&#91;@]}"; do
        local url="${MIRRORS_GLOBAL&#91;$name]}"
        test_mirror_speed "$url"
        local result=$?
        if &#91; $result -lt $fastest_time ] && &#91; $result -ne 999 ]; then
            fastest_time=$result
            fastest_mirror=$url
            mirror_name=$name
        fi
    done

    if &#91; -n "$fastest_mirror" ]; then
        SELECTED_MIRROR=$fastest_mirror
        echo ""
        echo -e "${GREEN}&#91;SUCCESS]${NC} Fastest mirror selected: ${BLUE}$mirror_name${NC} (${fastest_time}ms)"
        echo -e "${BLUE}URL:${NC} $SELECTED_MIRROR"
    else
        echo -e "${RED}&#91;ERROR]${NC} No working mirror found. Using official mirror."
        SELECTED_MIRROR="${MIRRORS_GLOBAL&#91;official]}"
    fi
    echo ""
}

# Function to select Kali version
select_kali_version() {
    echo -e "${GREEN}&#91;INFO]${NC} Select Kali Linux version:"
    echo ""

    local counter=1
    local versions_list=()

    for version in "${!KALI_VERSIONS&#91;@]}"; do
        echo -e "${counter}. ${YELLOW}${version}${NC} - ${KALI_VERSIONS&#91;$version]}"
        versions_list&#91;$counter]=$version
        ((counter++))
    done

    echo ""
    echo -e "${BLUE}Enter your choice (1-${#versions_list&#91;@]}):${NC} "
    read -r choice

    if &#91;&#91; $choice -ge 1 && $choice -le ${#versions_list&#91;@]} ]]; then
        SELECTED_VERSION="${versions_list&#91;$choice]}"
        echo -e "${GREEN}&#91;SELECTED]${NC} Version: $SELECTED_VERSION"
    else
        echo -e "${YELLOW}&#91;DEFAULT]${NC} Using default version"
        SELECTED_VERSION="default"
    fi

    echo ""
}

# Function to select download directory
select_download_directory() {
    echo -e "${GREEN}&#91;INFO]${NC} Select download directory:"
    echo ""

    # Default download locations
    local default_dir=""
    if &#91;&#91; "$OS_TYPE" == "Windows" ]]; then
        default_dir="$HOME/Downloads"
    elif &#91;&#91; "$OS_TYPE" == "macOS" ]]; then
        default_dir="$HOME/Downloads"
    else
        default_dir="$HOME/Downloads"
    fi

    echo -e "Default directory: ${BLUE}$default_dir${NC}"
    echo -e "Press Enter to use default, or enter custom path:"
    read -r custom_dir

    if &#91; -n "$custom_dir" ]; then
        DOWNLOAD_DIR="$custom_dir"
    else
        DOWNLOAD_DIR="$default_dir"
    fi

    # Create directory if it doesn't exist
    mkdir -p "$DOWNLOAD_DIR"

    echo -e "${GREEN}&#91;SELECTED]${NC} Download directory: $DOWNLOAD_DIR"
    echo ""
}

# Function to construct download URL
construct_download_url() {
    local iso_file="${KALI_VERSIONS&#91;$SELECTED_VERSION]}"

    # Handle different mirror URL structures
    if &#91;&#91; "$SELECTED_MIRROR" == *"kali.org"* ]] || &#91;&#91; "$SELECTED_MIRROR" == *"kali.download"* ]]; then
        DOWNLOAD_URL="${SELECTED_MIRROR}/current/$iso_file"
    else
        DOWNLOAD_URL="$SELECTED_MIRROR/$iso_file"
    fi

    echo -e "${GREEN}&#91;INFO]${NC} Download URL constructed:"
    echo -e "${BLUE}$DOWNLOAD_URL${NC}"
    echo ""
}

# Function to download file
download_file() {
    local filename=$(basename "$DOWNLOAD_URL")
    local filepath="$DOWNLOAD_DIR/$filename"

    echo -e "${GREEN}&#91;INFO]${NC} Starting download..."
    echo -e "${BLUE}File:${NC} $filename"
    echo -e "${BLUE}Destination:${NC} $filepath"
    echo ""

    # Choose download method
    if command -v aria2c >/dev/null 2>&1; then
        echo -e "${YELLOW}&#91;USING]${NC} aria2c (multi-threaded)"
        aria2c -x 16 -s 16 -d "$DOWNLOAD_DIR" "$DOWNLOAD_URL"
    elif command -v axel >/dev/null 2>&1; then
        echo -e "${YELLOW}&#91;USING]${NC} axel (multi-threaded)"
        axel -n 8 -o "$filepath" "$DOWNLOAD_URL"
    elif command -v wget >/dev/null 2>&1; then
        echo -e "${YELLOW}&#91;USING]${NC} wget"
        cd "$DOWNLOAD_DIR" && wget "$DOWNLOAD_URL"
    elif command -v curl >/dev/null 2>&1; then
        echo -e "${YELLOW}&#91;USING]${NC} curl"
        cd "$DOWNLOAD_DIR" && curl -O "$DOWNLOAD_URL"
    else
        echo -e "${RED}&#91;ERROR]${NC} No download tool available!"
        exit 1
    fi

    local download_status=$?

    if &#91; $download_status -eq 0 ]; then
        echo ""
        echo -e "${GREEN}&#91;SUCCESS]${NC} Download completed!"
        echo -e "${BLUE}Location:${NC} $filepath"
        echo -e "${BLUE}Size:${NC} $(ls -lh "$filepath" | awk '{print $5}')"

        # Verify file integrity
        verify_file "$filepath"
    else
        echo -e "${RED}&#91;ERROR]${NC} Download failed!"
        exit 1
    fi
}

# Function to verify file integrity
verify_file() {
    local filepath=$1
    local filename=$(basename "$filepath")

    echo ""
    echo -e "${GREEN}&#91;INFO]${NC} Verifying file integrity..."

    if command -v sha256sum >/dev/null 2>&1; then
        echo -e "${YELLOW}&#91;CHECKING]${NC} SHA256 checksum..."
        sha256sum "$filepath"
    elif command -v shasum >/dev/null 2>&1; then
        echo -e "${YELLOW}&#91;CHECKING]${NC} SHA256 checksum..."
        shasum -a 256 "$filepath"
    fi

    echo ""
    echo -e "${GREEN}&#91;TIPS]${NC} Compare checksum with official values at:"
    echo -e "${BLUE}https://www.kali.org/get-kali/#kali-installer-images${NC}"
}

# Function to show download summary
show_summary() {
    echo ""
    echo -e "${CYAN}========== DOWNLOAD SUMMARY ==========${NC}"
    echo -e "${BLUE}Language:${NC} $LANGUAGE"
    echo -e "${BLUE}Country:${NC} $COUNTRY"
    echo -e "${BLUE}OS Type:${NC} $OS_TYPE"
    echo -e "${BLUE}Mirror:${NC} $SELECTED_MIRROR"
    echo -e "${BLUE}Version:${NC} $SELECTED_VERSION"
    echo -e "${BLUE}Download Directory:${NC} $DOWNLOAD_DIR"
    echo -e "${CYAN}=====================================${NC}"
    echo ""
}

# Function to show help
show_help() {
    echo "Kali Linux Smart Download Script"
    echo ""
    echo "Usage: $0 &#91;OPTIONS]"
    echo ""
    echo "Options:"
    echo "  -h, --help     Show this help message"
    echo "  -v, --version  Show script version"
    echo "  --auto         Run in automatic mode"
    echo ""
    echo "Features:"
    echo "  • Automatically detects your location and language"
    echo "  • Tests multiple mirrors to find the fastest one"
    echo "  • Supports multiple Kali Linux versions"
    echo "  • Uses the best available download tool"
    echo "  • Verifies file integrity after download"
}

# Function to check dependencies
check_dependencies() {
    echo -e "${GREEN}&#91;INFO]${NC} Checking dependencies..."

    local tools=("curl" "wget")
    local available_tools=()

    for tool in "${tools&#91;@]}"; do
        if command -v "$tool" >/dev/null 2>&1; then
            echo -e "  ✓ ${GREEN}$tool${NC} - Available"
            available_tools+=("$tool")
        else
            echo -e "  ✗ ${RED}$tool${NC} - Not found"
        fi
    done

    # Check download tools
    local download_tools=("aria2c" "axel" "wget" "curl")
    local download_available=0

    for tool in "${download_tools&#91;@]}"; do
        if command -v "$tool" >/dev/null 2>&1; then
            echo -e "  ✓ ${GREEN}$tool${NC} - Available (download tool)"
            download_available=1
        fi
    done

    if &#91; $download_available -eq 0 ]; then
        echo -e "${RED}&#91;ERROR]${NC} No download tool available!"
        echo -e "Please install one of: aria2c, axel, wget, curl"
        exit 1
    fi

    echo ""
}

# Main function
main() {
    # Parse command line arguments
    case "$1" in
        -h|--help)
            show_help
            exit 0
            ;;
        -v|--version)
            echo "Kali Linux Smart Download Script v$SCRIPT_VERSION"
            exit 0
            ;;
        --auto)
            AUTO_MODE=1
            ;;
        *)
            AUTO_MODE=0
            ;;
    esac

    # Print banner
    print_banner

    # Check dependencies
    check_dependencies

    # Detect system information
    detect_system

    # Find fastest mirror
    find_fastest_mirror

    # Select Kali version
    if &#91; $AUTO_MODE -eq 0 ]; then
        select_kali_version
    else
        SELECTED_VERSION="default"
        echo -e "${YELLOW}&#91;AUTO]${NC} Using default version"
    fi

    # Select download directory
    if &#91; $AUTO_MODE -eq 0 ]; then
        select_download_directory
    else
        DOWNLOAD_DIR="$HOME/Downloads"
        mkdir -p "$DOWNLOAD_DIR"
        echo -e "${YELLOW}&#91;AUTO]${NC} Using default directory: $DOWNLOAD_DIR"
    fi

    # Construct download URL
    construct_download_url

    # Show summary
    show_summary

    # Confirm download
    if &#91; $AUTO_MODE -eq 0 ]; then
        echo -e "${BLUE}Start download? (y/N):${NC} "
        read -r confirm
        if &#91;&#91; ! "$confirm" =~ ^&#91;Yy]$ ]]; then
            echo -e "${YELLOW}&#91;CANCELLED]${NC} Download cancelled by user"
            exit 0
        fi
    fi

    # Download file
    download_file

    # Final message
    echo ""
    echo -e "${GREEN}&#91;COMPLETED]${NC} Kali Linux download finished successfully!"
    echo -e "${BLUE}Next steps:${NC}"
    echo -e "  1. Verify the checksum matches official values"
    echo -e "  2. Write the ISO to a USB drive using Rufus, Etcher, or dd"
    echo -e "  3. Boot from the USB and install Kali Linux"
    echo ""
}

# Run main function
main "$@"

Usage Instructions

1. Save the script

1
2
3

# Save as kali-downloader.sh
nano kali-downloader.sh
# Paste the script content and save

2. Make it executable

1	chmod +x kali-downloader.sh

3. Run the script

# Interactive mode
./kali-downloader.sh

# Automatic mode (uses defaults)
./kali-downloader.sh --auto

# Show help
./kali-downloader.sh --help

Features

🌍 Smart Detection

Automatically detects system language and country
Identifies OS type (Linux, macOS, Windows)
Selects appropriate mirrors based on location

⚡ Speed Testing

Tests multiple mirrors simultaneously
Measures response time for each mirror
Automatically selects the fastest working mirror

📦 Version Selection

Multiple Kali Linux versions available:
Default (standard desktop)
Large (extended tools)
Everything (full package set)
Light (minimal installation)
Minimal (smallest footprint)

🛠️ Intelligent Download

Uses the best available download tool:
aria2c (multi-threaded, fastest)
axel (multi-threaded)
wget (standard)
curl (fallback)
Supports resume interrupted downloads
Shows real-time progress

🔍 Verification

Automatically verifies file integrity
Displays SHA256 checksum
Provides official verification links

🌈 User Experience

Colorful, informative output
Interactive menu system
Progress indicators
Error handling and recovery

Requirements

Dependencies (at least one required):

aria2c - Multi-threaded download (recommended)
axel - Multi-threaded download
wget - Standard download tool
curl - HTTP client

Installation of dependencies:

# Ubuntu/Debian
sudo apt update
sudo apt install aria2 wget curl

# CentOS/RHEL/Fedora
sudo yum install aria2 wget curl
# or
sudo dnf install aria2 wget curl

# macOS
brew install aria2 wget curl

# Windows (WSL)
sudo apt install aria2 wget curl

Advanced Usage

Command Line Options:

1
2
3

./kali-downloader.sh --help     # Show help
./kali-downloader.sh --version  # Show version
./kali-downloader.sh --auto     # Automatic mode

Environment Variables:

1 2	export KALI_VERSION="everything" # Set default version export DOWNLOAD_DIR="/custom/path" # Set custom download directory

This script provides an intelligent, automated way to download Kali Linux from the fastest available mirror based on your location and system configuration.

2025-08-23

Linux命令

Kali Linux Download 官方站点及镜像站点整理

Kali Linux 官方下载网站及镜像站点整理，附一键下载脚本；

关键词：Kali Linux download, Kali Linux 官方下载, Kali Linux 镜像站点, Kali Linux 官方网站, Kali Linux 下载地址, Kali Linux 官方源地址, Kali Linux 下载教程, Kali Linux 官方镜像, Kali Linux 官网下载, Kali Linux 官方站点地址

For English：Kali Linux Download Websites and Mirror Sites Collection

For Chinese：Kali Linux Download 官方站点及镜像站点整理

🌐 官方下载网站

官方主站

官方网站: https://www.kali.org/
官方下载页面: https://www.kali.org/get-kali/
官方文档: https://www.kali.org/docs/

官方直接下载链接

最新版 ISO: https://cdimage.kali.org/kali-current/
每周构建版: https://cdimage.kali.org/kali-weekly/
历史版本: https://cdimage.kali.org/

🌍 国内镜像站点

教育网镜像

清华大学镜像站

地址: https://mirrors.tuna.tsinghua.edu.cn/kali/
特点: 速度快，稳定

中科大镜像站

地址: https://mirrors.ustc.edu.cn/kali/
特点: 同步及时

上海交通大学镜像站

地址: https://mirror.sjtu.edu.cn/kali/
特点: 华东地区访问快

华中科技大学镜像站

地址: https://mirrors.hust.edu.cn/kali/
特点: 华中地区优化

商业镜像

阿里云镜像站

地址: https://mirrors.aliyun.com/kali/
特点: 全国 CDN 加速

华为云镜像站

地址: https://mirrors.huaweicloud.com/kali/
特点: 企业级稳定性

腾讯云镜像站

地址: https://mirrors.cloud.tencent.com/kali/
特点: 南方地区访问优化

🌎 国际镜像站点

亚洲地区

日本镜像

地址: https://kali.download/kali/
特点: 官方推荐镜像

韩国镜像

地址: https://ftp.kaist.ac.kr/kali/
特点: 亚洲地区备选

欧洲地区

德国镜像

地址: https://ftp.halifax.rwth-aachen.de/kali/
特点: 欧洲地区优选

法国镜像

地址: https://kali.mirror.garr.it/kali/
特点: 欧洲高速镜像

英国镜像

地址: https://www.mirrorservice.org/sites/ftp.kali.org/kali/
特点: 英国教育网镜像

北美地区

美国镜像

地址: https://kali.mirror.globo.tech/kali/
特点: 美国西海岸镜像

📦 不同版本下载链接

桌面版

标准版: kali-linux-default-amd64.iso
大型版: kali-linux-large-amd64.iso
全功能版: kali-linux-everything-amd64.iso

轻量版

轻量版: kali-linux-light-amd64.iso
最小版: kali-linux-minimal-amd64.iso

特殊用途版

NetHunter: 专为移动设备设计
ARM版: 适配ARM架构设备
Live版: 可直接启动的Live系统

⚡ 下载建议

选择镜像的建议

国内用户: 优先选择清华、中科大、阿里云等镜像

教育网用户: 优先选择教育网镜像站点

海外用户: 选择地理位置较近的官方镜像

下载方式

# 使用 wget 下载（推荐）
wget https://cdimage.kali.org/kali-current/kali-linux-default-amd64.iso

# 使用 aria2 多线程下载
aria2c -x 16 -s 16 https://cdimage.kali.org/kali-current/kali-linux-default-amd64.iso

# 使用 curl 下载
curl -O https://cdimage.kali.org/kali-current/kali-linux-default-amd64.iso

校验文件完整性

# 校验 SHA256 哈希值
sha256sum kali-linux-default-amd64.iso

# 校验 GPG 签名
gpg --verify kali-linux-default-amd64.iso.sig kali-linux-default-amd64.iso

🔧 相关工具

镜像写入工具

Rufus (Windows)

Etcher (跨平台)

UNetbootin (跨平台)

Ventoy (多镜像管理)

虚拟机镜像

VMware: 提供预配置的虚拟机镜像
VirtualBox: OVA 格式虚拟机
Hyper-V: VHDX 格式镜像

⚠️ 注意事项

安全提醒: 请从官方或可信镜像下载

校验文件: 下载后务必校验文件完整性

合法使用: 仅用于合法的安全测试目的

系统要求: 确认硬件满足最低配置要求

🔄 更新源配置

使用国内镜像源可以加快软件包更新速度：

# 编辑源列表
sudo nano /etc/apt/sources.list

# 添加国内镜像源（以清华为例）
deb https://mirrors.tuna.tsinghua.edu.cn/kali kali-rolling main non-free contrib
deb-src https://mirrors.tuna.tsinghua.edu.cn/kali kali-rolling main non-free contrib

这些镜像站点可以为您提供快速稳定的 Kali Linux 下载体验。建议根据您的地理位置和网络环境选择最适合的镜像站点。

一键下载脚本：

Kali Linux Smart Download Script

#!/bin/bash

# Kali Linux Smart Download Script
# Automatically detects language, country, and OS to select the fastest mirror

# Colors for output
RED='\033&#91;0;31m'
GREEN='\033&#91;0;32m'
YELLOW='\033&#91;1;33m'
BLUE='\033&#91;0;34m'
PURPLE='\033&#91;0;35m'
CYAN='\033&#91;0;36m'
NC='\033&#91;0m' # No Color

# Global variables
SCRIPT_VERSION="1.0"
SELECTED_MIRROR=""
SELECTED_VERSION=""
DOWNLOAD_URL=""
DOWNLOAD_DIR=""
LANGUAGE=""
COUNTRY=""
OS_TYPE=""

# Mirror lists by region
declare -A MIRRORS_CN=(
    &#91;"tsinghua"]="https://mirrors.tuna.tsinghua.edu.cn/kali"
    &#91;"ustc"]="https://mirrors.ustc.edu.cn/kali"
    &#91;"aliyun"]="https://mirrors.aliyun.com/kali"
    &#91;"huaweicloud"]="https://mirrors.huaweicloud.com/kali"
    &#91;"sjtu"]="https://mirror.sjtu.edu.cn/kali"
    &#91;"hust"]="https://mirrors.hust.edu.cn/kali"
    &#91;"tencent"]="https://mirrors.cloud.tencent.com/kali"
)

declare -A MIRRORS_GLOBAL=(
    &#91;"official"]="https://http.kali.org/kali"
    &#91;"jp"]="https://kali.download/kali"
    &#91;"de"]="https://ftp.halifax.rwth-aachen.de/kali"
    &#91;"fr"]="https://kali.mirror.garr.it/kali"
    &#91;"uk"]="https://www.mirrorservice.org/sites/ftp.kali.org/kali"
    &#91;"us"]="https://kali.mirror.globo.tech/kali"
)

# Kali versions
declare -A KALI_VERSIONS=(
    &#91;"default"]="kali-linux-default-amd64.iso"
    &#91;"large"]="kali-linux-large-amd64.iso"
    &#91;"everything"]="kali-linux-everything-amd64.iso"
    &#91;"light"]="kali-linux-light-amd64.iso"
    &#91;"minimal"]="kali-linux-minimal-amd64.iso"
)

# Function to print banner
print_banner() {
    echo -e "${CYAN}"
    echo "  _  __       _ _    _     _     _ _     "
    echo " | |/ /      | | |  | |   (_)   | (_)    "
    echo " | ' / __ _  | | |  | |___ _ ___| |_ ___ "
    echo " |  < / _\` | | | |  | / __| / __| | / __|"
    echo " | . \ (_| | | | |__| \__ \ \__ \ | \__ \\"
    echo " |_|\_\__,_|_|_|\____/|___/_|___/_|_|___/"
    echo -e "${NC}"
    echo -e "${YELLOW}Kali Linux Smart Download Script v${SCRIPT_VERSION}${NC}"
    echo -e "${BLUE}Automatically selecting the fastest mirror for your location${NC}"
    echo ""
}

# Function to detect system information
detect_system() {
    echo -e "${GREEN}&#91;INFO]${NC} Detecting system information..."

    # Detect language
    if &#91; -n "$LANG" ]; then
        LANGUAGE=$(echo "$LANG" | cut -d'_' -f1 | tr '&#91;:upper:]' '&#91;:lower:]')
    else
        LANGUAGE="en"
    fi

    # Detect country
    if command -v curl >/dev/null 2>&1; then
        COUNTRY=$(curl -s https://ipapi.co/country_code/ 2>/dev/null || echo "US")
    elif command -v wget >/dev/null 2>&1; then
        COUNTRY=$(wget -qO- https://ipapi.co/country_code/ 2>/dev/null || echo "US")
    else
        COUNTRY="US"
    fi

    # Detect OS type
    if &#91;&#91; "$OSTYPE" == "linux-gnu"* ]]; then
        OS_TYPE="Linux"
    elif &#91;&#91; "$OSTYPE" == "darwin"* ]]; then
        OS_TYPE="macOS"
    elif &#91;&#91; "$OSTYPE" == "cygwin" ]] || &#91;&#91; "$OSTYPE" == "msys" ]] || &#91;&#91; "$OSTYPE" == "win32" ]]; then
        OS_TYPE="Windows"
    else
        OS_TYPE="Unknown"
    fi

    echo -e "${BLUE}Language:${NC} $LANGUAGE"
    echo -e "${BLUE}Country:${NC} $COUNTRY"
    echo -e "${BLUE}OS Type:${NC} $OS_TYPE"
    echo ""
}

# Function to test mirror speed
test_mirror_speed() {
    local mirror_url=$1
    local timeout=10

    echo -ne "${YELLOW}Testing${NC} $mirror_url ... "

    # Test connection speed
    if command -v curl >/dev/null 2>&1; then
        local start_time=$(date +%s%3N)
        local response=$(curl -s -o /dev/null -w "%{http_code}" --max-time $timeout "$mirror_url" 2>/dev/null)
        local end_time=$(date +%s%3N)
        local duration=$((end_time - start_time))
    elif command -v wget >/dev/null 2>&1; then
        local start_time=$(date +%s%3N)
        local response=$(wget -q --spider --timeout=$timeout "$mirror_url" 2>&1 && echo "200" || echo "0")
        local end_time=$(date +%s%3N)
        local duration=$((end_time - start_time))
    else
        echo -e "${RED}&#91;FAIL]${NC} No download tool available"
        return 999
    fi

    if &#91; "$response" = "200" ] || &#91; "$response" = "301" ] || &#91; "$response" = "302" ]; then
        echo -e "${GREEN}&#91;OK]${NC} (${duration}ms)"
        return $duration
    else
        echo -e "${RED}&#91;FAIL]${NC}"
        return 999
    fi
}

# Function to find fastest mirror
find_fastest_mirror() {
    echo -e "${GREEN}&#91;INFO]${NC} Finding the fastest mirror..."
    echo ""

    local fastest_time=999999
    local fastest_mirror=""
    local mirror_name=""

    # Test Chinese mirrors if in China or Chinese language
    if &#91;&#91; "$COUNTRY" == "CN" ]] || &#91;&#91; "$LANGUAGE" == "zh" ]]; then
        echo -e "${PURPLE}&#91;Testing Chinese Mirrors]${NC}"
        for name in "${!MIRRORS_CN&#91;@]}"; do
            local url="${MIRRORS_CN&#91;$name]}"
            test_mirror_speed "$url"
            local result=$?
            if &#91; $result -lt $fastest_time ] && &#91; $result -ne 999 ]; then
                fastest_time=$result
                fastest_mirror=$url
                mirror_name=$name
            fi
        done
        echo ""
    fi

    # Test global mirrors
    echo -e "${PURPLE}&#91;Testing Global Mirrors]${NC}"
    for name in "${!MIRRORS_GLOBAL&#91;@]}"; do
        local url="${MIRRORS_GLOBAL&#91;$name]}"
        test_mirror_speed "$url"
        local result=$?
        if &#91; $result -lt $fastest_time ] && &#91; $result -ne 999 ]; then
            fastest_time=$result
            fastest_mirror=$url
            mirror_name=$name
        fi
    done

    if &#91; -n "$fastest_mirror" ]; then
        SELECTED_MIRROR=$fastest_mirror
        echo ""
        echo -e "${GREEN}&#91;SUCCESS]${NC} Fastest mirror selected: ${BLUE}$mirror_name${NC} (${fastest_time}ms)"
        echo -e "${BLUE}URL:${NC} $SELECTED_MIRROR"
    else
        echo -e "${RED}&#91;ERROR]${NC} No working mirror found. Using official mirror."
        SELECTED_MIRROR="${MIRRORS_GLOBAL&#91;official]}"
    fi
    echo ""
}

# Function to select Kali version
select_kali_version() {
    echo -e "${GREEN}&#91;INFO]${NC} Select Kali Linux version:"
    echo ""

    local counter=1
    local versions_list=()

    for version in "${!KALI_VERSIONS&#91;@]}"; do
        echo -e "${counter}. ${YELLOW}${version}${NC} - ${KALI_VERSIONS&#91;$version]}"
        versions_list&#91;$counter]=$version
        ((counter++))
    done

    echo ""
    echo -e "${BLUE}Enter your choice (1-${#versions_list&#91;@]}):${NC} "
    read -r choice

    if &#91;&#91; $choice -ge 1 && $choice -le ${#versions_list&#91;@]} ]]; then
        SELECTED_VERSION="${versions_list&#91;$choice]}"
        echo -e "${GREEN}&#91;SELECTED]${NC} Version: $SELECTED_VERSION"
    else
        echo -e "${YELLOW}&#91;DEFAULT]${NC} Using default version"
        SELECTED_VERSION="default"
    fi

    echo ""
}

# Function to select download directory
select_download_directory() {
    echo -e "${GREEN}&#91;INFO]${NC} Select download directory:"
    echo ""

    # Default download locations
    local default_dir=""
    if &#91;&#91; "$OS_TYPE" == "Windows" ]]; then
        default_dir="$HOME/Downloads"
    elif &#91;&#91; "$OS_TYPE" == "macOS" ]]; then
        default_dir="$HOME/Downloads"
    else
        default_dir="$HOME/Downloads"
    fi

    echo -e "Default directory: ${BLUE}$default_dir${NC}"
    echo -e "Press Enter to use default, or enter custom path:"
    read -r custom_dir

    if &#91; -n "$custom_dir" ]; then
        DOWNLOAD_DIR="$custom_dir"
    else
        DOWNLOAD_DIR="$default_dir"
    fi

    # Create directory if it doesn't exist
    mkdir -p "$DOWNLOAD_DIR"

    echo -e "${GREEN}&#91;SELECTED]${NC} Download directory: $DOWNLOAD_DIR"
    echo ""
}

# Function to construct download URL
construct_download_url() {
    local iso_file="${KALI_VERSIONS&#91;$SELECTED_VERSION]}"

    # Handle different mirror URL structures
    if &#91;&#91; "$SELECTED_MIRROR" == *"kali.org"* ]] || &#91;&#91; "$SELECTED_MIRROR" == *"kali.download"* ]]; then
        DOWNLOAD_URL="${SELECTED_MIRROR}/current/$iso_file"
    else
        DOWNLOAD_URL="$SELECTED_MIRROR/$iso_file"
    fi

    echo -e "${GREEN}&#91;INFO]${NC} Download URL constructed:"
    echo -e "${BLUE}$DOWNLOAD_URL${NC}"
    echo ""
}

# Function to download file
download_file() {
    local filename=$(basename "$DOWNLOAD_URL")
    local filepath="$DOWNLOAD_DIR/$filename"

    echo -e "${GREEN}&#91;INFO]${NC} Starting download..."
    echo -e "${BLUE}File:${NC} $filename"
    echo -e "${BLUE}Destination:${NC} $filepath"
    echo ""

    # Choose download method
    if command -v aria2c >/dev/null 2>&1; then
        echo -e "${YELLOW}&#91;USING]${NC} aria2c (multi-threaded)"
        aria2c -x 16 -s 16 -d "$DOWNLOAD_DIR" "$DOWNLOAD_URL"
    elif command -v axel >/dev/null 2>&1; then
        echo -e "${YELLOW}&#91;USING]${NC} axel (multi-threaded)"
        axel -n 8 -o "$filepath" "$DOWNLOAD_URL"
    elif command -v wget >/dev/null 2>&1; then
        echo -e "${YELLOW}&#91;USING]${NC} wget"
        cd "$DOWNLOAD_DIR" && wget "$DOWNLOAD_URL"
    elif command -v curl >/dev/null 2>&1; then
        echo -e "${YELLOW}&#91;USING]${NC} curl"
        cd "$DOWNLOAD_DIR" && curl -O "$DOWNLOAD_URL"
    else
        echo -e "${RED}&#91;ERROR]${NC} No download tool available!"
        exit 1
    fi

    local download_status=$?

    if &#91; $download_status -eq 0 ]; then
        echo ""
        echo -e "${GREEN}&#91;SUCCESS]${NC} Download completed!"
        echo -e "${BLUE}Location:${NC} $filepath"
        echo -e "${BLUE}Size:${NC} $(ls -lh "$filepath" | awk '{print $5}')"

        # Verify file integrity
        verify_file "$filepath"
    else
        echo -e "${RED}&#91;ERROR]${NC} Download failed!"
        exit 1
    fi
}

# Function to verify file integrity
verify_file() {
    local filepath=$1
    local filename=$(basename "$filepath")

    echo ""
    echo -e "${GREEN}&#91;INFO]${NC} Verifying file integrity..."

    if command -v sha256sum >/dev/null 2>&1; then
        echo -e "${YELLOW}&#91;CHECKING]${NC} SHA256 checksum..."
        sha256sum "$filepath"
    elif command -v shasum >/dev/null 2>&1; then
        echo -e "${YELLOW}&#91;CHECKING]${NC} SHA256 checksum..."
        shasum -a 256 "$filepath"
    fi

    echo ""
    echo -e "${GREEN}&#91;TIPS]${NC} Compare checksum with official values at:"
    echo -e "${BLUE}https://www.kali.org/get-kali/#kali-installer-images${NC}"
}

# Function to show download summary
show_summary() {
    echo ""
    echo -e "${CYAN}========== DOWNLOAD SUMMARY ==========${NC}"
    echo -e "${BLUE}Language:${NC} $LANGUAGE"
    echo -e "${BLUE}Country:${NC} $COUNTRY"
    echo -e "${BLUE}OS Type:${NC} $OS_TYPE"
    echo -e "${BLUE}Mirror:${NC} $SELECTED_MIRROR"
    echo -e "${BLUE}Version:${NC} $SELECTED_VERSION"
    echo -e "${BLUE}Download Directory:${NC} $DOWNLOAD_DIR"
    echo -e "${CYAN}=====================================${NC}"
    echo ""
}

# Function to show help
show_help() {
    echo "Kali Linux Smart Download Script"
    echo ""
    echo "Usage: $0 &#91;OPTIONS]"
    echo ""
    echo "Options:"
    echo "  -h, --help     Show this help message"
    echo "  -v, --version  Show script version"
    echo "  --auto         Run in automatic mode"
    echo ""
    echo "Features:"
    echo "  • Automatically detects your location and language"
    echo "  • Tests multiple mirrors to find the fastest one"
    echo "  • Supports multiple Kali Linux versions"
    echo "  • Uses the best available download tool"
    echo "  • Verifies file integrity after download"
}

# Function to check dependencies
check_dependencies() {
    echo -e "${GREEN}&#91;INFO]${NC} Checking dependencies..."

    local tools=("curl" "wget")
    local available_tools=()

    for tool in "${tools&#91;@]}"; do
        if command -v "$tool" >/dev/null 2>&1; then
            echo -e "  ✓ ${GREEN}$tool${NC} - Available"
            available_tools+=("$tool")
        else
            echo -e "  ✗ ${RED}$tool${NC} - Not found"
        fi
    done

    # Check download tools
    local download_tools=("aria2c" "axel" "wget" "curl")
    local download_available=0

    for tool in "${download_tools&#91;@]}"; do
        if command -v "$tool" >/dev/null 2>&1; then
            echo -e "  ✓ ${GREEN}$tool${NC} - Available (download tool)"
            download_available=1
        fi
    done

    if &#91; $download_available -eq 0 ]; then
        echo -e "${RED}&#91;ERROR]${NC} No download tool available!"
        echo -e "Please install one of: aria2c, axel, wget, curl"
        exit 1
    fi

    echo ""
}

# Main function
main() {
    # Parse command line arguments
    case "$1" in
        -h|--help)
            show_help
            exit 0
            ;;
        -v|--version)
            echo "Kali Linux Smart Download Script v$SCRIPT_VERSION"
            exit 0
            ;;
        --auto)
            AUTO_MODE=1
            ;;
        *)
            AUTO_MODE=0
            ;;
    esac

    # Print banner
    print_banner

    # Check dependencies
    check_dependencies

    # Detect system information
    detect_system

    # Find fastest mirror
    find_fastest_mirror

    # Select Kali version
    if &#91; $AUTO_MODE -eq 0 ]; then
        select_kali_version
    else
        SELECTED_VERSION="default"
        echo -e "${YELLOW}&#91;AUTO]${NC} Using default version"
    fi

    # Select download directory
    if &#91; $AUTO_MODE -eq 0 ]; then
        select_download_directory
    else
        DOWNLOAD_DIR="$HOME/Downloads"
        mkdir -p "$DOWNLOAD_DIR"
        echo -e "${YELLOW}&#91;AUTO]${NC} Using default directory: $DOWNLOAD_DIR"
    fi

    # Construct download URL
    construct_download_url

    # Show summary
    show_summary

    # Confirm download
    if &#91; $AUTO_MODE -eq 0 ]; then
        echo -e "${BLUE}Start download? (y/N):${NC} "
        read -r confirm
        if &#91;&#91; ! "$confirm" =~ ^&#91;Yy]$ ]]; then
            echo -e "${YELLOW}&#91;CANCELLED]${NC} Download cancelled by user"
            exit 0
        fi
    fi

    # Download file
    download_file

    # Final message
    echo ""
    echo -e "${GREEN}&#91;COMPLETED]${NC} Kali Linux download finished successfully!"
    echo -e "${BLUE}Next steps:${NC}"
    echo -e "  1. Verify the checksum matches official values"
    echo -e "  2. Write the ISO to a USB drive using Rufus, Etcher, or dd"
    echo -e "  3. Boot from the USB and install Kali Linux"
    echo ""
}

# Run main function
main "$@"

Usage Instructions

1. Save the script

1
2
3

# Save as kali-downloader.sh
nano kali-downloader.sh
# Paste the script content and save

2. Make it executable

1	chmod +x kali-downloader.sh

3. Run the script

# Interactive mode
./kali-downloader.sh

# Automatic mode (uses defaults)
./kali-downloader.sh --auto

# Show help
./kali-downloader.sh --help

Features

🌍 Smart Detection

Automatically detects system language and country
Identifies OS type (Linux, macOS, Windows)
Selects appropriate mirrors based on location

⚡ Speed Testing

Tests multiple mirrors simultaneously
Measures response time for each mirror
Automatically selects the fastest working mirror

📦 Version Selection

Multiple Kali Linux versions available:
Default (standard desktop)
Large (extended tools)
Everything (full package set)
Light (minimal installation)
Minimal (smallest footprint)

🛠️ Intelligent Download

Uses the best available download tool:
aria2c (multi-threaded, fastest)
axel (multi-threaded)
wget (standard)
curl (fallback)
Supports resume interrupted downloads
Shows real-time progress

🔍 Verification

Automatically verifies file integrity
Displays SHA256 checksum
Provides official verification links

🌈 User Experience

Colorful, informative output
Interactive menu system
Progress indicators
Error handling and recovery

Requirements

Dependencies (at least one required):

aria2c - Multi-threaded download (recommended)
axel - Multi-threaded download
wget - Standard download tool
curl - HTTP client

Installation of dependencies:

# Ubuntu/Debian
sudo apt update
sudo apt install aria2 wget curl

# CentOS/RHEL/Fedora
sudo yum install aria2 wget curl
# or
sudo dnf install aria2 wget curl

# macOS
brew install aria2 wget curl

# Windows (WSL)
sudo apt install aria2 wget curl

Advanced Usage

Command Line Options:

1
2
3

./kali-downloader.sh --help     # Show help
./kali-downloader.sh --version  # Show version
./kali-downloader.sh --auto     # Automatic mode

Environment Variables:

1 2	export KALI_VERSION="everything" # Set default version export DOWNLOAD_DIR="/custom/path" # Set custom download directory

This script provides an intelligent, automated way to download Kali Linux from the fastest available mirror based on your location and system configuration.

2025-08-23

Linux系统编程

linux命名空间系统调用及示例

我们来通过一个完整、易懂的示例来演示 Linux 命名空间相关的四个核心系统调用：clone, unshare, setns 和 ioctl_ns (通过 ioctl)。

这个示例将模拟一个简单的容器环境创建和管理过程，包含以下步骤：

使用 clone 创建一个带有隔离环境的新进程（容器的“init”进程）。

在新进程中使用 unshare 进一步隔离其网络命名空间。

父进程使用 setns 加入子进程的 Mount 命名空间。

父进程使用 ioctl 查询子进程命名空间的信息。

linux命名空间系统调用及示例

getgroups系统调用及示例

linux命名空间系统调用及示例-CSDN博客

为了简化，我们将重点放在 Mount (mnt) 和 Network (net) 命名空间上。

重要提示：

需要 root 权限：操作命名空间，特别是挂载文件系统，通常需要 root 权限。
环境要求：你的 Linux 内核需要支持这些命名空间（现代 Linux 发行版默认支持）。
安全：此示例涉及系统级操作，请在测试环境中运行。

完整示例代码

#define _GNU_SOURCE // 启用 GNU 扩展
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sched.h>    // 包含 clone 标志和 unshare
#include <sys/syscall.h> // 包含 syscall 和系统调用号
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/wait.h> // 包含 waitpid
#include <fcntl.h>    // 包含 open, O_RDONLY 等
#include <errno.h>
#include <string.h>
#include <sys/mount.h> // 包含 mount
#include <linux/nsfs.h> // 包含 ioctl_ns 的常量 (NS_GET_nstype 等)
#include <sys/ioctl.h>  // 包含 ioctl

// 定义子进程栈大小
#define STACK_SIZE (1024 * 1024) // 1MB

// 子进程1：由 clone 创建，拥有自己的 Mount 和 UTS 命名空间
// 然后它会调用 unshare 来获得独立的 Network 命名空间
int container_init(void *arg) {
    printf("\n--- Inside Container Init Process (PID: %d) ---\n", getpid());

    // 1. 更改容器内的主机名 (在 UTS 命名空间内)
    // 这不会影响宿主机的主机名
    if (sethostname("my-container", strlen("my-container")) == -1) {
        perror("sethostname (in container)");
    } else {
        printf("Container: Set hostname to 'my-container'.\n");
    }

    // 2. 创建一个挂载点并挂载 tmpfs (在 Mount 命名空间内)
    // 这个挂载在宿主机上不可见
    const char* mount_point = "/tmp/container_root";
    if (mkdir(mount_point, 0755) == -1 && errno != EEXIST) {
        perror("mkdir (in container)");
        return 1;
    }

    if (mount("tmpfs", mount_point, "tmpfs", 0, NULL) == -1) {
        perror("mount tmpfs (in container)");
        return 1;
    }
    printf("Container: Mounted tmpfs on %s\n", mount_point);

    // 3. 在挂载点内创建一个文件
    char file_path&#91;256];
    snprintf(file_path, sizeof(file_path), "%s/container_file.txt", mount_point);
    FILE *f = fopen(file_path, "w");
    if (f) {
        fprintf(f, "This file exists only inside the container's mount namespace.\n");
        fclose(f);
        printf("Container: Created file %s\n", file_path);
    } else {
        perror("fopen (in container)");
    }

    // 4. 使用 unshare 脱离当前的 Network 命名空间，进入一个新的空的 Network 命名空间
    // 这使得容器拥有完全隔离的网络视图
    printf("Container: Calling unshare(CLONE_NEWNET) to get isolated network...\n");
    if (unshare(CLONE_NEWNET) == -1) {
        perror("unshare CLONE_NEWNET");
        umount(mount_point); // 清理
        return 1;
    }
    printf("Container: Successfully unshared network namespace.\n");
    printf("Container: Network is now isolated. 'ip link' should show only loopback.\n");

    // 5. 容器主循环：等待信号或执行任务
    // 这里我们简单地睡眠，以便我们可以从外部观察
    printf("Container: Sleeping for 60 seconds. Explore from host and container.\n");
    printf("Container: From host terminal, run:\n");
    printf("  - 'ls /tmp/container_root' (should NOT see the file)\n");
    printf("  - 'sudo nsenter -t %d -n ip link' (should see only loopback)\n", getpid());
    printf("Container: From another terminal (as root), run this program's second part:\n");
    printf("  - './namespace_demo join %d'\n", getpid());

    sleep(60); // 睡眠 60 秒

    // 6. 清理 (退出时内核通常会自动清理命名空间和挂载)
    printf("Container: Cleaning up and exiting.\n");
    umount(mount_point);
    rmdir(mount_point);
    return 0;
}

// 辅助函数：打开并返回指定进程的指定类型命名空间的文件描述符
int open_namespace_fd(pid_t pid, const char* ns_type) {
    char path&#91;256];
    snprintf(path, sizeof(path), "/proc/%d/ns/%s", pid, ns_type);
    int fd = open(path, O_RDONLY);
    if (fd == -1) {
        perror("open_namespace_fd");
        fprintf(stderr, "Failed to open %s namespace for PID %d\n", ns_type, pid);
    }
    return fd;
}

// 辅助函数：使用 ioctl_ns 获取命名空间类型
void query_namespace_type(int ns_fd) {
    // NS_GET_NSTYPE 是一个 ioctl 命令，用于获取命名空间类型
    int ns_type = ioctl(ns_fd, NS_GET_NSTYPE);
    if (ns_type == -1) {
        perror("ioctl NS_GET_NSTYPE");
        return;
    }

    const char* type_str;
    switch (ns_type) {
        case CLONE_NEWNS: type_str = "Mount (CLONE_NEWNS)"; break;
        case CLONE_NEWCGROUP: type_str = "Cgroup (CLONE_NEWCGROUP)"; break;
        case CLONE_NEWUTS: type_str = "UTS (CLONE_NEWUTS)"; break;
        case CLONE_NEWIPC: type_str = "IPC (CLONE_NEWIPC)"; break;
        case CLONE_NEWUSER: type_str = "User (CLONE_NEWUSER)"; break;
        case CLONE_NEWPID: type_str = "PID (CLONE_NEWPID)"; break;
        case CLONE_NEWNET: type_str = "Network (CLONE_NEWNET)"; break;
        default: type_str = "Unknown";
    }
    printf("  Namespace fd %d type is: %s\n", ns_fd, type_str);
}

// 主函数：演示创建和加入命名空间
int main(int argc, char *argv&#91;]) {
    // --- 场景 1: 加入已存在的命名空间 ---
    if (argc == 3 && strcmp(argv&#91;1], "join") == 0) {
        pid_t target_pid = atoi(argv&#91;2]);
        if (target_pid <= 0) {
            fprintf(stderr, "Invalid PID provided.\n");
            exit(EXIT_FAILURE);
        }

        printf("--- Joining Existing Namespace (as separate process) ---\n");
        printf("This process (PID: %d) will join the mount namespace of PID: %d\n", getpid(), target_pid);

        // 1. 打开目标进程的 Mount 命名空间文件描述符
        int target_mnt_ns_fd = open_namespace_fd(target_pid, "mnt");
        if (target_mnt_ns_fd == -1) exit(EXIT_FAILURE);

        // 2. 查询并打印命名空间类型 (使用 ioctl)
        printf("Querying namespace type using ioctl...\n");
        query_namespace_type(target_mnt_ns_fd);

        // 3. 使用 setns 加入目标的 Mount 命名空间
        printf("Calling setns() to join the mount namespace...\n");
        if (syscall(SYS_setns, target_mnt_ns_fd, CLONE_NEWNS) == -1) {
            perror("setns");
            close(target_mnt_ns_fd);
            exit(EXIT_FAILURE);
        }
        printf("Successfully joined the mount namespace of PID %d.\n", target_pid);
        close(target_mnt_ns_fd);

        // 4. 现在，当前进程的文件系统视图与目标进程相同
        printf("Checking for file that exists in the target namespace...\n");
        if (access("/tmp/container_root/container_file.txt", F_OK) == 0) {
            printf("SUCCESS: Found '/tmp/container_root/container_file.txt'. We are inside the container's mount namespace!\n");
        } else {
            printf("FAIL: Could not find the file. setns might have failed or file doesn't exist.\n");
        }

        printf("Exiting join process.\n");
        exit(EXIT_SUCCESS);
    }

    // --- 场景 2: 创建新的隔离进程 (容器) ---
    printf("--- Creating Isolated Process (Container) ---\n");
    printf("Main process PID: %d\n", getpid());

    // 1. 分配子进程栈
    char *child_stack = malloc(STACK_SIZE);
    if (!child_stack) {
        perror("malloc");
        exit(EXIT_FAILURE);
    }

    // 2. 使用 clone 创建子进程，并让它在新的 Mount 和 UTS 命名空间中启动
    // SIGCHLD: 子进程退出时发送 SIGCHLD 信号给父进程
    pid_t container_pid = syscall(SYS_clone,
                                 CLONE_NEWNS | CLONE_NEWUTS | SIGCHLD,
                                 child_stack + STACK_SIZE, // 栈是向下增长的
                                 NULL, NULL);

    if (container_pid == -1) {
        perror("clone");
        free(child_stack);
        exit(EXIT_FAILURE);
    }

    if (container_pid == 0) {
        // --- 在子进程中 ---
        free(child_stack); // 子进程不需要父进程的栈指针
        container_init(NULL); // 运行容器初始化函数
        exit(EXIT_SUCCESS);
    }

    // --- 回到父进程 ---
    printf("\n--- Back in Parent Process ---\n");
    printf("Parent: Created container process with PID: %d\n", container_pid);

    // 等待几秒，让子进程完成初始化 (挂载等)
    sleep(3);

    // 3. 父进程演示：获取子进程的命名空间文件描述符并查询信息
    printf("\n--- Parent: Inspecting Container's Namespaces ---\n");
    int child_mnt_ns_fd = open_namespace_fd(container_pid, "mnt");
    int child_net_ns_fd = open_namespace_fd(container_pid, "net");

    if (child_mnt_ns_fd != -1) {
        printf("Parent: Opened child's Mount namespace fd: %d\n", child_mnt_ns_fd);
        query_namespace_type(child_mnt_ns_fd);
        // 注意：此时父进程还未加入，所以访问 /tmp/container_root 应该失败
        if (access("/tmp/container_root/container_file.txt", F_OK) != 0) {
             printf("Parent: As expected, '/tmp/container_root/container_file.txt' is NOT accessible from parent namespace.\n");
        }
        // close(child_mnt_ns_fd); // 暂时不关闭，后面 setns 还要用
    }

    if (child_net_ns_fd != -1) {
        printf("Parent: Opened child's Network namespace fd: %d\n", child_net_ns_fd);
        query_namespace_type(child_net_ns_fd);
        close(child_net_ns_fd);
    }

    // 4. 父进程演示：使用 setns 加入子进程的 Mount 命名空间
    // (注意：实际应用中，你可能不会在父进程中这样做，这里仅为演示)
    /*
    printf("\n--- Parent: Attempting to join child's Mount Namespace ---\n");
    if (child_mnt_ns_fd != -1) {
        if (syscall(SYS_setns, child_mnt_ns_fd, CLONE_NEWNS) == 0) {
            printf("Parent: Successfully joined child's mount namespace.\n");
            // 现在父进程可以访问容器内的文件了
            if (access("/tmp/container_root/container_file.txt", F_OK) == 0) {
                printf("Parent: Now I can access '/tmp/container_root/container_file.txt'.\n");
            }
        } else {
            perror("Parent: setns failed");
        }
        close(child_mnt_ns_fd);
    }
    */

    // 5. 等待子进程结束
    printf("\n--- Parent: Waiting for container process (PID: %d) to finish ---\n", container_pid);
    printf("Parent: The container will exit automatically after its sleep.\n");
    printf("Parent: Or, you can manually stop it with 'kill %d'\n", container_pid);
    int status;
    waitpid(container_pid, &status, 0);

    if (WIFEXITED(status)) {
        printf("Parent: Container process exited with status %d.\n", WEXITSTATUS(status));
    } else {
        printf("Parent: Container process did not exit normally.\n");
    }

    free(child_stack);
    printf("Parent: Main process finished.\n");

    return 0;
}

如何编译和运行

# 1. 保存代码为 namespace_demo.c

# 2. 编译 (需要 root 权限来运行，但编译不需要)
gcc -o namespace_demo namespace_demo.c

# 3. 打开两个终端 (Terminal 1 和 Terminal 2)

# --- Terminal 1: 创建容器 ---
# 运行程序的第一部分，创建一个隔离的容器进程
sudo ./namespace_demo

# 你会看到类似输出：
# --- Creating Isolated Process (Container) ---
# Main process PID: 12345
# Parent: Created container process with PID: 12346
#
# --- Inside Container Init Process (PID: 12346) ---
# Container: Set hostname to 'my-container'.
# Container: Mounted tmpfs on /tmp/container_root
# Container: Created file /tmp/container_root/container_file.txt
# Container: Calling unshare(CLONE_NEWNET) to get isolated network...
# Container: Successfully unshared network namespace.
# Container: Network is now isolated. 'ip link' should show only loopback.
# Container: Sleeping for 60 seconds. Explore from host and container.
# ...

# --- Terminal 2: 加入容器 ---
# 在程序输出中找到容器的 PID (例如 12346)，然后运行程序的第二部分
# 加入容器的 Mount 命名空间
sudo ./namespace_demo join 12346

# 你会看到类似输出：
# --- Joining Existing Namespace (as separate process) ---
# This process (PID: 12347) will join the mount namespace of PID: 12346
# Querying namespace type using ioctl...
#   Namespace fd 3 type is: Mount (CLONE_NEWNS)
# Calling setns() to join the mount namespace...
# Successfully joined the mount namespace of PID 12346.
# Checking for file that exists in the target namespace...
# SUCCESS: Found '/tmp/container_root/container_file.txt'. We are inside the container's mount namespace!
# Exiting join process.

详细说明

场景 1: 使用 clone 创建隔离进程

main 函数：作为父进程启动。

malloc(STACK_SIZE)：为子进程分配独立的栈空间。

syscall(SYS_clone, …)：调用 clone 系统调用。

CLONE_NEWNS：告诉内核为新进程创建一个新的 Mount 命名空间。
CLONE_NEWUTS：告诉内核为新进程创建一个新的 UTS 命名空间（用于隔离主机名）。
child_stack + STACK_SIZE：传递子进程栈的顶部指针（栈向下增长）。

if (container_pid == 0)：在 clone 返回后，执行流分叉。在子进程中，clone 返回 0。

container_init：子进程执行此函数，它现在运行在一个隔离的 Mount 和 UTS 命名空间中。它设置了主机名、挂载了文件系统并创建了文件。然后它调用 unshare。

场景 2: 在进程中使用 unshare 增加隔离

container_init 函数内部：在子进程完成初步设置后。

unshare(CLONE_NEWNET)：调用 unshare 系统调用。

CLONE_NEWNET：告诉内核让当前进程脱离当前的 Network 命名空间，并加入一个新创建的、空的 Network 命名空间。

效果：现在这个子进程拥有完全独立的网络视图（例如，只有 lo 回环接口）。

场景 3: 使用 setns 加入已存在的命名空间

main 函数 (第二个实例)：我们运行 ./namespace_demo join 来启动一个新的、独立的进程，专门用于加入命名空间。

open(“/proc//ns/mnt”, O_RDONLY)：打开目标进程（容器）的 Mount 命名空间文件。这返回一个文件描述符。

syscall(SYS_setns, fd, CLONE_NEWNS)：调用 setns 系统调用。

fd：之前打开的命名空间文件描述符。
CLONE_NEWNS：指定要加入的命名空间类型（Mount）。

效果：调用 setns 的这个新进程，其文件系统视图现在与目标容器进程完全相同。它可以访问容器内挂载的 /tmp/container_root 及其内部的文件。

场景 4: 使用 ioctl 查询命名空间信息

open_namespace_fd：一个辅助函数，用于获取命名空间文件描述符。

ioctl(ns_fd, NS_GET_NSTYPE)：对命名空间文件描述符调用 ioctl 系统调用。

NS_GET_NSTYPE：这是一个特定的 ioctl 命令，请求内核返回该文件描述符对应的命名空间类型。

效果：程序可以动态地确定一个未知的命名空间文件描述符代表的是哪种类型的命名空间（Mount, Network, PID 等）。

总结

这个示例完整地演示了：

clone 如何在创建新进程时就为其分配隔离的环境（Mount, UTS）。
unshare 如何让一个已经在运行的进程（容器内）进一步获得新的隔离（Network）。
setns 如何让一个外部进程（或另一个独立的工具进程）加入到已存在的隔离环境（Mount）中。
ioctl (with NS_GET_NSTYPE) 如何查询命名空间文件描述符的类型。

通过这种方式，Linux 命名空间系统调用共同协作，提供了强大的资源隔离能力，构成了现代容器技术的基础。

2

好的，我们来构建一个高度简化但原理正确的“现代容器环境”示例。这个示例将展示创建一个容器所需的核心步骤，并对每一部分进行详细解释。

我们将模拟一个像 docker run 那样的命令，它会：

创建一个隔离的环境（使用命名空间）。

设置资源限制（使用 cgroups）。

准备一个文件系统根目录（使用 chroot 和 mount）。

在这个隔离的环境中运行一个指定的命令。

重要提示：

需要 root 权限：操作命名空间、cgroups、挂载文件系统都需要 root 权限。
简化：真实的容器运行时（如 runc）极其复杂，涉及大量细节、安全加固、兼容性处理等。此示例仅展示核心原理。
环境要求：你的 Linux 内核需要支持所需的特性（现代发行版默认支持）。
安全性：此示例代码不安全，仅用于学习，请勿在生产环境使用。

完整示例代码：my_simple_container.c

#define _GNU_SOURCE // 启用 GNU 扩展
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sched.h>      // 命名空间相关
#include <sys/syscall.h> // syscall
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/wait.h>   // waitpid
#include <fcntl.h>      // open, O_* flags
#include <errno.h>
#include <string.h>
#include <sys/mount.h>  // mount, umount
#include <limits.h>     // PATH_MAX
#include <ftw.h>        // nftw (用于递归删除目录)
#include <signal.h>     // kill

// --- 配置部分 ---
#define STACK_SIZE (1024 * 1024) // 1MB 子进程栈
#define CGROUP_NAME "my_simple_container"
#define MEMORY_LIMIT "50M" // 限制内存为 50MB

// --- 全局变量 ---
char child_stack&#91;STACK_SIZE]; // 子进程栈
char *container_root = "/tmp/my_container_root"; // 容器的根文件系统目录
char *host_fs_path = "/tmp/host_fs_for_container"; // 容器内挂载宿主机目录的位置

// --- 辅助函数 ---

// 递归删除目录的回调函数 (用于 nftw)
int remove_callback(const char *fpath, const struct stat *sb, int typeflag, struct FTW *ftwbuf) {
    (void)sb; (void)typeflag; (void)ftwbuf; // 忽略未使用的参数
    int res = remove(fpath);
    if (res == -1) {
        perror(fpath);
    }
    return res;
}

// 递归删除整个目录树
int remove_directory(const char *path) {
    return nftw(path, remove_callback, 64, FTW_DEPTH | FTW_PHYS);
}

// 安全地写入文件内容
int write_file(const char *path, const char *content) {
    int fd = open(path, O_WRONLY | O_CREAT | O_TRUNC, 0644);
    if (fd < 0) {
        perror("open (write_file)");
        return -1;
    }
    if (write(fd, content, strlen(content)) != (ssize_t)strlen(content)) {
        perror("write (write_file)");
        close(fd);
        return -1;
    }
    close(fd);
    return 0;
}

// --- 核心功能函数 ---

// 1. 准备容器的根文件系统
// 这是容器运行的基础，它需要包含运行命令所需的所有文件 (如 /bin, /lib, /etc 等)
int prepare_rootfs() {
    printf("&#91;*] Preparing root filesystem at %s\n", container_root);

    // 创建容器根目录
    if (mkdir(container_root, 0755) == -1 && errno != EEXIST) {
        perror("mkdir container_root");
        return -1;
    }

    // 挂载 tmpfs 作为容器的根文件系统
    // tmpfs 是内存中的临时文件系统，非常适合做实验性的 rootfs
    if (mount("tmpfs", container_root, "tmpfs", 0, NULL) == -1) {
        perror("mount tmpfs root");
        return -1;
    }
    printf("    Mounted tmpfs on %s\n", container_root);

    // 在容器根目录下创建基本的目录结构
    const char *dirs&#91;] = {"/bin", "/lib", "/lib64", "/etc", "/proc", "/sys", "/dev", "/tmp", host_fs_path+strlen(container_root)};
    for (int i = 0; i < sizeof(dirs)/sizeof(dirs&#91;0]); i++) {
        char path&#91;PATH_MAX];
        snprintf(path, sizeof(path), "%s%s", container_root, dirs&#91;i]);
        if (mkdir(path, 0755) == -1) {
            perror("mkdir (dirs)");
            return -1;
        }
    }
    printf("    Created basic directory structure\n");

    // 复制或绑定挂载一些必要的二进制文件和库
    // 这里我们只复制一个简单的命令: /bin/sh (shell)
    // 注意：实际的容器镜像会包含一个完整的文件系统
    const char *bins&#91;] = {"/bin/sh"};
    for (int i = 0; i < sizeof(bins)/sizeof(bins&#91;0]); i++) {
        char dst_path&#91;PATH_MAX];
        snprintf(dst_path, sizeof(dst_path), "%s%s", container_root, bins&#91;i]);
        // 使用硬链接或复制。这里简单使用系统调用 `cp`
        char cmd&#91;PATH_MAX * 2];
        snprintf(cmd, sizeof(cmd), "cp -p %s %s", bins&#91;i], dst_path);
        printf("    Copying %s...\n", bins&#91;i]);
        if (system(cmd) != 0) {
             fprintf(stderr, "Failed to copy %s\n", bins&#91;i]);
             return -1;
        }
    }

    // 创建一个简单的 /etc/passwd 和 /etc/group 文件，这样 `whoami` 等命令能工作
    char passwd_path&#91;PATH_MAX], group_path&#91;PATH_MAX];
    snprintf(passwd_path, sizeof(passwd_path), "%s/etc/passwd", container_root);
    snprintf(group_path, sizeof(group_path), "%s/etc/group", container_root);
    write_file(passwd_path, "root:x:0:0:root:/root:/bin/sh\nnobody:x:65534:65534:nobody:/:/bin/false\n");
    write_file(group_path, "root:x:0:\nnobody:x:65534:\n");
    printf("    Created minimal /etc/passwd and /etc/group\n");

    // 绑定挂载宿主机的 /lib, /lib64 目录到容器内，这样 /bin/sh 能找到它需要的共享库
    // 注意：这会暴露宿主机的库，实际容器会自带库或使用更精细的复制
    char lib_dst&#91;PATH_MAX], lib64_dst&#91;PATH_MAX];
    snprintf(lib_dst, sizeof(lib_dst), "%s/lib", container_root);
    snprintf(lib64_dst, sizeof(lib64_dst), "%s/lib64", container_root);
    if (mount("/lib", lib_dst, NULL, MS_BIND | MS_REC, NULL) == -1 ||
        mount("/lib64", lib64_dst, NULL, MS_BIND | MS_REC, NULL) == -1) {
        perror("mount /lib or /lib64");
        return -1;
    }
    printf("    Bind-mounted /lib and /lib64\n");

    // 在容器内创建一个挂载点，用于挂载宿主机的一个目录，演示数据共享
    // 这类似于 `docker run -v /host/path:/container/path`
    if (mkdir(host_fs_path, 0755) == -1 && errno != EEXIST) {
        perror("mkdir host_fs_path (host)");
        return -1;
    }
    char container_host_fs_path&#91;PATH_MAX];
    snprintf(container_host_fs_path, sizeof(container_host_fs_path), "%s%s", container_root, host_fs_path + strlen(container_root));
    if (mount(host_fs_path, container_host_fs_path, NULL, MS_BIND, NULL) == -1) {
        perror("mount host_fs_path");
        return -1;
    }
    printf("    Bind-mounted host directory %s to container directory %s\n", host_fs_path, container_host_fs_path);

    printf("&#91;+] Root filesystem prepared successfully.\n");
    return 0;
}

// 2. 设置 cgroups 以限制资源
// 这里我们只演示内存限制
int setup_cgroups(pid_t pid) {
    printf("&#91;*] Setting up cgroups (memory limit: %s)\n", MEMORY_LIMIT);
    char cgroup_path&#91;256];
    char tasks_path&#91;256];
    char limit_path&#91;256];

    // 创建我们自己的 cgroup 子目录
    snprintf(cgroup_path, sizeof(cgroup_path), "/sys/fs/cgroup/memory/%s", CGROUP_NAME);
    if (mkdir(cgroup_path, 0755) == -1 && errno != EEXIST) {
        perror("mkdir cgroup");
        // 如果 cgroup v2 或其他问题，可能需要更复杂的处理，这里简化
        fprintf(stderr, "Warning: Failed to create cgroup, skipping resource limits.\n");
        return 0; // 不算致命错误
    }

    // 设置内存限制
    snprintf(limit_path, sizeof(limit_path), "%s/memory.limit_in_bytes", cgroup_path);
    if (write_file(limit_path, MEMORY_LIMIT) == -1) {
        fprintf(stderr, "Warning: Failed to set memory limit, skipping.\n");
        return 0;
    }

    // 将子进程 PID 写入 tasks 文件，使其受此 cgroup 控制
    snprintf(tasks_path, sizeof(tasks_path), "%s/tasks", cgroup_path);
    char pid_str&#91;32];
    snprintf(pid_str, sizeof(pid_str), "%d", pid);
    if (write_file(tasks_path, pid_str) == -1) {
        fprintf(stderr, "Warning: Failed to add process to cgroup, skipping.\n");
        return 0;
    }

    printf("&#91;+] Cgroups configured.\n");
    return 0;
}

// 3. 容器初始化函数 (在子进程中运行)
// 这是容器内第一个执行的用户态进程 (PID 1)
int container_main(void *arg) {
    char **args = (char **)arg;
    printf("\n&#91;*** INSIDE CONTAINER ***]\n");
    printf("Container PID 1: %d\n", getpid());

    // --- 关键步骤 1: 切换根文件系统 (chroot) ---
    // 将容器的根目录设置为我们准备好的目录
    if (chdir(container_root) == -1) {
        perror("chdir to container root");
        return 1;
    }
    // chroot 系统调用将当前进程看到的文件系统根目录 '/' 改变
    if (chroot(".") == -1) {
        perror("chroot");
        return 1;
    }
    printf("    Changed root to %s using chroot\n", container_root);

    // --- 关键步骤 2: 挂载必要的虚拟文件系统 ---
    // 容器内进程需要 /proc, /sys, /dev 等来获取系统信息和设备访问
    if (mount("proc", "/proc", "proc", 0, NULL) == -1 ||
        mount("sysfs", "/sys", "sysfs", 0, NULL) == -1 ||
        mount("tmpfs", "/dev", "tmpfs", MS_NOSUID, "size=65536k,mode=755") == -1) {
        perror("mount essential filesystems");
        return 1;
    }
    // 创建基本的设备节点 (简化)
    if (mknod("/dev/null", S_IFCHR | 0666, makedev(1, 3)) == -1 && errno != EEXIST) {
        perror("mknod /dev/null");
    }
    printf("    Mounted /proc, /sys, /dev\n");

    // --- 关键步骤 3: 运行用户指定的命令 ---
    printf("    Executing command: ");
    for (int i = 0; args&#91;i] != NULL; i++) {
        printf("%s ", args&#91;i]);
    }
    printf("\n");
    printf("&#91;*** END OF CONTAINER SETUP ***]\n\n");

    // execvp 会用新程序替换当前进程的镜像
    execvp(args&#91;0], args);
    // 如果 execvp 返回，说明执行失败
    perror("execvp");
    return 1;
}

// --- 主函数 ---
int main(int argc, char *argv&#91;]) {
    if (argc < 2) {
        fprintf(stderr, "Usage: %s <command> &#91;args...]\n", argv&#91;0]);
        fprintf(stderr, "Example: sudo %s /bin/sh\n", argv&#91;0]);
        exit(EXIT_FAILURE);
    }

    printf("=== My Simple Container Runtime ===\n");

    // 1. 准备 rootfs
    if (prepare_rootfs() == -1) {
        fprintf(stderr, "Failed to prepare root filesystem\n");
        exit(EXIT_FAILURE);
    }

    // 2. 使用 clone 创建子进程，并设置命名空间隔离
    // CLONE_NEWPID: 新的 PID 命名空间 (容器内 PID 从 1 开始)
    // CLONE_NEWNS: 新的 Mount 命名空间 (隔离文件系统挂载点)
    // CLONE_NEWUTS: 新的 UTS 命名空间 (隔离主机名)
    // CLONE_NEWIPC: 新的 IPC 命名空间 (隔离 IPC 资源)
    // CLONE_NEWNET: 新的 Network 命名空间 (隔离网络)
    // SIGCHLD: 子进程结束时通知父进程
    int flags = CLONE_NEWPID | CLONE_NEWNS | CLONE_NEWUTS | CLONE_NEWIPC | CLONE_NEWNET | SIGCHLD;

    pid_t container_pid = syscall(SYS_clone, flags, child_stack + STACK_SIZE, NULL, NULL);
    if (container_pid == -1) {
        perror("clone");
        // 清理
        umount(container_root);
        remove_directory(container_root);
        exit(EXIT_FAILURE);
    }

    if (container_pid == 0) {
        // --- 在子进程中 ---
        // 设置主机名
        sethostname("my-container", strlen("my-container"));

        // 调用容器主函数
        container_main(argv + 1); // 传递命令行参数 (跳过 argv&#91;0])
        exit(EXIT_FAILURE); // 如果 container_main 返回，说明 exec 失败
    }

    // --- 回到父进程 ---
    printf("&#91;*] Started container process with PID %d\n", container_pid);

    // 3. 设置 cgroups (在子进程启动后)
    if (setup_cgroups(container_pid) == -1) {
        fprintf(stderr, "Warning: Cgroup setup failed\n");
    }

    // 4. 等待容器进程结束
    printf("&#91;*] Waiting for container (PID %d) to finish...\n", container_pid);
    int status;
    if (waitpid(container_pid, &status, 0) == -1) {
        perror("waitpid");
    }

    if (WIFEXITED(status)) {
        printf("&#91;*] Container exited with status %d\n", WEXITSTATUS(status));
    } else if (WIFSIGNALED(status)) {
        printf("&#91;*] Container killed by signal %d\n", WTERMSIG(status));
    }

    // 5. 清理资源
    printf("&#91;*] Cleaning up...\n");
    // 卸载挂载点
    char container_host_fs_path&#91;PATH_MAX];
    snprintf(container_host_fs_path, sizeof(container_host_fs_path), "%s%s", container_root, host_fs_path + strlen(container_root));
    umount(container_host_fs_path);
    umount("/lib64");
    umount("/lib");
    // 注意：chroot 后的卸载需要在 chroot 环境内或特殊处理，这里简化，依赖系统重启或手动清理
    // 通常容器运行时会更仔细地管理这些挂载
    umount(container_root);
    // 删除临时目录
    remove_directory(container_root);
    remove_directory(host_fs_path);
    printf("&#91;+] Cleanup finished.\n");

    printf("=== Container Runtime Finished ===\n");
    return 0;
}

代码详细解释

1. 配置和辅助部分

#define 和全局变量：定义了栈大小、cgroup 名称、容器根目录路径等常量和全局变量，方便修改和使用。
remove_directory：使用 nftw 递归遍历并删除整个目录树，用于清理工作。
write_file：一个安全的小函数，用于向文件写入内容，避免重复代码。

2. prepare_rootfs - 准备文件系统

这是容器技术中最复杂的部分之一，因为容器需要一个完整的、自包含的文件系统。

mkdir 和 mount(“tmpfs”, …)：创建容器根目录，并挂载一个 tmpfs。tmpfs 是基于内存的文件系统，非常适合做实验，因为它启动快且隔离性好。
创建基本目录：/bin, /lib, /etc, /proc, /sys, /dev 是 Linux 系统运行程序所必需的目录。
复制/绑定挂载二进制文件：这里简化地只复制了 /bin/sh。真实的容器镜像（如 Docker 镜像）会包含一个完整的根文件系统（/bin, /usr, /lib 等所有内容）。
绑定挂载库文件：为了让 /bin/sh 能运行，它需要依赖宿主机的共享库（.so 文件）。我们通过 mount –bind 将宿主机的 /lib 和 /lib64 挂载到容器内对应位置。注意：这在实际生产中是不安全的，因为容器会使用宿主机的库，可能导致版本不兼容或安全问题。真正的容器会自带所需的库。
创建 /etc/passwd 等：为了让一些基础命令（如 whoami）能正常工作，需要创建这些用户/组信息文件。
绑定挂载宿主机目录：模拟 docker run -v 功能，展示容器与宿主机之间的数据共享。

3. setup_cgroups - 设置资源限制

cgroups (Control Groups) 用于限制、记录和隔离进程组的资源使用（CPU、内存、磁盘 I/O 等）。

mkdir：在 /sys/fs/cgroup/memory/ 下创建一个子目录，作为我们容器专用的 cgroup。
write_file(limit_path, MEMORY_LIMIT)：向 memory.limit_in_bytes 文件写入限制值（如 “50M”），内核会自动应用这个限制。
write_file(tasks_path, pid_str)：将容器进程的 PID 写入 cgroup 的 tasks 文件。这一步是关键，它告诉内核：“请把这个 PID 的进程放进这个 cgroup 里，让它受到资源限制。”

4. container_main - 容器内的初始化

这是容器内第一个运行的用户态进程（通常 PID 为 1）。

chdir 和 chroot：

chdir(container_root)：先切换当前工作目录到我们准备好的容器根目录。
chroot(“.”)：这是核心操作。chroot 系统调用会将调用进程及其子进程的根目录（/）永久性地更改为当前工作目录（.，即 container_root）。执行此操作后，进程将无法访问原宿主机根目录之外的任何文件。对它来说，container_root 就是世界的尽头，里面的 / 就是真正的 /。

挂载虚拟文件系统：

mount(“proc”, “/proc”, “proc”, …)：挂载 proc 文件系统。进程需要通过 /proc 来读取自身信息（如 /proc/self/status）、查看子进程、获取 CPU 信息等。
mount(“sysfs”, “/sys”, “sysfs”, …)：挂载 sysfs，用于访问内核和硬件设备信息。
mount(“tmpfs”, “/dev”, “tmpfs”, …)：挂载一个 tmpfs 到 /dev，容器内的程序可能需要在这里创建设备节点或临时文件。
mknod：创建基本的设备文件，如 /dev/null。

execvp(args[0], args)：这是最后一步，也是至关重要的一步。execvp 系列函数会用磁盘上的一个新程序（由 args[0] 指定）完全替换当前进程的内存镜像（代码、数据、堆栈等）。执行成功后，容器内运行的就是用户指定的命令了，而不再是我们的 container_main 函数。这个新程序的 PID 仍然是 1（因为在新的 PID 命名空间中）。

5. main - 主流程控制

参数检查：确保用户提供了要运行的命令。
调用 prepare_rootfs：准备隔离环境。

调用 clone：

这是启动隔离进程的核心。我们传递了多个 CLONE_NEW* 标志：

CLONE_NEWPID：创建新的 PID 命名空间。这使得容器内的第一个进程 PID 为 1，并且它无法看到或操作宿主机上 PID 命名空间中的进程。
CLONE_NEWNS：创建新的 Mount 命名空间。这使得容器内的 mount 和 umount 操作不会影响宿主机的文件系统视图。
CLONE_NEWUTS：创建新的 UTS 命名空间。这使得容器可以拥有独立的主机名 (hostname) 和 NIS 域名。
CLONE_NEWIPC：创建新的 IPC 命名空间。这隔离了 System V IPC 和 POSIX 消息队列。
CLONE_NEWNET：创建新的 Network 命名空间。这使得容器拥有独立的网络设备、IP 地址、路由表、端口等。示例中未配置网络，所以容器内网络功能受限。

child_stack + STACK_SIZE：传递子进程栈的顶部指针。

子进程分支 (if (container_pid == 0))：

在子进程中，设置主机名，然后调用 container_main 进行初始化。

父进程分支：

调用 setup_cgroups 来限制子进程的资源。
使用 waitpid 等待容器进程结束。
容器结束后，执行清理工作：卸载文件系统、删除临时目录。

如何编译和运行

# 1. 将代码保存为 my_simple_container.c

# 2. 编译 (需要 root 权限来运行，但编译不需要)
gcc -o my_simple_container my_simple_container.c

# 3. 运行容器，执行 /bin/sh
# 必须使用 sudo，因为涉及命名空间、cgroups、挂载等特权操作
sudo ./my_simple_container /bin/sh

# 你会看到类似输出：
# === My Simple Container Runtime ===
# &#91;*] Preparing root filesystem at /tmp/my_container_root
#     Mounted tmpfs on /tmp/my_container_root
#     Created basic directory structure
#     Copying /bin/sh...
#     Created minimal /etc/passwd and /etc/group
#     Bind-mounted /lib and /lib64
#     Bind-mounted host directory /tmp/host_fs_for_container to container directory /tmp/host_fs_for_container
# &#91;+] Root filesystem prepared successfully.
# &#91;*] Started container process with PID 12345
# &#91;*] Setting up cgroups (memory limit: 50M)
# &#91;+] Cgroups configured.
# &#91;*] Waiting for container (PID 12345) to finish...
#
# &#91;*** INSIDE CONTAINER ***]
# Container PID 1: 1
#     Changed root to /tmp/my_container_root using chroot
#     Mounted /proc, /sys, /dev
#     Executing command: /bin/sh
# &#91;*** END OF CONTAINER SETUP ***]
#
# # <--- 你现在在容器的 shell 里面了，PID 为 1 ---
# # ps aux
# USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
# root           1  0.0  0.0   2284  1536 ?        Ss   10:30   0:00 /bin/sh
# root          12  0.0  0.0   3864  3168 ?        R+   10:31   0:00 ps aux
# #
# # hostname
# my-container
# #
# # df -h
# Filesystem      Size  Used Avail Use% Mounted on
# tmpfs           1.9G     0  1.9G   0% /
# proc            1.9G     0  1.9G   0% /proc
# sysfs           1.9G     0  1.9G   0% /sys
# tmpfs            64M     0   64M   0% /dev
# tmpfs           1.9G     0  1.9G   0% /tmp
# /dev/sda1        20G  5.0G   15G  26% /tmp/host_fs_for_container
# #
# # cd /tmp/host_fs_for_container
# # touch file_created_in_container
# # exit
# # <--- 退出容器 ---
#
# &#91;*] Container exited with status 0
# &#91;*] Cleaning up...
# &#91;+] Cleanup finished.
# === Container Runtime Finished ===
#
# # <--- 回到宿主机 ---
# # ls /tmp/host_fs_for_container/
# file_created_in_container  # 你在容器里创建的文件在宿主机上也能看到

总结原理

这个示例通过组合 Linux 内核的几个关键特性，模拟了现代容器的运行原理：

隔离 (Isolation) - 命名空间 (Namespaces):

clone 系统调用配合 CLONE_NEW* 标志，在创建新进程时就为其分配了独立的视图，包括进程 ID (CLONE_NEWPID)、文件系统挂载点 (CLONE_NEWNS)、主机名 (CLONE_NEWUTS)、IPC 资源 (CLONE_NEWIPC) 和网络 (CLONE_NEWNET)。
chroot 系统调用进一步将进程的文件系统根目录 (/) 切换到一个预先准备好的、与宿主机隔离的目录，实现了文件系统的彻底隔离。

资源限制 (Resource Limiting) - Control Groups (Cgroups):

通过在 /sys/fs/cgroup 下创建子目录并配置参数（如 memory.limit_in_bytes），然后将容器进程的 PID 添加到该 cgroup 的 tasks 列表中，实现了对该进程资源使用的限制（此处为内存）。

文件系统 (Filesystem) - Rootfs:

准备一个包含运行所需程序和库的目录 (/tmp/my_container_root)。
使用 tmpfs、bind mount 等技术构建这个目录。
通过 chroot 使其成为容器进程的根目录。

执行 (Execution):

在完成所有隔离和设置后，使用 execvp 系统调用，用用户指定的命令（如 /bin/sh）替换容器初始化进程（PID 1）的镜像，从而在隔离环境中运行该命令。

通过以上步骤，一个与宿主机环境隔离、资源受限、拥有独立文件系统和进程/网络视图的“沙盒”环境就被创建出来了，这就是容器的核心工作原理。

2025-08-23

Linux系统编程

sendto系统调用及示例

我们继续学习 Linux 系统编程中的重要函数。这次我们介绍 sendto 和 recvfrom 函数，它们是用于无连接（数据报）套接字（如 UDP）进行数据传输的核心系统调用，但也可以用于面向连接（流式）套接字。

sendto系统调用及示例-CSDN博客

sendto系统调用及示例

1. 函数介绍

sendto 和 recvfrom 是 Linux 系统调用，专门设计用于在套接字上传输数据报（datagrams）。它们与 send/write 和 recv/read 的主要区别在于：sendto 和 recvfrom 显式地处理目标地址和源地址。

sendto: 将数据从套接字发送到指定的目标地址。对于 UDP 套接字，这会创建一个数据报并发送到指定的主机和端口。对于 TCP 套接字，如果尚未连接，调用会失败。
recvfrom: 从套接字接收数据，并获取数据的来源地址（发送方的 IP 地址和端口号）。对于 UDP 套接字，这会接收一个数据报。对于 TCP 套接字，地址信息通常不被使用，因为连接是点对点的。

你可以把它们想象成写信和收信：

sendto: 你写一封信（数据），并在信封上明确写上收信人的地址（目标地址），然后投递出去。
recvfrom: 你收到一封信（数据），信封上写着寄件人的地址（源地址），你可以知道是谁寄来的。

2. 函数原型

#include <sys/socket.h> // 必需

// 发送数据到指定地址
ssize_t sendto(int sockfd, const void *buf, size_t len, int flags,
               const struct sockaddr *dest_addr, socklen_t addrlen);

// 从套接字接收数据，并获取源地址
ssize_t recvfrom(int sockfd, void *buf, size_t len, int flags,
                 struct sockaddr *src_addr, socklen_t *addrlen);

3. 功能

sendto:

通过套接字 sockfd 发送 len 个字节的数据（从 buf 指向的缓冲区）。
数据被发送到由 dest_addr 和 addrlen 指定的目标地址。
对于数据报（如 UDP）套接字，这会创建一个独立的数据报。

recvfrom:

通过套接字 sockfd 接收最多 len 个字节的数据，并将其存储在 buf 指向的缓冲区中。
如果 src_addr 和 addrlen 非 NULL，则将发送方的地址信息填充到 src_addr 指向的结构体中，并更新 *addrlen 为实际地址结构的大小。

4. 参数

这两个函数的参数非常相似，分别处理发送和接收。

sendto

int sockfd: 有效的套接字文件描述符。
const void *buf: 指向包含要发送数据的缓冲区的指针。
size_t len: 要发送的字节数。

int flags: 控制发送行为的标志位。常见的有：

0: 使用默认行为。
MSG_DONTWAIT: 使发送操作非阻塞（如果套接字是阻塞的）。
MSG_NOSIGNAL: 在面向连接的套接字上，如果连接断开，不产生 SIGPIPE 信号。

const struct sockaddr *dest_addr: 指向目标地址结构的指针（例如 sockaddr_in 或 sockaddr_in6）。

socklen_t addrlen: dest_addr 指向的地址结构的大小。

recvfrom

int sockfd: 有效的套接字文件描述符。
void *buf: 指向用于存储接收数据的缓冲区的指针。
size_t len: 缓冲区 buf 的大小，也是期望接收的最大字节数。

int flags: 控制接收行为的标志位。常见的有：

0: 使用默认行为。
MSG_DONTWAIT: 使接收操作非阻塞（如果套接字是阻塞的）。
MSG_PEEK: 查看传入的数据，但不从输入队列中移除。
MSG_WAITALL: 请求等待，直到读入请求的字节数。但当检测到错误或断开连接时，或套接字为非阻塞时，仍可能返回少于请求字节数的数据。

struct sockaddr *src_addr:

如果不需要获取发送方地址，可以传入 NULL。
如果需要获取发送方地址，应传入指向 sockaddr_in 或 sockaddr_in6 等结构的指针。

socklen_t *addrlen:

如果 src_addr 是 NULL，则 addrlen 也必须是 NULL。
如果 src_addr 非 NULL，则 addrlen 必须指向一个 socklen_t 变量。
输入: 调用时，该变量应初始化为 src_addr 指向的缓冲区的大小（例如 sizeof(struct sockaddr_in)）。
输出: 返回时，该变量被更新为实际存储在 src_addr 中的地址结构的大小。

5. 返回值

成功时:

sendto: 返回实际发送的字节数。对于数据报套接字，这通常等于 len（如果成功发送了整个数据报）。
recvfrom: 返回实际接收的字节数。如果返回 0，可能表示对端已关闭连接（对于面向连接的套接字）或收到了一个零长度的数据报（对于数据报套接字，理论上可能）。

失败时:

两个函数在失败时都返回 -1，并设置全局变量 errno 来指示具体的错误原因（例如 EAGAIN/EWOULDBLOCK 非阻塞套接字上无数据可读/写，ECONNREFUSED 远程主机拒绝连接，EINTR 调用被信号中断等）。

6. 相似函数，或关联函数

send / write: 用于发送数据，但不指定目标地址（通常用于已连接的套接字）。
recv / read: 用于接收数据，但不获取源地址信息（通常用于已连接的套接字）。
connect: 对于数据报套接字，connect 可以设置默认目标地址，之后就可以使用 send/write 和 recv/read 而无需指定地址。
socket / bind / sendto / recvfrom: 构成了 UDP 网络编程的基本工具集。

7. 示例代码

示例 1：UDP 客户端 (使用 sendto 和 recvfrom)

这个例子演示了一个 UDP 客户端如何使用 sendto 向服务器发送消息，并使用 recvfrom 接收服务器的回复，同时获取服务器的地址。

// udp_client.c
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define SERVER_PORT 8081
#define SERVER_IP "127.0.0.1"
#define BUFFER_SIZE 1024

int main() {
    int sock;
    struct sockaddr_in server_addr, client_addr;
    socklen_t client_addr_len = sizeof(client_addr);
    char *message = "Hello UDP Server!";
    char buffer&#91;BUFFER_SIZE];
    ssize_t bytes_sent, bytes_received;

    // 1. 创建 UDP 套接字
    sock = socket(AF_INET, SOCK_DGRAM, 0);
    if (sock < 0) {
        perror("socket creation failed");
        exit(EXIT_FAILURE);
    }
    printf("UDP client socket created (fd: %d)\n", sock);

    // 2. 配置服务器地址结构
    memset(&server_addr, 0, sizeof(server_addr));
    server_addr.sin_family = AF_INET;
    server_addr.sin_port = htons(SERVER_PORT);
    if (inet_pton(AF_INET, SERVER_IP, &server_addr.sin_addr) <= 0) {
        fprintf(stderr, "Invalid address/ Address not supported\n");
        close(sock);
        exit(EXIT_FAILURE);
    }

    // 3. 发送数据到服务器 (使用 sendto)
    printf("Sending message to %s:%d\n", SERVER_IP, SERVER_PORT);
    bytes_sent = sendto(sock, message, strlen(message), 0,
                        (const struct sockaddr *)&server_addr, sizeof(server_addr));
    if (bytes_sent < 0) {
        perror("sendto failed");
        close(sock);
        exit(EXIT_FAILURE);
    } else {
        printf("Sent %zd bytes: %s\n", bytes_sent, message);
    }

    // 4. 接收服务器的回复 (使用 recvfrom 并获取源地址)
    printf("Waiting for reply from server...\n");
    bytes_received = recvfrom(sock, buffer, BUFFER_SIZE - 1, 0,
                              (struct sockaddr *)&client_addr, &client_addr_len);
    if (bytes_received < 0) {
        perror("recvfrom failed");
        close(sock);
        exit(EXIT_FAILURE);
    } else {
        buffer&#91;bytes_received] = '\0'; // 确保字符串结束
        printf("Received %zd bytes from server %s:%d: %s\n",
               bytes_received,
               inet_ntoa(client_addr.sin_addr), ntohs(client_addr.sin_port),
               buffer);
    }

    // 5. 关闭套接字
    close(sock);
    printf("UDP client socket closed.\n");

    return 0;
}

代码解释:

创建一个 AF_INET 和 SOCK_DGRAM 的 UDP 套接字。

填充 sockaddr_in 结构 server_addr，指定服务器的 IP 和端口。

关键: 调用 sendto(sock, message, …, &server_addr, sizeof(server_addr)) 将数据发送到指定的服务器地址。

关键: 调用 recvfrom(sock, buffer, …, &client_addr, &client_addr_len) 接收数据。

&client_addr 和 &client_addr_len 用于接收发送方（即服务器）的地址信息。
client_addr_len 在调用前初始化为 sizeof(client_addr)。

打印接收到的数据和服务器的地址（IP 和端口）。

关闭套接字。

示例 2：UDP 服务器 (使用 recvfrom 和 sendto)

这个例子演示了一个 UDP 服务器如何使用 recvfrom 接收来自任意客户端的消息，并使用 sendto 将回复发送回消息的发送方。

// udp_server.c
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define PORT 8081
#define BUFFER_SIZE 1024

int main() {
    int server_fd;
    struct sockaddr_in server_addr, client_addr;
    socklen_t client_addr_len = sizeof(client_addr);
    char buffer&#91;BUFFER_SIZE];
    ssize_t bytes_received, bytes_sent;
    char reply&#91;] = "Echo: ";

    // 1. 创建 UDP 套接字
    server_fd = socket(AF_INET, SOCK_DGRAM, 0);
    if (server_fd < 0) {
        perror("socket creation failed");
        exit(EXIT_FAILURE);
    }
    printf("UDP server socket created (fd: %d)\n", server_fd);

    // 2. 配置服务器地址结构
    memset(&server_addr, 0, sizeof(server_addr));
    server_addr.sin_family = AF_INET;
    server_addr.sin_addr.s_addr = INADDR_ANY; // 绑定到所有接口
    server_addr.sin_port = htons(PORT);

    // 3. (可选) 绑定套接字到地址和端口
    // 对于 UDP 服务器，绑定是常见的做法，以便客户端知道连接到哪个端口
    if (bind(server_fd, (const struct sockaddr *)&server_addr, sizeof(server_addr)) < 0) {
        perror("bind failed");
        close(server_fd);
        exit(EXIT_FAILURE);
    }
    printf("UDP server bound to port %d\n", PORT);

    printf("UDP server is listening for messages...\n");

    while (1) {
        printf("Waiting for a datagram...\n");

        // 4. 接收数据报 (使用 recvfrom 并获取客户端地址)
        bytes_received = recvfrom(server_fd, buffer, BUFFER_SIZE - 1, 0,
                                  (struct sockaddr *)&client_addr, &client_addr_len);
        if (bytes_received < 0) {
            perror("recvfrom failed");
            continue; // 或 exit(EXIT_FAILURE);
        }

        buffer&#91;bytes_received] = '\0'; // 确保字符串结束
        printf("Received %zd bytes from client %s:%d: %s\n",
               bytes_received,
               inet_ntoa(client_addr.sin_addr), ntohs(client_addr.sin_port),
               buffer);

        // 5. 构造回复消息
        char reply_buffer&#91;BUFFER_SIZE];
        int reply_len = snprintf(reply_buffer, BUFFER_SIZE, "%s%s", reply, buffer);
        if (reply_len >= BUFFER_SIZE) {
            fprintf(stderr, "Reply message truncated.\n");
            reply_len = BUFFER_SIZE - 1;
        }

        // 6. 发送回复到客户端 (使用 sendto 和之前获取的客户端地址)
        printf("Sending reply to client %s:%d\n",
               inet_ntoa(client_addr.sin_addr), ntohs(client_addr.sin_port));
        bytes_sent = sendto(server_fd, reply_buffer, reply_len, 0,
                            (const struct sockaddr *)&client_addr, client_addr_len);
        if (bytes_sent < 0) {
            perror("sendto reply failed");
        } else {
            printf("Sent %zd bytes as reply.\n", bytes_sent);
        }
    }

    // close(server_fd); // 不会执行到这里
    return 0;
}

代码解释:

创建一个 UDP 套接字。

配置服务器地址 server_addr，并调用 bind() 将套接字绑定到该地址和端口。这是 UDP 服务器的标准做法。

进入一个无限循环。

关键: 调用 recvfrom(server_fd, buffer, …, &client_addr, &client_addr_len) 等待并接收数据报。

该调用会阻塞，直到有数据报到达。
client_addr 和 client_addr_len 会被自动填充为发送该数据报的客户端的地址信息。

处理接收到的数据（这里简单地打印）。

关键: 调用 sendto(server_fd, reply_buffer, …, &client_addr, client_addr_len) 将回复发送回刚才接收数据的那个客户端。地址信息直接来自上一步 recvfrom 的输出。

循环继续，处理下一个客户端的数据报。

示例 3：对比 sendto/recvfrom 与 connect + send/recv (UDP)

这个例子通过代码片段对比两种 UDP 客户端编程方式。

// 方式一：使用 sendto/recvfrom (显式地址)
void client_method_one() {
    int sock = socket(AF_INET, SOCK_DGRAM, 0);
    struct sockaddr_in server_addr;
    // ... 配置 server_addr ...

    char *msg = "Hello";
    // 发送时必须指定地址
    sendto(sock, msg, strlen(msg), 0, (struct sockaddr*)&server_addr, sizeof(server_addr));

    char buffer&#91;1024];
    struct sockaddr_in src_addr;
    socklen_t src_len = sizeof(src_addr);
    // 接收时可以获取源地址
    recvfrom(sock, buffer, sizeof(buffer), 0, (struct sockaddr*)&src_addr, &src_len);
    close(sock);
}

// 方式二：使用 connect + send/recv (隐式地址)
void client_method_two() {
    int sock = socket(AF_INET, SOCK_DGRAM, 0);
    struct sockaddr_in server_addr;
    // ... 配置 server_addr ...

    // 使用 connect 设置默认目标地址
    connect(sock, (struct sockaddr*)&server_addr, sizeof(server_addr));

    char *msg = "Hello";
    // 发送时无需指定地址
    send(sock, msg, strlen(msg), 0);

    char buffer&#91;1024];
    // 接收时通常不关心源地址 (因为已连接)
    recv(sock, buffer, sizeof(buffer), 0);
    close(sock);
}

代码解释:

方式一 (sendto/recvfrom):

每次发送都必须明确指定目标地址 (sendto)。
接收时可以选择性地获取源地址 (recvfrom)。
更加灵活，一个套接字可以与多个不同的目标通信。

方式二 (connect + send/recv):

通过 connect 一次性设置默认目标地址。
后续的 send/write 和 recv/read 操作就像 TCP 一样，无需指定地址。
简化了编程模型，但牺牲了灵活性（主要针对单个目标通信）。

重要提示与注意事项:

数据报边界: 对于 UDP (SOCK_DGRAM)，sendto 发送的是一个完整的数据报，recvfrom 接收的也是一个完整的数据报。这与 TCP (SOCK_STREAM) 的字节流不同。

无连接: UDP 是无连接的。服务器不需要 listen 和 accept。客户端不需要 connect（除非使用方式二）。

地址参数: sendto 的 dest_addr 和 recvfrom 的 src_addr 是它们与 send/recv 的核心区别。

错误处理: sendto 可能因为目标不可达而失败（ECONNREFUSED）。recvfrom 在没有数据时会阻塞（阻塞套接字）。

缓冲区大小: recvfrom 的 len 参数是缓冲区大小，返回值是实际收到的字节数。确保缓冲区足够大。

addrlen 初始化: 在调用 recvfrom 时，务必在之前将 *addrlen 初始化为目标缓冲区的大小。

总结:

sendto 和 recvfrom 是进行 UDP 网络编程（以及某些特殊情况下的 TCP 编程）的基础。它们提供了对数据报源地址和目标地址的直接控制，是实现无连接、不可靠但高效通信的关键工具。理解它们的参数和使用场景对于掌握网络编程至关重要。

https://www.calcguide.tech/2025/08/23/sendto系统调用及示例/