readv系统调用及示例

好的，我们继续学习 Linux 系统编程中的重要函数。这次我们介绍 readv 和 writev 函数，它们是 read 和 write 的向量（或分散/聚集）版本，允许在单次系统调用中将数据读入到多个不连续的缓冲区，或将多个不连续缓冲区的数据写入到文件描述符。

1. 函数介绍 Link to heading

readv (Read Vector) 和 writev (Write Vector) 是 Linux 系统调用，用于实现分散读取 (scatter input) 和 集中写入 (gather output)。

readv: 从文件描述符 fd 中读取数据，并将其分散存储到由 iov 指向的 iovec 结构体数组所描述的多个缓冲区中。内核会按顺序填满第一个缓冲区，然后是第二个，依此类推，直到读取了 count 个缓冲区或没有更多数据可读。
writev: 从由 iov 指向的 iovec 结构体数组所描述的多个缓冲区中收集数据，并将其写入到文件描述符 fd。内核会按顺序写入第一个缓冲区的所有数据，然后是第二个，依此类推，直到写入了 count 个缓冲区的数据或遇到错误。

你可以把它们想象成 read/write 的“批量处理”版本。readv 像是一个漏斗，把数据分别倒入不同的杯子（缓冲区）；writev 像是一个胶水，把多个纸片（缓冲区）按顺序粘起来然后贴到墙上（文件）。

2. 函数原型 Link to heading

#include <sys/uio.h> // 必需，包含 struct iovec 的定义

// 分散读取
ssize_t readv(int fd, const struct iovec *iov, int iovcnt);

// 集中写入
ssize_t writev(int fd, const struct iovec *iov, int iovcnt);

3. 功能 Link to heading

readv(fd, iov, iovcnt):
- 从 fd 读取数据。
- 将数据依次填入 iov 数组中描述的 iovcnt 个缓冲区。
- 返回实际读取的总字节数。
writev(fd, iov, iovcnt):
- 从 iov 数组中描述的 iovcnt 个缓冲区收集数据。
- 将收集到的所有数据按顺序写入 fd。
- 返回实际写入的总字节数。

使用 readv/writev 的主要优势在于减少系统调用的次数。例如，如果需要读取或写入的数据分布在多个缓冲区中，使用传统的 read/write 可能需要多次调用，而 readv/writev 可以在一次调用中完成，从而减少用户态和内核态之间的切换开销，提高效率。

4. 参数 Link to heading

int fd: 有效的文件描述符。
const struct iovec *iov: 指向 struct iovec 类型数组的指针。这个数组描述了数据缓冲区的位置和大小。 struct iovec 的定义如下：
```
struct iovec {
    void  *iov_base; // 缓冲区的起始地址
    size_t iov_len;  // 缓冲区的大小（以字节为单位）
};
```
- iov_base: 指向一个内存缓冲区的指针。
- iov_len: 该缓冲区的大小。
int iovcnt: iov 数组中的元素个数，即缓冲区的数量。根据 POSIX 标准，iovcnt 的有效范围通常是 1 到 IOV_MAX（在 Linux 上通常是 1024）。

5. 返回值 Link to heading

成功时:
- 返回传输的总字节数（读取或写入的总字节数）。这个数可能小于所有缓冲区大小的总和（例如，在读取时接近文件末尾，或在写入时遇到磁盘空间不足）。
失败时:
- 返回 -1，并设置全局变量 errno 来指示具体的错误原因（例如 EBADF fd 无效，EINVAL iovcnt 无效或超出限制，EIO I/O 错误等）。

6. 相似函数，或关联函数 Link to heading

read, write: 基础的读写函数，它们操作单个缓冲区。
preadv, pwritev: readv/writev 的扩展版本，结合了 pread/pwrite 的功能，允许指定文件偏移量，而不改变文件的当前读写位置指针。
mmap: 另一种访问文件内容的方式，通过内存映射。

7. 示例代码 Link to heading

示例 1：使用 `writev` 进行集中写入 Link to heading

这个例子演示如何使用 writev 将多个字符串（存储在不同的缓冲区中）一次性写入到标准输出。

#include <sys/uio.h>  // writev, struct iovec
#include <unistd.h>   // STDOUT_FILENO
#include <stdio.h>    // perror
#include <stdlib.h>   // exit
#include <string.h>   // strlen

int main() {
    // 定义要写入的数据
    char part1[] = "Hello, ";
    char part2[] = "this is ";
    char part3[] = "a message ";
    char part4[] = "written using writev!\n";

    // 定义 iovec 结构体数组，描述各个缓冲区
    struct iovec iov[4];
    ssize_t bytes_written;

    // 填充 iovec 数组
    iov[0].iov_base = part1;
    iov[0].iov_len  = strlen(part1);

    iov[1].iov_base = part2;
    iov[1].iov_len  = strlen(part2);

    iov[2].iov_base = part3;
    iov[2].iov_len  = strlen(part3);

    iov[3].iov_base = part4;
    iov[3].iov_len  = strlen(part4);

    // 调用 writev 将所有部分集中写入到标准输出 (STDOUT_FILENO)
    bytes_written = writev(STDOUT_FILENO, iov, 4);

    if (bytes_written == -1) {
        perror("writev failed");
        exit(EXIT_FAILURE);
    }

    printf("writev successfully wrote %zd bytes.\n", bytes_written);

    return 0;
}

代码解释:

定义了四个包含不同字符串的字符数组 part1 到 part4。
声明了一个 struct iovec 数组 iov，大小为 4。
依次填充 iov 数组的每个元素：
- iov_base 指向对应的字符串缓冲区。
- iov_len 设置为对应字符串的长度（使用 strlen）。
调用 writev(STDOUT_FILENO, iov, 4)。
- STDOUT_FILENO: 文件描述符，这里是标准输出。
- iov: 指向 iovec 数组的指针。
- 4: iovec 数组的元素个数。
检查 writev 的返回值。如果成功，它返回写入的总字节数（在这个例子中应该是所有字符串长度的总和）。

示例 2：使用 `readv` 进行分散读取 Link to heading

这个例子演示如何使用 readv 从一个文件中读取数据，并将其分散存储到不同的缓冲区中。

#include <sys/uio.h>  // readv, struct iovec
#include <fcntl.h>    // open
#include <unistd.h>   // readv, write, close
#include <stdio.h>    // perror, printf
#include <stdlib.h>   // exit
#include <string.h>   // memset

int main() {
    int fd;
    const char *filename = "source_data.txt";
    const char *source_content = "This is a sample text file with enough content to demonstrate readv.\n"
                                 "It has multiple lines and words to fill the buffers.\n"
                                 "End of sample data.\n";
    struct iovec iov[3];
    char buf1[20], buf2[30], buf3[50]; // 定义三个大小不同的缓冲区
    ssize_t bytes_read;
    ssize_t total_len;

    // 1. 创建并写入源文件
    fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC, 0644);
    if (fd == -1) {
        perror("open file for writing");
        exit(EXIT_FAILURE);
    }
    total_len = write(fd, source_content, strlen(source_content));
    if (total_len == -1) {
        perror("write source content");
        close(fd);
        exit(EXIT_FAILURE);
    }
    printf("Created source file '%s' with %zd bytes.\n", filename, total_len);
    close(fd);

    // 2. 重新打开文件以供读取
    fd = open(filename, O_RDONLY);
    if (fd == -1) {
        perror("open file for reading");
        exit(EXIT_FAILURE);
    }

    // 3. 初始化缓冲区 (可选，用于查看未填充的部分)
    memset(buf1, '.', sizeof(buf1) - 1);
    buf1[sizeof(buf1) - 1] = '\0';
    memset(buf2, '.', sizeof(buf2) - 1);
    buf2[sizeof(buf2) - 1] = '\0';
    memset(buf3, '.', sizeof(buf3) - 1);
    buf3[sizeof(buf3) - 1] = '\0';

    // 4. 设置 iovec 结构体数组
    iov[0].iov_base = buf1;
    iov[0].iov_len  = sizeof(buf1) - 1; // 留一个字节给字符串结束符 \0

    iov[1].iov_base = buf2;
    iov[1].iov_len  = sizeof(buf2) - 1;

    iov[2].iov_base = buf3;
    iov[2].iov_len  = sizeof(buf3) - 1;


    // 5. 调用 readv 进行分散读取
    bytes_read = readv(fd, iov, 3);

    if (bytes_read == -1) {
        perror("readv failed");
        close(fd);
        exit(EXIT_FAILURE);
    }

    printf("\nreadv successfully read %zd bytes.\n", bytes_read);

    // 6. 确保字符串以 \0 结尾 (因为 readv 不会自动添加)
    // 我们预留了空间，所以直接添加
    buf1[iov[0].iov_len] = '\0';
    buf2[iov[1].iov_len] = '\0';
    buf3[iov[2].iov_len] = '\0';

    // 7. 打印读取到的内容
    printf("\n--- Content of buffers after readv ---\n");
    printf("Buffer 1 (%zu bytes max): '%s'\n", sizeof(buf1) - 1, buf1);
    printf("Buffer 2 (%zu bytes max): '%s'\n", sizeof(buf2) - 1, buf2);
    printf("Buffer 3 (%zu bytes max): '%s'\n", sizeof(buf3) - 1, buf3);

    // 8. 关闭文件
    if (close(fd) == -1) {
        perror("close");
        exit(EXIT_FAILURE);
    }

    return 0;
}

代码解释:

首先创建一个名为 source_data.txt 的文件，并写入一些示例内容。
重新以只读模式打开该文件。
定义三个不同大小的字符缓冲区 buf1, buf2, buf3。
使用 memset 用点号 . 填充缓冲区（这不是必须的，只是为了演示 readv 如何填充它们）。
声明 struct iovec 数组 iov，并填充它，使其指向三个缓冲区并指定它们的大小（留一个字节给 \0）。
调用 readv(fd, iov, 3) 从文件中读取数据。
检查返回值。如果成功，它返回读取的总字节数。
手动为每个缓冲区添加字符串结束符 \0（因为 readv 读取的是原始字节，不会添加）。
打印每个缓冲区的内容，可以看到数据是如何被分散读取并填充到不同缓冲区的。
关闭文件。

示例 3：在网络编程中使用 `writev` 构造 HTTP 响应 Link to heading

这个例子（概念性地）展示了 writev 在网络服务器中构造响应头和响应体的典型应用。

#include <sys/uio.h>  // writev, struct iovec
#include <sys/socket.h> // send, sendmsg (概念性)
#include <stdio.h>    // printf
#include <string.h>   // strlen
// #include <unistd.h> // 如果使用 writev 直接写 socket fd

// 假设这些数据是动态生成的
const char *http_header = "HTTP/1.1 200 OK\r\n"
                          "Content-Type: text/plain\r\n"
                          "Connection: close\r\n"
                          "\r\n"; // Header 和 Body 之间需要空行

const char *http_body = "Hello, World! This is the response body.\n";

// 模拟发送 HTTP 响应的函数 (概念性)
int send_http_response(int client_socket_fd /* , const char *header, const char *body */) {
    struct iovec iov[2];
    ssize_t total_bytes, bytes_sent;

    // 设置 iovec 数组
    iov[0].iov_base = (void*)http_header; // 强制类型转换，因为 iov_base 是 void*
    iov[0].iov_len  = strlen(http_header);

    iov[1].iov_base = (void*)http_body;
    iov[1].iov_len  = strlen(http_body);

    // 使用 writev 发送 (假设 socket fd 可以像文件一样 writev)
    // 在实际网络编程中，更常用 sendmsg 或者循环 send/write
    // 这里用 writev 演示 gather output 的概念
    bytes_sent = writev(client_socket_fd, iov, 2);

    if (bytes_sent == -1) {
        perror("writev failed in send_http_response");
        return -1; // Indicate failure
    }

    total_bytes = iov[0].iov_len + iov[1].iov_len;
    printf("Sent HTTP response using writev: %zd of %zd bytes.\n", bytes_sent, total_bytes);

    if (bytes_sent != total_bytes) {
        fprintf(stderr, "Warning: Incomplete writev. Sent %zd, Expected %zd.\n", bytes_sent, total_bytes);
        // 在实际应用中，可能需要处理部分发送的情况
        return -1;
    }

    return 0; // Success
}

// 主函数 (概念性)
int main() {
    int simulated_client_socket_fd = 1; // 假设 1 是一个有效的 socket fd (这里用 stdout 模拟)

    printf("--- Simulating sending HTTP response ---\n");
    // 调用函数发送响应
    if (send_http_response(simulated_client_socket_fd) == -1) {
        fprintf(stderr, "Failed to send HTTP response.\n");
        return 1;
    }

    printf("--- HTTP response sent ---\n");
    return 0;
}

代码解释 (概念性):

定义了 HTTP 响应头 http_header 和响应体 http_body。
send_http_response 函数接收一个模拟的客户端套接字文件描述符。
它创建一个包含两个元素的 iovec 数组 iov。
第一个元素指向 http_header 及其长度。
第二个元素指向 http_body 及其长度。
调用 writev(client_socket_fd, iov, 2) 将响应头和响应体作为一个整体发送出去。
这比分别调用两次 write (一次发送 header，一次发送 body) 更高效，因为它减少了系统调用的次数。
注意: 在实际的网络编程中，直接对套接字使用 writev 是可行的，但更常见的是使用 sendmsg 系统调用，它提供了更丰富的控制选项（如发送辅助数据）。这里用 writev 是为了突出其 gather output 的核心概念。

总结:

readv 和 writev 是高效的 I/O 操作函数，特别适用于需要处理分散在多个缓冲区中的数据的场景。它们通过减少系统调用次数来提高性能。理解它们的关键是掌握 struct iovec 数组的使用方法以及它们如何实现数据的分散读取和集中写入。