系统IO

系统IO的基础接口

基础的API接口

一、打开文件 open

常用 flag 组合速查	说明
`O_RDONLY \| O_CREAT`	只读创建，文件必须不存在时需配合 `mode`
`O_WRONLY \| O_CREAT \| O_TRUNC`	写覆盖创建，经典“清空写”
`O_RDWR \| O_CREAT \| O_APPEND`	读写追加，日志场景
`O_SYNC`	每次 write 落盘，性能差，数据安全高
`O_DSYNC`	仅数据落盘，元数据未必
`O_DIRECT`	绕过页缓存，需对齐内存，数据库常用

头文件：
    #include <fcntl.h>
函数原型：
    int open(const char *pathname, int flags, .../* mode_t mode */ );
参数分析：
    pathname --> 指定需要打开的目标文件的路径+名字
    flags --> 打开文件时的选项（可以多选）
        必选（三选一）：
            O_RDONLY   只读
            O_WRONLY   只写
            O_RDWR     可读可写
        非必选：
            O_APPEND  把文件的读写位置调整到文件末尾（追加模式）
            O_TRUNC   清空类文件
            O_ASYNC   设置信号驱动
            O_CREAT   如果文件不存在则自动创建文件 （当需要创建文件时，第三个参数启动mode ）
            O_NONBLOCK 使用非阻塞模式
    mode 用于创建新文件时指定文件的初始权限（可变参数） 【受到系统对用户的限制umask影响】
        0654 0标识八进制  
            6标识文件拥有者的权限为6 = 4可读 + 2可写
            5表示同组用户的权限为5 = 4可读 + 1可执行
            4表示其他用户的权限 4 = 可读 

返回值：
    成功  返回 file descriptor 文件描述符  fd  用于代表该文件在本进程中的编号【本质上是数组的下标】
    失败  返回 -1 ，并设置错误码（通过perror得到具体的错误原因）

注意：

当文件被创建时我们必须提供第三个参数Mode ,用于指明该文件的初始访问权限，但是在程序运行时系统处于安全考虑一般默认会把umask设置为0022或0002 , 其中2 表示可写权限，也就是说系统会默认把程序创建的文件的可写权限（同组用户或其他用户）去除。
实际上的权限 = 目标权限 - umask
0644 = 0666 - 0022
如果不想被系统干预，则可以在程序运行前把umask设置为0 。

umask  0

当使用O_APPEND 打开文件时意思是末尾追加文件内容，但是直接读取时还是会从文件开头进行读取
- 一旦写入任何信息后，读写指针都会同步到文件末尾去

二、读取文件 read

头文件：
    #include <unistd.h>
函数原型：
    ssize_t read  (int fd,      void buf[.count], size_t count);
参数分析：
    fd　--> 指定从某个文件描述符中读取
    buf --> 用户缓冲区（用于存储从文件中读取到的文件内容）
    count --> 用户缓冲区大小（防止读取文件时导致缓冲区越界）
返回值：
    成功 返回实际读取到的字节数 0则表示已经读取到文件末尾了
    失败 返回-1 并设置错误码

三、写入文件 write

头文件：
    #include <unistd.h>
函数原型：
    ssize_t write(int fd, const void buf[.count], size_t count);
参数分析：
    fd　--> 指定往某个文件描述符中写入数据
    buf --> 用户缓冲区（指定需要写入到文件的数据目前存储的位置）
    count --> 用户缓冲区大小（一般是指需要写入的字节数）
返回值：
    成功 返回实际写入的字节数
    失败 返回 -1 并设置错误码

四、关闭文件 close

头文件：
    #include <unistd.h>
函数原型：
    int close(int fd);
参数分析：
    fd --> 需要关闭的文件描述符
返回值：
    成功 返回0 
    失败 返回 -1 并设置错误码

五、设置偏移 lseek

头文件：
    #include <unistd.h>
函数原型：
    off_t lseek(int fd, off_t offset, int whence);
参数分析:
    fd --> 需要设置偏移位置的文件的描述符
    offset  --> 偏移量
    whence --> 如何偏移
        SEEK_SET 设置读写位置到offset 的位置（从0开始）
        SEEK_CUR 以当前位置开始往前或往后偏移 （offset 可以是正数，也可以是负数）
        SEEK_END 把偏移量从文件末尾开始设置 一般把offset设置为负数
返回值：
    成功 返回当前所处的偏移量位置（文件开头到当前位置）
    失败 返回 -1  ，并设置错误码

示例：

lseek( fd , 100 , SEEK_SET ); // 直接把文件读写位置设置为100 
lseek( fd , -100 , SEEK_CUR ); // 以当前的位置开始偏移-100
lseek( fd , -100 , SEEK_END ); // 以文件末尾作为起点偏移-100

// 偏移到文件现有内容以外，造成文件空洞
// 用与实现大文件的多线程操作
lseek( fd , 10240 , SEEK_SET );

// 获取当前的读写位置
int curr = lseek( fd , 0 , SEEK_CUR );

六、文件描述符本质

函数open()的返回值，是一个整型int数据。这个整型数据，实际上是内核中的一个称为fd_array的数组的下标：

打开文件时，内核产生一个指向 file{} 的指针，并将该指针放入一个位于 file_struct{} 中的数组 fd_array[] 中，而该指针所在数组的下标，就被 open() 返回给用户，用户把这个数组下标称为文件描述符，如上图所示。

结论：

文件描述符从0开始，每打开一个文件，就产生一个新的文件描述符。
可以重复打开同一个文件，每次打开文件都会使内核产生系列结构体，并得到不同的文件描述符
由于系统在每个进程开始运行时，都默认打开了一次键盘、两次屏幕，因此0、1、2描述符分别代表标准输入（strin）、标准输出（strout）和标准出错（strerr）三个文件（两个硬件）。

#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char const *argv[])
{
    int fd_1 = open ("abc.txt" , O_RDWR);
    int fd_2 = open ("abc.txt" , O_RDWR);

    printf("fd1:%d fd2:%d\n" , fd_1 , fd_2) ;

    // 通过1号描述符对文件进行写入
    write( fd_1 , "Hello" , 5 );

    char msg [32] = {0};
    // 通过2号描述符进行读取
    read(fd_2 , msg , 32 );
    printf("msg:%s\n" , msg );

    // 结果发现一号文件描述符与2号描述符并不互相影响
    
    return 0;
}

七、文件描述符复制 dup 与 dup2

dup 是英语单词 duplicate 的缩写，意即“复制”。
这两个函数功能类似，都是用来“复制”文件描述符，接口规范如下：

dup()会将指定的旧文件描述符oldfd复制一份，并返回一个系统当前未用的最小的新文件描述符。注意，此时这新旧两个文件描述符是可以互换的，因为它们本质上指涉的是同一个文件，因此它们共享文件的读写偏移量和文件的状态标签，比如使用lseek()对新文件描述符修改文件偏移量，这个操作会同时影响旧文件描述符oldfd，再如，使用read()对新文件描述符读取文件部分内容后，可以继续对旧文件描述符读取后续内容。
dup2()跟dup()几乎完全一样，不同的地方在于前者可以指定新文件描述符的具体数值，而不局限于系统当前未用描述符的最小值。这样一来，就可以通过dup2()指定一个已用的描述符，来达到重定向文件流的作用。

int main()
{
    // 打开文件 a.txt ，获得其文件描述符 fd1
    // 此处 fd1 就代表了这个文件及其配套的系统资源
    int fd1 = open("a.txt", O_RDWR);

    // 复制文件描述符 fd1，默认得到最小未用的文件描述符
    dup(fd1);

    // 复制文件描述符 fd1，并指派为 100
    dup2(fd1, 100);
}

解析：
使用dup函数时，会自动分配当前未用的最小的文件描述符，如上述代码，由于进程默认打开了0、1、2作为标准输入输出，于是 fd1 就是3，新产生的文件描述符就是4，而 dup2 函数可以任意指定文件描述符的数值，如果指定的文件描述符已经有所指代，那么原先指代关系将会被替换。这种情况被称为“重定向”。

日志文件雏形示例：
通过dup2把用户指定的一个文件来替代了标准输出文件的角色，因此程序中替换后的所有输出printf都会往用户指定的文件中打印输出。

#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char const *argv[])
{

    printf("__%d__\n" , __LINE__ ) ;

    int fd = open("abc.txt" , O_RDWR | O_APPEND);
    if (fd < 0)
    {
        perror("open file error") ;
        return -1 ;
    }

    printf("__%d__\n" , __LINE__ ) ;

    // 1号描述符是已经在使用的标准输出文件的描述符
    // dup2会先偷偷关闭1号描述符（标准输出文件）
    // dup2第二步则会把fd复制为 1号
    dup2( fd , 1 ) ;

    // 接下来所有的printf函数的输出都会被偷偷定向到指定的abc.txt文件中
    // 而并非标准输出，类似一个日志文件
    printf("__%d__\n" , __LINE__ ) ;

    
    close(fd);
    return 0;
}

八、文件控制 fcntl

该函数的名字是 file control 的缩写，顾名思义，它可以用来“控制”文件，与 ioctl 类似，此处的 “控制” 含义广泛，具体内容由其第二个参数命令字来决定，fcntl 的接口规范如下：

关键点：

fcntl是个变参函数，前两个参数是固定的，后续的参数个数和类型取决于cmd的具体数值。
第二个参数cmd，称为命令字。
命令字有很多，常用的如下：

示例（设置非阻塞）：

// 设置非阻塞
// 获取当前标准输入的状态
int stat = fcntl( 1 , F_GETFL );

// 添加非阻塞选项
stat |= O_NONBLOCK ;

// 把添加了非阻塞选项的标记设置回去
fcntl( 1 , F_SETFL , stat );

九、错误码

错误码实际上是一个全局变量，我们大部分的系统调用函数在出现错误时，都会直接修改该全局值，用户可以关注该值的变化来得知进程发生了什么错误。

#include <stdio.h>
#include <errno.h>
#include <string.h>


int main(int argc, char const *argv[])
{

    // strerror 可以把参数中的错误值对应的错误信息返回
    for (errno = 0; errno < 135 ; errno ++)
    {
        printf("【%d】:%s\n" , errno  , strerror(errno) );
    }
    

    return 0;
}

所有的已知错误：

【0】:Success
【1】:Operation not permitted
【2】:No such file or directory
【3】:No such process
【4】:Interrupted system call
【5】:Input/output error
【6】:No such device or address
【7】:Argument list too long
【8】:Exec format error
【9】:Bad file descriptor
【10】:No child processes
【11】:Resource temporarily unavailable
【12】:Cannot allocate memory
【13】:Permission denied
【14】:Bad address
【15】:Block device required
【16】:Device or resource busy
【17】:File exists
【18】:Invalid cross-device link
【19】:No such device
【20】:Not a directory
【21】:Is a directory
【22】:Invalid argument
【23】:Too many open files in system
【24】:Too many open files
【25】:Inappropriate ioctl for device
【26】:Text file busy
【27】:File too large
【28】:No space left on device
【29】:Illegal seek
【30】:Read-only file system
【31】:Too many links
【32】:Broken pipe
【33】:Numerical argument out of domain
【34】:Numerical result out of range
【35】:Resource deadlock avoided
【36】:File name too long
【37】:No locks available
【38】:Function not implemented
【39】:Directory not empty
【40】:Too many levels of symbolic links
【41】:Unknown error 41
【42】:No message of desired type
【43】:Identifier removed
【44】:Channel number out of range
【45】:Level 2 not synchronized
【46】:Level 3 halted
【47】:Level 3 reset
【48】:Link number out of range
【49】:Protocol driver not attached
【50】:No CSI structure available
【51】:Level 2 halted
【52】:Invalid exchange
【53】:Invalid request descriptor
【54】:Exchange full
【55】:No anode
【56】:Invalid request code
【57】:Invalid slot
【58】:Unknown error 58
【59】:Bad font file format
【60】:Device not a stream
【61】:No data available
【62】:Timer expired
【63】:Out of streams resources
【64】:Machine is not on the network
【65】:Package not installed
【66】:Object is remote
【67】:Link has been severed
【68】:Advertise error
【69】:Srmount error
【70】:Communication error on send
【71】:Protocol error
【72】:Multihop attempted
【73】:RFS specific error
【74】:Bad message
【75】:Value too large for defined data type
【76】:Name not unique on network
【77】:File descriptor in bad state
【78】:Remote address changed
【79】:Can not access a needed shared library
【80】:Accessing a corrupted shared library
【81】:.lib section in a.out corrupted
【82】:Attempting to link in too many shared libraries
【83】:Cannot exec a shared library directly
【84】:Invalid or incomplete multibyte or wide character
【85】:Interrupted system call should be restarted
【86】:Streams pipe error
【87】:Too many users
【88】:Socket operation on non-socket
【89】:Destination address required
【90】:Message too long
【91】:Protocol wrong type for socket
【92】:Protocol not available
【93】:Protocol not supported
【94】:Socket type not supported
【95】:Operation not supported
【96】:Protocol family not supported
【97】:Address family not supported by protocol
【98】:Address already in use
【99】:Cannot assign requested address
【100】:Network is down
【101】:Network is unreachable
【102】:Network dropped connection on reset
【103】:Software caused connection abort
【104】:Connection reset by peer
【105】:No buffer space available
【106】:Transport endpoint is already connected
【107】:Transport endpoint is not connected
【108】:Cannot send after transport endpoint shutdown
【109】:Too many references: cannot splice
【110】:Connection timed out
【111】:Connection refused
【112】:Host is down
【113】:No route to host
【114】:Operation already in progress
【115】:Operation now in progress
【116】:Stale file handle
【117】:Structure needs cleaning
【118】:Not a XENIX named type file
【119】:No XENIX semaphores available
【120】:Is a named type file
【121】:Remote I/O error
【122】:Disk quota exceeded
【123】:No medium found
【124】:Wrong medium type
【125】:Operation canceled
【126】:Required key not available
【127】:Key has expired
【128】:Key has been revoked
【129】:Key was rejected by service
【130】:Owner died
【131】:State not recoverable
【132】:Operation not possible due to RF-kill
【133】:Memory page has hardware error
【134】:Unknown error 134

十、示例代码

// 打开文件示例代码

#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <errno.h>
#include <string.h>

// 示例1：以只读方式打开文件
int example1() {
    int fd = open("test.txt", O_RDONLY);
    if (fd == -1) {
        perror("打开文件失败");
        return -1;
    }
    printf("文件打开成功，文件描述符: %d\n", fd);
    close(fd);
    return 0;
}

// 示例2：以读写方式打开文件，不存在则创建
int example2() {
    int fd = open("test.txt", O_RDWR | O_CREAT, 0644);
    if (fd == -1) {
        perror("打开/创建文件失败");
        return -1;
    }
    printf("文件打开/创建成功，文件描述符: %d\n", fd);
    
    // 写入一些数据
    char *data = "Hello, World!";
    write(fd, data, strlen(data));
    
    close(fd);
    return 0;
}

// 示例3：以追加方式打开文件
int example3() {
    int fd = open("test.txt", O_WRONLY | O_APPEND | O_CREAT, 0644);
    if (fd == -1) {
        perror("打开文件失败");
        return -1;
    }
    printf("以追加方式打开文件成功，文件描述符: %d\n", fd);
    
    // 追加数据
    char *append_data = "\nAppended text";
    write(fd, append_data, strlen(append_data));
    
    close(fd);
    return 0;
}

// 示例4：创建新文件，如果已存在则报错
int example4() {
    int fd = open("newfile.txt", O_RDWR | O_CREAT | O_EXCL, 0644);
    if (fd == -1) {
        if (errno == EEXIST) {
            printf("文件已存在，创建失败\n");
        } else {
            perror("创建文件失败");
        }
        return -1;
    }
    printf("新文件创建成功，文件描述符: %d\n", fd);
    
    close(fd);
    return 0;
}

// 示例5：读取文件内容
int example5() {
    int fd = open("test.txt", O_RDONLY);
    if (fd == -1) {
        perror("打开文件失败");
        return -1;
    }
    
    char buffer[1024];
    ssize_t bytes_read = read(fd, buffer, sizeof(buffer) - 1);
    if (bytes_read == -1) {
        perror("读取文件失败");
        close(fd);
        return -1;
    }
    
    buffer[bytes_read] = '\0';  // 添加字符串结束符
    printf("读取到的内容:\n%s\n", buffer);
    printf("读取了 %zd 字节\n", bytes_read);
    
    close(fd);
    return 0;
}

// 示例6：使用lseek设置文件偏移
int example6() {
    int fd = open("test.txt", O_RDWR);
    if (fd == -1) {
        perror("打开文件失败");
        return -1;
    }
    
    // 获取当前偏移位置
    off_t current_pos = lseek(fd, 0, SEEK_CUR);
    printf("当前文件位置: %ld\n", current_pos);
    
    // 设置到文件开头
    lseek(fd, 0, SEEK_SET);
    
    // 读取前10个字节
    char buffer[11];
    read(fd, buffer, 10);
    buffer[10] = '\0';
    printf("前10个字节: %s\n", buffer);
    
    // 移动到第5个字节
    lseek(fd, 5, SEEK_SET);
    current_pos = lseek(fd, 0, SEEK_CUR);
    printf("移动到第5字节后的位置: %ld\n", current_pos);
    
    // 从当前位置向后移动3字节
    lseek(fd, 3, SEEK_CUR);
    current_pos = lseek(fd, 0, SEEK_CUR);
    printf("向后移动3字节后的位置: %ld\n", current_pos);
    
    // 从文件末尾向前移动5字节
    lseek(fd, -5, SEEK_END);
    current_pos = lseek(fd, 0, SEEK_CUR);
    printf("从末尾向前5字节的位置: %ld\n", current_pos);
    
    close(fd);
    return 0;
}

// 示例7：创建文件空洞
int example7() {
    int fd = open("sparse.txt", O_RDWR | O_CREAT | O_TRUNC, 0644);
    if (fd == -1) {
        perror("创建文件失败");
        return -1;
    }
    
    // 在文件开头写入数据
    write(fd, "Start", 5);
    
    // 跳过10240字节创建空洞
    lseek(fd, 10240, SEEK_SET);
    
    // 在空洞后写入数据
    write(fd, "End", 3);
    
    printf("创建了包含空洞的文件，文件大小应该大于10248字节\n");
    
    close(fd);
    return 0;
}

int main() {
    printf("=== 示例1: 只读方式打开文件 ===\n");
    example1();
    
    printf("\n=== 示例2: 读写方式打开/创建文件 ===\n");
    example2();
    
    printf("\n=== 示例3: 追加方式打开文件 ===\n");
    example3();
    
    printf("\n=== 示例4: 创建新文件(存在则报错) ===\n");
    example4();
    
    printf("\n=== 示例5: 读取文件内容 ===\n");
    example5();
    
    printf("\n=== 示例6: 使用lseek设置文件偏移 ===\n");
    example6();
    
    printf("\n=== 示例7: 创建文件空洞 ===\n");
    example7();
    
    return 0;
}

posted @ 2025-11-05 08:25 林明杰阅读(4) 评论(0) 收藏举报

刷新页面返回顶部

Jaklin