Files
A file is one of the core abstractions in an OS. At its core, it turns interactions between processes and devices/components into binary data operations on files.
In other words, lower-level implementation details are hidden from upper-layer software. Through mechanisms like pipes (pipe), process-to-system communication becomes binary data flow.
Files can be shared by multiple processes. On top of this abstraction, systems provide the file descriptor (fd).
An fd is a view of a file, including a reference and an offset. The offset is maintained implicitly.
TIP
In
Kafka, aTopicis similarly consumed with an offset tracked by eachconsumer, representing current progress.
Q: In both multi-process systems and Kafka consumers, concurrent writes to the same file/stream can occur. How is this handled?
A file descriptor is a process-scoped integer. Conventionally, 0 is stdin and 1 is stdout. This allows stream redirection via pipes, so multiple commands can be composed freely.
Take lsof | grep 'Surge' as an example:
grepis the processing stage for filtering/searching the stream. Each tool focuses on one thing.- The pipe redirects
lsofoutput togrepinput: input from stdin, output to stdout. - Pipes chain small tools into a more complex processing pipeline.
File Operation Semantics
open: Opens a file in a specified mode, returns anfd, and implicitly initializes its offset to 0.read(fd, buf, n): Reads up tonbytes from the current offset of filefdintobuf. On success, the offset advances automatically. The caller does not manage offsets manually; repeated reads are enough. Returns actual bytes read on success; returns< 0on failure.write(fd, buf, n): Writesnbytes frombufto filefdat current offset, then advances offset automatically by written bytes. Returnsnon success; returns< non failure.close(fd): Releasesfdand related resources (file pointer, offset, etc.). The process can reuse the descriptor number later.