Skip to content

0x01 - Revisiting OS

MasakiMu319

Files

A file is one of the core abstractions in an OS. At its core, it turns interactions between processes and devices/components into binary data operations on files.

In other words, lower-level implementation details are hidden from upper-layer software. Through mechanisms like pipes (pipe), process-to-system communication becomes binary data flow.

Files can be shared by multiple processes. On top of this abstraction, systems provide the file descriptor (fd).

An fd is a view of a file, including a reference and an offset. The offset is maintained implicitly.

TIP

In Kafka, a Topic is similarly consumed with an offset tracked by each consumer, representing current progress.

Q: In both multi-process systems and Kafka consumers, concurrent writes to the same file/stream can occur. How is this handled?

A file descriptor is a process-scoped integer. Conventionally, 0 is stdin and 1 is stdout. This allows stream redirection via pipes, so multiple commands can be composed freely.

Take lsof | grep 'Surge' as an example:

  1. grep is the processing stage for filtering/searching the stream. Each tool focuses on one thing.
  2. The pipe redirects lsof output to grep input: input from stdin, output to stdout.
  3. Pipes chain small tools into a more complex processing pipeline.

File Operation Semantics

  1. open: Opens a file in a specified mode, returns an fd, and implicitly initializes its offset to 0.
  2. read(fd, buf, n): Reads up to n bytes from the current offset of file fd into buf. On success, the offset advances automatically. The caller does not manage offsets manually; repeated reads are enough. Returns actual bytes read on success; returns < 0 on failure.
  3. write(fd, buf, n): Writes n bytes from buf to file fd at current offset, then advances offset automatically by written bytes. Returns n on success; returns < n on failure.
  4. close(fd): Releases fd and related resources (file pointer, offset, etc.). The process can reuse the descriptor number later.
Previous
Dive into Embedding
Next
Thoughts on an English Speaking Product Strategy