Introduction to Computer Systems

Hardware Architectures of Von Neumann

Processors and Assembly Code

Processors

state:

Program counter (PC): address of the next instruction
memory
registers: local store of heavily used data in the processor (fsater than memory)

Break instruction execution into stages and implement them into hardware modules

Fetch an instruction specified by PC and update the PC
Decode the instruction and read operands from registers
Execute an arithmetic/logic operation
Load/store a value from/to memory if needed
Write back the result to a register

Optimization 1: Pipelining

instruction-level parallelism

Optimization 2: Superscalar

also instruction-level parallelism

Optimization 3: Multi-Core

multiple levels of parallelism

Specialized/Customized Hardware

High efficiency but inflexible.

RISC-V

State

PC: 32-bit
Registers: x0-x31, each 32-bit. x0 is always 0
Memory: 4-byte aligned

Instruction (32-bit long)

Basic compute: arithmetic/logic/shift/compare
Memory access: load/store
Control flow: branch/jump

ISA (instruction set architecture)

software program -> ISA -> hardware processor. (abstraction)

two types:

General-purpose/programmable processors: x86
Specialized/customized accelerators: AI accelerators

ISA defines states and instructions.

two style:

RISC: reduced instruction set computer. E.g. RISC-V.
CISC: complex instruction set computer. E.g. x86.

ISA Support for Parallelism

SIMD (single-instruction, multiple-data)

Data-level parallelism

Multiple Levels of Parallelism

ISA Support for Synchronization

Avoid data races by synchronization of loads/stores to shared data

Atomic instructions

Test and set
Compare and swap
Fetch and add/sub

Memory layout of a program

Memory Hierarchy

SRAM & DRAM

RAM = able to access any address

2D array and row/column address

S (static) RAM

Use same technology as processors, on the same chip
Similarly fast access speed as processors
Each cell is large (6T) → small capacity

D (dynamic) RAM

Different from processor technology; separate chips
Slower access due to charge and discharge
Each cell is small (1T1C) → large capacity
Cell access is destructive; need extra complex circuits

Locality

Principle: predict data accessed

temporal locality
spatial locality

Exploiting Locality: Memory Hierarchy

Caches

AMAT (average memory access time) = hit latency + miss rate × miss penalty

Cache organization

Fully Associative Caches

Use any available location to store a block

low miss rate but high hit latency

Direct-Mapped Caches

Each block has a designated location, which is fully determined by its memory address

low hit latency but high miss rate

Index of a block = block address % #(blocks in cache) = lower bits in block address

Block address = (byte) address / block size

Set-Associative Caches

A block can go to one of $N$ (associativity of the cache) locations

Middle ground between direct-mapped ($N=1$) and fully associative ($N=\#(\text{blocks})$)

Set index = block address % #(sets) = ((byte) address / block size) % #(sets)

A cache entry = data block + tag + valid bit + (more metadata)

Replacement Policy

determine whether to evict and which entry to evict

Typical Replacement Policies

Random
Least Recently Used (LRU)
Least Frequently Used (LFU)

Write policy

write-hit: simply writes data only to cache, need a dirty bit in metadata, delay writes to memory until eviction.
write-miss: first treated as a read-miss (fetching block from memory into cache), and then writes to cached block

The 3Cs of Cache Misses

Compulsory/cold -> associativity
Conflict -> block size
Capacity -> capacity

Multi-level cache hierarchy and multi-core caches

Use coherence protocols to solve coherence problem.

I/O Devices

I/O challenge:

Thousands of devices, different from each other. Need new and standardized interface and mechanisms
Devices are slow and with unpredictable behaviors. Need new access approaches different from memory loads/stores

Operational Parameters:

Data granularity: byte vs. block
Access pattern: sequential vs. random
Device speed rates vary over many orders of magnitude

General Techniques

Memory-Mapped I/O

Assign an unused range of physical address to each I/O device, which is called its I/O address

I/O address space is protected by the virtual memory mechanism:

Method 1: use syscalls to access I/O addresses
Method 2: ask OS to map I/O addresses to virtual addresses

Standardizing I/O Interface

Provide uniform interfaces for wide range of different devices.

Abstract all I/O devices to fit in several standard types
Use device driver to both
- Expose the standard abstract interface to the OS
- Internally translate to low-level actions to perform on the device.

I/O Types

By Access Granularity/Pattern:

Block device: Access blocks of data; can randomly address a location
Stream device: Sequential interface, no addressing; just read/write next block or character/byte

By Timing:

Blocking interface: "wait"
Non-blocking interface: “do not wait”
Asynchronous interface: “tell me later”

I/O Notification

How does the OS and user programs know that something interesting happened in the I/O device, if events are infrequent, and occur at any time unexpectedly?

Two approaches:

Polling: The I/O device places information in a status register; the OS periodically checks the status register (may be memory-mapped).
- Advantages: Simple to implement; processor is in control and manages the work
- Disadvantages: Difficult to determine how frequently to poll
- Works best for frequent and predictable events
Interrupts: When I/O device needs attention, it interrupts the processor; the OS finds out the reason from specific places.
- Advantages: CPU is free to do other computation
- Disadvantages: Software overhead (e.g., context switch); implementation complexity
- Work best for infrequent and unpredictable events

Direct Memory Access (DMA)

DMA is a custom hardware engine for data movement. Use it, we can transfer blocks of data to or from memory without CPU intervention. CPU can sends a data transfer task to it, and gets notified when task is done. DMA engine itself is an I/O device.

Processor sets up DMA by supplying: identity of the device and the operation (read/write), memory address for source/destination, and number of bytes to transfer. Each task will generate a data transfer descriptor: source, destination, length.

DMA accepts data transfer descriptor from processor. Can keep multiple transfer descriptors in a queue, and serve them one after one. DMA engine starts transfer when data is ready and notify processor when complete or on error, typically with interrupts.

Design issues:

OS may swap pages out to disk

Solution: memory pinning
contiguous virtual addresses may not be physically contiguous!

Solution 1: chain series of single-page requests; single interrupt at the end

Solution 2: DMA engine uses virtual addresses

data involved in DMA may reside in processor cache

Solution 1: software flushes the cache or forces writebacks before I/O

Solution 2: hardware routes memory accesses for I/O through cache

Disks, SSDs and GPUs

Magnetic Hard Disk

Structure and Performance

Long-term, non-volatile, large, inexpensive. But... slow.

Usage: virtual memory, file system.

Organization: disk-> platter->surface->track->sector; aligned tracks form a cylinder. Software always transfers groups of sectors together, as blocks.

Access time breakdown:

Controller overheads
Queuing delay if other accesses are pending
Seek time: move heads to the track
Rotational latency: wait for target sector in the track
Data transfer time: the whole sector passes under the heads

Disk Scheduling

Access to small data dominated by seek time and rotational latency. Disk scheduling that exploits data locality can improve performance!

Disk Scheduling Policies: (only consider track IDs)

FIFO: in the order of arrival

Pros: fair among requests; Cons: many seeks
Shortest seek time first (SSTF): pick the request that is closest to head

Pros: reduce seeks; Cons: may lead to starvation
SCAN: elevator algorithm, take the closest request in the travel direction

Pros: low seek, no starvation; Cons: still unfair as it favors middle tracks
C-SCAN (circular SCAN): like SCAN but only serves in one direction

Pros: more fair than SCAN; more consistent response time; Cons: longer seeks on the way back; lower throughput

Solid State Drive (SSD)

Structure and Performance

SSDs are usually made by NAND/NOR multi-level cell Flash memory. It contains an array of Flash chips, a DRAM buffer (a.k.a., cache), and a controller (which could be either a general-purpose core or specialized chip). Since there is no mechanical moving parts, seek and rotational latencies are eliminated. It contains two levels: 4 kB per page and 32 to 128 pages per block.

Reads are in units of pages, with latency in about 20 us; writes are complicated, with latency in 200 us– 1.7 ms (can only write to erased pages (about 200 us); erase must happen in units of blocks and are slow (e.g., 1.5 ms)).

Notice that no in-place modification in SSD/Flash. Must write data to a new page in an erased block; then let the address point to this page. Such management is handled by the Flash Translation Layer (FTL).

Flash Translation Layer (FTL)

Functionalities:

Data address mapping
Tracking physical block status
Wear leveling

Data structures:

Address mapping: logical to physical
Block info table for each physical block: track block status, write pages in a block sequentially, wear leveling.

Wear Out: SSD has limited lifetime during which we can reliably store data, a block wears out after certain number of erases. So frequently written data may wear out their pages/blocks sooner. To solve it, FTL performs wear leveling to avoid hotspots. (To uniformly move and remap data to different physical pages/blocks.)

vs. disks

Advantages: No moving parts(faster, less power, more rugged), random access is only slightly slower than sequential access.

Disadvantages: Asymmetric performance between reads and writes, have the potential issue of wear-out.

GPUs

In system, GPU is connected as an I/O device to the CPU, but they have physically separate memories.

Performance overhead of data movements is significant, use overlapping computation and communication to optimize.

As management processor + accelerator, CPU is dedicated to GPU management and other auxiliary functions, while GPU is the workhorse of the application, offloaded with most computation. (Offloading computation.)

As cooperative co-processors, CPU and GPU work together on a problem.

Operating Systems: Manage Hardware Resources

Processes and Threads

Processes

A process is an instance of an executing program.

In multiprocessing, each process seems to have exclusive use of processor and memory.

A thread is a single unique execution context.

One process has one address space, which is shared by all threads of that process

A process = one or multiple threads + an address space

User vs. Kernel Mode

Kernel mode: high privilege

User mode: limited privilege

One bit in processor as the user/kernel mode bit.

Kernel Transition Mechanisms

System call

proactive

Syscalls offer a narrow waist as a portable interface

Interrupt

reactive; internal reasons

Exception and trap

reactive; external reasons

Context switch

By an interrupt of a timer of a syscall to voluntarily yield the processor.

Management

Terminating Processes

A process is terminated for one of

Returning from main function
Calling exit syscall
Receiving a signal whose default action is to terminate

exit is called but never returns

Creating Processes

Parent process creates a new running child process by calling fork.

Child process is almost identical (i.e., a copy) to parent process

Child gets an identical (but separate) copy of the parent's address space
Child gets identical access permissions of the parent's I/O devices
Child has a different PID than the parent

Parent and child processes execute concurrently. Cannot predict their execution order; may interleave in any way.

int fork(void):

Return 0 to child process; return child's PID to parent process
Or return negative value to indicate errors

fork is called once but returns twice.

Reaping Child Processes

pid_t wait(int* wstatus)

Loading and Running Programs

exec family: load and run in the current process

Reaping Child Processes

pid_t wait(int* wstatus):

Suspends current process until one of its children terminates
Return value is the PID of the child process that terminated
If wstatus != NULL, then store the child exit status into that address

pid_t waitpid(pid_t pid, int* wstatus, int options):

Suspends current process until the specific child process pid terminates

If any parent terminates without reaping a child, then the orphaned child is reaped by the special init process (whose PID is 1)

Loading and Running Programs

int execve(char* filename, char* argv[], char* envp[]):

With the executable filename, argument list argv, and env variable list envp
Retain PID, but overwrite code, data, and stack

execve is called once and never returns, except if there is an error

The commonstyle: fork-then-exec

pid_t cpid = fork();
if (cpid == 0) { 	   // Child Process
    // Child loads and runs new program
    char *args[] = {"ls", "-l", NULL};
    execv("/bin/ls", args, NULL);
    // execv only returns if failed
    perror("execv");
    exit(1);
} else if (cpid > 0) { // Parent Process
    // Parent waits for child
    pid_t tcpid = wait(&status);
}

Signals

A signal is small message to notify a process that some event has occurred. (Sent from the kernel (sometimes at the request of another process) to a process.)

Type:

Signal ID	Name	Default Action	Event
2	SIGINT	Terminate	Ctrl-C
9	SIGKILL	Terminate	Kill program
11	SIGSEGV	Terminate	Segmentation violation
14	SIGALRM	Terminate	Timer
17	SIGCHLD	Ignore	Child terminated

Sending a signal: Kernel detects a system event or another process calls the kill syscall to request kernel to send a signal. int kill(pid_t pid, int sig)

Receiving a signal: Terminate the process, or execute a user-level function called signal handler, or just ignore the signal.

Shell

A shell is a job control system that runs programs on behalf of user. For example:

sh: original Unix shell (Stephen Bourne, AT&T Bell Labs, 1977)
csh/tcsh: BSD Unix C shell
bash: “Bourne-Again” shell (default Linux shell)

Shell is a process itself: fork itself, exec program, and wait (or not)

Foreground jobs: shell waits and reaps
Background jobs: not reaped by shell, and also not reaped by init because shell (typically) will not terminate

Scheduling policies

Life Cycle of Processes/Threads:

When there are more threads than hardware cores, active threads reside in a ready queue, waiting to be put onto the cores to run. OS decides which threads to run first, using a scheduling policy.

Classes:

Non-preemptive: once giving a job some period of time to execute (not necessarily to completion), cannot take processor back before the period ends
Preemptive: we can take processor back at any time, and use it for other jobs; preempted job goes back to ready queue and canresumelater

Metrics:

Waiting time: time when the job waits in the ready queue
Response time = waiting time + execution time. This is what the user sees
Throughput: number of jobs completed per unit of time. Related to response time, but not always the same thing
Fairness: all jobs use resource in some equal way

FCFS (First-Come, First-Served)

Run until done, non-preemptive

Simple
Depend on arrival orders
Average wait/response time suffers from Convoy effect (short jobs are blocked behind long jobs)

Round Robin (RR)

Each job gets a small unit of time (time quantum, q) to execute, and then gets preempted and added back to end of queue

More fair for short jobs than FCFS
Context switch adds overheads
Not easy to choose theright time quantum value
- Large q: Averagewait/response time suffers
- Small q: Throughput suffers

Typically choose q so that context switch overhead is small enough (e.g., 5%)

Optimal Scheduling: SJF/SRTF

SJF (shortest job first): run the job with the shortest execution time first

SRTF (shortest remaining time first): preemptive version of SJF. (If newjobarrives and has shorter remaining time than current one, preempt.) Also called STCF (shortest time to completion first))

Optimal average response time
Unfair, may lead to starvation if many short jobs come in; long jobs never run!
Difficult to predict job execution time; sometimes impractical

Concurrency and Synchronization

Cooperating Threads

Advantages:

shared resource
speedup
modularity

But... We need to consider Correctness Problems.

Some definitions:

Synchronization: using atomic operations to ensure cooperation between threads
Critical Section: piece of code that only one thread can execute at once
Mutual Exclusion: ensuring that only one thread executes critical section
Lock: prevents someone from doing something
- Lock.Acquire() – wait until lock is free, then grab
- Lock.Release() – unlock, waking up anyone waiting

Important idea: all synchronization involves waiting.

The big picture of synchronization: Hardware $\to$ Higher-level API $\to$ Programs

Semaphores

A Semaphore has a non-negative integer value and supports the following two operations:

P(): an atomic operation that waits for semaphore to become positive, then decrements it by 1.
V(): an atomic operation that increments the semaphore by 1, waking up a waiting P, if any

Application:

Mutual Exclusion (initial value = 1)

semaphore.P();
// Critical section goes here
semaphore.V();

Scheduling Constraints (initial value = 0)
- Allow thread 1 to wait for a signal from thread 2, i.e.,
  thread 2 schedules thread 1 when a given constrained
  is satisfied
```
Initial value of semaphore = 0
ThreadJoin {
	semaphore.P();
}
ThreadFinish {
	semaphore.V();
}
```

Producer-consumer with a bounded buffer: producer->buffer->consumer

Semaphore fullSlots = 0; // Initially, no coke
Semaphore emptySlots = bufSize;
// Initially, num empty slots
Semaphore mutex = 1; // No one using machine
Producer(item) {
    emptySlots.P(); // Wait until space
    mutex.P(); // Wait until machine free
    Enqueue(item);
    mutex.V();
    fullSlots.V(); // Tell consumers there is more coke
}
Consumer() {
    fullSlots.P(); // Check if there’s a coke
    mutex.P(); // Wait until machine free
    item = Dequeue();
    mutex.V();
    emptySlots.V(); // tell producer need more
    return item;
}

Monitors and Conditional Variables

In the bounded buffer problem, semaphores are dual purpose for both mutex and scheduling constraints, may causes to deadlock.

Monitors: A synchronous object plus one or more condition variables.

Condition Variable: a queue of threads waiting for something inside a critical section

Operations:

Wait(&lock): Atomically release lock and go to sleep. Re-acquire lock later, before returning.
Signal(): Wake up one waiter, if any
Broadcast(): Wake up all waiters

Application: synchronized queue

Lock lock;
Condition dataready;
Queue queue;
AddToQueue(item) {
    lock.Acquire(); // Get Lock
    queue.enqueue(item); // Add item
    dataready.signal(); // Signal any waiters 
    lock.Release(); // Release Lock
}
RemoveFromQueue() {
    lock.Acquire(); // Get Lock
    while (queue.isEmpty()) {
        dataready.wait(&lock); // If nothing, sleep
    }
    item = queue.dequeue(); // Get next item
    lock.Release(); // Release Lock
    return(item);
}

Reader / Writer Problem

Consider a shared database with two classes of users:

Readers: never modify database
Writers: read and modify database

Like to have many readers at the same time or only one writer at a time.

Correctness Constraints:

Readers can access database when no writers
Writers can access database when no readers or writers
Only one thread manipulates state variables at a time

State variables (Protected by a lock called “lock”):

int AR: Number of active readers; initially = 0
int WR: Number of waiting readers; initially = 0
int AW: Number of active writers; initially = 0
int WW: Number of waiting writers; initially = 0
Condition okToRead = NIL
Condition okToWrite = NIL

Reader() {
    // First check self into system
    lock.Acquire();
    while ((AW + WW) > 0) { 	// Is it safe to read?
        WR++; 					// No. Writers exist
        okToRead.wait(&lock); 	// Sleep on cond var
        WR--; 					// No longer waiting
    }
    AR++; // Now we are active!
    lock.Release();
    AccessDatabase(ReadOnly); 	// Perform actual read-only access
    // Now, check out of system
    lock.Acquire();
    AR--; // No longer active	
    if (AR == 0 && WW > 0) 		// No other active readers
    	okToWrite.signal(); 	// Wake up one writer
    lock.Release();
}
Writer() {	
    // First check self into system
    lock.Acquire();
	while ((AW + AR) > 0) {		// Is it safe to write?		
        WW++;					// No. Active users exist		
        okToWrite.wait(&lock);	// Sleep on cond var		
        WW--;					// No longer waiting
    }
	AW++; // Now we are active!
    lock.release();
    AccessDatabase(ReadWrite); 	// Perform actual read/write access
	// Now, check out of system
    lock.Acquire();
    AW--; // No longer active	
    if (WW > 0){				// Give priority to writers	
        okToWrite.signal();		// Wake up one writer
    } else if (WR > 0) {		// Otherwise, wake reader		
        okToRead.broadcast();	// Wake all readers	
    }
    lock.release();
}

Context Swith:

User->Kernel Context Switch
Acquire Lock
Kernel->User Context Switch
<perform critical section work>
User->Kernel Context Switch
Release Lock
Kernel->User Context Switch

Resource contention and deadlocks

Two types of resources:

Preemptable: can take it away. e.g. CPU, Embedded security chip
Non-preemptable: must leave it with the thread, e.g., Disk space, printer, chunk of virtual address space, Critical section

But... Resources may require exclusive access or may be sharable:

Sharable: read-only files
Non-shareable: printers

Def. Starvation: thread waits indefinitely. For example, low-priority thread waiting for resources constantly in use by high-priority threads.

Def. Deadlock: circular waiting for resources. For example, Thread A owns Res 1 and is waiting for Res 2, meanwhile Thread B owns Res 2 and is waiting for Res 1.

Rem. All deadlock are Starvation, but not vice versa.

Four requirements for Deadlock:

Mutual exclusion: Only one thread at a time can use a resource
Hold and wait: Thread holding at least one resource is waiting to acquire additional resources held by other threads
No preemption: Resources are released only voluntarily by the thread holding the resource, after thread is finished with it
Circular wait: There exists a set $\{T_1, ..., T_n\}$ of waiting threads, and $T_i$ is waiting for a resource that is held by $T_{i\bmod n+1}$.

Handling Deadlocks:

Allow system to enter deadlock and then recover
Deadlock prevention: ensure that system will never enter a deadlock
Ignore the problem and pretend that deadlocks never occur in the system

Preventing Deadlock:

Infinite resources
No Sharing of resources (totally independent threads)
Don’t allow waiting
Make all threads request everything they’ll need at the beginning
Force all threads to request resources in a particular order preventing any cyclic use of resources

Behind the Scenes: Implementing a lock

Language support for synchronization

C: All locking/unlocking is explicit: you need to check every possible exit path from a critical section.

int Rtn() {
    lock.acquire();
    ...
    if (error) {
        lock.release();// Needed!
        return errReturnCode;
    }
    ...
    lock.release();
	return OK;
}

C++: Languages that support exceptions are more challenging: exceptions create many new exit paths from the critical section.

void Rtn() {
    lock.acquire();
    try {
        ...
        DoFoo();
    	...
    }
    catch (...) {     	// really three dots!
                        // catch all exceptions
        lock.release();	// release lock
        throw;          // re-throw unknown exception	
   	}
    lock.release();
}
void DoFoo() {
    ...
    if (exception) throw errException;
	...
}

Alternative: Use the lock class destructor to release the lock.

class lock { 
	mutex &m_; 
    public: 
        lock(mutex &m) : m_(m) { 
        	m.acquire(); 
        } 
        ~lock() { 
        	m_.release(); 
        } 
};
mutex m;
...
{ //Critical Section
  lock mylock(m);
  ...
  ... 
  ... // no explicit unlock
}

Another way: Communicate to Share

Share no memory, if want to share, send a message!
Can use a queue to send message

Naïve use of Interrupt Enable/Disable

We know that dispatcher gets control in two ways.

Internal: Thread does something to relinquish the CPU
External: Interrupts cause dispatcher to take CPU

On a uniprocessor, can avoid context-switching by avoiding internal events (although virtual memory tricks) and preventing external events by disabling interrupts.

Consequently, naïve Implementation of locks:

LockAcquire { disable Ints; }
LockRelease { enable Ints; }

But... Consider following:

LockAcquire();
While(TRUE) {;}

Better Implementation of Locks by Disabling Interrupts

Key idea: maintain a lock variable and impose mutual exclusion only during operations on that variable

int value = FREE;

void Acquire() {
   disable_interrupts();
   if (value == BUSY) {
       put_thread_on_wait_queue();
       Go_to_sleep();
       // Enable interrupts?
   } else {
       value = BUSY;
   }
   enable_interrupts();
}

void Release() {
   disable_interrupts();
   if (anyone_on_wait_queue()) {
       take_thread_off_wait_queue();
       Put_at_front_of_ready_queue();
   } else {
       value = FREE;
   }
   enable_interrupts();
}

Problems: Doesn’t work well on multiprocessor

Implementing with Atomic Instructions Sequences

Atomic Read-Modify-Write instruction: read a value from memory and write a new value atomically. Examples:

/* test&set - most architectures */
test&set (&address) {
    result = M[address];
    M[address] = 1;
    return result;
}

/* swap - x86 */
swap (&address, register) {
    temp = M[address];
    M[address] = register;
    register = temp;
}

/* compare&swap - 68000 */
compare&swap (&address, reg1, reg2) {
    if (reg1 == M[address]) {
        M[address] = reg2;
        return success;
    } else {
        return failure;
    }
}

Let's implement locks using test&set:

int guard = 0;
int value = FREE;

void Acquire() {
    // Short busy-wait time
    while (test&set(guard));
    if (value == BUSY) {
        put_thread_on_wait_queue();
        go_to_sleep();
        guard = 0;
    } else {
        value = BUSY;
        guard = 0;
    }
}

void Release() {
    // Short busy-wait time
    while (test&set(guard));
    if (anyone_on_wait_queue()) {
        take_thread_off_wait_queue();
        place_on_ready_queue();
    } else {
        value = FREE;
    }
    guard = 0;
}

Virtual Memory

PTE and TLB

Base and bound (B&B) suggests to split the entire physical memory into contiguous ranges that are assigned to different processes. It has many drawbacks, such as restricted to contiguous address ranges, difficult to extend dynamically or to support shared regions across address spaces.

Virtual memory mechanism provides fine-grained and dynamic management of address spaces. Each process has its own virtual address space that is contiguous and linear, i.e., from $0$ to $N = 2^n$ (n is 32-bit or 48-bit). All virtual address spaces are mapped to the physical address space in a flexible way, in the unit of pages.

Page table is used to translate virtual address to physical address. Each process has its own page table; managed by OS; used by hardware. It's stored in main memory as an OS data structure. One page table entry (PTE) per virtual page, each PTE includes: valid bit, physical page number (a.k.a., frame number), metadata (e.g., R/W/X permissions)

Virtual memory as a tool for

Memory management: Allocate & map pages in a virtual space.
Memory protection: Process cannot access physical memory not mapped to its virtual space
Data caching: Only a subset of allocated virtual pages are mapped, other pages are swapped out to external storage. Main memory (DRAM) only caches the most frequently accessed data. If a page fault (allocated but not mapped) occurs, the page is paged in on demand, by the page fault handler in OS kernel.

OS manages page table by allocating physical space to virtual space, while hardware looks up page table to translate virtual to physical address.

Translation Look-aside Buffer (TLB) is a hardware cache just for address translation entries, i.e., PTE. Each TLB entry consists of data: (a PTE, including physical page number and R/W/X bits) and valid bit, tag, etc. of itself. TLB as a cache for page table, can speed up translation.

If we miss in TLB, access PTE in memory; this is called page table walk.

Virtual address space is much larger than typical physical address space! Ideally, we only need PTEs of allocated pages, not all virtual pages. To save page table size, we introduce:

Multi-Level Page Tables: Use a hierarchical page table structure, i.e., a (sparse) radix tree. Only top level must be resident in memory, remaining levels can be in memory or on disk. For example, Intel x86-64 four-level page table is a 512-ary radix tree, with span of 9 bits:

Memory Allocation

Memory Mapping

Inter-Process Sharing are allowed. Different processes can share a page by setting their virtual addresses to point to the same physical address.

Copy-on-Write (CoW): When copying pages, just make destination virtual address point to the same physical page as source virtual address, and mark the page as write-protected in PTE. When writing to either virtual address, trigger a permission fault and do actual copy, i.e., allocate a new physical page.

Memory Mapping is about how to associate a virtual memory area with a file object. It is a unified way to deal with all following cases, i.e., all data sources to the memory area are treated as files:

Get initial data from this file into memory: regular file on disk
Swap the physical memory content from/to this file: special swap file
For newly allocated space initialized to all zero values: special anonymous file

When doing memory mapping, OS only sets up PTE in page table. Wait until the first page fault (when data are actually accessed), and rely on demand paging to actually get data into memory. For anonymous file, physical page is only allocated at the time of the first write; before that, just return 0 for read accesses.

Recall: When fork, child gets an identical but separate copy of the parent’s address space.

Create exact copies of the page table and other kernel structures
Mark each PTE in both page tables as write-protected
At this point, both processes have the identical contents in their address spaces
Subsequent writes to either address space use CoW to create separate pages

To load and run a new program by execve, etc

Free the old page table and initialize a new page table
Memory map program and data
Set PC to the entry point of the program

User-Level Memory Mapping: mmap and mumap syscall

void* mmap(void* addr, size_t length, int prot, int flags, int fd, off_t offset): Accessing files as well as allocating new virtual address ranges (anonymous file). Map length bytes from the offset offset of the file specified by the descriptor fd, preferably to virtual address addr, with desired protection prot.

addr is only a hint; it could also be NULL to let kernel choose the virtual address;
flags: MAP_ANON, MAP_PRIVATE, MAP_SHARED, …

Return the mapped virtual address (may not equal to addr), or MAP_FAILED on error.

int munmap(void* addr, size_t length)

Dynamic Memory Allocation: Explicit Memory Management

A program can dynamically allocate memory at runtime using the dynamic allocator. The dynamic allocator manages an area of the process virtual address space, called the heap.

Allocator API:

void* malloc(size_t size): Get a pointer to a memory space of at least size bytes. Other variants: calloc, realloc, …

void free(void *p): Give up the memory space pointed by p, previously from malloc/calloc/…

C++ new and delete statements are similar.

Allocator Policy: Segregated Free Lists

Organize free spaces into multiple linked lists for different sizes, e.g., one list for each power-of-two size.
To allocate a block of $n$ bytes, search the free list of size $m>n$, until found a free space, use the first $n$ bytes of it. If no space is available in all lists, grow heap from OS.
When the requested size is smaller than the allocated free space, and the remaining space is wasted, then we need internal fragmentation, e.g., append remaining space of $(m-n)$ to its corresponding free list, which can be used for a future small-size request.

Good allocators should be fast, and minimize both external and internal fragmentation.

Garbage Collection: Implicit Memory Management

Garbage collection: automatic reclamation of heap-allocated storage.

A memory manager runs along with the user program. When a memory space is not needed, the memory manager frees it. So user programs never need to free allocated memory!

How does the memory manager know when a memory space can be freed? If for a memory object, there are no pointers anywhere that point to it, it can never be used any more. But still we may miss some opportunities to reclaim memory.

Model: View memory as a graph, where each each object is a node. An edge exists if a pointer in an object points to another object. Root nodes are the objects that are not dynamically allocated in the heap. Heap objects that are reachable from any root node is reachable nodes. So non-reachable nodes are garbage to be collected.

GC may start due to out-of-memory, or is periodically triggered. Classical GC Algorithm:

Mark-and-Sweep
- Mark: start from root nodes, traverse graph and mark a bit in each reachable object; non-reachable objects are not visited and not marked.
- Sweep: scan all objects in the memory, and free unmarked objects.
Reference Counting
- Each object maintains a reference counter, which keeps the number of other pointers that point to this object.
- When the reference counter becomes zero, we free the object.
- Problem: does not work for circular cases

File Systems

File system is a component in OS that transforms block interface into high-level interface like files and directories. A file system mainly consists of two parts:

A collection of files, each storing relevant data in a group of disk blocks;
A directory structure, organizing files and mapping names to files.

File, Directory and Links

File Basic

Files must be stored durably, i.e., persist even after the creating process is terminated and even after system is shutdown. Files are named, and referred by the names. When a file is named, it becomes independent of the process, the user, and even the system that created it. From a user’s perspective, a file is the smallest granularity of logical secondary storage; that is, data cannot be written to secondary storage unless they are within a file.

File descriptor (a.k.a., inode) is a OS data structure about a file’s metadata information. It's kept in kernel memory when file is open: for convenient operations. Consists of

File size;
File location (e.g., sectors on disk);
Access timestamps (e.g., create, last read, last write);
Protection info (e.g., access permissions (R/W/X), owner ID, group ID);
...

File Operations:

Open: use name to search a file in the directory, i.e., resolve file name. First, translate file name into a file number, which is used to locate the file descriptor on disk. Then copy file descriptor from disk to OS kernel memory space. Finally, return a file handle (usually an integer) to user process.
Later read/write/seek/sync operate on file handles. OS internally uses file handle to locate file descriptor, and to find data blocks
Close: when no longer operating on the file.

Two levels of open-file tables: per-process and system-wide. Multiple processes may open the same file at the same time. The second process simply adds an entry to its per-process table, pointing to the already existing entry in the system-wide table. System-wide table entries have reference counters, automatically remove entry when the last process closes the file.

File Access Patterns:

Sequential: data are processed in order, one byte after another;
Random: can address any byte in the middle of a file directly;
Keyed (or indexed): search for blocks with particular contents (Most OSes do not support this; instead, build a database)

Links

Hard link: set another directory entry to the file number for a file. Make directory structure a DAG (directed acyclic graph), not a tree.

Soft or symbolic link: special file whose content is another name (path). Stored as regular files, but with a flag set in file descriptor.

Current Working Directory is remembered by OS per process. The inode of the working directory is kept in the PCB. If a path starts without “/”, then look up starting from the working directory

Space Management

Data Block Organization

How to allocate disk blocks of a file? Principle:

Most files are small (a few kB or less): per-file overheads must be low
Most of the disk space is in large files
Performance must be good for large files even for random accesses
File sizes may grow unpredictably over time

Contiguous allocation: allocate contiguous blocks all at once.

Pros: Simple; easy to access; few disk seeks;
Cons: Harder and harder to allocate large files (due to external fragmentation); Hard to grow file sizes.

Linked Files: keep a linked list of all blocks of a file. In file descriptor, only need to keep a pointer to the first block, to randomly access a specific position, just from file descriptor, follow linked list until reaching the block. E.g., Windows FAT.

Pros: No external fragmentation; high utilization of disk space; easy to allocate new blocks; file size can grow, and may even grow in the middle;
Cons: More disk seeks, even for sequential accesses (poor locality); overheads of tracking pointers.

Indexed Files: keep the set of pointers to data blocks. E.g., 4.3 BSD Unix (Multi-Level Index).

Pros: Have pros of both contiguous and linked: support direct access; fast random access; no external fragmentation。
Cons: Still some amount of seeks; File size is limited, or, many unused pointers

Free Space Management

Many early systems use a linked list of free blocks, which is similar to free space tracking in dynamic heap allocators.

4.3 BSD Unix approach: bitmap. During allocation, search for a block close to the previously last block of the file, so that seek distance is minimized and locality is improved. When disk is full, search takes long time and locality is not improved. Solution: do not let disk fill up!

Performance optimization: Buffer Cache

Buffer Cache use part of main memory to retain recently accessed disk blocks. It's implemented entirely in OS software; can cache inodes, data blocks for directories and files, indirect blocks, free bitmaps, etc. Managed in unit of blocks; replacement policy is usually LRU. Many OSes now unify buffer cache and VM page pool, in order to avoid double caching issue for memory-mapped files.

What happens when a block in the buffer cache is modified?

Synchronous writes: immediately write through to disk. Safe but slow.

Delayed writes: do not immediately write to disk, e.g., wait for a while. Fast but dangerous.

Protection

Protection: prevent accidental and malicious misuse.

Authentication: identify a responsible party (principal).

Authorization: determine what actions can be performed by a principal on an object

Access enforcement: combine authentication and authorization to control access

Authentication is typically done with passwords. System store one-way hash results instead of plaintext passwords, and concatenate with random salts.

Once authenticated, e.g., user log in, the identity of the principal, e.g., user ID, is attached to every process executed under that login. After authentication, the authenticated identity of the principal must be protected from tampering, since other parts of the system rely on it.

In Authorization, authorization information can be represented as access matrix, one row per principal and one column per object. Such a full matrix is too large, we need to compress it. There are two ways:

Access control lists (ACL): Organized by columns. With each object, store info about which principals are allowed to perform what operations. Users can be organized into groups, and a single ACL entry for a group.

Pros: all info close to object; easy to know who is allowed
Cons: hard to store if cannot grouped; hard to know everything one can access

Capabilities: Organized by rows. With each principal, indicate which objects may be accessed and in what ways. Capabilities must be unforgeable and revocable.

Pros: easy to know all resources one can access; combined with access
Cons: hard to know everyone allowed to access one object; capability list is long

Networking and Data Systems

Networking

Basic Network Functionalities:

Delivery: deliver packets between any two hosts in the Internet without a large number of physical wires
Reliability: tolerate packet losses
Flow control and congestion control: avoid overflowing the receiver + avoid overflowing the routers

Deliver on Network

Multiplexing

Links (wires) is the basic building block of network. But we can't build direct links for all hosts. When sharing network resources among hosts, we need multiplexing, it's about how to efficiently utilizing the wires.

Packet Switching: Source sends information as self-contained packets that have an address. Source may have to break up single message into multiple, each packet travels independently to the destination host. Switches use the address in the packet to determine how to forward the packets.

Addressing

Addresses

Network (interface) card/controller (NIC): hardware that physically connects a computer to the network.

MAC address: 48-bit unique identifier assigned by card vendor
IP Address: 32-bit (or 128-bit for IPv6) address assigned by network administrator or dynamically when computer connects to network

Use IP address for routing.

Connection: communication channel between two processes. Each endpoint is identified by a port number: 16-bit identifier assigned by app or OS. Globally, an endpoint is identified by (IP address, port number).

DNS (Domain Name System): Domain name to IP address.
ARP (Address Resolution Protocol): IP to MAC

LANs and WANs

How to find the destinations in networks?

Local Area Network (LAN): all hosts in can share the same physical communication media (wire).

Hubs: forward from one wire to all the others.
Switches: forward frames only to intended recipients. "Learning switch": remembers which MAC is connected to which port.

Wide Area Network (WAN): network that covers a broad area (e.g., city, state, country, entire world). WAN connects multiple datalink layer networks (LANs). Datalink layer networks are connected by routers.

Routers: forward each packet received on an incoming link to an outgoing link based on packet’s destination IP address (towards its destination). Packets are buffered before being forwarded. Record the mapping between IP address and the output link in the forwarding table.
- read the IP destination address of the packet
- consults its forwarding table to get the output port
- forwards packet to corresponding output port

Reliability and flow control

Parameters of a communication channel:

Latency: how long does it take for the first bit to reach destination
Capacity: how many bits/sec can we push through? (often termed “bandwidth”)
Jitter: how much variation in latency?
Loss / reliability: can the channel drop packets?

Common problems:

Lost. Solution: Timeout and Retransmit.
Corrupted. Solution: Add a checksum.
Out of order. Solution: Add Sequence Numbers.
Overloaded. Solution: Buffering and Congestion Control.

Packet delay consists of:

Propagation delay on each link, proportional to the length of the link.
Transmission delay on each link, proportional to the packet size and 1/(link speed).
Processing delay on each router, depends on the speed of the router.
Queuing delay on each router, depends on the traffic load and queue size.

How to design reliable protocol?

Stop & Wait Protocol: Accomplished by using acknowledgements (ACK) and timeouts.

TCP protocol: sent more packets before waiting for an ACK.

Retransmit lost packets: Keep a sequence number on each packet
Figuring out the rate by AIMD (additive increment multiplicative decrement): Slow start until detect overflow, then slowdown quickly, and increase slowly until overflow again.

Layering

Different applications (Skype, SSH, NFS, HTTP) have different transmission media. We need layering to adapting them.

Layering is a modular approach in networking. It introduce intermediate layers that provide set of abstractions for various network functionality & technologies. Each layer solely relies on services from layer below and solely exports services to layer above. Interface between layers defines interaction and it hides implementation details.

A protocol is a set of rules and formats that specify the communication between network elements. It specify how the layer is implemented between machines.

OSI Layering Model: seven layers as follows

Application $\to$ Presentation $\to$ Session $\to$ Transport $\to$ Network $\to$ Datalink $\to$ Physical

Internet Protocol (IP): only five layers, the functionalities of the missing layers (i.e., Presentation and Session) are provided by the Application layer.

Physical Layer

Service: move information between two systems connected by a physical link

Interface: specifies how to send and receive bits

Protocol: coding scheme used to represent a bit, voltage levels, duration of a bit

Datalink Layer

Service: Enable end hosts to exchange frames (atomic messages) on the same physical line or wireless link

Interface: send frames to other end hosts; receive frames addressed to end host

Protocols: Addressing ARP; Media Access Control (MAC). Each frame has a header which contains a source and a destination MAC address, use broadcast ARP to address.

Network Layer

Service: Deliver packets to specified network (IP) addresses across multiple datalink layer networks

Interface: send packets to specified network address destination; receive packets destined for end host

Protocols: define network addresses (globally unique); construct forwarding tables; packet forwarding

Examples : IP (Internet Protocol) is the network layer of Internet. It service it provides: “Best-Effort” Packet Delivery.

Transport Layer

Service: Port (end-to-end communication between processes); Multiple processes on the same hosts communicating simultaneously

Interface: send message to specific process at given destination; local process receives messages sent to it

Protocol: port numbers, perhaps implement reliability, flow control, packetization of large messages, framing

Examples: TCP (Reliable, in-order delivery) and UDP (Datagram service, No-frills extension of “best-effort” IP).

Application Layer

Service: any service provided to the end user

Interface: depends on the application

Protocol: depends on the application

Examples: Skype, SMTP (email), HTTP (Web), BitTorrent…

Summary

Lower three layers implemented everywhere; Ideally, top two layers implemented only at hosts.

The Internet Hourglass: There is just one network-layer protocol, IP. The “narrow waist” facilitates interoperability.

End-to-End Argument: If you have to implement a function end-to-end anyway, don’t implement it inside the communication system.

Remote procedure calls

Communication mode:

Clients and Servers
Peer to Peer: Each host is both server and client

A protocol is an agreement on how to communicate, includes

Syntax: how a communication is specified & structured (Format, order messages are sent and received)
Semantics: what a communication means (Actions taken when transmitting, receiving, or when a timer expires)

IPC mechanisms (Socket API) share raw bytes. Can be wrapped into an RPC as the underlying communication mechanism.

Remote Procedure Call (RPC): Encapsulates network communication and simulates local function calls. Looks like a local procedure call on client. Translated automatically into a procedure call on remote machine (server).

Marshalling involves converting values to a canonical form, serializing objects, copying arguments passed by reference, etc. Client and server use “stubs” to glue pieces together. Also called “serialize” / “de-serialize”.

Databases and Transactions

Databases and Queries

A data model is a collection of entities and their relationships.

A schema is an instance of a data model.

A relational data model organizes data into tables (relations) with rows and columns, where each table's schema defines its structure and fields.

A Database Management System (DBMS) is a software system designed to store, manage, and facilitate access to databases. A DBMS provides: Data Definition Language (DDL) and Data Manipulation Language (DML), guarantees about durability, concurrency, semantics, etc. System handles query plan generation & optimization; ensures correct execution.

Transactions

A transaction is an atomic sequence of database actions (reads/writes). It takes DB from one consistent state to another.

The ACID properties of Transactions:

Atomicity: all actions in the transaction happen, or none happen.
Consistency: transactions maintain data integrity, e.g.,
- Balance cannot be negative;
- Cannot reschedule meeting on February 30.
Isolation: execution of one transaction is isolated from that of all others; no problems from concurrency.
Durability: if a transaction commits, its effects persist despite crashes.

By Log

Log is stored in non-volatile storage (Flash or on Disk), used to seal the commitment to a whole series of actions. For example, creating a file:

Start a transaction.
Write metadata changes to the log (not yet applied to disk), contains: write map (i.e., mark used); write inode entry to point to block(s); write dirent to point to inode.
Commit the transaction.
"Redo Log": First looks in log, then copy changes to the actual file system.
Free log space.

When the system crashes, upon recovery: scan the log; detect transaction start with no commit; discard log entries.

Cons: expensive, writing all data twice.

By Lock

Serial schedule: A schedule that does not interleave the operations of different transactions.

Equivalent schedules: For any storage/database state, the effect (on storage/database) and output of executing the first schedule is identical to the effect of executing the second schedule.

Serializable schedule: A schedule that is equivalent to some serial execution of the transactions

Goals of Transaction Scheduling: find a serializable schedule that maximize concurrency.

Anomalies with interleaved execution:

Write-write conflict (overwriting uncommitted data)
Read-Write conflict (Unrepeatable reads)
Write-read conflict (Reading uncommitted data)

Two operations conflict if they belong to different transactions, are on the same data, and at least one of them is a write.

Two schedules are conflict equivalent iff they involve same operations of same transactions and every pair of conflicting operations is ordered the same way.

Schedule $S$ is conflict serializable if $S$ is conflict equivalent to some serial schedule.

If you can transform an interleaved schedule by swapping consecutive non-conflicting operations of different transactions into a serial schedule, then the original schedule is conflict serializable.

Dependency graph is a graph with transactions represented as nodes and edges are from $T_i$ to $T_j$ such that an operation of $T_i$ conflicts with an operation of $T_j$ and $T_i$ appears earlier than $T_j$ in the schedule.

Thm. Schedule is conflict serializable if and only if its dependency graph is acyclic.

We know that Conflict Serializability $\Rightarrow$ Serializability but Serializability $\not\Rightarrow$ Conflict Serializability. Actually, deciding whether a schedule is serializable (not conflict-serializable) is $\mathsf{NP}$-complete.

Locks are used to control access to data. There are two types of locks:

Shared (S) lock – multiple concurrent transactions allowed to operate on data;
Exclusive (X) lock – only one transaction can operate on data at a time.

Two-Phase Locking (2PL):

Each transaction must obtain:
- S (shared) or X (exclusive) lock on data before reading,
- X (exclusive) lock on data before writing
A transaction can not request additional locks once it releases any locks

Thus, each transaction has a “growing phase” followed by a “shrinking phase”.

Thm. 2PL-compatible schedules are conflict serializable.

Distributed Systems

Distributed Systems overview

A Distributed System is a collection of independent computers that appears to its users as a single coherent system. Features:

No shared memory: message-based communication
Each runs its own local OS
Heterogeneity

A distributed system organized as middleware. The middleware layer runs on all machines, and offers a uniform interface to the system.

But...coordination is more difficult. It has:

Worse availability: depend on every machine being up
Worse reliability: can lose data if any machine crashes
Worse security: anyone in the world can break into system

Goals:

Resource sharing
Transparency
Openness
Scalability

Remote procedure calls: Looks like a local procedure call on client; Translated automatically into a procedure call on remote machine (server). Use marshalling to converting values to a canonical form, serializing objects, copying arguments passed by reference, etc.

With RPC a failure within a procedure call means remote machine crashed, but local one could continue working. However, performance: Cost of Procedure call << same-machine RPC << network RPC

Network File System (NFS) and Andrew File System (AFS)

File access consistency

Method	Comment
UNIX semantics	Every operation on a file is instantly visible to all processes
Session semantics	No changes are visible to other processes until the file is closed
Immutable files	No updates are possible; simplifies sharing and replication
Transactions	All changes occur atomically

In UNIX local file system, concurrent file reads and writes have “sequential” consistency semantics: Each file read/write from user-level app is an atomic operation; Each file write is immediately visible to all file readers. Such "One Copy Semantics" is provided by neither NFS nor AFS.

Network File System

Distributed file systems require transparent access to files stored on a remote disk.

Intuitively, use RPC to support EVERY read and write gets forwarded to server. Bu the performance is bad. Consider use caching to reduce network load. That is, if open/read/write/close can be done locally, don’t need to do any network traffic. Although it can be fast, may leads to cache consistency problem.

Write-through cache and NFS protocol

Write-through cache. With a write-through caching policy, the system's processor writes data to the cache first and then immediately copies that new cache data to the corresponding memory or disk. The application that's working to produce this data holds execution until the new data writes to storage are completed. Write-through caching ensures data is always consistent between cache and storage, though application performance could see a slight penalty because the application must wait for longer input/output operations to memory or even to disk.

Network File System protocol: Client polls server periodically to check for changes. It has a weak consistency, since if multiple clients write to same file, it becomes completely arbitrary.

Failures Handling

NFS servers are stateless; each request provides all arguments require for execution.

Idempotent: Performing requests multiple times has same effect as performing it exactly once

With stateless + idempotent, if server crashes and restarted, requests can continue where left off (in many cases).

Failure Model: Transparent to client system. Options to failures handling: hang until server comes back up or return an error.

Andrew File System

Andrew File System: Global distributed file system. The goal is to minimize / eliminate work per client operation.

Assumption: Client machines have disks; write/write and write/read sharing are rare; access from local disk is faster and cheaper than from LANs.

AFS Cell/Volume Architecture:

Cells correspond to administrative groups
Cells are broken into volumes (miniature file systems): one user's files, project source tree, ...
Client machine has cell-server database: protection server, volume location server

Write-back cache and AFS protocol

AFS caching: More aggressive caching, caches on disk in addition to just in memory

Prefetching (On open, AFS gets entire file from server, making later ops local & fast).
Callbacks (Clients register with server that they have a copy of file; server tells them: “Invalidate!” if the file changes)

Session semantics: A file write is visible to processes on the same box immediately, but not visible to processes on other machines until the file is closed. When a file is closed, changes are visible to new opens, but are not visible to “old” opens. Last writer (i.e. the last close()) wins.

Write-back cache. With a write-back caching policy, the processor writes data to the cache first, and then application execution is allowed to continue before the cached data is copied to main memory or disk. Cache contents are copied to storage later as a background task -- such as during idle processor cycles. This allows the processor to continue working on the application sooner and helps to improve application performance. However, this also means there could be brief time periods when application data in cache differs from data in storage, which raises a possibility of application data loss in the event of application disruption or crashes. For example, in AFS, dirty data are buffered at the client machine until file close, then flushed back to server, which leads the server to send “break callback” to other clients.

Failures Handling

If server crashes, it lose all callback state! Must reconstruct callback information from client: go ask everyone “who has which files cached?”

If client crashes, it may have missed callback! Must revalidate any cached content it uses.

Disconnected operation (Coda): “Optimistic Replica Control”

Hoarding: Whenever the client is connected to the network, it is in hoarding mode. In this mode, the client does aggressive caching, that is, hoard useful data for disconnection. Need to balance the needs of connected and disconnected operation.
Emulation: Whenever the client is disconnected, it makes a transition into the emulation mode. In this mode, all the file requests are serviced by the local cache and the version vectors get updated. The local cache file acts just like the actual file. Attempts to access files that are not in the client caches appear as failures to application. All changes are written in a persistent log, the client modification log.
Reintegration: On reconnection, the client goes to the reintegration mode. Resynchronization and version vector comparisons are made in this mode. Only care about write/write conflict, in practice, it won't happen.

Practical Skills

Command line

wc -l # Count the number of lines
ps # Show the status of current process
tail –f # Track file changes in real time
sort -nk1,1
	-n use numeric order
	-k start/stop of a column (1, 1)
uniq # Works only on sorted data
sed -E 's/.*Disconnected from (invalid |authenticating )?user (.*) [^ ]+ port [0-9]+( \[preauth\])?$/\2/' # Second capturing group is the username

$0 # Name of the script
$1 to $9 # Arguments to the script. $1 is the first argument and so on.
$@ # All the arguments
$# # Number of arguments
$? # Return code of the previous command (argument to the exit() function in C) 
$$ # Process identification number (PID) for the current script
!! # The entire previous command line

Permissions: Owners/Groups/Others

drwxr-xr-x #d: directory. rwx:read/write/execute

stdin 0 / stdout 1 / stderr 2

ls -l / | tail -n1 # tunnel

Ctrl-C = SIGINT

Ctrl-\=SIGQUIT

kill -TERM

Ctrl-Z=SIGSTOP

# Find files by extension:
find root_path -name '*.ext' 
# Find files matching multiple path/name patterns:
find root_path -path '**/path/**/*.ext' -or -name '*pattern*'
# Find directories matching a given name, in case-insensitive mode:
find root_path -type d -iname '*lib*' 
# Find files matching a given pattern, excluding specific paths:
find root_path -name '*.py' -not -path '*/site-packages/*'
# Find files matching a given size range:
find root_path -size +500k -size -10M
# Run a command for each file (use `{}` within the command to access the filename):
find root_path -name '*.ext' -exec wc -l {} \;
# Find files modified in the last 7 days and delete them:
find root_path -daystart -mtime -7 -delete
# Find empty (0 byte) files and delete them:
find root_path -type f -empty -delete

Regular expressions

. means “any single character” except newline

* zero or more of the preceding match

+ one or more of the preceding match

? zero or one of the preceding match

a{3} matches three a’s

[abc] any one character of a, b, and c

(RX1|RX2) either something that matches RX1 or RX2

^ the start of the line

$ the end of the line

\d{k} matches k digits

posted @ 2025-03-03 15:50 xcyle 阅读(93) 评论(0) 收藏举报

刷新页面返回顶部