-
Total Demos
-
Today

🔗 Why Directory Hard Links Are Forbidden

Watch what happens when you try to create cycles in the filesystem

Ready - Click "Create Directory Cycle"

The Infinite Loop Problem

Filesystems must be Directed Acyclic Graphs (DAG). If we allow hard links to directories, we can create cycles that break traversal tools.

💡 Interactive Demo
Click "Create Directory Cycle" to set up the problematic structure, then "Run find Command" to watch it loop forever.

Technical Breakdown

Directory Entry Structure

Each directory entry contains a name and inode number. When you create a hard link to a directory, both entries point to the same inode.

dir-a (inode 1000) ├── dir-b (inode 2000) └── link-to-a (inode 1000) ← Same as dir-a!

How find Traversal Works

The find command uses depth-first search with these assumptions:

  • Start at root directory
  • Read directory entries sequentially
  • Recurse into subdirectories
  • Never revisit the same directory

With directory hard links, assumption #4 breaks. The kernel follows link-to-a thinking it's entering a new directory, but it's actually re-entering dir-a.

Why No Cycle Detection?

For files, the kernel doesn't track visited inodes during traversal because:

  • Would require O(n) memory for every directory scan
  • Performance penalty on every filesystem operation
  • Tree structure (no cycles) is a fundamental assumption

The . and .. Exception

Every directory has link count ≥ 2 because of . (self) and .. (parent). These are hard links! But they're special-cased:

// Kernel explicitly skips these during traversal if (strcmp(entry->d_name, ".") == 0 || strcmp(entry->d_name, "..") == 0) continue;

User-created directory hard links can't be special-cased without knowing all possible names, making cycle detection impossible at the kernel level.

Ready - Click "Create Circular Refs"

The Reference Counting Problem

Linux uses link counts to know when to free inodes. When a file's link count reaches zero, the kernel frees the data. Directory hard links create circular references that break this mechanism.

💡 Interactive Demo
Watch link counts update in real-time as you create the structure, then see what happens when you try to delete it.

Technical Breakdown

How Link Counting Works

Every inode maintains a link count (i_nlink in ext4). The kernel updates this count on every hard link operation:

// Creating a hard link link(source, target): inode = lookup_inode(source) inode->i_nlink++ add_directory_entry(target, inode)

Deletion Process

When you delete a file, the kernel only removes the directory entry:

// Unlinking a file unlink(path): inode = lookup_inode(path) remove_directory_entry(path) inode->i_nlink-- if (inode->i_nlink == 0) free_inode(inode) // Only then!

The Circular Reference Problem

With directory hard links:

dir-a (inode 1000, links: 3) ├── self (.) ├── parent reference (..) └── dir-b └── link-to-a → inode 1000 rm -rf dir-a from parent: 1. Remove parent's entry to dir-a 2. links: 3 → 2 (. and link-to-a remain) 3. links > 0, so DON'T free inode 4. But dir-b is inside dir-a! 5. Can't reach dir-a to remove link-to-a 6. Orphaned: unreachable but not freed

Why This is Fatal

  • Filesystem leak: Inodes and data blocks permanently allocated
  • No recovery: Even fsck can't determine which directories are orphaned vs intentionally inaccessible
  • Cascading failure: As disk fills with orphaned directories, system becomes unstable

The . and .. Links

Directories already have link count ≥ 2:

mkdir dir-a: links = 2 (. from dir-a, entry from parent) mkdir dir-a/dir-b: dir-a links = 3 (adds .. from dir-b) dir-b links = 2 (. and parent entry)

But . and .. are kernel-managed. User-created hard links add untracked circular references the kernel can't handle.

Ready - See how symlinks handle loops

Why Symlinks Work

Symbolic links can point to directories safely because they work fundamentally differently than hard links.

💡 Interactive Demo
Create the same loop structure with symlinks, then watch how find -L handles it. Note: The kernel doesn't "detect" the loop - it simply enforces a maximum depth of 40 symlink resolutions, which prevents infinite traversal.

Technical Differences

Symlinks are Separate Files

A symlink has its own inode and stores the target path as data:

// Hard link (forbidden for directories) link("dir-a", "dir-b/link") → EPERM Both entries point to same inode // Symbolic link (allowed) symlink("../../dir-a", "dir-b/link") link has its own inode inode->data = "../../dir-a" dir-a's link count unchanged

Path Resolution with Symlinks

The kernel tracks symlink resolution depth:

// In include/linux/namei.h (Linux kernel) // Source: https://github.com/torvalds/linux/blob/master/include/linux/namei.h#L14 #define MAXSYMLINKS 40 resolve_symlink(path): depth = 0 while is_symlink(path): if depth++ > MAXSYMLINKS: return -ELOOP // Loop detected! path = readlink(path) return path

Why find Works with Symlinks

Hard Links: Infinite Loop

// With hard link to directory (forbidden) find dir-a: visit(dir-a) // Enter dir-a visit(dir-b) // Enter dir-b visit(link-to-a) // Enter link-to-a // Same inode as dir-a! visit(dir-a) AGAIN // Infinite loop starts visit(dir-b) AGAIN visit(link-to-a) AGAIN ... forever ... ❌ HANGS FOREVER - No depth tracking

Symlinks: Graceful Detection

// With symlink (allowed) find -L dir-a: visit(dir-a) depth=0 visit(dir-b) depth=0 visit(link-to-a) depth=1 (symlink) visit(dir-a) depth=1 (via symlink) visit(dir-b) depth=1 visit(link-to-a) depth=2 (symlink) visit(dir-a) depth=2 (via symlink) ... depth=40 → return -ELOOP ✅ Depth limit reached - Exits cleanly

Why This Works

  • Bounded depth: Maximum 40 symlink resolutions prevents infinite loops
  • Depth limiting, not loop detection: Kernel doesn't track visited paths - it simply counts symlink resolutions and stops at 40
  • Clear error: Returns -ELOOP (though the error message "loop detected" is somewhat misleading - it's really "depth limit exceeded")
  • Catches both loops and deep chains: A legitimate 50-link chain would also hit this limit
⚠️ Important Distinction: Depth Limiting vs Loop Detection

The loop exists from creation - the moment you create the symlink, the loop is there.

The kernel doesn't detect it - it doesn't track "I've been to dir-a before!"

Instead, it counts: depth = 1, 2, 3... 40. At 40, it stops with -ELOOP.

This means a legitimate 50-link chain (no loop) would also fail. The kernel prevents any deep symlink traversal, not just loops.

Why rm Works with Symlinks

Hard Links: Orphaned Directories

// With hard link (forbidden) dir-a (inode 1000, links: 4) ├── . (self) ├── .. (parent) ├── dir-b (which has ..) └── dir-b/link-to-a → inode 1000 rm -rf dir-a: remove_entry("dir-a") dir-a links: 4 → 3 (. link-to-a and dir-b/.. remain) if links == 0: free_inode() // NOT EXECUTED ❌ dir-a NOT freed (links = 3) ❌ dir-b still inside dir-a ❌ Can't reach link-to-a to remove it ❌ ORPHANED: Unreachable but not freed ❌ FILESYSTEM LEAK

Symlinks: Clean Deletion

// With symlink (allowed) dir-a (inode 1000, links: 2) ├── . (self) └── .. (parent) link-to-a (inode 5000, separate!) data = "../../dir-a" rm link-to-a: inode = lookup("link-to-a") remove_entry("link-to-a") inode 5000->i_nlink-- // Only symlink's inode! if inode->i_nlink == 0: free_inode(5000) // Symlink freed ✅ link-to-a freed completely ✅ dir-a completely unaffected ✅ No circular reference ✅ No leak

Key Differences

  • Independent inodes: Symlink has its own inode (5000), separate from target (1000)
  • No link count impact: Deleting symlink doesn't touch target's link count
  • Path-based reference: Symlink stores path string, not inode pointer
  • Clean separation: Removing symlink is just removing one file, period
⚠️ Critical Insight
Hard links create shared inode ownership (circular refs). Symlinks create path references (no ownership). This fundamental difference makes symlinks safe for directories.
🔄 Connecting to global stats...0 total hard links demos • 0 sessions today • Last demo: Never