Friday, October 23, 2009

Linux File System Internal Structure

In order to ease management of files, file system logically divides the disk into small units called blocks. A block is the smallest unit which can be allocated. Each block in the file system can be allocated or free.

Some file systems also support the concept of fragments which I believe is the sub-blocks.

File systems groups together a fixed number of sequential blocks into groups called group blocks.


What are blocks used for?


The blocks are used for two different purposes:
1. Most blocks stores user data (files).
2. Some blocks in every file system store the file system’s metadata.

Metadata describes the structure of the file system. Most common metadata structures are boot block, superblock, inode. Following paragraphs describes each of them.

The Boot Block
The boot block is usually a part of the disk label, a special set of blocks containing information on the disk layout. The boot block holds the loader to boot the operating system.

Superblock
Each file system is different and they have types like ext2, ext3 etc. Further each file system has size like 5 GB, 10 GB and status such as mount status. In short each file system has a superblock, which contains information about file system such as:
· File system type
· Size
· Status
· Information about other metadata structures

If this information is lost, you are in trouble (data loss), so Linux maintains multiple redundant copies of the superblock in every file system. This is very important in many emergency situations, for example you can use backup copies to restore damaged primary super block.

Following command displays primary and backup superblock location on /dev/hda2.

dumpe2fs /dev/hda2 | grep -i superblock
Primary superblock at 0, Group descriptors at 1-3
Backup superblock at 32768, Group descriptors at 32769-32771
Backup superblock at 98304, Group descriptors at 98305-98307
Backup superblock at 163840, Group descriptors at 163841-163843
Backup superblock at 229376, Group descriptors at 229377-229379
Backup superblock at 294912, Group descriptors at 294913-294915
Backup superblock at 819200, Group descriptors at 819201-819203
Backup superblock at 884736, Group descriptors at 884737-884739
Backup superblock at 1605632, Group descriptors at 1605633-1605635
Backup superblock at 2654208, Group descriptors at 2654209-2654211
Backup superblock at 4096000, Group descriptors at 4096001-4096003
Backup superblock at 7962624, Group descriptors at 7962625-7962627

Inode

The inode (index node) is a fundamental concept in the Linux file system. Each object in the file system is represented by an inode.

But what are the objects? Let us try to understand it in simple
words. Each and every file under Linux (and UNIX) has following attributes:
=> File type (executable, block special etc)
=> Permissions (read, write etc)
=> Owner
=> Group
=> File Size
=> File access, change and modification time
=> File deletion time
=> Number of links (soft/hard)
=> Extended attribute such as append only or no one can delete file including root user
=> Access Control List (ACLs)

All the above information stored in an inode. In short the inode identifies the file and its attributes (as above). Each inode is identified by a unique inode number within the file system. Inode is also known as index number. I-nodes are the complete file except for the data. Then where does the data go?

Data block
Data blocks contain the file data. Since i-nodes are of fixed size, obviously there is an upper limit to the number of data blocks which can be listed in the inode. When an inode can no longer contain the list of data blocks, that list is moved to an indirect block, and the i-node is converted to contain a list of indirect blocks. When the inode can no longer contain a list of indirect blocks, the list is moved to a double-indirect block and the inode then contains a list of double-indirect
blocks.

Few Q & A on file-systems:

How do you see the inode number of a file?
a. Using ls –i file name
[root@localhost raja]# ls -i 1.c
3928799 1.c
b. Using stat file name
[root@localhost raja]# stat 1.c
File: `1.c'
Size: 80 Blocks: 16 IO Block: 4096 regular file
Device: 302h/770d Inode: 3928799 Links: 1
Access: (0755/-rwxr-xr-x) Uid: ( 501/ raja) Gid: ( 501/ raja)
Access: 2008-01-02 15:08:28.000000000 +0530
Modify: 2008-01-02 14:34:56.000000000 +0530
Change: 2008-01-02 16:56:52.000000000 +0530
One practical application of i-nodes is when deleting files with name containing special characters.
Lets say you created a file with name “a*b which contains special characters “ and * b
[root@localhost raja]# touch \"a*b
(Try deleting the file using rm command and see what happens?)
[root@localhost raja]# ls –il
3928808 -rw-r--r-- 1 root root 0 Jan 3 15:10 "a*b

Now to delete the file “a*b, use the below command:
find . -inum 3928808 -exec rm -i {} \;

But where does this really help?
Few operating systems allow you to create files with names containing special characters and when you mount those disks in Linux, one way to delete that file is by using the inode number.

Why it is not possible to create hard links across file system boundaries?

Hard link to a file say orig, could be created by the command (ln orig link).
Now, both the files orig and link contains the same i-node numbers say 3928808.
(I.e. they both refer to same data block). This is when both orig and link are located in the same file system.

If the orig file is located in one file system(ext2) and if you try to create the link to it in another file system(ext3), it will lead to confusing references for Linux file system. Linux will get confused, whether the i-node 3928808 is located in ext2 or ext3 FS.

No comments:

Post a Comment