Author's home

Table Of Contents

Previous topic

How it works

Next topic

Resizing ddumbfs filesystem

This Page


Automatic recovery

At startup, ddumbfs search for the file .autofsck. The presence of this file means that ddumbfs didn’t shutdown properly and that the filesystem must be checked. ddumbfs start a file check similar to the fsckddumbfs command with option -r. This is fast, and appropriate to handle such situation.

Manual recovery

ddumbfs is released with the powerful fsckddumbfs. This tools do everything that is possible to detect and repair errors. Errors that cannot be repaired are reported to the user through the special file ./ddumbfs/corrupted.txt.

Informations are stored in 3 main containers:

  • the index: This is a sparse hash table made of nodes. Nodes are composed by the position and the hash of a block in the block file. The index manages also a list of free blocks.
  • the files: This is the sequence of blocks that compose a user file. Each node in the file are identical to nodes in the index. Information in these nodes can be used to rebuild the index without accessing the block file.
  • the block file: This is where blocks are stored one after the others.

These structures hold a lot of redundant informations. This redundancy help to improve the access speed and the reliability of the system. fsckddumbfs cross all these informations to detect anomalies and fix them. The index can be quickly and completely rebuild from the files nodes.

fsckddumbfs has 3 main modes of operation :

  • check: check in read only mode.
  • repair: check and repair the existing index.
  • rebuild: erase the index and recreate a new one using data from file nodes or from the block file itself.

Each of these mode have a light (lowercase) and an heavy (uppercase) version. The heavy one read all the blocks of the block file to check, repair or rebuild the index.

Read the very complete fsckddumbfs manual for more informations.

How it works

I have simulated an unexpected shutdown and show you how ddumbfs handle this situation.

I was copying 3 files of 1Go to the filesystem when I have powered down the virtual machine.

After a reboot, the linux distribution checked and mounted all conventional filesystems, including the underlying filesystem used by ddumbfs

I’m now mounting the crashed filesystem and explain lines of the log:

  [root@cos6-x64 ddumbfsC6]# src/ddumbfs /ddumbfs/ -o parent=/l0/ddumbfs/
    file_header_size 16
    hash TIGER
    hash_size 24
    block_size 4096
    index_block_size 4096
    node_overflow 1.30
    reuse_asap 0
    partition_size 10737418240
    block_count 2621440
    addr_size 3
    node_size 27
    node_count 3408630
    node_block_count 22469
    freeblock_offset 4096
    freeblock_size 327680
    node_offset 331776
    index_size 92364800
    index_block_count 22550
    root_directory ddfsroot
    block_filename /dev/sdb3
    index_filename /dev/sdb2
hash: TIGER
direct_io: 1 enable
writer pool: 2 cpus
root directory: /l0/ddumbfs/ddfsroot
blockfile: /dev/sdb3
indexfile: /dev/sdb2
index locked into memory: 88.1Mo
09:49:11 INF check filesystem /l0/ddumbfs

ddumbfs has detected the file .autofsck in the parent directory and start automatically the appropriate file system check.

09:49:12 INF Repair node order, fixed 0 errors.

All nodes in the index are at the expected place. A crash should not disturb the node order. But all further tests expect some consistency of the index. Because the index has not been flushed, some data can be on the filesystem but not in the index. The autocheck and its manual equivalent fsckddumbfs read hashes from all files and update the index when possible. Here a lot off block have to be hashed to update the index.

09:49:12 INF Update index from files.
09:49:13 INF calculate hash for block addr=2
09:49:13 INF calculate hash for block addr=1625
09:49:13 INF calculate hash for block addr=1628
09:49:13 INF Read 3 files in 1.0s.
09:49:13 INF 1478 blocks used in files.
09:49:13 INF 1103 blocks have been added to index.

fsckddumbfs has registered 1103 blocks that were referenced by files but not yet in the index. The index is hold in memory or in cache and not flushed to disk to often to increase performance. But this is not a problem, data can be recovered from file themself.

09:49:13 INF ddfs_load_usedblocks
09:49:14 INF Check also recently added blocks: 1668.

On the other side some blocks could have been added to the index and referenced by files but not yet written to the block file. They are 1668 blocks that have been added since the last checkpoint, all must be checked. The regulars checkpoint limit the number of blocks to check at reboot time:

09:49:14 INF 1670 blocks used in nodes.
09:49:14 INF 1668 suspect blocks in nodes.
09:49:14 INF Resolve Index conflicts.
09:49:14 INF 0 nodes fixed.

Everything was ok and now, the index is clean and supposed to match what is in the block file.

fsckddumbfs now check and update files consistency.

09:49:14 INF Fix files.
09:49:14 WAR F s  /l0/ddumbfs/ddfsroot/file1
09:49:14 WAR F s  /l0/ddumbfs/ddfsroot/file3
09:49:14 WAR  Cs  /l0/ddumbfs/ddfsroot/file2
09:49:14 INF Fixed:2  Corrupted:1  Total:3 files in 0.0s.

Two files have an invalid size but fsckddumbfs has Fixed the problem. The last file is Corrupted and had a bad size. The size problem can be fixed but the file is now known as Corrupted. See below:

09:49:14 INF Deleted 193 useless nodes.

Some blocks were registered in the index but not yet written nor used by files. These useless nodes can also come from a previous file deletion. Index is not updated when files are deleted. To free the space you must start a reclaim procedure. These node have been removed:

09:49:14 INF blocks in use: 1477   blocks free: 2619963.

This is clear.

Now take a look at what we can see on the filesystem. Files are less than 1Go has expected.

[root@cos6-x64 ddumbfsC6]# ll /ddumbfs/
total 6056
-rw-r--r--. 1 root root 2015232 Oct 14 09:49 file1
-rw-r--r--. 1 root root 2015232 Oct 14 09:49 file2
-rw-r--r--. 1 root root 2170880 Oct 14 09:49 file3

To have a resume of which files are corrupted take a look at the corrupted.txt file that will display the same as seen above.

[root@cos6-x64 ddumbfsC6]# cat /ddumbfs/.ddumbfs/corrupted.txt
F s  /l0/ddumbfs/ddfsroot/file1
F s  /l0/ddumbfs/ddfsroot/file3
 Cs  /l0/ddumbfs/ddfsroot/file2

cpddumbfs is a tools able to upload or download files from an offline ddumbfs volume. download can be used when the filesystem is online without risk for it. Don’t expect consistent result if you are writing on it at the same time ! Option -c can be used to check the file with hash stored in the ddumbfs filesystem and its consistency.

[root@cos6-x64 ddumbfsC6]# src/cpddumbfs -c /l0/ddumbfs/ddfsroot/file1 /dev/null
[root@cos6-x64 ddumbfsC6]# src/cpddumbfs -c /l0/ddumbfs/ddfsroot/file3 /dev/null

OK means file1 and file3 are consistent. Of course they are incomplete because of the unexpected shutdown !

[root@cos6-x64 ddumbfsC6]# src/testddumbfs -o C -B 4096 -S 1024M -f -m 0x0 -s 1 /ddumbfs/file1
difference in block starting at: 2015232
[root@cos6-x64 ddumbfsC6]# src/testddumbfs -o C -B 4096 -S 1024M -f -m 0x0 -s 3 /ddumbfs/file3
difference in block starting at: 2170880

Comparing with the original I can see that the written data match up to the last byte. testddumbfs is tool that generate big random file and then allow to compare them. The advantage is to avoid the need to have such big files under the hand for testing. Trust the syntax and the appropriate usage by the author :-)

Now the corrupted file !

[root@cos6-x64 ddumbfsC6]# src/cpddumbfs -c /l0/ddumbfs/ddfsroot/file2 /dev/null
   416      1 e38bf882357c4a0b err

ERR means some blocks don’t match the expected one. Read of the block 416 at offset 416*4k=1703936 will return full of zeros. This is the default behavior. The reference has been written to the file, but the block was still in cache when the server crashed.

If you are lucky, you will copy another file, or make another backup that will contains and identical block and the next file check will re-link the corrupted node to the new block and the file will be removed of the corrupted list. If not, you must reload the file from the source or delete it. You can also copy the file, the missing block will be replaced by a block full of zeroes. But keep in mind that this new file is corrupted.