Do not like the silent dd on unix? Let’s add a progress bar for it. (Using pv)

Though dd is not efficient, it’s still widely used for many years until now, especially when we are “burning” the disk image of the famous Raspberry Pi or some related boards, the worst point I hate about dd is the silence, it makes me worried, we can not know the progress of the process, and if it takes too much time, we may start to think about if the computer hung, the SD card broken, or something else … until the process finished, that’s very bad.

I’ve found pv and also use it for a while, it’s the Pipe Viewer, a terminal-based tool for monitoring the progress of data through a pipeline, I know it’s not so famous and widely used, so I would like to share and talk about it, let’s start it!

pv is a OSI(Open Source Initiative, not the network Open Systems Interconnection model) certified open source software, here is the homepage of it:

The latest release until now is v1.6.0, the version in Ubuntu 14.04 LTS is v1.2.0, but still works well, I am going to use Raspbian Jessie, version February 2016 (Release date:2016-02-09) as the example image in this post, and writing an raw image to a microSD card as the scenario.

Traditionally, we use dd like this, dd in, dd out, with silence:

$ sudo dd if=./2016-02-09-raspbian-jessie.img of=/dev/sdd bs=10M

在 Linux 上快速建立 空白檔案 / 大檔案 的方法

建立 swap 置換空間的或是跑一些測試時候有時會需要建立一個超大檔案(這邊指的是空白檔案),最常用的方法應該是 dd,例如從 /dev/zero 讀空白資料出來寫到我們的目的地檔案 ./dd_1G,一次寫入 1MB 的資料,總共寫入 1024 塊,就可以產生一個 1GB 的檔案:

$ time dd if=/dev/zero of=./dd_1G bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 5.92408 s, 181 MB/s

real 0m6.052s
user 0m0.000s
sys 0m0.660s

在我的電腦上大概花了6秒鐘左右,檔案大的話花的時間會更多,例如 10G 或 100G 之類的,以空檔案來說,其實 dd 有更快的做法,直接用 seek 跳到指定的位置:

$ time dd of=./dd_1G_fast bs=1 count=0 seek=1G
0+0 records in
0+0 records out
0 bytes (0 B) copied, 3.4469e-05 s, 0.0 kB/s

real 0m0.001s
user 0m0.000s
sys 0m0.001s

0.001 秒 … 可以想像成把檔案頭尾的起點終點設定完就結束了,中間的內容跳過不管他,所以會很快,不過這樣產生的檔案不見得什麼工具都接受,要拿來當 swap 的話就沒辦法了,會出現:

$ sudo swapon /swap
swapon: /swap: skipping - it appears to have holes.

這邊可以參考 swapon 的 manpage:

You should not use swapon on a file with holes. Swap over NFS may not work.

除了 dd,我們可以考慮使用 – fallocate / truncate

$ time fallocate -l 1G fallocate_1G
real 0m0.002s
user 0m0.000s
sys 0m0.002s

$ time truncate -s 1G truncate_1G
real 0m0.001s
user 0m0.000s
sys 0m0.001s

truncate 產生出來的檔案拿來給 swap 使用也會遇到上述 swap 不接受的問題,fallocate 就沒有這個問題,只是可能不適用於所有檔案系統,已知支援:btrfs, ext4, ocfs2 及 xfs,如果跑在 ext3 上面會出現:

fallocate: /swap: fallocate failed: Operation not supported

如果不考慮速度的還有一個工具 – xfs_mkfile,Debian / Ubuntu based GNU/Linux 可以透過 apt-get install xfsprogs 來安裝,雖然他是 xfs 的 utility 但也可以在其他檔案系統上面使用:

$ time xfs_mkfile 1024m xfs_mkfile_1G
real 0m10.810s
user 0m0.000s
sys 0m0.110s

比較一下產出的檔案有什麼差異?光看大小跟 checksum 看起來是沒差~ 只是使用起來可能會有上面提到的差異

peter ~/test $ ls -l
total 4194320
-rw------- 1 peter peter 1073741824 Jan 20 17:13 dd_1G
-rw------- 1 peter peter 1073741824 Jan 20 17:13 dd_1G_2
-rw------- 1 peter peter 1073741824 Jan 20 17:16 dd_1G_fast
-rw------- 1 peter peter 1073741824 Jan 20 17:13 fallocate_1G
-rw------- 1 peter peter 1073741824 Jan 20 17:13 truncate_1G
-rw------- 1 peter peter 1073741824 Jan 20 17:14 xfs_mkfile_1G

peter ~/test $ md5sum *
cd573cfaace07e7949bc0c46028904ff dd_1G
cd573cfaace07e7949bc0c46028904ff dd_1G_2
cd573cfaace07e7949bc0c46028904ff dd_1G_fast
cd573cfaace07e7949bc0c46028904ff fallocate_1G
cd573cfaace07e7949bc0c46028904ff truncate_1G
cd573cfaace07e7949bc0c46028904ff xfs_mkfile_1G

這時候要用 du -k --apparent-size & du -k 才會看得出來,有興趣的人可以試試看,結論是要產生 swap file 的話用 fallocate 會是比較快速又實際的方法