Sunteți pe pagina 1din 15

ZFS

The Last Word In File Systems


Jeff Bonwick
Bill Moore

(modifed by yupu for CS537)

Trouble with Existing Filesystems




1RGHIHQVHDJDLQVWVLOHQWGDWDFRUUXSWLRQ


%UXWDOWRPDQDJH



$Q\GHIHFWLQGLVNFRQWUROOHUFDEOHGULYHUODVHURUILUPZDUHFDQ
FRUUXSWGDWDVLOHQWO\OLNHUXQQLQJDVHUYHUZLWKRXW(&&PHPRU\
/DEHOVSDUWLWLRQVYROXPHVSURYLVLRQLQJJURZVKULQNHWFILOHV
/RWVRIOLPLWVILOHV\VWHPYROXPHVL]HILOHVL]HQXPEHURIILOHV
ILOHVSHUGLUHFWRU\QXPEHURIVQDSVKRWV

'LIIHUHQWWRROVWRPDQDJHILOHEORFNL6&6,1)6&,)6

1RWSRUWDEOHEHWZHHQSODWIRUPV [63$5&3RZHU3&$50

'RJVORZ


/LQHDUWLPHFUHDWHIDWORFNVIL[HGEORFNVL]HQDwYHSUHIHWFK
GLUW\UHJLRQORJJLQJSDLQIXO5$,'UHEXLOGVJURZLQJEDFNXSWLPH

ZFS Objective

End the Suffering

)LJXUHRXWZK\VWRUDJHKDVJRWWHQVRFRPSOLFDWHG

%ORZDZD\\HDUVRIREVROHWHDVVXPSWLRQV

'HVLJQDQLQWHJUDWHGV\VWHPIURPVFUDWFK

ZFS Overview


3RROHGVWRUDJH


&RPSOHWHO\HOLPLQDWHVWKHDQWLTXHQRWLRQRIYROXPHV

'RHVIRUVWRUDJHZKDW90GLGIRUPHPRU\

7UDQVDFWLRQDOREMHFWV\VWHP



$OZD\VFRQVLVWHQWRQGLVNQRIVFNHYHU
8QLYHUVDOILOHEORFNL6&6,VZDS

3URYDEOHHQGWRHQGGDWDLQWHJULW\


'HWHFWVDQGFRUUHFWVVLOHQWGDWDFRUUXSWLRQ

+LVWRULFDOO\FRQVLGHUHGWRRH[SHQVLYHQRORQJHUWUXH

6LPSOHDGPLQLVWUDWLRQ


&RQFLVHO\H[SUHVV\RXULQWHQW

FS/Volume Model vs. Pooled Storage


7UDGLWLRQDO9ROXPHV






$EVWUDFWLRQYLUWXDOGLVN
3DUWLWLRQYROXPHIRUHDFK)6
*URZVKULQNE\KDQG
(DFK)6KDVOLPLWHGEDQGZLGWK
6WRUDJHLVIUDJPHQWHGVWUDQGHG

FS

FS

FS

Volume

Volume

Volume

FS/Volume Model vs. Pooled Storage


7UDGLWLRQDO9ROXPHV






$EVWUDFWLRQYLUWXDOGLVN
3DUWLWLRQYROXPHIRUHDFK)6
*URZVKULQNE\KDQG
(DFK)6KDVOLPLWHGEDQGZLGWK
6WRUDJHLVIUDJPHQWHGVWUDQGHG

=)63RROHG6WRUDJH






FS

FS

FS

Volume

Volume

Volume

$EVWUDFWLRQPDOORFIUHH
1RSDUWLWLRQVWRPDQDJH
*URZVKULQNDXWRPDWLFDOO\
$OOEDQGZLGWKDOZD\VDYDLODEOH
$OOVWRUDJHLQWKHSRROLVVKDUHG

ZFS

ZFS
Storage Pool

ZFS

Copy-On-Write
,QLWLDOEORFNWUHH

&2:VRPHEORFNV

&2:LQGLUHFWEORFNV

5HZULWHXEHUEORFN DWRPLF

Transactional Object System




(YHU\WKLQJLVDQREMHFWLQ=)6


HJILOHVGLUHFWRULHV

6\VWHPFDOOV !PRGLILFDWLRQVRQREMHFWV

7UDQVDFWLRQ
(YHU\KLJKOHYHORSHDUWLRQLVDWUDQVDFWLRQ

7UDQVDFWLRQ*URXS
7UDQVDFWLRQVDUHDGGHGWRDWUDQVDFWLRQJURXS

$WUDQVDFWLRQJURXSLVFRPPLWHGWRGLVNSHULRGLFDOO\DVDZKROH

(LWKHUVXFFHHGRUIDLO

$OZD\VFRQVLVWHQWRQGLVN

Trends in Storage Integrity




8QFRUUHFWDEOHELWHUURUUDWHVKDYHVWD\HGURXJKO\FRQVWDQW


LQELWV a7% IRUGHVNWRSFODVVGULYHV

LQELWV a7% IRUHQWHUSULVHFODVVGULYHV DOOHJHGO\

%DGVHFWRUHYHU\7%LQSUDFWLFH GHVNWRSDQGHQWHUSULVH

'ULYHFDSDFLWLHVGRXEOLQJHYHU\PRQWKV

1XPEHURIGULYHVSHUGHSOR\PHQWLQFUHDVLQJ

5DSLGLQFUHDVHLQHUURUUDWHV

%RWKVLOHQWDQGQRLV\GDWDFRUUXSWLRQEHFRPLQJ
PRUHFRPPRQ
&KHDSIODVKVWRUDJHZLOORQO\DFFHOHUDWHWKLVWUHQG

End-to-End Data Integrity in ZFS


'LVN%ORFN&KHFNVXPV


&KHFNVXPVWRUHGZLWKGDWDEORFN

Data

Data

Checksum

Checksum

End-to-End Data Integrity in ZFS


Disk Block Checksums


Checksum stored with data block

Any self-consistent block will pass

Can't detect stray writes

Inherent FS/volume interface limitation


Address

Data
Checksum

Data
Checksum

Disk checksum only validates media


 Bit rot







Phantom writes
Misdirected reads and writes
DMA parity errors
Driver bugs
Accidental overwrite

End-to-End Data Integrity in ZFS


ZFS Data Authentication

Disk Block Checksums




Checksum stored in parent block pointer

Any self-consistent block will pass

Fault isolation between data and checksum

Can't detect stray writes

Inherent FS/volume interface limitation

Checksum stored with data block

Checksum hierarchy forms


self-validating Merkle tree
Address

Data
Checksum

Data

Address

Checksum Checksum

Checksum

Data

Disk checksum only validates media


 Bit rot







Phantom writes
Misdirected reads and writes
DMA parity errors
Driver bugs
Accidental overwrite

Address
Address
Checksum Checksum

Data

ZFS validates the entire I/O path









Bit rot
Phantom writes
Misdirected reads and writes
DMA parity errors
Driver bugs
Accidental overwrite

End-to-End Data Integrity in ZFS


ZFS Data Authentication

Disk Block Checksums




Checksum stored in parent block pointer

Any self-consistent block will pass

Fault isolation between data and checksum

Can't detect stray writes

Inherent FS/volume interface limitation

Checksum stored with data block

Checksum hierarchy forms


self-validating Merkle tree
Address

Data
Checksum

Data

Address

Checksum Checksum

Checksum

Data

Disk checksum only validates media


 Bit rot







Phantom writes
Misdirected reads and writes
DMA parity errors
Driver bugs
Accidental overwrite

Address
Address
Checksum Checksum

Data

ZFS validates the entire I/O path









Bit rot
Phantom writes
Misdirected reads and writes
DMA parity errors
Driver bugs
Accidental overwrite

Ditto Blocks


'DWDUHSOLFDWLRQDERYHDQGEH\RQGPLUURU5$,'=


(DFKORJLFDOEORFNFDQKDYHXSWRWKUHHSK\VLFDOEORFNV



'LIIHUHQWGHYLFHVZKHQHYHUSRVVLEOH
'LIIHUHQWSODFHVRQWKHVDPHGHYLFHRWKHUZLVH HJODSWRSGULYH


$OO=)6PHWDGDWDFRSLHV


6PDOOFRVWLQODWHQF\DQGEDQGZLGWK PHWDGDWDRIGDWD


([SOLFLWO\VHWWDEOHIRUSUHFLRXVXVHUGDWD


'HWHFWVDQGFRUUHFWVVLOHQWGDWDFRUUXSWLRQ

Creating Pools and Filesystems




&UHDWHDPLUURUHGSRROQDPHGWDQN


# zpool create tank mirror c2d0 c3d0




&UHDWHKRPHGLUHFWRU\ILOHV\VWHPPRXQWHGDWH[SRUWKRPH


# zfs create tank/home


# zfs set mountpoint=/export/home tank/home


&UHDWHKRPHGLUHFWRULHVIRUVHYHUDOXVHUV
1RWHDXWRPDWLFDOO\PRXQWHGDWH[SRUWKRPH^DKUHQVERQZLFNELOOP`WKDQNVWRLQKHULWDQFH


# zfs create tank/home/ahrens


# zfs create tank/home/bonwick
# zfs create tank/home/billm


$GGPRUHVSDFHWRWKHSRRO
# zpool add tank mirror c4d0 c5d0

S-ar putea să vă placă și