Sunteți pe pagina 1din 72

Lustre Operations Manual

Cluster File Systems

Lustre Operations Manual

First Edition (March 31, 2004)

This publication is intended to help Cluster File Systems, Inc.’s (CFS) Customers and Partners who are involved in installing, configuring, and administering Lustre.

The information contained in this document has not been submitted to any formal CFS test and is distributed AS IS. The use of this information or the implementation of any of these techniques is a customer responsibility and depends on the customer’s ability to evaluate and integrate them into the customer’s operational environment. While each item may have been reviewed by CFS for accuracy in a specific situation, there is no guarantee that the same or similar results will be obtained elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk.

Comments may be addressed to:

Cluster File Systems, Inc.

110 Capen Street

Medford MA 02155-4230

Copyright Cluster File Systems, Inc. 2004 All rights reserved.

Use or disclosure is subject to restrictions.

Duplication of this manual is prohibited.

2

Contents

1

Prerequisites

9

1.1 Lustre Version Selection

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

9

1.1.1 How To Get Lustre

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

9

1.1.2 Supported Configurations

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

9

1.2 Using a pre-packaged Lustre release

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

9

1.2.1 Choosing a Pre-packaged kernel

 

.

.

.

.

.

.

.

.

.

.

.

.

.

10

1.2.2 Lustre Tools

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

10

1.2.3 Building Other Modules Against the Lustre kernel

.

.

.

.

.

.

11

1.2.4 Other Required Software

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

11

1.3 Building From Source

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

12

1.3.1

Building Your Own kernel

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

13

1.3.2

Building Lustre

. Environment Requirements 1.4 .

1.3.3

LDAP

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

14

16

16

1.4.1

Installing LDAP Packages

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

16

1.4.2

Updating slapd.conf

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

17

1.4.3

Specifying Password Location

 

.

.

.

.

.

.

.

.

.

.

.

.

.

17

1.4.4

Caveats

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

17

1.4.5

Using LDAP to configure the cluster

.

.

.

.

.

.

.

.

.

.

.

.

.

18

1.5 Installing Lustre-Manager

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

18

1.5.1 Dependencies

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

18

1.5.2 Installing the reporting client daemon (LMD)

 

.

.

.

.

.

.

20

1.5.3 Installing the management client (LMM)

 

.

.

.

.

.

.

.

.

.

.

.

21

3

CONTENTS

2

Creating a New File System

 

23

2.1 What do you need to know to setup Lustre? .

.

.

.

.

.

.

.

.

.

.

.

.

.

23

2.1.1 Architecture Refresher

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

23

2.1.2 Sizing Your Nodes

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

24

2.1.3 High Availability

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

25

2.1.4 Total Usable Storage .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

26

2.2 Disk Layout

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

26

2.2.1 Basics

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

26

2.2.2 Lustre on RAID

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

27

2.2.3 Logical Volume Manager (LVM)

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

28

2.3 Counting Your Object Storage Servers/Targets

.

.

.

.

.

.

.

.

.

.

.

.

29

2.3.1 Peak Bandwidth

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

29

2.3.2 Total Storage Capacity

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

29

2.3.3 When Your Best Isn’t Good Enough

.

.

.

.

.

.

.

.

.

.

.

.

.

30

2.4 File Striping

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

30

2.4.1 Advantages of Striping

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

30

2.4.2 Disadvantages of Striping

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

31

2.4.3 Stripe Size

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

31

2.4.4 Choosing OSTs

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

32

2.5 Using Lustre-Manager

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

32

2.5.1 Basic Multi-node Setup

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

32

2.5.2 Basic Service Management

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

35

2.5.3 Large Parallel I/O Configuration

 

.

.

.

.

.

.

.

.

.

.

.

.

.

36

2.5.4 Configuring for Failover

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

38

2.5.5 LDAP

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

38

2.5.6 Multinet and Routing

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

38

2.5.7 Configuration Pitfalls

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

38

2.6 Failover Example

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

38

2.6.1 Shared Storage

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

38

2.6.2 Configuring With Failover Manager .

.

.

.

.

.

.

.

.

.

.

.

.

.

39

2.6.3 Pairwise Config

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

39

2.6.4 Passive/active Failover

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

39

 

4

CONTENTS

2.6.5 Active/Active Failover

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

39

2.6.6 N-way Failover

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

40

2.6.7 The Default Lustre Upcall

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

40

2.6.8 Testing Failover

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

40

2.7 Client Configuration .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

40

2.7.1

Automatic Client Mounting via fstab

.

.

.

.

.

.

.

.

.

.

.

.

.

40

2.8 Validation and Light Testing .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

41

2.8.1

Lustre Throughput Tests

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

41

2.9 Configuration, Under the Hood

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

41

2.9.1 Automatic Service Stopping and Starting

.

.

.

.

.

.

.

.

.

.

.

42

2.9.2 File System Parameters

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

43

2.9.3 Upcall Generation and Configuration

.

.

.

.

.

.

.

.

.

.

.

.

.

43

2.9.4 Log Levels and Timeouts

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

43

2.10 Striping Tools

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

43

2.10.1 Per-File

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

44

2.10.2 Per-Directory .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

44

2.10.3 Inspecting Stripe Settings

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

44

2.10.4 Finding Files on a Given OST

 

.

.

.

.

.

.

.

.

.

.

.

.

.

44

2.10.5 Examples

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

44

3 Configuring Monitoring

 

46

3.1 Basic monitoring

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

46

3.1.1 System Health

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

46

3.1.2 Current Load .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

46

3.1.3 Bandwidth/Disk/CPU

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

46

3.1.4 OST performance monitoring with LMT

.

.

.

.

.

.

.

.

.

.

.

47

3.1.5 Lustre Operation/RPC Rate

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

47

3.2 Integrating with other monitoring

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

47

3.2.1 Configuring System Log

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

47

3.2.2 Logging from the Upcall

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

48

3.2.3 SNMP

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

48

4 Health Checking and Troubleshooting

 

49

5

CONTENTS

 

4.1

File system consistency

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

49

4.2 E2fsck

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

49

4.2.1

4.3 lfsck

.

Supported e2fsck Releases .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

50

50

 

4.3.1 What is lfsck?

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

51

4.3.2 When To Run lfsck

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

51

4.3.3 What if I don’t run lfsck?

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

51

4.3.4 Using lfsck

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

51

 

4.4 Validation of Configuration

.

.

.

.

.

.

.

.

.

.