GPFS performance and Infiniband:
linux磁盘预读访问优化:
for i in `ls /dev/mapper/ | grep -v control | grep -v Vol`;do blockdev --setra 1024 /dev/mapper/$i;done
for i in `ls /dev/mapper/ | grep -v control | grep -v Vol`;do blockdev --getra /dev/mapper/$i;done
mmchconfig prefetchThreads=96 mmchconfig worker1Threads=144 maxMBpS=3200 pagepool=4096M verbsPorts
[root@node009 ~]# ibstatus
Infiniband device 'mlx4_0' port 1 status:
default gid: fe80:0000:0000:0000:0030:4800:4413:0001 base lid: 0x4b sm lid: 0x35 state: 4: ACTIVE phys state: 5: LinkUp
rate: 20 Gb/sec (4X DDR)
Infiniband device 'mlx4_0' port 2 status:
default gid: fe80:0000:0000:0000:0030:4800:4413:0002 base lid: 0x0 sm lid: 0x0 state: 1: DOWN phys state: 2: Polling rate: 10 Gb/sec (4X)
mmchconfig verbsPorts=\ mmchconfig versRdma=enable
Once these two parameters are set, if a node has these ports available they are used once the GPFS daemon has started. GPFS decides to use the IB RDMA connection instead of TCP/IP before looking at the subnets parameter. If an IB RDMA connection is available on a node it is used. If there are no IB ports available on a node then the subnets processing rules take effect. For enhanced performance when multiple verbPorts are defined on a node GPFS multiplexes the traffic over all of the available ports.
[root@node009 ~]# mmchconfig verbsPorts=mlx4_0 mmchconfig: Command successfully completed
mmchconfig: 6027-1371 Propagating the cluster configuration data to all affected nodes. This is an asynchronous process.
[root@node009 ~]# mmchconfig verbsRdma=enable mmchconfig: Command successfully completed
mmchconfig: 6027-1371 Propagating the cluster configuration data to all affected nodes. This is an asynchronous process.
[root@node009 ~]# mmfsadm test verbs config
VERBS RDMA Configuration:
Status : started
Start time : Thu May 5 11:47:45 2011 mmfs verbsRDMA : enable mmfs verbsPorts : mlx4_0 mmfs verbsRdmasPerNode : 0 mmfs verbsRdmasPerConnection : 8 mmfs verbsRdmaMinBytes : 8192 mmfs verbsRdmaMaxSendBytes : 16777216 mmfs verbsRdmaTimeout : 14
mmfs verbsLibName : libibverbs.so ibv_fork_support : true Max connections : 65536 Max RDMA size : 16777216 Max RDMAs per node max : 32 Max RDMAs per node curr : 32 Number of Devices opened : 1 Device : mlx4_0 vendor_id : 713 Device vendor_part_id : 25418
Device mem register chunk : 8589934592 (0x200000000) Device max_sge : 28 Device max_qp_wr : 16351 Device max_qp_rd_atom : 16 Max RDMAs per conn max : 32 Max RDMAs per conn curr : 8 Open Connect Ports : 1 Connect port : mlx4_0/1 lid : 75
state : IBV_PORT_ACTIVE path_mtu : IBV_MTU_2048
[root@node009 ~]# mmfsadm test verbs conn
NSD Client Connections:
cl nidx destination status curr RW peak RW file RDs file WRs file RD KB file WR KB idx cookie
-- ----- --------------- --------------- --------- --------- ---------- ---------- ----------- ----------- ---- ------
NSD Server Connections:
cl nidx destination status curr rdma wait rdma rdma RDs rdma WRs rdma RDs KB rdma WRs KB idx cookie
-- ----- --------------- --------------- --------- --------- ---------- ---------- ----------- ----------- ---- ------
adding extra subnets(可选):
mmchconfig subnets=\
与LUSTRE文件系统共存环境搭建:
LUSTRE server端安装包会改变当前操作系统的内核,如果不改变gpfs的编译配置文件,gpfs会无法启动,配置文件名称为site.mcr,在LUSTRE 1.8和LUSTRE 2.2版本上已经验证可以正常工作。
1. 2. 3. 4. 5. 6.
剩下的工作:
1. 配置io节点到客户端的无密码SSH功能,其中包含io节点到客户端和客户端到自身。 2. 修改/etc/hosts文件,加入io节点和client节点。
3. 安装GPFS到客户端节点,注意区分客户端是否使用了LUSTRE内核(GPFS编译方式不一样) 4. 配置客户端所使用的IB端口(mmchconfig verbsPorts=”…” -N clientnodes) 5. 安装完成,可进行测试。
安装LUSTRE server包(多个),安装devel包和head包 重启并以LUSTRE内核启动 安装gpfs
替换配置文件/usr/lpp/mmfs/src/config/site.mcr
这个位置可能也会有配置文件,/usr/lpp/mmfs/src/shark/config/site.mcr,同上替换掉
正常编译完成后即可
GPFS基本卸载过程
1. 在所有节点上umount /vol_data
2. mmdelfs vol_data (vol_data由命令mmlsnsd获得) 3. mmdelnsd gpfs1nsd (gpfsXnsd由命令mmlsnsd获得) 4. mmshutdown –a
5. mmdelnode –N nodeX (nodeX由命令mmgetstate –a获得) 6. 若要完全删除所有配置,rm -f /var/mmfs/gen/*

