ubuntu 14.04 安装infiniband网卡
-
- 帖子: 22
- 注册时间: 2016-10-26 16:06
- 系统: ubuntu 14.04
ubuntu 14.04 安装infiniband网卡
下载了mellanox官网上的MLNX_OFED_LINUX-3.3-1.0.4.0-ubuntu14.04-x86_64,自己的系统是ubuntu 14.04.3 ,内核是3.19.0.安装后出现如下问题:
Attempting to perform Firmware update...
The firmware for this device is not distributed inside Mellanox driver: 04:00.0 (PSID: FJT0D90200009)
To obtain firmware for this device, please contact your HW vendor.
Failed to update Firmware.
See /tmp/OFED.22244.logs/fw_update.log
Device (04:00.0):
04:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0)
Link Width: x4 ( WARNING - device supports x8 )
PCI Link Speed: 5GT/s
执行命令ibstat查看,网卡的stat为Down
求教什么原因
Attempting to perform Firmware update...
The firmware for this device is not distributed inside Mellanox driver: 04:00.0 (PSID: FJT0D90200009)
To obtain firmware for this device, please contact your HW vendor.
Failed to update Firmware.
See /tmp/OFED.22244.logs/fw_update.log
Device (04:00.0):
04:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0)
Link Width: x4 ( WARNING - device supports x8 )
PCI Link Speed: 5GT/s
执行命令ibstat查看,网卡的stat为Down
求教什么原因
-
- 论坛版主
- 帖子: 18279
- 注册时间: 2009-08-04 16:33
Re: ubuntu 14.04 安装infiniband网卡
1. https://www.mellanox.com/mellanox 官网 infiniband 网卡
2. 重點是 網卡上面的晶片 品牌與型號
2-1. 把下面指令 複製 貼進終端機 執行
sudo lshw -numeric -class network
把結果全部直接 選取/複製/貼上來
-
- 帖子: 22
- 注册时间: 2016-10-26 16:06
- 系统: ubuntu 14.04
Re: ubuntu 14.04 安装infiniband网卡
admin1@admin:~$ sudo lshw -numeric -class network
[sudo] password for admin1:
*-network
description: Ethernet interface
product: Intel Corporation [8086:15B7]
vendor: Intel Corporation [8086]
physical id: 1f.6
bus info: pci@0000:00:1f.6
logical name: eth0
version: 31
serial: 4c:cc:6a:14:d0:d4
size: 100Mbit/s
capacity: 1Gbit/s
width: 32 bits
clock: 33MHz
capabilities: pm msi bus_master cap_list ethernet physical tp 10bt 10bt-fd 100bt 100bt-fd 1000bt-fd autonegotiation
configuration: autonegotiation=on broadcast=yes driver=e1000e driverversion=2.3.2-k duplex=full firmware=0.8-4 ip=10.103.246.118 latency=0 link=yes multicast=yes port=twisted pair speed=100Mbit/s
resources: irq:126 memory:dff00000-dff1ffff
运行结果如上,我机器上现在是两块网卡一块以太网卡一块HCA卡,想要驱动HCA卡
网卡S/N:MT1146X03829
P/N: MHQA19-XTR
[sudo] password for admin1:
*-network
description: Ethernet interface
product: Intel Corporation [8086:15B7]
vendor: Intel Corporation [8086]
physical id: 1f.6
bus info: pci@0000:00:1f.6
logical name: eth0
version: 31
serial: 4c:cc:6a:14:d0:d4
size: 100Mbit/s
capacity: 1Gbit/s
width: 32 bits
clock: 33MHz
capabilities: pm msi bus_master cap_list ethernet physical tp 10bt 10bt-fd 100bt 100bt-fd 1000bt-fd autonegotiation
configuration: autonegotiation=on broadcast=yes driver=e1000e driverversion=2.3.2-k duplex=full firmware=0.8-4 ip=10.103.246.118 latency=0 link=yes multicast=yes port=twisted pair speed=100Mbit/s
resources: irq:126 memory:dff00000-dff1ffff
运行结果如上,我机器上现在是两块网卡一块以太网卡一块HCA卡,想要驱动HCA卡
网卡S/N:MT1146X03829
P/N: MHQA19-XTR
-
- 论坛版主
- 帖子: 18279
- 注册时间: 2009-08-04 16:33
Re: ubuntu 14.04 安装infiniband网卡
1. http://pciids.sourceforge.net/v2.2/pci.idsproduct: Intel Corporation [8086:15B7]
configuration: autonegotiation=on broadcast=yes driver=e1000e driverversion=2.3.2-k duplex=full firmware=0.8-4 ip=10.103.246.118 latency=0 link=yes multicast=yes port=twisted pair speed=100Mbit/s
8086 Intel Corporation
15b7 Ethernet Connection (2) I219-LM
2. 從這個網址下載 e1000e-3.3.3.tar.gz 來安裝看看
https://sourceforge.net/projects/e1000/ ... ble/3.3.3/
Initial support for the following devices:
* Ethernet Connection (4) I219-LM
* Ethernet Connection (4) I219-V
* Ethernet Connection (5) I219-LM
* Ethernet Connection (5) I219-V
-
- 帖子: 22
- 注册时间: 2016-10-26 16:06
- 系统: ubuntu 14.04
Re: ubuntu 14.04 安装infiniband网卡
第一个网址打不开, e1000e-3.3.3.tar.gz 我下载了,并安装了。看了一下似乎和infiniband那张卡没什么关系吧,sudo lshw -numeric -class network显示的是我那张已经配置的以太网网卡的信息吧,而未驱动成功的infiniband网卡没有显示,是这样吗?. 從這個網址下載 e1000e-3.3.3.tar.gz 來安裝看看
-
- 论坛版主
- 帖子: 18279
- 注册时间: 2009-08-04 16:33
Re: ubuntu 14.04 安装infiniband网卡
把下面指令 複製 貼進終端機 執行sudo lshw -numeric -class network显示的是我那张已经配置的以太网网卡的信息吧,而未驱动成功的infiniband网卡没有显示,是这样吗?
1. uname -a
2. sudo lspci -knn
把結果全部直接 選取/複製/貼上來
看看 有沒有偵測到硬件
如果是 U 卡 請再貼文
sudo cat /usr/share/misc/pci.ids第一个网址打不开
這個檔案 也有一份名單
-
- 帖子: 22
- 注册时间: 2016-10-26 16:06
- 系统: ubuntu 14.04
Re: ubuntu 14.04 安装infiniband网卡
admin1@admin:~$ sudo uname -a把下面指令 複製 貼進終端機 執行
[sudo] password for admin1:
Linux admin 3.19.0-25-generic #26~14.04.1-Ubuntu SMP Fri Jul 24 21:16:20 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
admin1@admin:~$ sudo lspci -knn
00:00.0 Host bridge [0600]: Intel Corporation Device [8086:191f] (rev 07)
Subsystem: Lenovo Device [17aa:30bd]
00:01.0 PCI bridge [0604]: Intel Corporation Device [8086:1901] (rev 07)
Kernel driver in use: pcieport
00:14.0 USB controller [0c03]: Intel Corporation Device [8086:a12f] (rev 31)
Subsystem: Lenovo Device [17aa:30bd]
Kernel driver in use: xhci_hcd
00:16.0 Communication controller [0780]: Intel Corporation Device [8086:a13a] (rev 31)
Subsystem: Lenovo Device [17aa:30bd]
00:16.3 Serial controller [0700]: Intel Corporation Device [8086:a13d] (rev 31)
Subsystem: Lenovo Device [17aa:30bd]
Kernel driver in use: serial
00:17.0 SATA controller [0106]: Intel Corporation Device [8086:a102] (rev 31)
Subsystem: Lenovo Device [17aa:30bd]
Kernel driver in use: ahci
00:1c.0 PCI bridge [0604]: Intel Corporation Device [8086:a116] (rev f1)
Kernel driver in use: pcieport
00:1d.0 PCI bridge [0604]: Intel Corporation Device [8086:a118] (rev f1)
Kernel driver in use: pcieport
00:1f.0 ISA bridge [0601]: Intel Corporation Device [8086:a146] (rev 31)
Subsystem: Lenovo Device [17aa:30bd]
00:1f.2 Memory controller [0580]: Intel Corporation Device [8086:a121] (rev 31)
Subsystem: Lenovo Device [17aa:30bd]
00:1f.3 Audio device [0403]: Intel Corporation Device [8086:a170] (rev 31)
Subsystem: Lenovo Device [17aa:30bd]
Kernel driver in use: snd_hda_intel
00:1f.4 SMBus [0c05]: Intel Corporation Device [8086:a123] (rev 31)
Subsystem: Lenovo Device [17aa:30bd]
00:1f.6 Ethernet controller [0200]: Intel Corporation Device [8086:15b7] (rev 31)
Subsystem: Lenovo Device [17aa:30bd]
Kernel driver in use: e1000e
01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Oland XT [Radeon HD 8670 / R7 250] [1002:6610] (rev 87)
Subsystem: Bitland(ShenZhen) Information Technology Co., Ltd. Device [1642:3f09]
Kernel driver in use: radeon
01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series] [1002:aab0]
Subsystem: Bitland(ShenZhen) Information Technology Co., Ltd. Device [1642:aab0]
Kernel driver in use: snd_hda_intel
02:00.0 PCI bridge [0604]: Integrated Technology Express, Inc. Device [1283:8893] (rev 41)
04:00.0 InfiniBand [0c06]: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] [15b3:673c] (rev b0)
Subsystem: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] [15b3:673c]
Kernel driver in use: mlx4_core
-
- 论坛版主
- 帖子: 18279
- 注册时间: 2009-08-04 16:33
Re: ubuntu 14.04 安装infiniband网卡
1. http://pciids.sourceforge.net/v2.2/pci.ids04:00.0 InfiniBand [0c06]: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] [15b3:673c] (rev b0)
Subsystem: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] [15b3:673c]
Kernel driver in use: mlx4_core
sudo cat /usr/share/misc/pci.ids
15b3 Mellanox Technologies
673c MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE]
2. 目前作用中驅動模組 mlx4_core
2-1. 需要 IOMMU 支援
2-2. 詳見
https://github.com/HSAFoundation/HSA-Dr ... D/issues/6
及其連結文章
有成功也有失敗案例
3. sudo lspci -knn 有 表示 硬件偵測到了
sudo lshw -numeric -class network 沒有 表示沒有被驅動
4. 最新驅動
http://www.mellanox.com/page/products_d ... nux_driver
Mellanox EN Driver for Linux
MLNX_EN Download Center
3.4-1.0.0.3
Ubuntu 14.04
5. 使用說明書
http://www.mellanox.com/related-docs/us ... manual.pdf
www.mellanox.com
ConnectX-2 EN Dual Port SFP+ Ethernet
Adapter Card User Manual
-
- 帖子: 22
- 注册时间: 2016-10-26 16:06
- 系统: ubuntu 14.04
Re: ubuntu 14.04 安装infiniband网卡
IOMMU是针对amd的,我的是intel 的i7,所以这个行不通啊。最新驅動
然后我安装了最新的驱动,应该是通过了,没有之前的错误。
Selecting previously unselected package mlnx-fw-updater.
(Reading database ... 170968 files and directories currently installed.)
Preparing to unpack .../mlnx-fw-updater_3.4-1.0.0.0_amd64.deb ...
Unpacking mlnx-fw-updater (3.4-1.0.0.0) ...
Setting up mlnx-fw-updater (3.4-1.0.0.0) ...
Added 'RUN_FW_UPDATER_ONBOOT=no to /etc/infiniband/openib.conf
Attempting to perform Firmware update...
Querying Mellanox devices firmware ...
Device #1:
----------
Device Type: ConnectX2
Part Number: MHQH19B-XTR_A1-A3
Description: ConnectX-2 VPI adapter card; single-port 40Gb/s QSFP; PCIe2.0 x8 5.0GT/s; tall bracket; RoHS R6
PSID: MT_0D90110009
PCI Device Name: 04:00.0
Port1 GUID: 0002c903000d37fb
Port2 MAC: 0002c90d37fb
Versions: Current Available
FW 2.9.1000 2.9.1000
Status: Up to date
Log File: /tmp/mlnx-en.2762.logs/fw_update.log
Device (04:00.0):
04:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0)
Link Width: x4 ( WARNING - device supports x8 )
PCI Link Speed: 5GT/s
Installation passed successfully
To load the new driver, run:
/etc/init.d/openibd restart
但是在配置interface后,
admin1@admin:/mnt$ sudo ifup ib0
RTNETLINK answers: File exists
Failed to bring up ib0.
admin1@admin:/mnt$ ibstat
CA 'mlx4_0'
CA type: MT26428
Number of ports: 1
Firmware version: 2.9.1000
Hardware version: b0
Node GUID: 0x0002c903000d37fa
System image GUID: 0x0002c903000d37fd
Port 1:
State: Initializing
Physical state: LinkUp
Rate: 40
Base lid: 0
LMC: 0
SM lid: 0
Capability mask: 0x02510868
Port GUID: 0x0002c903000d37fb
Link layer: InfiniBand
再ifconfig显示的ib0是这样,
ib0 Link encap:未指定 硬件地址 A0-00-02-10-FE-80-00-00-00-00-00-00-00-00-00-00
inet 地址:172.1.1.201 广播:172.1.1.255 掩码:255.255.255.0
UP BROADCAST MULTICAST MTU:4092 跃点数:1
接收数据包:0 错误:0 丢弃:0 过载:0 帧数:0
发送数据包:0 错误:0 丢弃:0 过载:0 载波:0
碰撞:0 发送队列长度:256
接收字节:0 (0.0 B) 发送字节:0 (0.0 B)
是写入nterface时出的问题吗?还是驱动还未成功?
-
- 论坛版主
- 帖子: 18279
- 注册时间: 2009-08-04 16:33
Re: ubuntu 14.04 安装infiniband网卡
1. 看看你的 Intel i7 型號 是否包含在裡面IOMMU是针对amd的,我的是intel 的i7,所以这个行不通啊。
1-1. https://en.wikipedia.org/wiki/List_of_I ... g_hardware
List of IOMMU-supporting hardware
1-2. https://en.wikipedia.org/wiki/List_of_I ... ntel_based
Intel based
List of Intel and Intel-based hardware that supports VT-d (Intel Virtualization Technology for Directed I/O)
2. 把下面指令 複製 貼進終端機 執行我安装了最新的驱动,应该是通过了,没有之前的错误。
2-1. sudo lshw -numeric -class network
2-2. sudo ifconfig -a
2-3. sudo cat /etc/network/interfaces
2-4. sudo cat /etc/NetworkManager/NetworkManager.conf
2-5. sudo nmcli dev status
把結果全部直接 選取/複製/貼上來
上面指令 主要要看看 網卡驅動是否已經合適驅動
-
- 帖子: 22
- 注册时间: 2016-10-26 16:06
- 系统: ubuntu 14.04
Re: ubuntu 14.04 安装infiniband网卡
我的是i7 6700,列表里面包含的是i7 6700K看看你的 Intel i7 型號 是否包含在裡面
admin1@admin:~$ sudo lshw -numeric -class network把下面指令 複製 貼進終端機 執行
*-network
description: Ethernet interface
product: Intel Corporation [8086:15B7]
vendor: Intel Corporation [8086]
physical id: 1f.6
bus info: pci@0000:00:1f.6
logical name: eth0
version: 31
serial: 4c:cc:6a:14:d0:d4
size: 100Mbit/s
capacity: 1Gbit/s
width: 32 bits
clock: 33MHz
capabilities: pm msi bus_master cap_list ethernet physical tp 10bt 10bt-fd 100bt 100bt-fd 1000bt-fd autonegotiation
configuration: autonegotiation=on broadcast=yes driver=e1000e driverversion=2.3.2-k duplex=full firmware=0.8-4 ip=10.103.246.118 latency=0 link=yes multicast=yes port=twisted pair speed=100Mbit/s
resources: irq:126 memory:dff00000-dff1ffff
admin1@admin:~$ sudo ifconfig -a
eth0 Link encap:以太网 硬件地址 4c:cc:6a:14:d0:d4
inet 地址:10.103.246.118 广播:10.103.255.255 掩码:255.255.240.0
inet6 地址: fe80::4ecc:6aff:fe14:d0d4/64 Scope:Link
inet6 地址: 2001:da8:215:3f0:4c52:2567:dabb:5fe2/64 Scope:Global
inet6 地址: 2001:da8:215:3f0:4ecc:6aff:fe14:d0d4/64 Scope:Global
UP BROADCAST RUNNING MULTICAST MTU:1500 跃点数:1
接收数据包:29847 错误:0 丢弃:0 过载:0 帧数:0
发送数据包:4451 错误:0 丢弃:0 过载:0 载波:0
碰撞:0 发送队列长度:1000
接收字节:9172674 (9.1 MB) 发送字节:737526 (737.5 KB)
中断:16 Memory:dff00000-dff20000
ib0 Link encap:未指定 硬件地址 A0-00-02-10-FE-80-00-00-00-00-00-00-00-00-00-00
inet 地址:172.1.1.201 广播:172.1.1.255 掩码:255.255.255.0
UP BROADCAST MULTICAST MTU:4092 跃点数:1
接收数据包:0 错误:0 丢弃:0 过载:0 帧数:0
发送数据包:0 错误:0 丢弃:0 过载:0 载波:0
碰撞:0 发送队列长度:256
接收字节:0 (0.0 B) 发送字节:0 (0.0 B)
lo Link encap:本地环回
inet 地址:127.0.0.1 掩码:255.0.0.0
inet6 地址: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 跃点数:1
接收数据包:1211 错误:0 丢弃:0 过载:0 帧数:0
发送数据包:1211 错误:0 丢弃:0 过载:0 载波:0
碰撞:0 发送队列长度:0
接收字节:116360 (116.3 KB) 发送字节:116360 (116.3 KB)
admin1@admin:~$ sudo cat /etc/network/interfaces
# interfaces(5) file used by ifup(8) and ifdown(8)
auto lo
iface lo inet loopback
auto ib0
iface ib0 inet static
address 172.1.1.201
netmask 255.255.255.0
gateway 172.1.1.1
admin1@admin:~$ sudo cat /etc/NetworkManager/NetworkManager.conf
[main]
plugins=ifupdown,keyfile,ofono
dns=dnsmasq
[ifupdown]
managed=false
admin1@admin:~$ sudo nmcli dev status
设备 类型 状态
eth0 802-3-ethernet 已连接
ib0 infiniband 未托管
-
- 论坛版主
- 帖子: 18279
- 注册时间: 2009-08-04 16:33
Re: ubuntu 14.04 安装infiniband网卡
sudo cat /etc/network/interfaces
auto ib0
iface ib0 inet static
address 172.1.1.201
netmask 255.255.255.0
gateway 172.1.1.1
sudo cat /etc/NetworkManager/NetworkManager.conf
managed=false
sudo nmcli dev status
ib0 infiniband 未托管
1. ib0 成功驅動 IPv4 是 172.1.1.201ib0 Link encap:未指定 硬件地址 A0-00-02-10-FE-80-00-00-00-00-00-00-00-00-00-00
inet 地址:172.1.1.201 广播:172.1.1.255 掩码:255.255.255.0
UP BROADCAST MULTICAST MTU:4092 跃点数:1
接收数据包:0 错误:0 丢弃:0 过载:0 帧数:0
发送数据包:0 错误:0 丢弃:0 过载:0 载波:0
碰撞:0 发送队列长度:256
接收字节:0 (0.0 B) 发送字节:0 (0.0 B)
但是目前沒有資料流使用 ib0
1-1. 把下面指令 複製 貼進終端機 執行
sudo route -nv
把結果全部直接 選取/複製/貼上來
要查為何沒有資料流使用 ib0
2. 你這個網址 172.1.1.201sudo cat /etc/network/interfaces
auto ib0
iface ib0 inet static
address 172.1.1.201
netmask 255.255.255.0
gateway 172.1.1.1
以及 這個網關 172.1.1.1
是參考哪裡的資料
請提供網址
-
- 帖子: 22
- 注册时间: 2016-10-26 16:06
- 系统: ubuntu 14.04
Re: ubuntu 14.04 安装infiniband网卡
admin1@admin:~$ ping 172.1.1.201ib0 成功驅動 IPv4 是 172.1.1.201
PING 172.1.1.201 (172.1.1.201) 56(84) bytes of data.
64 bytes from 172.1.1.201: icmp_seq=1 ttl=64 time=0.024 ms
64 bytes from 172.1.1.201: icmp_seq=2 ttl=64 time=0.044 ms
64 bytes from 172.1.1.201: icmp_seq=3 ttl=64 time=0.038 ms
64 bytes from 172.1.1.201: icmp_seq=4 ttl=64 time=0.036 ms
64 bytes from 172.1.1.201: icmp_seq=5 ttl=64 time=0.043 ms
64 bytes from 172.1.1.201: icmp_seq=6 ttl=64 time=0.042 ms
64 bytes from 172.1.1.201: icmp_seq=7 ttl=64 time=0.042 ms
64 bytes from 172.1.1.201: icmp_seq=8 ttl=64 time=0.041 ms
^C
--- 172.1.1.201 ping statistics ---
8 packets transmitted, 8 received, 0% packet loss, time 6999ms
rtt min/avg/max/mdev = 0.024/0.038/0.044/0.009 ms
现在ping可以ping通,但是
admin1@admin:~$ sudo ibstat
CA 'mlx4_0'
CA type: MT26428
Number of ports: 1
Firmware version: 2.9.1000
Hardware version: b0
Node GUID: 0x0002c903000d37fa
System image GUID: 0x0002c903000d37fd
Port 1:
State: Initializing
Physical state: LinkUp
Rate: 40
Base lid: 0
LMC: 0
SM lid: 0
Capability mask: 0x02510868
Port GUID: 0x0002c903000d37fb
Link layer: InfiniBand
显示的State: Initializing,正常应该是Active吧?
另外
sudo cat /etc/NetworkManager/NetworkManager.conf
managed=false
sudo nmcli dev status
ib0 infiniband 未托管
求教代表什么?
是否和我现在机器上还使用另一张以太网卡上网有关?
-
- 论坛版主
- 帖子: 18279
- 注册时间: 2009-08-04 16:33
Re: ubuntu 14.04 安装infiniband网卡
viewtopic.php?f=48&t=481016&p=3177518&h ... f#p3177518sudo cat /etc/NetworkManager/NetworkManager.conf
managed=false
sudo nmcli dev status
ib0 infiniband 未托管
求教代表什么?
是否和我现在机器上还使用另一张以太网卡上网有关?
-
- 帖子: 22
- 注册时间: 2016-10-26 16:06
- 系统: ubuntu 14.04
Re: ubuntu 14.04 安装infiniband网卡
了解了,现在我ping 172.1.1.201能够ping通,但是172.1.1.203就不行了,这是什么原因?sudo cat /etc/NetworkManager/NetworkManager.conf
managed=false
sudo nmcli dev status