Skip to content

Commit 16b3aca

Browse files
authored
Update ROCmInstall.md
1 parent 6ba9dbe commit 16b3aca

File tree

1 file changed

+93
-55
lines changed

1 file changed

+93
-55
lines changed

ROCmInstall.md

+93-55
Original file line numberDiff line numberDiff line change
@@ -9,73 +9,73 @@ The ROCm Platform brings a rich foundation to advanced computing by seamlessly
99

1010
#### Supported CPUs
1111

12+
Starting with ROCm 1.8, we have relaxed the requirements for PCIe Atomics on Vega 10 (GFX9) GPUs, and we have similarly opened up more options for number of PCIe lanes. With this release, these GFX9 GPUs can support CPUs without PCIe Atomics and, for example, run on PCIe Gen2 x1 lanes. To enable this option, please set the environment variable `HSA_ENABLE_SDMA=0`.
1213

13-
Starting with ROCm 1.8 we have relaxed the use of PCIe Atomics and also PCIe lane choice for Vega10/GFX9 class GPU. So now you can support CPU without PCIe Atomics and also use Gen2 x1 lanes.
14-
15-
Currently our GFX8 GPU's (Fiji & Polaris family) still need to use PCIe Gen 3 and PCIe Atomics, but are looking at relaxing this in a future release, once we have fully tested firmware.
16-
14+
Currently, our GFX8 GPUs (Fiji & Polaris family) still need to use PCIe Gen 3 and PCIe Atomics, but are looking at relaxing this in a future release, once we have fully tested firmware.
1715

1816
Current CPUs which support PCIe Gen3 + PCIe Atomics are:
1917
* AMD Ryzen CPUs;
2018
* AMD EPYC CPUs;
2119
* Intel Xeon E7 V3 or newer CPUs;
2220
* Intel Xeon E5 v3 or newer CPUs;
23-
* Intel Xeon E3 v3 or newer CPUs;
21+
* Intel Xeon E3 v3 or newer CPUs;
2422
* Intel Core i7 v4, Core i5 v4, Core i3 v4 or newer CPUs (i.e. Haswell family or newer).
2523

26-
For Fiji and Polaris GPU's the ROCm platform leverages PCIe Atomics (Fetch and Add, Compare and Swap,
24+
For Fiji and Polaris GPUs, the ROCm platform leverages PCIe Atomics (Fetch and Add, Compare and Swap,
2725
Unconditional Swap, AtomicsOp Completion).
2826
PCIe Atomics are only supported on PCIe Gen3 enabled CPUs and PCIe Gen3 switches like
29-
Broadcom PLX. When you install your GPUs make sure you install them in a fully
27+
Broadcom PLX. When you install your GPUs, make sure you install them in a fully
3028
PCIe Gen3 x16 or x8, x4 or x1 slot attached either directly to the CPU's Root I/O
3129
controller or via a PCIe switch directly attached to the CPU's Root I/O
32-
controller. In our experience many issues stem from trying to use consumer
30+
controller. In our experience, many issues stem from trying to use consumer
3331
motherboards which provide physical x16 connectors that are electrically
34-
connected as e.g. PCIe Gen2 x4 connected via the
35-
Southbridge PCIe I/O controller.
32+
connected as e.g. PCIe Gen2 x4, PCIe slots connected via the
33+
Southbridge PCIe I/O controller, or PCIe slots connected through a PCIe switch that does
34+
not support PCIe atomics.
3635

37-
38-
Experimental support for our GFX7 GPUs Radeon R9 290, R9 390, AMD FirePro S9150, S9170 note they do not support or
39-
take advantage of PCIe Atomics. However, we still recommend that you use a CPU
40-
from the list provided above.
36+
Experimental support for our Hawaii (GFX7) GPUs (Radeon R9 290, R9 390, FirePro W9100, S9150, S9170)
37+
does not require or take advantage of PCIe Atomics. However, we still recommend that you use a CPU
38+
from the list provided above for compatibility purposes.
4139

4240
#### Not supported or very limited support under ROCm
4341
###### Limited support
4442

45-
46-
* With ROCm 1.8 and Vega10 it should support PCIe Gen2 enabled CPUs such as the AMD Opteron, Phenom, Phenom II, Athlon, Athlon X2, Athlon II and older Intel Xeon and Intel Core Architecture and Pentium CPUs. But we have done very limited testing. Since our test farm today has been catering to CPU listed above. This is where we need community support.
47-
* Thunderbolt 1,2 and 3 enabled breakout boxes GPU's should now be able to work with ROCm. Thunderbolt 1 and 2 are PCIe Gen2 based. But we have done no testing on this config and would need comunity support do limited access to this type of equipment
43+
* ROCm 1.8 and Vega10 should support PCIe Gen2 enabled CPUs such as the AMD Opteron, Phenom, Phenom II, Athlon, Athlon X2, Athlon II and older Intel Xeon and Intel Core Architecture and Pentium CPUs. However, we have done very limited testing on these configurations, since our test farm has been catering to CPU listed above. This is where we need community support; if you find problems on such setups, please report these issues.
44+
* Thunderbolt 1, 2, and 3 enabled breakout boxes should now be able to work with ROCm. Thunderbolt 1 and 2 are PCIe Gen2 based, and thus are only supported with GPUs that do not require PCIe Gen 3 atomics (i.e. Vega 10). However, we have done no testing on this configuration and would need comunity support due to limited access to this type of equipment
4845

4946
###### Not supported
5047

51-
52-
* We also do not support AMD Carrizo and Kaveri APU as host for compliant dGPU attachments.
53-
* Thunderbolt 1 and 2 enabled GPU's are not supported by ROCm. Thunderbolt 1 & 2 are PCIe Gen2 based.
54-
* AMD Carrizo based APUs have limited support due to OEM & ODM's choices when it comes to some key configuration parameters. On point, we have observed that Carrizo laptops, AIOs and desktop systems showed inconsistencies in exposing and enabling the System BIOS parameters required by the ROCm stack. Before purchasing a Carrizo system for ROCm, please verify that the BIOS provides an option for enabling IOMMUv2. If this is the case, the final requirement is associated with correct CRAT table support - please inquire with the OEM about the latter.
55-
* AMD Merlin/Falcon Embedded System is also not currently supported by the public repo.
48+
* We do not support GFX8-class GPUs (Fiji, Polaris, etc.) on CPUs that do not have PCIe Gen 3 with PCIe atomics.
49+
* As such, do not support AMD Carrizo and Kaveri APUs as hosts for such GPUs..
50+
* Thunderbolt 1 and 2 enabled GPUs are not supported by GFX8 GPUs on ROCm. Thunderbolt 1 & 2 are PCIe Gen2 based.
51+
* AMD Carrizo based APUs have limited support due to OEM & ODM's choices when it comes to some key configuration parameters. In particular, we have observed that Carrizo laptops, AIOs, and desktop systems showed inconsistencies in exposing and enabling the System BIOS parameters required by the ROCm stack. Before purchasing a Carrizo system for ROCm, please verify that the BIOS provides an option for enabling IOMMUv2 and that the system BIOS properly exposes the correct CRAT table - please inquire with the OEM about the latter.
52+
* AMD Merlin/Falcon Embedded System is not currently supported by the public repo.
5653
* AMD Raven Ridge APU are currently not supported
5754

55+
### New features to ROCm 1.8.3
56+
57+
* ROCm 1.8.3 is a minor update meant to fix compatibility issues on Ubuntu releases running kernel 4.15.0-33
5858

59-
### New features to ROCm 1.8.1
59+
### New features as of ROCm 1.8.2
6060

6161
#### DKMS driver installation
6262

6363
* Debian packages are provided for DKMS on Ubuntu
6464
* RPM packages are provided for CentOS/RHEL 7.4 and 7.5 support
6565
* See the [ROCT-Thunk-Interface](https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/tree/roc-1.8.x) and [ROCK-Kernel-Driver](https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/tree/roc-1.8.x) for additional documentation on driver setup
6666

67-
#### New distribution suppport
67+
#### New distribution support
6868

6969
* Binary package support for Ubuntu 16.04
70-
* Binary package support for CentoOS 7.4 and 7.5
70+
* Binary package support for CentOS 7.4 and 7.5
7171
* Binary package support for RHEL 7.4 and 7.5
7272

7373
#### Improved OpenMPI via UCX support
7474

7575
* UCX support for OpenMPI
7676
* ROCm RDMA
7777

78-
### The latest ROCm platform - ROCm 1.8.1
78+
### The latest ROCm platform - ROCm 1.8.3
7979

8080
The latest tested version of the drivers, tools, libraries and source code for
8181
the ROCm platform have been released and are available under the roc-1.8.x or rocm-1.8.x tag
@@ -92,7 +92,7 @@ of the following GitHub repositories:
9292
* [atmi](https://github.com/RadeonOpenCompute/atmi/tree/0.3.7)
9393

9494
Additionally, the following mirror repositories that support the HCC compiler
95-
are also available on GitHub, and frozen for the rocm-1.8.0 release:
95+
are also available on GitHub, and frozen for the rocm-1.8.3 release:
9696

9797
* [llvm](https://github.com/RadeonOpenCompute/llvm/tree/roc-1.8.x)
9898
* [ldd](https://github.com/RadeonOpenCompute/lld/tree/roc-1.8.x)
@@ -101,14 +101,14 @@ are also available on GitHub, and frozen for the rocm-1.8.0 release:
101101

102102
#### Supported Operating Systems - New operating systems available
103103

104-
The ROCm 1.8.1 platform has been tested on the following operating systems:
104+
The ROCm 1.8.3 platform has been tested on the following operating systems:
105105
* Ubuntu 16.04
106106
* CentOS 7.4 &. 7.5 (Using devetoolset-7 runtime support)
107107
* RHEL 7.4. &. 7.5 (Using devetoolset-7 runtime support)
108108

109109
### Installing from AMD ROCm repositories
110110

111-
AMD is hosting both Debian and RPM repositories for the ROCm 1.8.1 packages at this time.
111+
AMD is hosting both Debian and RPM repositories for the ROCm 1.8.3 packages at this time.
112112

113113
The packages in the Debian repository have been signed to ensure package integrity.
114114

@@ -122,26 +122,22 @@ sudo apt dist-upgrade
122122
sudo apt install libnuma-dev
123123
sudo reboot
124124
```
125-
##### Optional: Upgrade to 4.13 kernel
126125

127-
Although not required, it is recommended as of ROCm 1.8.1 that the system's kernel is upgraded to the latest 4.13 version available:
128-
129-
```shell
130-
sudo apt install linux-headers-4.13.0-32-generic linux-image-4.13.0-32-generic linux-image-extra-4.13.0-32-generic linux-signed-image-4.13.0-32-generic
131-
sudo reboot
132-
```
133126
##### Add the ROCm apt repository
134127

135128
For Debian based systems, like Ubuntu, configure the Debian ROCm repository as
136129
follows:
137130

138131
```shell
139132
wget -qO - http://repo.radeon.com/rocm/apt/debian/rocm.gpg.key | sudo apt-key add -
140-
sudo sh -c 'echo deb [arch=amd64] http://repo.radeon.com/rocm/apt/debian/ xenial main > /etc/apt/sources.list.d/rocm.list'
133+
echo 'deb [arch=amd64] http://repo.radeon.com/rocm/apt/debian/ xenial main' | sudo tee /etc/apt/sources.list.d/rocm.list
141134
```
142-
The gpg key might change, so it may need to be updated when installing a new release. If the key signature verification fails when you attempt to update, please re-add the key from ROCm apt repository. The current rocm.gpg.key is not avialable in a standard key ring distribution, but has the following sha1sum hash:
135+
The gpg key might change, so it may need to be updated when installing a new release.
136+
If the key signature verification is failed while update, please re-add the key from
137+
ROCm apt repository. The current rocm.gpg.key is not avialable in a standard key ring
138+
distribution, but has the following sha1sum hash:
143139

144-
`f7f8147431c75e505c58a6f3a3548510869357a6 rocm.gpg.key`
140+
f7f8147431c75e505c58a6f3a3548510869357a6 rocm.gpg.key
145141

146142
##### Install
147143

@@ -172,28 +168,47 @@ To add yourself to the video group you will need the sudo password and can use t
172168
sudo usermod -a -G video $LOGNAME
173169
```
174170

171+
You may want to ensure that any future users you add to your system are put into the "video" group by default. To do that, you can run the following commands:
172+
```shell
173+
echo 'ADD_EXTRA_GROUPS=1' | sudo tee -a /etc/adduser.conf
174+
echo 'EXTRA_GROUPS=video' | sudo tee -a /etc/adduser.conf
175+
```
176+
175177
Once complete, reboot your system.
176178

177-
Upon Reboot run
179+
Upon Reboot run the following commands to verify that the ROCm installation waas successful. If you see your GPUs listed by both of these commands, you should be ready to go!
178180
```shell
179-
rocminfo
180-
clinfo
181+
/opt/rocm/bin/rocminfo
182+
/opt/rocm/opencl/bin/x86_64/clinfo
181183
```
182184

183-
If you have an [Install Issue ](https://rocm.github.io/install_issues.html) please read this FAQ .
185+
Note that, to make running ROCm programs easier, you may wish to put the ROCm libraries in your LD_LIBRARY_PATH environment variable and the ROCm binaries in your PATH.
186+
```shell
187+
echo 'export LD_LIBRARY_PATH=/opt/rocm/opencl/lib/x86_64:/opt/rocm/hsa/lib:$LD_LIBRARY_PATH' | sudo tee -a /etc/profile.d/rocm.sh
188+
echo 'export PATH=$PATH:/opt/rocm/bin:/opt/rocm/profiler/bin:/opt/rocm/opencl/bin/x86_64' | sudo tee -a /etc/profile.d/rocm.sh
189+
```
190+
191+
If you have an [Install Issue](https://rocm.github.io/install_issues.html) please read this FAQ .
184192

185-
##### For Vega10 Users who want to run ROCm without supporting PCIe atomic support must set HSA_ENABLE_SDMA=0
193+
###### Vega10 users who want to run ROCm on a system that does not support PCIe atomics must set HSA_ENABLE_SDMA=0
186194

187-
Currently with Vega10 GPUs to disable PCIe atomics support in ROCm, you need to turn off SDMA functionality.
195+
Currently, if you want to run ROCm on a Vega10 GPU (GFX9) on a system without PCIe atomics, you must turn off SDMA functionality.
188196

189197
```shell
190198
export HSA_ENABLE_SDMA=0
191199
```
192200

201+
###### Performing an OpenCL-only Installation of ROCm
202+
203+
Some users may want to install a subset of the full ROCm installation. In particular, if you are trying to install on a system with a limited amount of storage space, or which will only run a small collection of known applications, you may want to install only the packages that are required to run OpenCL applications. To do that, you can run the following installation command **instead** of the command to install `rocm-dkms`.
204+
205+
```shell
206+
sudo apt-get install dkms rock-dkms rocm-opencl
207+
```
193208

194209
###### Upon restart, to test your OpenCL instance
195210

196-
Build and run Hello World OCL app.
211+
Build and run Hello World OCL app.
197212

198213
HelloWorld sample:
199214

@@ -243,7 +258,8 @@ If you installed any of the ROCm pre-release packages from github, they will
243258
need to be manually un-installed:
244259

245260
```shell
246-
sudo apt purge libhsakmt
261+
sudo apt purge hsakmt-roct
262+
sudo apt purge hsakmt-roct-dev
247263
sudo apt purge compute-firmware
248264
sudo apt purge $(dpkg -l | grep 'kfd\|rocm' | grep linux | grep -v libc | awk '{print $2}')
249265
```
@@ -270,7 +286,7 @@ system with the RHEL subscription server and attaching to a pool id.
270286
Second, enable the following repositories:
271287

272288
```shell
273-
sudo subscription-manager repos --enable rhel-7-server-rhscl-rpms
289+
sudo subscription-manager repos --enable rhel-server-rhscl-7-rpms
274290
sudo subscription-manager repos --enable rhel-7-server-optional-rpms
275291
sudo subscription-manager repos --enable rhel-7-server-extras-rpms
276292
```
@@ -289,9 +305,9 @@ https://www.softwarecollections.org/en/scls/rhscl/devtoolset-7/
289305

290306
Note that devtoolset-7 is a Software Collections package, and is not supported by AMD.
291307

292-
#### Prepare CentOS/RHEL 7.4 for DKMS Install
308+
#### Prepare CentOS/RHEL 7.4 or 7.5 for DKMS Install
293309

294-
Installing kernel drivers on CentOS/RHEL 7.4 requires dkms tool being installed:
310+
Installing kernel drivers on CentOS/RHEL 7.4/7.5 requires dkms tool being installed:
295311

296312
```shell
297313
sudo yum install -y epel-release
@@ -339,14 +355,22 @@ Current release supports up to CentOS/RHEL 7.4 and 7.5. Users should update to t
339355
```shell
340356
sudo yum update
341357
```
342-
##### For Vega10 Users who want to run ROCm without supporting PCIe atomic support must set HSA_ENABLE_SDMA=0
358+
###### Vega10 users who want to run ROCm on a system that does not support PCIe atomics must set HSA_ENABLE_SDMA=0
343359

344-
Currently with Vega10 GPUs to disable PCIe atomics support in ROCm, you need to turn off SDMA functionality.
360+
Currently, if you want to run ROCm on a Vega10 GPU (GFX9) on a system without PCIe atomics, you must turn off SDMA functionality.
345361

346362
```shell
347363
export HSA_ENABLE_SDMA=0
348364
```
349365

366+
###### Performing an OpenCL-only Installation of ROCm
367+
368+
Some users may want to install a subset of the full ROCm installation. In particular, if you are trying to install on a system with a limited amount of storage space, or which will only run a small collection of known applications, you may want to install only the packages that are required to run OpenCL applications. To do that, you can run the following installation command **instead** of the command to install `rocm-dkms`.
369+
370+
```shell
371+
sudo yum install rock-dkms rocm-opencl
372+
```
373+
350374
#### Compiling applications using hcc, hip, etc.
351375

352376
To compile applications or samples, please use gcc-7.2 provided by the devtoolset-7 environment.
@@ -367,7 +391,7 @@ sudo yum autoremove rocm-dkms
367391

368392
##### If you Plan to Run with X11 - we are seeing X freezes under load
369393

370-
ROCm 1.8.1 a kernel parameter noretry has been set to 1 to improve overall system performance. However it has been proven to bring instability to graphics driver shipped with Ubuntu. This is an ongoing issue and we are looking into it.
394+
In ROCm 1.8.3, the kernel parameter 'noretry' has been set to 1 to improve overall system performance. However it has been proven to bring instability to graphics driver shipped with Ubuntu. This is an ongoing issue and we are looking into it.
371395

372396
Before that, please try apply this change by changing noretry bit to 0.
373397

@@ -385,9 +409,9 @@ Once it's done, run sudo update-initramfs -u. Reboot and verify /sys/module/amdk
385409

386410
##### If you are you are using hipCaffe Alexnet training on ImageNet - we are seeing sporadic hangs of hipCaffe during training
387411

388-
##### Vega10 Users who want to run ROCm without supporting PCIe atomic support must set HSA_ENABLE_SDMA=0
412+
###### Vega10 users who want to run ROCm on a system that does not support PCIe atomics must set HSA_ENABLE_SDMA=0
389413

390-
Currently with Vega10 GPUs to disable PCIe atomics support in ROCm, you need to turn off SDMA functionality.
414+
Currently, if you want to run ROCm on a Vega10 GPU (GFX9) on a system without PCIe atomics, you must turn off SDMA functionality.
391415

392416
```shell
393417
export HSA_ENABLE_SDMA=0
@@ -420,3 +444,17 @@ curl https://storage.googleapis.com/git-repo-downloads/repo > ~/bin/repo
420444
chmod a+x ~/bin/repo
421445
```
422446
Note: make sure ~/bin exists and it is part of your PATH
447+
448+
#### Cloning the code
449+
450+
```shell
451+
mkdir ROCm && cd ROCm
452+
repo init -u https://github.com/RadeonOpenCompute/ROCm.git -b roc-1.8.3
453+
repo sync
454+
```
455+
These series of commands will pull all of the open source code associated with
456+
the ROCm 1.8 release. Please ensure that ssh-keys are configured for the
457+
target machine on GitHub for your GitHub ID.
458+
459+
* OpenCL Runtime and Compiler will be submitted to the Khronos Group, prior to
460+
the final release, for conformance testing.

0 commit comments

Comments
 (0)