Skip to content

Commit 50545b1

Browse files
authored
Update CPU and OS support and document DYNAMIC_ARCH option in README.md
prompted by #2388
1 parent 71faa1c commit 50545b1

1 file changed

Lines changed: 33 additions & 8 deletions

File tree

README.md

Lines changed: 33 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,8 @@ You can download them from [file hosting on sourceforge.net](https://sourceforge
2626

2727
Download from project homepage, https://xianyi.github.com/OpenBLAS/, or check out the code
2828
using Git from https://github.com/xianyi/OpenBLAS.git.
29+
Buildtime parameters can be chosen in Makefile.rule, see there for a short description of each option.
30+
Most can also be given directly on the make or cmake command line.
2931

3032
### Dependencies
3133

@@ -101,16 +103,16 @@ The default installation directory is `/opt/OpenBLAS`.
101103

102104
## Supported CPUs and Operating Systems
103105

104-
Please read `GotoBLAS_01Readme.txt`.
106+
Please read `GotoBLAS_01Readme.txt` for older CPU models already supported by the 2010 GotoBLAS.
105107

106108
### Additional supported CPUs
107109

108110
#### x86/x86-64
109111

110112
- **Intel Xeon 56xx (Westmere)**: Used GotoBLAS2 Nehalem codes.
111113
- **Intel Sandy Bridge**: Optimized Level-3 and Level-2 BLAS with AVX on x86-64.
112-
- **Intel Haswell**: Optimized Level-3 and Level-2 BLAS with AVX2 and FMA on x86-64.
113-
- **Intel Skylake**: Optimized Level-3 and Level-2 BLAS with AVX512 and FMA on x86-64.
114+
- **Intel Haswell**: Optimized Level-3 and Level-2 BLAS with AVX2 and FMA on x86-64.
115+
- **Intel Skylake-X**: Optimized Level-3 and Level-2 BLAS with AVX512 and FMA on x86-64.
114116
- **AMD Bobcat**: Used GotoBLAS2 Barcelona codes.
115117
- **AMD Bulldozer**: x86-64 ?GEMM FMA4 kernels. (Thanks to Werner Saar)
116118
- **AMD PILEDRIVER**: Uses Bulldozer codes with some optimizations.
@@ -129,8 +131,15 @@ Please read `GotoBLAS_01Readme.txt`.
129131

130132
#### ARM64
131133

132-
- **ARMv8**: Experimental
133-
- **ARM Cortex-A57**: Experimental
134+
- **ARMv8**: Basic ARMV8 with small caches, optimized Level-3 and Level-2 BLAS
135+
- **Cortex-A53**: same as ARMV8 (different cpu specifications)
136+
- **Cortex A57**: Optimized Level-3 and Level-2 functions
137+
- **Cortex A72**: same as A57 ( different cpu specifications)
138+
- **Cortex A73**: same as A57 (different cpu specifications)
139+
- **Falkor**: same as A57 (different cpu specifications)
140+
- **ThunderX**: Optimized some Level-1 functions
141+
- **ThunderX2T99**: Optimized Level-3 BLAS and parts of Levels 1 and 2
142+
- **TSV110**: Optimized some Level-3 helper functions
134143

135144
#### PPC/PPC64
136145

@@ -139,18 +148,34 @@ Please read `GotoBLAS_01Readme.txt`.
139148

140149
#### IBM zEnterprise System
141150

142-
- **Z13**: Optimized Level-3 BLAS and Level-1,2 (double precision)
143-
- **Z14**: Optimized Level-3 BLAS and Level-1,2 (single precision)
151+
- **Z13**: Optimized Level-3 BLAS and Level-1,2
152+
- **Z14**: Optimized Level-3 BLAS and (single precision) Level-1,2
153+
154+
### Support for multiple targets in a single library
155+
156+
OpenBLAS can be built for multiple targets with runtime detection of the target cpu by specifiying DYNAMIC_ARCH=1 in Makefile.rule, on the gmake command line or as -DDYNAMIC_ARCH=TRUE in cmake.
157+
For **x86_64**, the list of targets this activates contains Prescott, Core2, Nehalem, Barcelona, Sandybridge, Bulldozer, Piledriver, Steamroller, Excavator, Haswell, Zen, SkylakeX. For cpu generations not included in this list, the corresponding older model is used. If you also specify DYNAMIC_OLDER=1, specific support for Penryn, Dunnington, Opteron, Opteron/SSE3, Bobcat, Atom and Nano is added. Finally there is an option DYNAMIC_LIST that allows to specify an individual list of targets to include instead of the default.
158+
DYNAMIC_ARCH is also supported on **x86**, where it translates to Katmai, Coppermine, Northwood, Prescott, Banias,
159+
Core2, Penryn, Dunnington, Nehalem, Athlon, Opteron, Opteron_SSE3, Barcelona, Bobcat, Atom and Nano.
160+
On **ARMV8**, it enables support for CortexA53, CortexA57, CortexA72, CortexA73, Falkor, ThunderX, ThunderX2T99, TSV110 as well as generic ARMV8 cpus.
161+
For **POWER**, the list encompasses POWER6, POWER8 and POWER9, on **ZARCH** it comprises Z13 and Z14.
162+
The TARGET option can be used in conjunction with DYNAMIC_ARCH=1 to specify which cpu model should be assumed for all the
163+
common code in the library, usually you will want to set this to the oldest model you expect to encounter.
164+
Please not that it is not possible to combine support for different architectures, so no combined 32 and 64 bit or x86_64 and arm64 in the same library.
144165

145166
### Supported OS
146167

147168
- **GNU/Linux**
148169
- **MinGW or Visual Studio (CMake)/Windows**: Please read <https://github.com/xianyi/OpenBLAS/wiki/How-to-use-OpenBLAS-in-Microsoft-Visual-Studio>.
149-
- **Darwin/macOS**: Experimental. Although GotoBLAS2 supports Darwin, we are not macOS experts.
170+
- **Darwin/macOS/OSX/iOS**: Experimental. Although GotoBLAS2 already supports Darwin, we are not OSX/iOS experts.
150171
- **FreeBSD**: Supported by the community. We don't actively test the library on this OS.
151172
- **OpenBSD**: Supported by the community. We don't actively test the library on this OS.
173+
- **NetBSD**: Supported by the community. We don't actively test the library on this OS.
152174
- **DragonFly BSD**: Supported by the community. We don't actively test the library on this OS.
153175
- **Android**: Supported by the community. Please read <https://github.com/xianyi/OpenBLAS/wiki/How-to-build-OpenBLAS-for-Android>.
176+
- **AIX**: Supported on PPC up to POWER8
177+
- **Haiku**: Supported by the community. We don't actively test the library on this OS.
178+
- **SunOS**: Supported by the community. We don't actively test the library on this OS:
154179

155180
## Usage
156181

0 commit comments

Comments
 (0)