Changes between Version 46 and Version 47 of WikiStart

Show
Ignore:
Timestamp:
10/18/13 23:20:04 (4 years ago)
Author:
hfinkel
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • WikiStart

    v46 v47  
    3939 
    4040If you're using bgclang, please subscribe to the mailing list: [http://lists.alcf.anl.gov/mailman/listinfo/llvm-bgq-discuss] 
     41 
     42== General usage == 
     43 
     44bgclang command-line argument handling is designed to be similar to gcc's command-line argument handling, and where possible, bgclang tries to support the same flags. 
     45 
     46Like bgxlc and powerpc64-bgq-linux-gcc, bgclang defaults to static linking. If you pass the -dynamic flags (or the -shared flag), then dynamic linking will be used instead. As with gcc, when compiling objects intended to become part of a shared library or dynamically-linked executable, you should probably also pass the -fPIC flag. In general, use of dynamically-linked executables and shared libraries is discouraged on the BG/Q. 
     47 
     48== OpenMP == 
     49 
     50bgclang fully supports the OpenMP 3.1 specification, and features from OpenMP 4 are currently being added. To enable OpenMP support, pass the -fopenmp flag when both compiling and linking. 
     51 
     52Note that bgclang's OpenMP runtime library (derived from Intel's open-source implementation) is different from that used by powerpc64-bgq-linux-gcc and bgxlc, and linking the OpenMP runtime library from either of those two compilers with an application compiled with bgclang -fopenmp will likely result in runtime failures. 
     53 
     54== Fast-math optimizations == 
     55 
     56bgclang supports a number of fast-math optimizations, enabled by passing -ffast-math, which increase performance but violate the relevant IEEE specification on floating-point computation. -ffast-math is conceptually similar to IBM's -qnostrict compiler flag. 
     57 
     58== Vector (QPX) intrinsics and math functions == 
     59 
     60bgclang supports the same QPX vector intrinsics (vec_add, etc.) as IBM's compiler, and it understands the vector4double type. No special flags or header files are required to enable this support. 
     61 
     62bgclang also comes with a vector math library (derived from Naoki Shibata's SLEEF library). To use this library, include the qpxmath.h header. The bgclang wrapper scripts automatically handle linking to the vector math library, so no special linking flags are required. 
     63 
     64{{{ 
     65#include <qpxmath.h> 
     66}}} 
     67 
     68the following functions are available (the functions with the _u1 suffix have no more than 1 ulp error): 
     69 
     70{{{ 
     71vector4double xldexp(vector4double x, const int *q); 
     72void xilogb(vector4double d, int *l); 
     73 
     74vector4double xsin(vector4double d); 
     75vector4double xcos(vector4double d); 
     76void xsincos(vector4double d, vector4double *ds, vector4double *dc); 
     77vector4double xtan(vector4double d); 
     78vector4double xasin(vector4double s); 
     79vector4double xacos(vector4double s); 
     80vector4double xatan(vector4double s); 
     81vector4double xatan2(vector4double y, vector4double x); 
     82vector4double xlog(vector4double d); 
     83vector4double xexp(vector4double d); 
     84vector4double xpow(vector4double x, vector4double y); 
     85 
     86vector4double xsinh(vector4double d); 
     87vector4double xcosh(vector4double d); 
     88vector4double xtanh(vector4double d); 
     89vector4double xasinh(vector4double s); 
     90vector4double xacosh(vector4double s); 
     91vector4double xatanh(vector4double s); 
     92 
     93vector4double xcbrt(vector4double d); 
     94 
     95vector4double xexp2(vector4double a); 
     96vector4double xexp10(vector4double a); 
     97vector4double xexpm1(vector4double a); 
     98vector4double xlog10(vector4double a); 
     99vector4double xlog1p(vector4double a); 
     100 
     101vector4double xsin_u1(vector4double d); 
     102vector4double xcos_u1(vector4double d); 
     103void xsincos_u1(vector4double d, vector4double *ds, vector4double *dc); 
     104vector4double xtan_u1(vector4double d); 
     105vector4double xasin_u1(vector4double s); 
     106vector4double xacos_u1(vector4double s); 
     107vector4double xatan_u1(vector4double s); 
     108vector4double xatan2_u1(vector4double y, vector4double x); 
     109vector4double xlog_u1(vector4double d); 
     110vector4double xcbrt_u1(vector4double d); 
     111}}} 
     112 
     113plus single precision versions (which are named like the double-precision variants but have an 'f' as a suffix like this): 
     114{{{ 
     115... 
     116vector4double xsinf(vector4double d); 
     117vector4double xcosf(vector4double d); 
     118... 
     119vector4double xsinf_u1(vector4double d); 
     120vector4double xcosf_u1(vector4double d); 
     121... 
     122}}} 
     123 
     124In addition, you can use IBM's SIMD MASS library by including the appropriate header and linking with -lmass_simd. Compared to IBM's SIMD MASS library, bgclang's vector math functions tend to be slower but more accurate. 
     125 
     126For convenience, if you define QPXMATH_MASS_SIMD_FUNCTIONS before including the qpxmath.h header, aliases will also be defined for libmass_simd function names (sind4, etc.). Note, however, that libmass_simd provides some functions not provided by bgclang's vector math library. Also, bgclang's vector math library provides vectorized ldexp and ilogb functions (which libmass_simd does not provide). 
     127 
     128== Autovectorization == 
     129 
     130bgclang's autovectorization support is enabled by default with the optimization flag -O3. There are two types of autovectorization used by bgclang: Loop autovectorization (which can be disabled using -fno-vectorize) and SLP autovectorization (which can be disabled using -fno-slp-vectorize) for the autovectorization of non-loop code. 
     131 
     132bgclang can currently transform calls to the following standard library (libm) math functions into calls to its vector math library as part of the autovectorization process: acos, acosh, asin, asinh, atan, atan2, atanh, cbrt, cos, cosh, exp, exp10, exp2, expm1, log, log10, log1p, pow, sin, sinh, tan, tanh, along with the single-precision versions. Also sqrt (and division), but only with -ffast-math. For sin, cos, tan, asin, acos, atan, atan2, log faster (but slightly less accurate) variants are used with -ffast-math. 
     133 
     134== Memory bounds checking (address sanitizer) == 
     135 
     136bgclang supports a memory bounds-checking feature called address sanitizer. This feature, enabled by passing -fsanitize=address, instruments the compiled code, and will produce an error on both out-of-bounds stack and heap access. 
     137 
     138When using address sanitizer, dynamic linking must be used (and bgclang will default to using dynamic linking when -fsanitize=address is passed). 
     139 
     140== Link-time optimization (LTO) == 
     141 
     142LTO is a powerful feature of bgclang and its associated toolchain which enables the compiler to perform additional global optimizations, such as function inlining, as part of the final linking process. This can be expensive in terms of compile time, but can yield significant runtime performance gains. 
     143 
     144To use LTO you must pass the -flto flag to bgclang, both when compiling and also when linking. In addition, because the object file produced by bgclang when using LTO is in a custom format, special tools are necessary in order to: 
     145 
     146 - Combine such object files into static archives: use bgclang-ar (or equivalently powerpc64-bgq-linux-clang-ar) instead of ar (or powerpc64-bgq-linux-ar). Failure to do so will result in errors when attempting to use the static archives during linking; specifically this error (which is misleading in this context): 
     147 
     148{{{ 
     149  error adding symbols: Archive has no index; run ranlib to add one 
     150}}} 
     151 
     152 - Inspect the symbols defined in such object files: use bgclang-nm (or equivalently powerpc64-bgq-linux-clang-nm) instead of nm (or powerpc64-bgq-linux-nm). 
     153 
     154bgclang's LTO capability is currently experimental. The are known issues with how debugging data is handled, and you might run into problems using -flto and -g together. We're currently working on fixing these issues. 
     155 
     156== FAQ == 
     157 
     158=== Why do I receive linking errors complaining about multiple definitions of inline functions? === 
     159 
     160The source code you're compiling probably assumes the GNU semantics for the inline keyword, and not those defined by the C99 standard. Compile your code with the -fgnu89-inline flag to force bgclang to use the non-standard GNU semantics. 
     161 
     162=== Linking code compiled with bgclang++ together with code compiled with bgclang++11 does not work, why? === 
     163 
     164Code compiled using bgclang++ uses the same libstdc++ standard template library (STL) implementation as the system-default GNU powerpc64-bgq-linux-g++ compiler. This provides compatibility with C++ libraries, including some system libraries, compiled with the GNU toolchain. This STL implementation, however, cannot provide a conforming C++11 programming environment, and so bgclang++11 uses an up-to-date STL implementation derived from LLVM's libc++. Unfortunately, this STL implementation is incompatible with libstdc++, and so linking errors will result for functions that use STL objects as part of their signatures (i.e. parameter or return types). 
     165 
     166=== I'd like to use bgclang's OpenMP support and also link against code that uses IBM's OpenMP implementation (such as IBM's SMP ESSL library). Can I do that? === 
     167 
     168No, unfortunately both bgclang's OpenMP library and IBM's OpenMP library define functions of the same name, and using both at the same time is not generally possible. That having been said, if you're willing to play games with how your application is linked, it might be possible, and you should ask for advise on the mailing list. 
     169 
     170=== Is there a corresponding Fortran compiler available? === 
     171 
     172No, not yet. This is also being worked on. 
    41173 
    42174== Repository Information ==