|
|
Boost.Numeric.Bindings.ATLAS |
``ATLAS is an approach for the automatic generation and optimization of numerical software for processors with deep memory hierarchies and pipelined functional units. The production of such software for machines ranging from desktop workstations to embedded processors can be a tedious and time consuming task. ATLAS has been designed to automate much of this process. We concentrate our efforts on the widely used linear algebra kernels called the Basic Linear Algebra Subroutines (BLAS).'' [BLAS FAQ: What is ATLAS?]
ATLAS Bindings Library provides generic, higher level interface to ATLAS C interface. By `generic' we mean that bindings can be used with various vector and matrix classes, and by `higher level' that some complexities of the C interface are hidden. For example, to compute the Euclidean norm of a vector, ATLAS provides four functions:
float cblas_snrm2 (const int N, const float *X, const int incX); double cblas_dnrm2 (const int N, const double *X, const int incX); float cblas_scnrm2 (const int N, const void *X, const int incX); double cblas_dznrm2 (const int N, const void *X, const int incX);for
float, double, float
complex and double complex vectors. Parameters of these
functions are the length of the vector X, pointer[1]
to the first element of X and the so-called stride of X.
This interface is rather flexible: functions can be used not only for
`complete' vectors, but also for vector ranges (if pointer points to
the first element of a range) and vector slices (if pointer points to
the first element of a slice and stride is not one); but it also leads
to long parameter/argument lists.
Vectors, vector ranges and vector slices in C++, being object of
classes, usually `carry' all required information and there is no need
to supply them explicitly. Furthermore, overloading makes the encoding of
vector component types in function names unnecessary. Using templates
and traits classes,
function can be written to manipulate various vector types:
#include <iostream>
#include <complex>
#include <boost/numeric/bindings/traits/std_vector.hpp>
#include <boost/numeric/bindings/traits/std_valarray.hpp>
#include <boost/numeric/bindings/traits/ublas_vector.hpp>
#include <boost/numeric/bindings/traits/ublas_matrix.hpp>
#include <boost/numeric/bindings/atlas/cblas1.hpp>
namespace ublas = boost::numeric::ublas;
namespace atlas = boost::numeric::bindings::atlas;
int main() {
std::vector<float> sv (8);
// initialisation...
std::cout << atlas::nrm2 (sv) << std::endl;
std::valarray<std::complex<double> > va (10);
// ...
std::cout << atlas::nrm2 (va) << std::endl;
ublas::vector<double> uv (18);
// ...
std::cout << atlas::nrm2 (uv) << std::endl;
ublas::vector_range<ublas::vector<double> > ur (uv, ublas::range (3, 15));
std::cout << atlas::nrm2 (ur) << std::endl;
ublas::vector_slice<ublas::vector<double> > us (uv, ublas::slice (1, 3, 5));
std::cout << atlas::nrm2 (us) << std::endl;
ublas::vector_slice<ublas::vector_range<ublas::vector<double> > >
usr (ur, ublas::slice (2, 2, 4));
std::cout << atlas::nrm2 (usr) << std::endl;
ublas::matrix<double> um (5, 5);
// ...
ublas::matrix_row<ublas::matrix<double> > umr (um, 1);
std::cout << atlas::nrm2 (umr) << std::endl;
ublas::matrix_column<ublas::matrix<double> > umc (um, 2);
std::cout << atlas::nrm2 (umc) << std::endl;
}
ATLAS also contains some LAPACK routines for solving systems of simultaneous linear equations.
All ATLAS bindings header files are in the directory boost/numeric/bindings/atlas.
All `public' ATLAS bindings functions are in the namespace boost::numeric::bindings::atlas.
cblas_inc.hpp and clapack_inc.hpp,
which only include ATLAS headers cblas.h and
clapack.h[2] respectively, are therefore introduced.
You should edit these two files providing your paths.
All other files then include these two files, so there is no need to
edit other files.
boost/numeric/bindings/atlas/cblas1.hpp.
Namespace is boost::numeric::bindings::atlas.
Level 1 BLAS do vector-vector operations.
In the following table x, y are vectors,
a, b, c are scalars and i,
k integer indices.
| Expression | Semantics |
|---|---|
set(a,x) | xi <- a
for all i [1]
|
copy(x,y) | y <- x
|
swap(x,y) | x <-> y
|
scal(a,x) | x <- a x
|
axpy(a,x,y) | y <- a x + y
|
xpy(x,y) | y <- x + y [2]
|
axpby(a,x,b,y) | y <- a x + b y [1]
|
dot(x,y) | returns xT y [3]
|
dsdot(x,y) | returns xT y [4]
|
sdsdot(x,y) | returns a + xT y [4]
|
dotu(x,y) | returns xT y [5]
|
dotu(x,y,c) | c <- xT y [5]
|
dotc(x,y) | returns xH y [5]
|
dotc(x,y,c) | c <- xH y [5]
|
nrm2(x) | returns ||x||2
|
asum(x) | returns ||re(x)||1 + ||im(x)||1
|
iamax(x) | returns 1st k such that
|re(xk)| + |im(xk)|
= max (|re(xi)| + |im(xi)|)
|
[1] ATLAS extension (i.e. not mandated by the BLAS standard)
[2] ATLAS bindings extension
[3] in BLAS/ATLAS float and double vectors
only, complex vectors ATLAS bindings extension (same as dotu())
[4] float vectors only, with double accumulation
[5] complex types only
boost/numeric/bindings/atlas/cblas2.hpp.
Namespace is boost::numeric::bindings::atlas.
Level 2 BLAS do matrix-vector operations.
In the following table a, b are scalars,
x, y are vectors and A is a
matrix. Trans can be CblasNoTrans,
CblasTrans or CblasConjTrans, denoting
A, AT or AH,
respectively. Uplo can be CblasUpper or
CblasLower, denoting that leading
upper or lower, respectively, triangular part of A contains
the upper or lower triangular part of the
symmetric/Hermitian matrix (and that the lower or upper triangular
part, respectively, is not referenced).
| Expression | Semantics | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
gemv(Trans,a,A,x,b,y) |
[1] It is assumed that matrix traits class can determine whether upper
or lower triangular part of A contains the corresponding
part of the symmetric/Hermitian matrix (e.g. traits for
ublas::symmetric_adaptor<>
and ublas::hermitian_adaptor<>).
[2] in BLAS/ATLAS float and double vectors
only, complex vectors ATLAS bindings extension (same as geru())
[3] complex types only
boost/numeric/bindings/atlas/cblas3.hpp.
Namespace is boost::numeric::bindings::atlas.
Level 3 BLAS do matrix-matrix operations.
In the following table a, b are scalars
(with a' denoting conj(a))
and A, B, C are matrices. For
Trans, TransA, TransB see
Trans in 2.3; also,
for Uplo see 2.3.
Side can be CblasLeft or CblasRight,
specifying whether the symmetric/Hermitian matrix A
appears on the left or right in the multiplication,
i.e. AB or BA.
| Expression | Semantics | ||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
gemm(TransA,TransB,a,A,B,b,C) |
[1] It is assumed that matrix traits class can determine whether upper
or lower triangular part of A contains the corresponding
part of the symmetric/Hermitian matrix (e.g. traits for
ublas::symmetric_adaptor<>)
and ublas::hermitian_adaptor<>).
[2] Requires proper traits classes and therefore does not work with all compilers.
boost/numeric/bindings/atlas/clapack.hpp.
Namespace is boost::numeric::bindings::atlas.
ATLAS contains only few LAPACK routines: for the solutions of systems of simultaneous linear equations with general and with symmetric or Hermitian positive definite coefficient matrices[3].
ATLAS functions can be applied to matrices with both column major and
row major storage orders. But there is a `catch' with row major
solves and factorisations. To quote from
ATLAS FAQ:
``Most users are confused by the row major factorization and related solves. The right-hand side vectors are probably the biggest source of confusion. The RHS array does not represent a matrix in the mathematical sense, it is instead a pasting together of the various RHS into one array for calling convenience. As such, RHS vectors are always stored contiguously, regardless of the row/col major that is chosen.''
That is, if B is row-major, it should be nrhs-by-n,
and RHS vectors should be its rows, not columns; also, solution
vectors will be its rows.
For more information about LAPACK routines see LAPACK Users' Guide. Also, see LAPACK bindings.
gesv
A X = B, where A is general n-by-n
matrix and X and B are
n-by-nrhs matrices (nrhs denotes the number of
right-hand side vectors). A is overwritten with
the appropriate LU factorization and B is
overwritten with the solution X.
getrf
A using partial
pivoting. Factorisation is A = PLU, where:
A is column major, P is a row-permutation
matrix, L is lower triangular with unit diagonal elements
(lower trapezoidal if m>n), and U is upper
triangular (upper trapezoidal if m<n);
A is row major, P is a
column-permutation matrix, L is lower triangular (lower
trapezoidal if m>n), and U is upper triangular
with unit diagonals (upper trapezoidal if m<n).
A is overwritten with the LU factorization.
getrs
AX = B (or ATX = B or AHX = B)
with a general n-by-n matrix A using the
LU factorization previously computed by getrf().
B is overwritten with the solution X.
getri
A
using the LU factorization previously computed by getrf().
Factors LU are overwritten with A-1.
In the following table A, B, L, U are matrices,
ipiv is a vector of pivot indices (representing
premutation matrix P),
Trans can be CblasNoTrans,
CblasTrans or CblasConjTrans, denoting
A, AT or AH,
respectively, and ierr is `diagnostic argument'
INFO (as described in
LAPACK Users' Guide).
| Expression | Semantics |
|---|---|
gesv(A,ipiv,B) | using AP=LU,
B <- A-1B, A <- LU,
ipiv <- P; returns ierr
|
gesv(A,B) | as above, with ipiv
allocated and deallocated internally
|
lu_solve(A,B) | same as gesv(A,B)
|
getrf(A,ipiv) | using AP=LU,
A <- LU, ipiv <- P; returns ierr
|
lu_factor(A,ipiv) | same as getrf(A,ipiv)
|
getrs(Trans,A,ipiv,B) |
B <- op(A)-1B assuming
A <- LU and ipiv <- P,
op (A) == A || AT || AH;
returns ierr
|
getrs(A,ipiv,B) | B <- A-1B assuming A <- LU and ipiv <- P;
returns ierr
|
lu_substitute(A,ipiv,B) | same as getrs(A,ipiv,B)
|
getri(A,ipiv) | A <- A-1 assuming A <- LU and ipiv <- P;
returns ierr
|
lu_invert(A,ipiv) | same as getri(A,ipiv)
|
Example. Solution to A X = B using
gesv(); A and B are column major:
#include <iostream>
#include <boost/numeric/bindings/atlas/clapack.hpp>
#include <boost/numeric/bindings/traits/ublas_matrix.hpp>
#include <boost/numeric/ublas/io.hpp>
namespace ublas = boost::numeric::ublas;
namespace atlas = boost::numeric::bindings::atlas;
int main() {
// system matrix A:
ublas::matrix<double, ublas::column_major> A(3,3);
A(0,0) = 1.; A(0,1) = 1.; A(0,2) = 1.;
A(1,0) = 2.; A(1,1) = 3.; A(1,2) = 1.;
A(2,0) = 1.; A(2,1) = -1.; A(2,2) = -1.;
std::cout << "A: " << A << std::endl;
// right-hand side matrix B:
ublas::matrix<double, ublas::column_major> B(3,1);
B(0,0) = 4.; B(1,0) = 9.; B(2,0) = -2.;
std::cout << "B: " << B << std::endl;
// solve system:
atlas::gesv (A, B);
// B now contains solution:
std::cout << "X: " << B << std::endl;
}
Output is:
A: [3,3]((1,1,1),(2,3,1),(1,-1,-1)) B: [3,1]((4),(9),(-2)) X: [3,1]((1),(2),(1))Same system, but with
A and B row major:
#include <iostream>
#include <boost/numeric/bindings/atlas/clapack.hpp>
#include <boost/numeric/bindings/traits/ublas_matrix.hpp>
#include <boost/numeric/ublas/io.hpp>
namespace ublas = boost::numeric::ublas;
namespace atlas = boost::numeric::bindings::atlas;
int main() {
// system matrix A:
ublas::matrix<double, ublas::row_major> A(3,3);
A(0,0) = 1.; A(0,1) = 1.; A(0,2) = 1.;
A(1,0) = 2.; A(1,1) = 3.; A(1,2) = 1.;
A(2,0) = 1.; A(2,1) = -1.; A(2,2) = -1.;
std::cout << "A: " << A << std::endl;
// right-hand side matrix B:
ublas::matrix<double, ublas::row_major> B(1,3);
B(0,0) = 4.; B(0,1) = 9.; B(0,2) = -2.;
std::cout << "B: " << B << std::endl;
// solve system:
atlas::gesv (A, B);
std::cout << "X: " << B << std::endl;
}
Output is:
A: [3,3]((1,1,1),(2,3,1),(1,-1,-1)) B: [1,3]((4,9,-2)) X: [1,3]((1,2,1))
Example. Solution to A X = B using functions
getrf() and getrs():
#include <iostream>
#include <boost/numeric/bindings/atlas/clapack.hpp>
#include <boost/numeric/bindings/traits/std_vector.hpp>
#include <boost/numeric/bindings/traits/ublas_matrix.hpp>
#include <boost/numeric/ublas/io.hpp>
namespace ublas = boost::numeric::ublas;
namespace atlas = boost::numeric::bindings::atlas;
int main() {
// n -- no. of equations; nrhs -- no. of right-hand side vectors
ublas::matrix<double, ublas::column_major> A(n,n); // system matrix
ublas::matrix<double, ublas::column_major> B(n,nrhs); // right-hand side matrix
// fill A & B
std::vector<int> ipiv(n); // pivot vector
int ierr = atlas::getrf (A, ipiv); // factorise A
if (ierr == 0) {
atlas::getrs (A, ipiv, B); // solve from factorization
std::cout << "X: " << B << std::endl;
}
else
std::cout << "matrix is singular: ierr = " << ierr << std::endl;
}
BOOST_NUMERIC_BINDINGS_ATLAS_POTRF_BUG
in clapack.hpp, line 30 (before the inclusion of
clapack_overloads.hpp).
Symmetric/Hermitian matrix A should be represented by
n-by-n array; there are two possibilities:
A, and the strictly lower triangular part is not referenced;
A, and the strictly upper triangular part is not referenced.
ublas::symmetric_adaptor<>
and ublas::hermitian_adaptor<>,
traits mechanism can
automagickly determine which part of the matrix A is
stored.
posv
A X = B, where A is n-by-n
symmetric/Hermitian positive definite matrix and X and B are
n-by-nrhs matrices (nrhs denotes the number of
right-hand side vectors). Stored part of
A is overwritten with the factor U or
L from the Cholesky factorization
A = UTU or A = LLT
(or A = UHU or A = LLH)
and B is overwritten with the solution X.
potrf
A:
A = UTU or A = LLT
(or A = UHU or A =
LLH). Factors U or L
overwrite stored part of A.
potrs
AX = B with a symmetric or Hermitian positive definite n-by-n matrix A using the
Cholesky factorization previously computed by potrf().
B is overwritten with the solution X.
potri
A
using the Cholesky factorization previously computed by potrf().
Factors U or L are overwritten with
appropriate part of A-1.
In the following table A, B, L,
U are matrices and ierr is LAPACK's `diagnostic argument'
INFO.
Uplo can be CblasUpper or
CblasLower, denoting that leading
upper or lower, respectively, triangular part of A contains
the upper or lower triangular part of the
symmetric/Hermitian matrix (and that the lower or upper triangular
part, respectively, is not referenced).
| Expression | Semantics |
|---|---|
posv(Uplo,A,B) |
B <- A-1B, using
A = UTU or A = LLT
or A = UHU or A = LLH;
returns ierr
|
posv(A,B) |
B <- A-1B, using
A = UTU or A = LLT
or A = UHU or A = LLH;
returns ierr [1]
|
cholesky_solve(A,B) | same as posv(A,B)
|
potrf(Uplo,A) |
A = UTU or A = LLT
or A = UHU or A = LLH;
returns ierr
|
potrf(A) |
A = UTU or A = LLT
or A = UHU or A = LLH;
returns ierr [1]
|
cholesky_factor(A) | same as potrf(A)
|
potrs(Uplo,A,B) |
B <- A-1B assuming
A = UTU or A = LLT
or A = UHU or A = LLH;
returns ierr
|
potrs(A,B) |
B <- A-1B assuming
A = UTU or A = LLT
or A = UHU or A = LLH;
returns ierr [1]
|
cholesky_substitute(A,B) | same as potrs(A,B)
|
potri(Uplo,A) | A <- A-1 assuming
A = UTU or A = LLT
or A = UHU or A = LLH;
returns ierr
|
potri(A) | A <- A-1 assuming
A = UTU or A = LLT
or A = UHU or A = LLH;
returns ierr [1]
|
cholesky_invert(A) | same as getri(A)
|
[1] It is assumed that matrix traits class can determine whether upper
or lower triangular part of A contains the corresponding
part of the symmetric/Hermitian matrix (e.g. traits for
ublas::symmetric_adaptor<>)
and ublas::hermitian_adaptor<>).
Example. Solution to A X = B, with A
Hermitian positive definite matrix (lower triangular part is stored):
#include <iostream>
#include <complex>
#include <boost/numeric/bindings/atlas/clapack.hpp>
#include <boost/numeric/bindings/traits/ublas_matrix.hpp>
#include <boost/numeric/bindings/traits/ublas_hermitian.hpp>
#include <boost/numeric/ublas/io.hpp>
namespace ublas = boost::numeric::ublas;
namespace atlas = boost::numeric::bindings::atlas;
typedef std::complex<double> cmplx_t;
typedef ublas::matrix<cmplx_t, ublas::column_major> cm_t;
typedef ublas::hermitian_adaptor<cm_t, ublas::lower> herm_t;
int main() {
cm_t CA(3,3); // matrix (storage)
herm_t HA(CA); // hermitian adaptor
HA(0,0) = cmplx_t (25,0);
HA(1,0) = cmplx_t (-5,5);
HA(1,1) = cmplx_t (51,0);
HA(2,0) = cmplx_t (10,-5);
HA(2,1) = cmplx_t (4,6);
HA(2,2) = cmplx_t (71,0);
cm_t CB(3,2); // right-hand side
CB(0,0) = cmplx_t (60,-55);
CB(1,0) = cmplx_t (34,58);
CB(2,0) = cmplx_t (13,-152);
CB(0,1) = cmplx_t (70,10);
CB(1,1) = cmplx_t (-51,110);
CB(2,1) = cmplx_t (75,63);
atlas::posv (HA, CB); // solve system
std::cout << "X: " << CB << std::endl;
}
Example. Inverse of the symmetric positive definite matrix
A (upper triangular part is stored):
#include <iostream>
#include <boost/numeric/bindings/atlas/cblas3.hpp>
#include <boost/numeric/bindings/atlas/clapack.hpp>
#include <boost/numeric/bindings/traits/ublas_matrix.hpp>
#include <boost/numeric/bindings/traits/ublas_hermitian.hpp>
#include <boost/numeric/ublas/io.hpp>
namespace ublas = boost::numeric::ublas;
namespace atlas = boost::numeric::bindings::atlas;
typedef ublas::matrix<double> m_t;
typedef ublas::symmetric_adaptor<m_t, ublas::upper> symm_t;
int main() {
m_t A(5,5); // matrix (storage)
// A =
// [5 4 3 2 1]
// [0 5 4 3 2]
// [0 0 5 4 3]
// [0 0 0 5 4]
// [0 0 0 0 5]
symm_t SA(A); // symmetric adaptor
m_t A2(SA); // copy of SA (full symmetric)
atlas::potrf (SA); // factorise SA
atlas::potri (SA); // compute inverse
// RI is (almost) identity matrix:
m_t RI(5,5);
atlas::symm (A2, SA, RI);
std::cout << "I = A * A^(-1): " << RI << std::endl;
atlas::symm (SA, A2, RI);
std::cout << "I = A^(-1) * A: " << RI << std::endl;
}
(BLAS level 3 function symm()
computes AB or BA, where A is
symmetric (only upper or lower triangular part is referenced) and
B is general ('full') matrix -- in our example, as SA
is symmetric matrix, A2 must be `full'.)
[1]
C, prior to C99, didn't have complex data type. Parameters
of complex versions are therefore void* or const void*, and
it is assumed that ``a complex element consists of two consecutive
memory locations of the underlying data type (i.e., float
or double), where the first location contains the real
component, and the second contains the imaginary component''.
[2]
If you are using stable version of ATLAS prior to 3.4.2 or developer
version prior to 3.5.6 and if you didn't apply fix
described in ATLAS errata,
clapack_inc.hpp must include both atlas_enum.h
and clapack.h.
[3]
There are two more functions: xlauum() and xtrtri().
Currently, bindings for them are not provided.