LAMA
|
#include <LAMAInterface.hpp>
Public Member Functions | |
BLAS2Interface () | |
For future versions: left functions for hermitian matrices - hemv, hbmv, her, her2. | |
Data Fields | |
void(* | gemv )(const enum CBLAS_ORDER order, const enum CBLAS_TRANSPOSE trans, const IndexType m, const IndexType n, const T alpha, const T *A, const IndexType lda, const T *x, const IndexType incX, const T beta, T *y, const IndexType incY, SyncToken *syncToken) |
gemv performs one of the matrix-vector operations | |
void(* | symv )(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const T alpha, const T *A, const IndexType lda, const T *x, const IndexType incX, const T beta, T *y, const IndexType incY, SyncToken *syncToken) |
symv performs the matrix vector operation | |
void(* | trmv )(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const enum CBLAS_TRANSPOSE trans, const enum CBLAS_DIAG diag, const IndexType n, const T *A, const IndexType lda, T *x, const IndexType incX, SyncToken *syncToken) |
trmv performs one of the matrix-vector operations | |
void(* | trsv )(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const enum CBLAS_TRANSPOSE trans, const enum CBLAS_DIAG diag, const IndexType n, const T *A, const IndexType lda, T *x, const IndexType incX, SyncToken *syncToken) |
trsv solves a system of equations | |
void(* | gbmv )(const enum CBLAS_ORDER order, const enum CBLAS_TRANSPOSE trans, const int m, const int n, const int kl, const int ku, const T alpha, const T *A, const int lda, const T *x, const int incX, const T beta, T *y, const int incY, SyncToken *syncToken) |
gbmv performs one of the matrix-vector operations | |
void(* | sbmv )(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const IndexType k, const T alpha, const T *A, const IndexType lda, const T *x, const IndexType incX, const T beta, T *y, const IndexType incY, SyncToken *syncToken) |
sbmv performs matrix vector operation | |
void(* | tbmv )(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const enum CBLAS_TRANSPOSE trans, const enum CBLAS_DIAG diag, const IndexType n, const IndexType k, const T *A, const IndexType lda, T *x, const IndexType incX, SyncToken *syncToken) |
tbmv performs one of the matrix-vector operations | |
void(* | tbsv )(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const enum CBLAS_TRANSPOSE trans, const enum CBLAS_DIAG diag, const IndexType n, const IndexType k, const T *A, const IndexType lda, T *x, const IndexType incX, SyncToken *syncToken) |
tbsv solves one of the systems of equations | |
void(* | ger )(const enum CBLAS_ORDER order, const IndexType m, const IndexType n, const T alpha, const T *x, const IndexType incX, const T *y, const IndexType incY, T *A, const IndexType lda, SyncToken *syncToken) |
ger performs the symmetric rank 1 operation | |
void(* | syr )(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const T alpha, const T *x, const IndexType incX, T *A, const IndexType lda, SyncToken *syncToken) |
TODO: need geru ?! | |
void(* | syr2 )(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const T alpha, const T *x, const IndexType incX, const T *y, const IndexType incY, T *A, const IndexType lda, SyncToken *syncToken) |
syr2 performs the symmetric rank 2 operation | |
void(* | spmv )(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const T alpha, const T *AP, const T *x, const IndexType incX, const T beta, T *y, const IndexType incY, SyncToken *syncToken) |
spmv performs matrix vector | |
void(* | spr )(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const T alpha, const T *x, const IndexType incX, T *AP, SyncToken *syncToken) |
spr performs the symmetric rank 1 operation | |
void(* | spr2 )(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const T alpha, const T *x, const IndexType incX, const T *y, const IndexType incY, T *AP, SyncToken *syncToken) |
spr2 performs the symmetric rank 2 operation | |
void(* | tpmv )(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const enum CBLAS_TRANSPOSE trans, const enum CBLAS_DIAG diag, const IndexType n, const T *AP, T *x, const IndexType incX, SyncToken *syncToken) |
tpmv performs one of the matrix-vector operations | |
void(* | tpsv )(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const enum CBLAS_TRANSPOSE trans, const enum CBLAS_DIAG diag, const IndexType n, const T *Ap, T *x, const IndexType incX, SyncToken *syncToken) |
tpsv solves one of the systems of equations |
lama::BLAS2Interface< T >::BLAS2Interface | ( | ) |
For future versions: left functions for hermitian matrices - hemv, hbmv, her, her2.
Default constructor, initializes variables with NULL
void(* lama::BLAS2Interface< T >::gbmv)(const enum CBLAS_ORDER order, const enum CBLAS_TRANSPOSE trans, const int m, const int n, const int kl, const int ku, const T alpha, const T *A, const int lda, const T *x, const int incX, const T beta, T *y, const int incY, SyncToken *syncToken) |
gbmv performs one of the matrix-vector operations
y = alpha * op(A)* x + beta * y where op(A) = A or op(A) = AT
alpha and beta are scalars, and x and y are vectors. A is an m×n band matrix consisting of elements with kl subdiagonals and ku superdiagonals.
[in] | trans | specifies op(A). If trans == 'N' or 'n', op(A) = A If trans == 'T','t','C','c', op(A) = AT |
[in] | m | the number of rows of matrix A; m must be at least zero |
[in] | n | the number of columns of matrix A; n must be at least zero |
[in] | kl | the number of subdiagonals of matrix A; kl must be at least zero |
[in] | ku | the number of superdiagonals of matrix A; ku must be at least zero |
[in] | alpha | scalar multiplier applied to op(A) |
[in] | A | array of dimensions (lda,n). The leading part of array A must contain the band matrix A, supplied column by column, with the leading diagonal of the matrix in row ku+1 of the array, the first superdiagonal starting at position 2 inrow ku, the first subdiagonal starting at position 1 in row ku+2, and so on. Elements in the array A that do not correspond to elements in the band matrix (such as the top left ku×ku triangle) are not referenced. |
[in] | lda | leading dimension of A; lda must be at least kl + ku + 1 |
[in] | x | array of length at least (1 + (n - 1) * abs(incX)) when trans == 'N' or 'n' and at least (1 + ( m - 1) * abs(incX)) otherwise |
[in] | incX | storage spacing between elements of x; incX must not be zero |
[in] | beta | scalar multiplier applied to vector y. If beta is zero, y is not read |
[in] | y | array of length at least (1 + (m - 1) * abs(incY)) when trans == 'N' or 'n' and at least (1 + ( n - 1) * abs(incY)) otherwise |
[out] | y | updated according to y = alpha * op(A) * x + beta * y |
Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().
void( * lama::BLAS2Interface< T >::gemv)(const enum CBLAS_ORDER order, const enum CBLAS_TRANSPOSE trans, const IndexType m, const IndexType n, const T alpha, const T *A, const IndexType lda, const T *x, const IndexType incX, const T beta, T *y, const IndexType incY, SyncToken *syncToken) |
gemv performs one of the matrix-vector operations
y = alpha * op(A) * x + beta * y where op(A) = A or op(A) = AT
alpha and beta are scalars, and x and y are vectors. A is an m×n matrix consisting of elements. Matrix A is stored in column-major format, and lda is the leading dimension of the two dimensional array in which A is stored.
[in] | trans | specifies op(A). If trans == 'N' or 'n', op(A) = A If trans == 'T','t','C','c', op(A) = AT |
[in] | m | the number of rows of matrix A; m must be at least zero |
[in] | n | the number of columns of matrix A; n must be at least zero |
[in] | alpha | scalar multiplier applied to op(A) |
[in] | A | array of dimensions (lda,n). If trans == 'N' or 'n', of dimensions (lda,m) otherwise; lda must be at least max(1,m) if trans == 'N' or 'n' and at least max(1,n) otherwise |
[in] | lda | leading dimension of two-dimensional array used to store matrix A. |
[in] | x | array of length at least (1 + (n - 1) * abs(incX)) when trans == 'N' or 'n' else at least (1 + ( m - 1) * abs(incX)) |
[in] | incX | storage spacing between elements of x; incX must not be zero |
[in] | beta | scalar multiplier applied to vector y. If beta is zero, y is not read |
[in] | y | array of length at least (1 + (m - 1) * abs(incY)) when trans == 'N' or 'n' else at least (1 + ( n - 1) * abs(incY)) |
[in] | incY | the storage spacing between elements of y; incY must not be zero. |
[out] | y | updated according to y = alpha * op(A) * x + beta * y |
[in] | syncToken | allows to start asynchronous execution |
Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().
void(* lama::BLAS2Interface< T >::ger)(const enum CBLAS_ORDER order, const IndexType m, const IndexType n, const T alpha, const T *x, const IndexType incX, const T *y, const IndexType incY, T *A, const IndexType lda, SyncToken *syncToken) |
ger performs the symmetric rank 1 operation
A = alpha * x * yT + A
where alpha is a scalar, x is a m‐element vector, y is a n‐element, A is an m×n matrix consisting of elements. Matrix A is stored in column‐major format, and lda is the leading dimension of the two-dimensional array used to store A.
[in] | m | specifies the number of rows of matrix A; m must be at least zero |
[in] | n | specifies the number of columns of matrix A; n must be at least zero |
[in] | alpha | scalar multiplier applied to x * yT |
[in] | x | array of length at least (1 + (m - 1) * abs(incX)) |
[in] | incX | storage spacing between elements of x; incX must not be zero |
[in] | beta | scalar multiplier applied to vector y. If beta is zero, y is not read |
[in] | y | array of length at least (1 + (n - 1) * abs(incY)) |
[in] | incY | the storage spacing between elements of y; incY must not be zero. |
[in] | A | array of dimensions (lda,n) |
[in] | lda | leading dimension of two-dimensional array used to store matrix A. |
[out] | A | updated according to y = alpha * x * yT + A |
Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().
void(* lama::BLAS2Interface< T >::sbmv)(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const IndexType k, const T alpha, const T *A, const IndexType lda, const T *x, const IndexType incX, const T beta, T *y, const IndexType incY, SyncToken *syncToken) |
sbmv performs matrix vector operation
y = alpha * A * x + beta * y
where alpha and beta are scalars, where and x and y are n‐element vectors. A is an n×n symmetric band matrix consisting of single‐precision elements, with k superdiagonals and the same number of subdiagonals.
[in] | uplo | specifies whether the upper or lower triangular part of the symmetric band matrix A is being supplied. If uplo == 'U' or 'u', the upper triangular part is being supplied. If uplo == 'L' or 'l', the lower triangular part is being supplied. |
[in] | n | specifies the number of rows and the number of columns of the symmetric matrix A; n must be at least zero. |
[in] | k | specifies the number of superdiagonals of matrix A. Since the matrix is symmetric, this is also the number of subdiagonals; k must be at least zero. |
[in] | alpha | scalar multiplier applied to A * x |
[in] | A | array of dimensions (lda, n). When uplo == 'U' or 'u', the leading (k+1)×n part of array A must contain the upper triangular band of the symmetric matrix, supplied column by column, with the leading diagonal of the matrix in row k+1 of the array, the first superdiagonal starting at position 2 in row k, and so on. The top left k×k triangle of the array A is not referenced. When uplo == 'L' or 'l', the leading (k+1)×n part of the array A must contain the lower triangular band part of the symmetric matrix, supplied column by column, with the leading diagonal of the matrix in row 1 of the array, the first subdiagonal starting at position 1 in row 2, and so on. The bottom right k×k triangle of the array A is not referenced. |
[in] | lda | leading dimension of A; lda must be at least k+1 |
[in] | x | array of length at least ( 1 + (n-1) * abs(incX)) |
[in] | incX | storage spacing between elements of x; incX must not be zero |
[in] | beta | scalar multiplier applied to vector y. If beta is zero, y is not read |
[in] | y | array of length at least (1 + (n - 1) * abs(incY)) If beta is zero, y is not read. |
[in] | incY | storage spacing between elements of y; incY must not be zero. |
[out] | y | updated according to y = alpha * A * x + beta * y |
Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().
void(* lama::BLAS2Interface< T >::spmv)(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const T alpha, const T *AP, const T *x, const IndexType incX, const T beta, T *y, const IndexType incY, SyncToken *syncToken) |
spmv performs matrix vector
y = alpha * A * x + beta * y
where alpha and beta are scalars, and x and y are n‐element vectors. A is a symmetric n×n matrix that consists of elements and is supplied in packed form.
[in] | uplo | specifies whether the matrix data is stored in the upper or the lower triangular part of array AP. If uplo == 'U' or 'u', the upper triangular part of A is supplied in AP. If uplo == 'L' or 'l', the lower triangular part of A is supplied in AP. |
[in] | n | the number of rows and columns number of matrix A; n must be at least zero. |
[in] | alpha | scalar multiplier applied to A * x |
[in] | AP | array with at least (n*(n+1))/2 elements. If uplo == 'U' or 'u', the array AP contains the upper triangular part of the symmetric matrix A, packed sequentially, column by column. Thats is, if i<=j,A[i,j] is stored in AP[i+(j*(j+1)/2)] If uplo == 'L' or 'l', the array AP contains the lower triangular part of the symmetric matrix A, packed sequentially, column by column. Thats it, if i>=j,A[i,j] is stored in AP[i+((2*n-j+1)*j)/2] |
[in] | x | array of length at least ( 1 + (n-1) * abs(incX)) |
[in] | incX | storage spacing between elements of x; incX must not be zero |
[in] | beta | scalar multiplier applied to vector y. If beta is zero, y is not read |
[in] | y | array of length at least (1 + (n - 1) * abs(incY)) If beta is zero, y is not read. |
[in] | incY | the storage spacing between elements of y; incY must not be zero. |
[out] | y | updated according to y = alpha * A * x + beta * y |
Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().
void(* lama::BLAS2Interface< T >::spr)(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const T alpha, const T *x, const IndexType incX, T *AP, SyncToken *syncToken) |
spr performs the symmetric rank 1 operation
A = alpha * x * xT + A
where alpha is a scalar, and x is an n‐element vector. A is a symmetric n×n matrix that consists of elements and is supplied in packed form
[in] | uplo | specifies whether the matrix data is stored in the upper or the lower triangular part of array AP. If uplo == 'U' or 'u', the upper triangular part of A is supplied in AP. If uplo == 'L' or 'l', the lower triangular part of A is supplied in AP. |
[in] | n | the number of rows and columns number of matrix A; n must be at least zero. |
[in] | alpha | scalar multiplier applied to x * xT |
[in] | x | array of length at least ( 1 + (n-1) * abs(incX)) |
[in] | incX | storage spacing between elements of x; incX must not be zero |
[in] | AP | array with at least (n*(n+1))/2 elements. If uplo == 'U' or 'u', the array AP contains the upper triangular part of the symmetric matrix A, packed sequentially, column by column. Thats is, if i<=j,A[i,j] is stored in AP[i+(j*(j+1)/2)] If uplo == 'L' or 'l', the array AP contains the lower triangular part of the symmetric matrix A, packed sequentially, column by column. Thats it, if i>=j,A[i,j] is stored in AP[i+((2*n-j+1)*j)/2] |
[out] | A | updated according to A = alpha * x * xT + A |
Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().
void(* lama::BLAS2Interface< T >::spr2)(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const T alpha, const T *x, const IndexType incX, const T *y, const IndexType incY, T *AP, SyncToken *syncToken) |
spr2 performs the symmetric rank 2 operation
A = alpha * x * yT + alpha * y * xT + A
where alpha is a scalar, and x and y are n‐element vectors. A is a symmetric n×n matrix that consists of elements and is supplied in packed form
[in] | uplo | specifies whether the matrix data is stored in the upper or the lower triangular part of array A. If uplo == 'U' or 'u', the upper triangular part of may be referenced and the lower triangular part of A is inferred. If uplo == 'L' or 'l', the lower triangular part of may be referenced and the upper triangular part of A is inferred. |
[in] | n | the number of rows and columns number of matrix A; n must be at least zero. |
[in] | alpha | scalar multiplier applied to x * xT |
[in] | x | array of length at least ( 1 + (n-1) * abs(incX)) |
[in] | incX | storage spacing between elements of x; incX must not be zero |
[in] | AP | array with at least (n*(n+1))/2 elements. If uplo == 'U' or 'u', the array AP contains the upper triangular part of the symmetric matrix A, packed sequentially, column by column. Thats is, if i<=j,A[i,j] is stored in AP[i+(j*(j+1)/2)] If uplo == 'L' or 'l', the array AP contains the lower triangular part of the symmetric matrix A, packed sequentially, column by column. Thats it, if i>=j,A[i,j] is stored in AP[i+((2*n-j+1)*j)/2] |
[out] | A | updated according to A = alpha * x * yT + alpha * y * xT + A |
Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().
void(* lama::BLAS2Interface< T >::symv)(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const T alpha, const T *A, const IndexType lda, const T *x, const IndexType incX, const T beta, T *y, const IndexType incY, SyncToken *syncToken) |
symv performs the matrix vector operation
A = alpha * A * x + beta * y
where alpha and beta are scalars, and x and y are n‐element vectors. A is a symmetric n×n matrix that consists of single‐precision elements and is stored in either upper or lower storage mode.
[in] | uplo | specifies whether the upper or lower triangular part of the array A is referenced. If uplo == 'U' or 'u', the symmetric matrix A is stored in upper storage mode; only the upper triangular part of A is referenced while the lower triangular part of A is inferred. If uplo == 'L' or 'l', the symmetric matrix A is stored in lower storage mode; only the lower triangular part of A is referenced while the upper triangular part of A is inferred. |
[in] | n | specifies the number of rows and the number of columns of the symmetric matrix A; n must be at least zero. |
[in] | alpha | scalar multiplier applied to A * x |
[in] | A | array of dimensions (lda, n). If uplo == 'U' or 'u', the leading n×n upper triangular part of the array A must contain the upper triangular part of the symmetric matrix, and the strictly lower triangular part of A is not referenced. If uplo == 'L' or 'l', the leading n×n lower triangular part of the array A must contain the lower triangular part of the symmetric matrix, and the strictly upper triangular part of A is not referenced. |
[in] | lda | leading dimension of A; lda must be at least max(1,n). |
[in] | x | array of length at least ( 1 + (n-1) * abs(incX)) |
[in] | incX | storage spacing between elements of x; incX must not be zero |
[in] | beta | scalar multiplier applied to the vector y. |
[in] | y | array of length at least ( 1 + (n-1) * abs(incY)) If beta is zero, y is not read. |
[in] | incY | storage spacing between elements of y; incY must not be zero. |
[out] | y | updated according to y = alpha * A * x + beta * y |
Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().
void(* lama::BLAS2Interface< T >::syr)(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const T alpha, const T *x, const IndexType incX, T *A, const IndexType lda, SyncToken *syncToken) |
TODO: need geru ?!
syr performs the symmetric rank 1 operation
A = alpha * x * xT + A
where alpha is a scalar, x is an n‐element vectors. and A is an n×n symmetric matrix consisting of elements. A is stored in column‐major format, lda is the leading dimension of the two‐dimensional array containing A.
[in] | uplo | specifies whether the matrix data is stored in the upper or the lower triangular part of array A. If uplo == 'U' or 'u', only the upper triangular part of A is referenced. If uplo == 'L' or 'l', only the lower triangular part of A is referenced. |
[in] | n | the number of rows and columns of matrix A; n must be at least zero. |
[in] | alpha | scalar multiplier applied to x * xT |
[in] | x | array of length at least ( 1 + (n-1) * abs(incX)) |
[in] | incX | storage spacing between elements of x; incX must not be zero |
[in] | A | array of dimensions (lda, n). If uplo == 'U' or 'u', A contains the upper triangular part of the symmetric matrix, and the strictly lower triangular part is not referenced. If uplo == 'L' or 'l', A contains the lower triangular part of the symmetric matrix, and the strictly upper triangular part is not referenced. |
[in] | lda | leading dimension of the two-dimensional array containing A; lda must be at least max(1,n). |
[out] | A | updated according to A = alpha * x * xT + A |
Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().
void(* lama::BLAS2Interface< T >::syr2)(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const T alpha, const T *x, const IndexType incX, const T *y, const IndexType incY, T *A, const IndexType lda, SyncToken *syncToken) |
syr2 performs the symmetric rank 2 operation
A = alpha * x * yT + alpha * y * xT + A,
where alpha is a scalar, x and y are n‐element vectors, A is an n×n symmetric matrix consisting of elements.
[in] | uplo | specifies whether the matrix data is stored in the upper or the lower triangular part of array A. If uplo == 'U' or 'u', only the upper triangular part of A is referenced and the lower triangular part of A is inferred. If uplo == 'L' or 'l', only the lower triangular part of A is referenced and the upper triangular part of A is inferred. n the number of rows and columns of matrix A; n must be at least zero. |
[in] | n | the number of rows and columns of matrix A; n must be at least zero. |
[in] | alpha | scalar multiplier applied to x * yT + y * xT |
[in] | x | array of length at least ( 1 + (n-1) * abs(incX)) |
[in] | incX | storage spacing between elements of x; incX must not be zero |
[in] | y | array of length at least ( 1 + (n-1) * abs(incY)) |
[in] | incY | storage spacing between elements of y; incY must not be zero. |
[in] | A | array of dimensions (lda, n). If uplo == 'U' or 'u', A contains the upper triangular part of the symmetric matrix, and the strictly lower triangular part is not referenced. If uplo == 'L' or 'l', A contains the lower triangular part of the symmetric matrix, and the strictly upper triangular part is not referenced. |
[in] | lda | leading dimension of the A; lda must be at least max(1,n). |
[out] | A | updated according to A = alpha * x * yT + alpha * y * xT + A |
Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().
void(* lama::BLAS2Interface< T >::tbmv)(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const enum CBLAS_TRANSPOSE trans, const enum CBLAS_DIAG diag, const IndexType n, const IndexType k, const T *A, const IndexType lda, T *x, const IndexType incX, SyncToken *syncToken) |
tbmv performs one of the matrix-vector operations
x = op(A) * x where op(A) = A or op(A) = AT
x is an n‐element vector, A is an n×n, unit or nonunit, upper or lower, triangular band matrix consisting of elements.
[in] | uplo | specifies whether the matrix A is an upper or lower triangular band matrix. If uplo == 'U' or 'u', A is an upper triangular band matrix. If uplo == 'L' or 'l', A is a lower triangular band matrix. |
[in] | trans | specifies op(A) If trans = 'N', 'n' op(A) = A If trans == 'T', 't', 'C', 'c' op(A) = AT |
[in] | diag | specifies whether or not matrix A is unit triangular. If diag == 'U' or 'u', A is assumed to be unit triangular. If diag == 'N' or 'n', A is not assumed to be unit triangular |
[in] | n | specifies the number of rows and columns of matrix A; n must be at least zero. |
[in] | k | specifies the number of superdiagonals or subdiagonals. If uplo == 'U' or 'u', k specifies the number of superdiagonals. If uplo == 'L' or 'l' k specifies the number of subdiagonals; k must at least be zero. |
[in] | A | array of dimension (lda, n). If uplo == 'U' or 'u', the leading (k+1)×n part of the array A must contain the upper triangular band matrix, supplied column by column, with the leading diagonal of the matrix in row k+1 of the array, the first superdiagonal starting at position 2 in row k, and so on. The top left k×k triangle of the array A is not referenced. If uplo == 'L' or 'l', the leading (k+1)×n part of the array A must contain the lower triangular band matrix, supplied column by column, with the leading diagonal of the matrix in row 1 of the array, the first subdiagonal starting at position 1 in row 2, and so on. The bottom right k×k triangle of the array is not referenced. |
[in] | lda | is the leading dimension of the A; lda must be at least k+1. |
[in] | x | array of length at least (1 + (n-1)* abs(incX)). On entry, x contains the source vector. On exit, x is overwritten with the result vector. |
[in] | incX | specifies the storage spacing for elements of x; incX must not be zero. |
[out] | x | updated according to x = op(A) * x. |
Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().
void(* lama::BLAS2Interface< T >::tbsv)(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const enum CBLAS_TRANSPOSE trans, const enum CBLAS_DIAG diag, const IndexType n, const IndexType k, const T *A, const IndexType lda, T *x, const IndexType incX, SyncToken *syncToken) |
tbsv solves one of the systems of equations
op(A) * x = b where op(A) = A or op(A) = AT
b and x are n‐element vectors, A is an n×n, unit or non‐unit, upper or lower, triangular band matrix with k+1 diagonals. No test for singularity or near‐singularity is included in this function. Such tests must be performed before calling this function.
[in] | uplo | specifies whether the matrix A is an upper or lower triangular band matrix. If uplo == 'U' or 'u', A is an upper triangular band matrix. If uplo == 'L' or 'l', A is a lower triangular band matrix. |
[in] | trans | specifies op(A). If trans = 'N', 'n' op(A) = A If trans == 'T', 't', 'C', 'c' op(A) = AT |
[in] | diag | specifies whether A is unit triangular. If diag == 'U' or 'u', A is assumed to be unit triangular; that is, diagonal elements are not read and are assumed to be unity. If diag == 'N' or 'n', A is not assumed to be unit triangular. |
[in] | n | specifies the number of rows and columns of matrix A; n must be at least zero. |
[in] | k | specifies the number of superdiagonals or subdiagonals. If uplo == 'U' or 'u', k specifies the number of superdiagonals. If uplo == 'L' or 'l' k specifies the number of subdiagonals; k must at least be zero. |
[in] | A | array of dimension (lda, n). If uplo == 'U' or 'u', the leading (k+1)×n part of the array A must contain the upper triangular band matrix, supplied column by column, with the leading diagonal of the matrix in row k+1 of the array, the first superdiagonal starting at position 2 in row k, and so on. The top left k×k triangle of the array A is not referenced. If uplo == 'L' or 'l', the leading (k+1)×n part of the array A must contain the lower triangular band matrix, supplied column by column, with the leading diagonal of the matrix in row 1 of the array, the first subdiagonal starting at position 1 in row 2, and so on. The bottom right k×k triangle of the array is not referenced. |
[in] | x | array of length at least (1 + (n-1)* abs(incX)). On entry, x contains the n-element right-hand side vector b. On exit, it is overwritten with the solution vector x. |
[in] | incX | specifies the storage spacing for elements of x; incX must not be zero. |
[out] | x | updated to contain the solution vector x that solves op(A) * x = b. |
Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().
void(* lama::BLAS2Interface< T >::tpmv)(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const enum CBLAS_TRANSPOSE trans, const enum CBLAS_DIAG diag, const IndexType n, const T *AP, T *x, const IndexType incX, SyncToken *syncToken) |
tpmv performs one of the matrix-vector operations
x = op(A) * x, where op(A) = A or op(A) = AT
x is an n‐element vector, A is an n×n, unit or nonunit, upper or lower, triangular matrix consisting of elements.
[in] | uplo | specifies whether the matrix A is an upper or lower triangular matrix. If uplo == 'U' or 'u', A is an upper triangular matrix. If uplo == 'L' or 'l', A is a lower triangular matrix. |
[in] | trans | specifies op(A) If trans = 'N', 'n' op(A) = A If trans == 'T', 't', 'C', 'c' op(A) = AT |
[in] | diag | specifies whether A is unit triangular. If diag == 'U' or 'u', A is assumed to be unit triangular; If diag == 'N' or 'n', A is not assumed to be unit triangular. |
[in] | n | specifies the number of rows and columns of matrix A; n must be at least zero. |
[in] | AP | array with at least (n*(n+1))/2 elements. If uplo == 'U' or 'u', the array AP contains the upper triangular part of the symmetric matrix A, packed sequentially, column by column, thats is if i <= j, A[i,j] is stored in AP[i + j(*(j+1)/2)]. If uplo == 'L' or 'l', the array AP contains the lower triangular part of the symmetric matrix A, packed sequentially, column by column, thats is if i >= j, A[i,j] is stored in AP[i + ((2*n-j+1)*j)/2]. |
[in] | x | array of length at least (1 + (n-1)* abs(incX)). On entry, x contains the n-element right-hand side vector b. On exit, it is overwritten with the result vector x. |
[in] | incX | specifies the storage spacing for elements of x; incX must not be zero. |
[out] | x | updated according to x = op(A)*x |
Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().
void(* lama::BLAS2Interface< T >::tpsv)(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const enum CBLAS_TRANSPOSE trans, const enum CBLAS_DIAG diag, const IndexType n, const T *Ap, T *x, const IndexType incX, SyncToken *syncToken) |
tpsv solves one of the systems of equations
op(A) * x = b where op(A) = A or op(A) = AT
b and x are n‐element vectors, A is an n×n, unit or non‐unit, upper or lower, triangular matrix. No test for singularity or near‐singularity is included in this function. Such tests must be performed before calling this function.
[in] | uplo | specifies whether the matrix A is an upper or lower triangular matrix. If uplo == 'U' or 'u', A is an upper triangular matrix. If uplo == 'L' or 'l', A is a lower triangular matrix. |
[in] | trans | specifies op(A) If trans = 'N', 'n' op(A) = A If trans == 'T', 't', 'C', 'c' op(A) = AT |
[in] | diag | specifies whether A is unit triangular. If diag == 'U' or 'u', A is assumed to be unit triangular; diagonal elements are not read and are assumed to be unity If diag == 'N' or 'n', A is not assumed to be unit triangular. |
[in] | n | specifies the number of rows and columns of matrix A; n must be at least zero. |
[in] | AP | array with at least (n*(n+1))/2 elements. If uplo == 'U' or 'u', the array AP contains the upper triangular matrix A, packed sequentially, column by column, thats is if i <= j, A[i,j] is stored in AP[i + j(*(j+1)/2)]. If uplo == 'L' or 'l', the array AP contains the lower triangular matrix A, packed sequentially, column by column, thats is if i >= j, A[i,j] is stored in AP[i + ((2*n-j+1)*j)/2]. When diag == 'U' or 'u' the diagonal elements of A are not referenced and are assumed to be unity. |
[in] | x | array of length at least (1 + (n-1)* abs(incX)). On entry, x contains the n-element right-hand side vector b. On exit, it is overwritten with the solution vector x. |
[in] | incX | specifies the storage spacing for elements of x; incX must not be zero. |
[out] | x | updated according to op(A) * x = b |
Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().
void(* lama::BLAS2Interface< T >::trmv)(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const enum CBLAS_TRANSPOSE trans, const enum CBLAS_DIAG diag, const IndexType n, const T *A, const IndexType lda, T *x, const IndexType incX, SyncToken *syncToken) |
trmv performs one of the matrix-vector operations
x = op(A) * x where op(A) = A or op(A) = AT
x is an n‐element vector, A is an n×n, unit or nonunit, upper or lower, triangular matrix consisting of elements.
[in] | uplo | specifies whether the matrix A is an upper or lower triangular matrix. If uplo == 'U' or 'u', A is an upper triangular matrix. If uplo == 'L' or 'l', A is a lower triangular matrix. |
[in] | trans | specifies op(A) If trans = 'N', 'n' op(A) = A If trans == 'T', 't', 'C', 'c' op(A) = AT |
[in] | diag | specifies whether A is unit triangular. If diag == 'U' or 'u', A is assumed to be unit triangular; If diag == 'N' or 'n', A is not assumed to be unit triangular. |
[in] | n | specifies the number of rows and columns of matrix A; n must be at least zero. |
[in] | A | array of dimensions (lda, n). If uplo == 'U' or 'u', the leading n×n upper triangular part of the array A must contain the upper triangular matrix, and the strictly lower triangular part of A is not referenced. If uplo == 'L' or 'l', the leading n×n lower triangular part of the array A must contain the lower triangular matrix, and the strictly upper triangular part of A is not referenced. When diag == 'U' or 'u', the diagonal elements of A are not referenced either, but are assumed to be unity. |
[in] | lda | leading dimension of A; lda must be at least max(1,n). |
[in] | x | array of length at least (1 + (n-1)* abs(incX)). On entry, x contains the source vector. On exit, x is overwritten with the result vector. |
[in] | incX | specifies the storage spacing for elements of x; incX must not be zero. |
[out] | x | updated according to x = op(A) * x. |
Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().
void(* lama::BLAS2Interface< T >::trsv)(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const enum CBLAS_TRANSPOSE trans, const enum CBLAS_DIAG diag, const IndexType n, const T *A, const IndexType lda, T *x, const IndexType incX, SyncToken *syncToken) |
trsv solves a system of equations
op(A) * x = b where op(A) = A or op(A) = AT
b and x are n‐element vectors, A is an n×n, unit or non‐unit, upper or lower, triangular matrix consisting of elements. Matrix A is stored in column‐major format, lda is the leading dimension of the two‐dimensional array containing A.
No test for singularity or near‐singularity is included in this function. Such tests must be performed before calling this function.
[in] | uplo | specifies whether the matrix data is stored in the upper or the lower triangular part of array A. If uplo == 'U' or 'u', only the upper triangular part of A may be referenced. If uplo == 'L' or 'l', only the lower triangular part of A may be referenced. |
[in] | trans | specifies op(A) If trans = 'N', 'n' op(A) = A If trans == 'T', 't', 'C', 'c' op(A) = AT |
[in] | diag | specifies whether A is unit triangular. If diag == 'U' or 'u', A is assumed to be unit triangular; If diag == 'N' or 'n', A is not assumed to be unit triangular. |
[in] | n | specifies the number of rows and columns of matrix A; n must be at least zero. |
[in] | A | array of dimensions (lda, n). If uplo == 'U' or 'u', the leading n×n upper triangular part of the array A must contain the upper triangular matrix, and the strictly lower triangular part of A is not referenced. If uplo == 'L' or 'l', the leading n×n lower triangular part of the array A must contain the lower triangular matrix, and the strictly upper triangular part of A is not referenced. |
[in] | lda | leading dimension of A; lda must be at least max(1,n). |
[in] | x | array of length at least (1 + (n-1)* abs(incX)). On entry, x contains the n-element, right-hand-side vector b. On exit, it is overwritten with the solution vector x. |
[in] | incX | specifies the storage spacing for elements of x; incX must not be zero. |
[out] | x | updated according to op(A) * x = b |
Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().