LAMA
lama::BLAS2Interface< T > Struct Template Reference

#include <LAMAInterface.hpp>

Inheritance diagram for lama::BLAS2Interface< T >:

Public Member Functions

 BLAS2Interface ()
 For future versions: left functions for hermitian matrices - hemv, hbmv, her, her2.

Data Fields

void(* gemv )(const enum CBLAS_ORDER order, const enum CBLAS_TRANSPOSE trans, const IndexType m, const IndexType n, const T alpha, const T *A, const IndexType lda, const T *x, const IndexType incX, const T beta, T *y, const IndexType incY, SyncToken *syncToken)
 gemv performs one of the matrix-vector operations
void(* symv )(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const T alpha, const T *A, const IndexType lda, const T *x, const IndexType incX, const T beta, T *y, const IndexType incY, SyncToken *syncToken)
 symv performs the matrix vector operation
void(* trmv )(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const enum CBLAS_TRANSPOSE trans, const enum CBLAS_DIAG diag, const IndexType n, const T *A, const IndexType lda, T *x, const IndexType incX, SyncToken *syncToken)
 trmv performs one of the matrix-vector operations
void(* trsv )(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const enum CBLAS_TRANSPOSE trans, const enum CBLAS_DIAG diag, const IndexType n, const T *A, const IndexType lda, T *x, const IndexType incX, SyncToken *syncToken)
 trsv solves a system of equations
void(* gbmv )(const enum CBLAS_ORDER order, const enum CBLAS_TRANSPOSE trans, const int m, const int n, const int kl, const int ku, const T alpha, const T *A, const int lda, const T *x, const int incX, const T beta, T *y, const int incY, SyncToken *syncToken)
 gbmv performs one of the matrix-vector operations
void(* sbmv )(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const IndexType k, const T alpha, const T *A, const IndexType lda, const T *x, const IndexType incX, const T beta, T *y, const IndexType incY, SyncToken *syncToken)
 sbmv performs matrix vector operation
void(* tbmv )(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const enum CBLAS_TRANSPOSE trans, const enum CBLAS_DIAG diag, const IndexType n, const IndexType k, const T *A, const IndexType lda, T *x, const IndexType incX, SyncToken *syncToken)
 tbmv performs one of the matrix-vector operations
void(* tbsv )(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const enum CBLAS_TRANSPOSE trans, const enum CBLAS_DIAG diag, const IndexType n, const IndexType k, const T *A, const IndexType lda, T *x, const IndexType incX, SyncToken *syncToken)
 tbsv solves one of the systems of equations
void(* ger )(const enum CBLAS_ORDER order, const IndexType m, const IndexType n, const T alpha, const T *x, const IndexType incX, const T *y, const IndexType incY, T *A, const IndexType lda, SyncToken *syncToken)
 ger performs the symmetric rank 1 operation
void(* syr )(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const T alpha, const T *x, const IndexType incX, T *A, const IndexType lda, SyncToken *syncToken)
 TODO: need geru ?!
void(* syr2 )(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const T alpha, const T *x, const IndexType incX, const T *y, const IndexType incY, T *A, const IndexType lda, SyncToken *syncToken)
 syr2 performs the symmetric rank 2 operation
void(* spmv )(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const T alpha, const T *AP, const T *x, const IndexType incX, const T beta, T *y, const IndexType incY, SyncToken *syncToken)
 spmv performs matrix vector
void(* spr )(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const T alpha, const T *x, const IndexType incX, T *AP, SyncToken *syncToken)
 spr performs the symmetric rank 1 operation
void(* spr2 )(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const T alpha, const T *x, const IndexType incX, const T *y, const IndexType incY, T *AP, SyncToken *syncToken)
 spr2 performs the symmetric rank 2 operation
void(* tpmv )(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const enum CBLAS_TRANSPOSE trans, const enum CBLAS_DIAG diag, const IndexType n, const T *AP, T *x, const IndexType incX, SyncToken *syncToken)
 tpmv performs one of the matrix-vector operations
void(* tpsv )(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const enum CBLAS_TRANSPOSE trans, const enum CBLAS_DIAG diag, const IndexType n, const T *Ap, T *x, const IndexType incX, SyncToken *syncToken)
 tpsv solves one of the systems of equations

template<typename T>
struct lama::BLAS2Interface< T >


Constructor & Destructor Documentation

template<typename T >
lama::BLAS2Interface< T >::BLAS2Interface ( )

For future versions: left functions for hermitian matrices - hemv, hbmv, her, her2.

Default constructor, initializes variables with NULL


Field Documentation

template<typename T>
void(* lama::BLAS2Interface< T >::gbmv)(const enum CBLAS_ORDER order, const enum CBLAS_TRANSPOSE trans, const int m, const int n, const int kl, const int ku, const T alpha, const T *A, const int lda, const T *x, const int incX, const T beta, T *y, const int incY, SyncToken *syncToken)

gbmv performs one of the matrix-vector operations

y = alpha * op(A)* x + beta * y where op(A) = A or op(A) = AT

alpha and beta are scalars, and x and y are vectors. A is an m×n band matrix consisting of elements with kl subdiagonals and ku superdiagonals.

Parameters:
[in]transspecifies op(A). If trans == 'N' or 'n', op(A) = A If trans == 'T','t','C','c', op(A) = AT
[in]mthe number of rows of matrix A; m must be at least zero
[in]nthe number of columns of matrix A; n must be at least zero
[in]klthe number of subdiagonals of matrix A; kl must be at least zero
[in]kuthe number of superdiagonals of matrix A; ku must be at least zero
[in]alphascalar multiplier applied to op(A)
[in]Aarray of dimensions (lda,n). The leading part of array A must contain the band matrix A, supplied column by column, with the leading diagonal of the matrix in row ku+1 of the array, the first superdiagonal starting at position 2 inrow ku, the first subdiagonal starting at position 1 in row ku+2, and so on. Elements in the array A that do not correspond to elements in the band matrix (such as the top left ku×ku triangle) are not referenced.
[in]ldaleading dimension of A; lda must be at least kl + ku + 1
[in]xarray of length at least (1 + (n - 1) * abs(incX)) when trans == 'N' or 'n' and at least (1 + ( m - 1) * abs(incX)) otherwise
[in]incXstorage spacing between elements of x; incX must not be zero
[in]betascalar multiplier applied to vector y. If beta is zero, y is not read
[in]yarray of length at least (1 + (m - 1) * abs(incY)) when trans == 'N' or 'n' and at least (1 + ( n - 1) * abs(incY)) otherwise
[out]yupdated according to y = alpha * op(A) * x + beta * y

Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().

template<typename T>
void( * lama::BLAS2Interface< T >::gemv)(const enum CBLAS_ORDER order, const enum CBLAS_TRANSPOSE trans, const IndexType m, const IndexType n, const T alpha, const T *A, const IndexType lda, const T *x, const IndexType incX, const T beta, T *y, const IndexType incY, SyncToken *syncToken)

gemv performs one of the matrix-vector operations

y = alpha * op(A) * x + beta * y where op(A) = A or op(A) = AT

alpha and beta are scalars, and x and y are vectors. A is an m×n matrix consisting of elements. Matrix A is stored in column-major format, and lda is the leading dimension of the two dimensional array in which A is stored.

Parameters:
[in]transspecifies op(A). If trans == 'N' or 'n', op(A) = A If trans == 'T','t','C','c', op(A) = AT
[in]mthe number of rows of matrix A; m must be at least zero
[in]nthe number of columns of matrix A; n must be at least zero
[in]alphascalar multiplier applied to op(A)
[in]Aarray of dimensions (lda,n). If trans == 'N' or 'n', of dimensions (lda,m) otherwise; lda must be at least max(1,m) if trans == 'N' or 'n' and at least max(1,n) otherwise
[in]ldaleading dimension of two-dimensional array used to store matrix A.
[in]xarray of length at least (1 + (n - 1) * abs(incX)) when trans == 'N' or 'n' else at least (1 + ( m - 1) * abs(incX))
[in]incXstorage spacing between elements of x; incX must not be zero
[in]betascalar multiplier applied to vector y. If beta is zero, y is not read
[in]yarray of length at least (1 + (m - 1) * abs(incY)) when trans == 'N' or 'n' else at least (1 + ( n - 1) * abs(incY))
[in]incYthe storage spacing between elements of y; incY must not be zero.
[out]yupdated according to y = alpha * op(A) * x + beta * y
[in]syncTokenallows to start asynchronous execution

Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().

template<typename T>
void(* lama::BLAS2Interface< T >::ger)(const enum CBLAS_ORDER order, const IndexType m, const IndexType n, const T alpha, const T *x, const IndexType incX, const T *y, const IndexType incY, T *A, const IndexType lda, SyncToken *syncToken)

ger performs the symmetric rank 1 operation

A = alpha * x * yT + A

where alpha is a scalar, x is a m‐element vector, y is a n‐element, A is an m×n matrix consisting of elements. Matrix A is stored in column‐major format, and lda is the leading dimension of the two-dimensional array used to store A.

Parameters:
[in]mspecifies the number of rows of matrix A; m must be at least zero
[in]nspecifies the number of columns of matrix A; n must be at least zero
[in]alphascalar multiplier applied to x * yT
[in]xarray of length at least (1 + (m - 1) * abs(incX))
[in]incXstorage spacing between elements of x; incX must not be zero
[in]betascalar multiplier applied to vector y. If beta is zero, y is not read
[in]yarray of length at least (1 + (n - 1) * abs(incY))
[in]incYthe storage spacing between elements of y; incY must not be zero.
[in]Aarray of dimensions (lda,n)
[in]ldaleading dimension of two-dimensional array used to store matrix A.
[out]Aupdated according to y = alpha * x * yT + A

Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().

template<typename T>
void(* lama::BLAS2Interface< T >::sbmv)(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const IndexType k, const T alpha, const T *A, const IndexType lda, const T *x, const IndexType incX, const T beta, T *y, const IndexType incY, SyncToken *syncToken)

sbmv performs matrix vector operation

y = alpha * A * x + beta * y

where alpha and beta are scalars, where and x and y are n‐element vectors. A is an n×n symmetric band matrix consisting of single‐precision elements, with k superdiagonals and the same number of subdiagonals.

Parameters:
[in]uplospecifies whether the upper or lower triangular part of the symmetric band matrix A is being supplied. If uplo == 'U' or 'u', the upper triangular part is being supplied. If uplo == 'L' or 'l', the lower triangular part is being supplied.
[in]nspecifies the number of rows and the number of columns of the symmetric matrix A; n must be at least zero.
[in]kspecifies the number of superdiagonals of matrix A. Since the matrix is symmetric, this is also the number of subdiagonals; k must be at least zero.
[in]alphascalar multiplier applied to A * x
[in]Aarray of dimensions (lda, n). When uplo == 'U' or 'u', the leading (k+1)×n part of array A must contain the upper triangular band of the symmetric matrix, supplied column by column, with the leading diagonal of the matrix in row k+1 of the array, the first superdiagonal starting at position 2 in row k, and so on. The top left k×k triangle of the array A is not referenced. When uplo == 'L' or 'l', the leading (k+1)×n part of the array A must contain the lower triangular band part of the symmetric matrix, supplied column by column, with the leading diagonal of the matrix in row 1 of the array, the first subdiagonal starting at position 1 in row 2, and so on. The bottom right k×k triangle of the array A is not referenced.
[in]ldaleading dimension of A; lda must be at least k+1
[in]xarray of length at least ( 1 + (n-1) * abs(incX))
[in]incXstorage spacing between elements of x; incX must not be zero
[in]betascalar multiplier applied to vector y. If beta is zero, y is not read
[in]yarray of length at least (1 + (n - 1) * abs(incY)) If beta is zero, y is not read.
[in]incYstorage spacing between elements of y; incY must not be zero.
[out]yupdated according to y = alpha * A * x + beta * y

Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().

template<typename T>
void(* lama::BLAS2Interface< T >::spmv)(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const T alpha, const T *AP, const T *x, const IndexType incX, const T beta, T *y, const IndexType incY, SyncToken *syncToken)

spmv performs matrix vector

y = alpha * A * x + beta * y

where alpha and beta are scalars, and x and y are n‐element vectors. A is a symmetric n×n matrix that consists of elements and is supplied in packed form.

Parameters:
[in]uplospecifies whether the matrix data is stored in the upper or the lower triangular part of array AP. If uplo == 'U' or 'u', the upper triangular part of A is supplied in AP. If uplo == 'L' or 'l', the lower triangular part of A is supplied in AP.
[in]nthe number of rows and columns number of matrix A; n must be at least zero.
[in]alphascalar multiplier applied to A * x
[in]AParray with at least (n*(n+1))/2 elements. If uplo == 'U' or 'u', the array AP contains the upper triangular part of the symmetric matrix A, packed sequentially, column by column. Thats is, if i<=j,A[i,j] is stored in AP[i+(j*(j+1)/2)] If uplo == 'L' or 'l', the array AP contains the lower triangular part of the symmetric matrix A, packed sequentially, column by column. Thats it, if i>=j,A[i,j] is stored in AP[i+((2*n-j+1)*j)/2]
[in]xarray of length at least ( 1 + (n-1) * abs(incX))
[in]incXstorage spacing between elements of x; incX must not be zero
[in]betascalar multiplier applied to vector y. If beta is zero, y is not read
[in]yarray of length at least (1 + (n - 1) * abs(incY)) If beta is zero, y is not read.
[in]incYthe storage spacing between elements of y; incY must not be zero.
[out]yupdated according to y = alpha * A * x + beta * y

Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().

template<typename T>
void(* lama::BLAS2Interface< T >::spr)(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const T alpha, const T *x, const IndexType incX, T *AP, SyncToken *syncToken)

spr performs the symmetric rank 1 operation

A = alpha * x * xT + A

where alpha is a scalar, and x is an n‐element vector. A is a symmetric n×n matrix that consists of elements and is supplied in packed form

Parameters:
[in]uplospecifies whether the matrix data is stored in the upper or the lower triangular part of array AP. If uplo == 'U' or 'u', the upper triangular part of A is supplied in AP. If uplo == 'L' or 'l', the lower triangular part of A is supplied in AP.
[in]nthe number of rows and columns number of matrix A; n must be at least zero.
[in]alphascalar multiplier applied to x * xT
[in]xarray of length at least ( 1 + (n-1) * abs(incX))
[in]incXstorage spacing between elements of x; incX must not be zero
[in]AParray with at least (n*(n+1))/2 elements. If uplo == 'U' or 'u', the array AP contains the upper triangular part of the symmetric matrix A, packed sequentially, column by column. Thats is, if i<=j,A[i,j] is stored in AP[i+(j*(j+1)/2)] If uplo == 'L' or 'l', the array AP contains the lower triangular part of the symmetric matrix A, packed sequentially, column by column. Thats it, if i>=j,A[i,j] is stored in AP[i+((2*n-j+1)*j)/2]
[out]Aupdated according to A = alpha * x * xT + A

Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().

template<typename T>
void(* lama::BLAS2Interface< T >::spr2)(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const T alpha, const T *x, const IndexType incX, const T *y, const IndexType incY, T *AP, SyncToken *syncToken)

spr2 performs the symmetric rank 2 operation

A = alpha * x * yT + alpha * y * xT + A

where alpha is a scalar, and x and y are n‐element vectors. A is a symmetric n×n matrix that consists of elements and is supplied in packed form

Parameters:
[in]uplospecifies whether the matrix data is stored in the upper or the lower triangular part of array A. If uplo == 'U' or 'u', the upper triangular part of may be referenced and the lower triangular part of A is inferred. If uplo == 'L' or 'l', the lower triangular part of may be referenced and the upper triangular part of A is inferred.
[in]nthe number of rows and columns number of matrix A; n must be at least zero.
[in]alphascalar multiplier applied to x * xT
[in]xarray of length at least ( 1 + (n-1) * abs(incX))
[in]incXstorage spacing between elements of x; incX must not be zero
[in]AParray with at least (n*(n+1))/2 elements. If uplo == 'U' or 'u', the array AP contains the upper triangular part of the symmetric matrix A, packed sequentially, column by column. Thats is, if i<=j,A[i,j] is stored in AP[i+(j*(j+1)/2)] If uplo == 'L' or 'l', the array AP contains the lower triangular part of the symmetric matrix A, packed sequentially, column by column. Thats it, if i>=j,A[i,j] is stored in AP[i+((2*n-j+1)*j)/2]
[out]Aupdated according to A = alpha * x * yT + alpha * y * xT + A

Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().

template<typename T>
void(* lama::BLAS2Interface< T >::symv)(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const T alpha, const T *A, const IndexType lda, const T *x, const IndexType incX, const T beta, T *y, const IndexType incY, SyncToken *syncToken)

symv performs the matrix vector operation

A = alpha * A * x + beta * y

where alpha and beta are scalars, and x and y are n‐element vectors. A is a symmetric n×n matrix that consists of single‐precision elements and is stored in either upper or lower storage mode.

Parameters:
[in]uplospecifies whether the upper or lower triangular part of the array A is referenced. If uplo == 'U' or 'u', the symmetric matrix A is stored in upper storage mode; only the upper triangular part of A is referenced while the lower triangular part of A is inferred. If uplo == 'L' or 'l', the symmetric matrix A is stored in lower storage mode; only the lower triangular part of A is referenced while the upper triangular part of A is inferred.
[in]nspecifies the number of rows and the number of columns of the symmetric matrix A; n must be at least zero.
[in]alphascalar multiplier applied to A * x
[in]Aarray of dimensions (lda, n). If uplo == 'U' or 'u', the leading n×n upper triangular part of the array A must contain the upper triangular part of the symmetric matrix, and the strictly lower triangular part of A is not referenced. If uplo == 'L' or 'l', the leading n×n lower triangular part of the array A must contain the lower triangular part of the symmetric matrix, and the strictly upper triangular part of A is not referenced.
[in]ldaleading dimension of A; lda must be at least max(1,n).
[in]xarray of length at least ( 1 + (n-1) * abs(incX))
[in]incXstorage spacing between elements of x; incX must not be zero
[in]betascalar multiplier applied to the vector y.
[in]yarray of length at least ( 1 + (n-1) * abs(incY)) If beta is zero, y is not read.
[in]incYstorage spacing between elements of y; incY must not be zero.
[out]yupdated according to y = alpha * A * x + beta * y

Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().

template<typename T>
void(* lama::BLAS2Interface< T >::syr)(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const T alpha, const T *x, const IndexType incX, T *A, const IndexType lda, SyncToken *syncToken)

TODO: need geru ?!

syr performs the symmetric rank 1 operation

A = alpha * x * xT + A

where alpha is a scalar, x is an n‐element vectors. and A is an n×n symmetric matrix consisting of elements. A is stored in column‐major format, lda is the leading dimension of the two‐dimensional array containing A.

Parameters:
[in]uplospecifies whether the matrix data is stored in the upper or the lower triangular part of array A. If uplo == 'U' or 'u', only the upper triangular part of A is referenced. If uplo == 'L' or 'l', only the lower triangular part of A is referenced.
[in]nthe number of rows and columns of matrix A; n must be at least zero.
[in]alphascalar multiplier applied to x * xT
[in]xarray of length at least ( 1 + (n-1) * abs(incX))
[in]incXstorage spacing between elements of x; incX must not be zero
[in]Aarray of dimensions (lda, n). If uplo == 'U' or 'u', A contains the upper triangular part of the symmetric matrix, and the strictly lower triangular part is not referenced. If uplo == 'L' or 'l', A contains the lower triangular part of the symmetric matrix, and the strictly upper triangular part is not referenced.
[in]ldaleading dimension of the two-dimensional array containing A; lda must be at least max(1,n).
[out]Aupdated according to A = alpha * x * xT + A

Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().

template<typename T>
void(* lama::BLAS2Interface< T >::syr2)(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const IndexType n, const T alpha, const T *x, const IndexType incX, const T *y, const IndexType incY, T *A, const IndexType lda, SyncToken *syncToken)

syr2 performs the symmetric rank 2 operation

A = alpha * x * yT + alpha * y * xT + A,

where alpha is a scalar, x and y are n‐element vectors, A is an n×n symmetric matrix consisting of elements.

Parameters:
[in]uplospecifies whether the matrix data is stored in the upper or the lower triangular part of array A. If uplo == 'U' or 'u', only the upper triangular part of A is referenced and the lower triangular part of A is inferred. If uplo == 'L' or 'l', only the lower triangular part of A is referenced and the upper triangular part of A is inferred. n the number of rows and columns of matrix A; n must be at least zero.
[in]nthe number of rows and columns of matrix A; n must be at least zero.
[in]alphascalar multiplier applied to x * yT + y * xT
[in]xarray of length at least ( 1 + (n-1) * abs(incX))
[in]incXstorage spacing between elements of x; incX must not be zero
[in]yarray of length at least ( 1 + (n-1) * abs(incY))
[in]incYstorage spacing between elements of y; incY must not be zero.
[in]Aarray of dimensions (lda, n). If uplo == 'U' or 'u', A contains the upper triangular part of the symmetric matrix, and the strictly lower triangular part is not referenced. If uplo == 'L' or 'l', A contains the lower triangular part of the symmetric matrix, and the strictly upper triangular part is not referenced.
[in]ldaleading dimension of the A; lda must be at least max(1,n).
[out]Aupdated according to A = alpha * x * yT + alpha * y * xT + A

Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().

template<typename T>
void(* lama::BLAS2Interface< T >::tbmv)(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const enum CBLAS_TRANSPOSE trans, const enum CBLAS_DIAG diag, const IndexType n, const IndexType k, const T *A, const IndexType lda, T *x, const IndexType incX, SyncToken *syncToken)

tbmv performs one of the matrix-vector operations

x = op(A) * x where op(A) = A or op(A) = AT

x is an n‐element vector, A is an n×n, unit or nonunit, upper or lower, triangular band matrix consisting of elements.

Parameters:
[in]uplospecifies whether the matrix A is an upper or lower triangular band matrix. If uplo == 'U' or 'u', A is an upper triangular band matrix. If uplo == 'L' or 'l', A is a lower triangular band matrix.
[in]transspecifies op(A) If trans = 'N', 'n' op(A) = A If trans == 'T', 't', 'C', 'c' op(A) = AT
[in]diagspecifies whether or not matrix A is unit triangular. If diag == 'U' or 'u', A is assumed to be unit triangular. If diag == 'N' or 'n', A is not assumed to be unit triangular
[in]nspecifies the number of rows and columns of matrix A; n must be at least zero.
[in]kspecifies the number of superdiagonals or subdiagonals. If uplo == 'U' or 'u', k specifies the number of superdiagonals. If uplo == 'L' or 'l' k specifies the number of subdiagonals; k must at least be zero.
[in]Aarray of dimension (lda, n). If uplo == 'U' or 'u', the leading (k+1)×n part of the array A must contain the upper triangular band matrix, supplied column by column, with the leading diagonal of the matrix in row k+1 of the array, the first superdiagonal starting at position 2 in row k, and so on. The top left k×k triangle of the array A is not referenced. If uplo == 'L' or 'l', the leading (k+1)×n part of the array A must contain the lower triangular band matrix, supplied column by column, with the leading diagonal of the matrix in row 1 of the array, the first subdiagonal starting at position 1 in row 2, and so on. The bottom right k×k triangle of the array is not referenced.
[in]ldais the leading dimension of the A; lda must be at least k+1.
[in]xarray of length at least (1 + (n-1)* abs(incX)). On entry, x contains the source vector. On exit, x is overwritten with the result vector.
[in]incXspecifies the storage spacing for elements of x; incX must not be zero.
[out]xupdated according to x = op(A) * x.

Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().

template<typename T>
void(* lama::BLAS2Interface< T >::tbsv)(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const enum CBLAS_TRANSPOSE trans, const enum CBLAS_DIAG diag, const IndexType n, const IndexType k, const T *A, const IndexType lda, T *x, const IndexType incX, SyncToken *syncToken)

tbsv solves one of the systems of equations

op(A) * x = b where op(A) = A or op(A) = AT

b and x are n‐element vectors, A is an n×n, unit or non‐unit, upper or lower, triangular band matrix with k+1 diagonals. No test for singularity or near‐singularity is included in this function. Such tests must be performed before calling this function.

Parameters:
[in]uplospecifies whether the matrix A is an upper or lower triangular band matrix. If uplo == 'U' or 'u', A is an upper triangular band matrix. If uplo == 'L' or 'l', A is a lower triangular band matrix.
[in]transspecifies op(A). If trans = 'N', 'n' op(A) = A If trans == 'T', 't', 'C', 'c' op(A) = AT
[in]diagspecifies whether A is unit triangular. If diag == 'U' or 'u', A is assumed to be unit triangular; that is, diagonal elements are not read and are assumed to be unity. If diag == 'N' or 'n', A is not assumed to be unit triangular.
[in]nspecifies the number of rows and columns of matrix A; n must be at least zero.
[in]kspecifies the number of superdiagonals or subdiagonals. If uplo == 'U' or 'u', k specifies the number of superdiagonals. If uplo == 'L' or 'l' k specifies the number of subdiagonals; k must at least be zero.
[in]Aarray of dimension (lda, n). If uplo == 'U' or 'u', the leading (k+1)×n part of the array A must contain the upper triangular band matrix, supplied column by column, with the leading diagonal of the matrix in row k+1 of the array, the first superdiagonal starting at position 2 in row k, and so on. The top left k×k triangle of the array A is not referenced. If uplo == 'L' or 'l', the leading (k+1)×n part of the array A must contain the lower triangular band matrix, supplied column by column, with the leading diagonal of the matrix in row 1 of the array, the first subdiagonal starting at position 1 in row 2, and so on. The bottom right k×k triangle of the array is not referenced.
[in]xarray of length at least (1 + (n-1)* abs(incX)). On entry, x contains the n-element right-hand side vector b. On exit, it is overwritten with the solution vector x.
[in]incXspecifies the storage spacing for elements of x; incX must not be zero.
[out]xupdated to contain the solution vector x that solves op(A) * x = b.

Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().

template<typename T>
void(* lama::BLAS2Interface< T >::tpmv)(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const enum CBLAS_TRANSPOSE trans, const enum CBLAS_DIAG diag, const IndexType n, const T *AP, T *x, const IndexType incX, SyncToken *syncToken)

tpmv performs one of the matrix-vector operations

x = op(A) * x, where op(A) = A or op(A) = AT

x is an n‐element vector, A is an n×n, unit or nonunit, upper or lower, triangular matrix consisting of elements.

Parameters:
[in]uplospecifies whether the matrix A is an upper or lower triangular matrix. If uplo == 'U' or 'u', A is an upper triangular matrix. If uplo == 'L' or 'l', A is a lower triangular matrix.
[in]transspecifies op(A) If trans = 'N', 'n' op(A) = A If trans == 'T', 't', 'C', 'c' op(A) = AT
[in]diagspecifies whether A is unit triangular. If diag == 'U' or 'u', A is assumed to be unit triangular; If diag == 'N' or 'n', A is not assumed to be unit triangular.
[in]nspecifies the number of rows and columns of matrix A; n must be at least zero.
[in]AParray with at least (n*(n+1))/2 elements. If uplo == 'U' or 'u', the array AP contains the upper triangular part of the symmetric matrix A, packed sequentially, column by column, thats is if i <= j, A[i,j] is stored in AP[i + j(*(j+1)/2)]. If uplo == 'L' or 'l', the array AP contains the lower triangular part of the symmetric matrix A, packed sequentially, column by column, thats is if i >= j, A[i,j] is stored in AP[i + ((2*n-j+1)*j)/2].
[in]xarray of length at least (1 + (n-1)* abs(incX)). On entry, x contains the n-element right-hand side vector b. On exit, it is overwritten with the result vector x.
[in]incXspecifies the storage spacing for elements of x; incX must not be zero.
[out]xupdated according to x = op(A)*x

Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().

template<typename T>
void(* lama::BLAS2Interface< T >::tpsv)(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const enum CBLAS_TRANSPOSE trans, const enum CBLAS_DIAG diag, const IndexType n, const T *Ap, T *x, const IndexType incX, SyncToken *syncToken)

tpsv solves one of the systems of equations

op(A) * x = b where op(A) = A or op(A) = AT

b and x are n‐element vectors, A is an n×n, unit or non‐unit, upper or lower, triangular matrix. No test for singularity or near‐singularity is included in this function. Such tests must be performed before calling this function.

Parameters:
[in]uplospecifies whether the matrix A is an upper or lower triangular matrix. If uplo == 'U' or 'u', A is an upper triangular matrix. If uplo == 'L' or 'l', A is a lower triangular matrix.
[in]transspecifies op(A) If trans = 'N', 'n' op(A) = A If trans == 'T', 't', 'C', 'c' op(A) = AT
[in]diagspecifies whether A is unit triangular. If diag == 'U' or 'u', A is assumed to be unit triangular; diagonal elements are not read and are assumed to be unity If diag == 'N' or 'n', A is not assumed to be unit triangular.
[in]nspecifies the number of rows and columns of matrix A; n must be at least zero.
[in]AParray with at least (n*(n+1))/2 elements. If uplo == 'U' or 'u', the array AP contains the upper triangular matrix A, packed sequentially, column by column, thats is if i <= j, A[i,j] is stored in AP[i + j(*(j+1)/2)]. If uplo == 'L' or 'l', the array AP contains the lower triangular matrix A, packed sequentially, column by column, thats is if i >= j, A[i,j] is stored in AP[i + ((2*n-j+1)*j)/2]. When diag == 'U' or 'u' the diagonal elements of A are not referenced and are assumed to be unity.
[in]xarray of length at least (1 + (n-1)* abs(incX)). On entry, x contains the n-element right-hand side vector b. On exit, it is overwritten with the solution vector x.
[in]incXspecifies the storage spacing for elements of x; incX must not be zero.
[out]xupdated according to op(A) * x = b

Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().

template<typename T>
void(* lama::BLAS2Interface< T >::trmv)(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const enum CBLAS_TRANSPOSE trans, const enum CBLAS_DIAG diag, const IndexType n, const T *A, const IndexType lda, T *x, const IndexType incX, SyncToken *syncToken)

trmv performs one of the matrix-vector operations

x = op(A) * x where op(A) = A or op(A) = AT

x is an n‐element vector, A is an n×n, unit or nonunit, upper or lower, triangular matrix consisting of elements.

Parameters:
[in]uplospecifies whether the matrix A is an upper or lower triangular matrix. If uplo == 'U' or 'u', A is an upper triangular matrix. If uplo == 'L' or 'l', A is a lower triangular matrix.
[in]transspecifies op(A) If trans = 'N', 'n' op(A) = A If trans == 'T', 't', 'C', 'c' op(A) = AT
[in]diagspecifies whether A is unit triangular. If diag == 'U' or 'u', A is assumed to be unit triangular; If diag == 'N' or 'n', A is not assumed to be unit triangular.
[in]nspecifies the number of rows and columns of matrix A; n must be at least zero.
[in]Aarray of dimensions (lda, n). If uplo == 'U' or 'u', the leading n×n upper triangular part of the array A must contain the upper triangular matrix, and the strictly lower triangular part of A is not referenced. If uplo == 'L' or 'l', the leading n×n lower triangular part of the array A must contain the lower triangular matrix, and the strictly upper triangular part of A is not referenced. When diag == 'U' or 'u', the diagonal elements of A are not referenced either, but are assumed to be unity.
[in]ldaleading dimension of A; lda must be at least max(1,n).
[in]xarray of length at least (1 + (n-1)* abs(incX)). On entry, x contains the source vector. On exit, x is overwritten with the result vector.
[in]incXspecifies the storage spacing for elements of x; incX must not be zero.
[out]xupdated according to x = op(A) * x.

Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().

template<typename T>
void(* lama::BLAS2Interface< T >::trsv)(const enum CBLAS_ORDER order, const enum CBLAS_UPLO uplo, const enum CBLAS_TRANSPOSE trans, const enum CBLAS_DIAG diag, const IndexType n, const T *A, const IndexType lda, T *x, const IndexType incX, SyncToken *syncToken)

trsv solves a system of equations

op(A) * x = b where op(A) = A or op(A) = AT

b and x are n‐element vectors, A is an n×n, unit or non‐unit, upper or lower, triangular matrix consisting of elements. Matrix A is stored in column‐major format, lda is the leading dimension of the two‐dimensional array containing A.

No test for singularity or near‐singularity is included in this function. Such tests must be performed before calling this function.

Parameters:
[in]uplospecifies whether the matrix data is stored in the upper or the lower triangular part of array A. If uplo == 'U' or 'u', only the upper triangular part of A may be referenced. If uplo == 'L' or 'l', only the lower triangular part of A may be referenced.
[in]transspecifies op(A) If trans = 'N', 'n' op(A) = A If trans == 'T', 't', 'C', 'c' op(A) = AT
[in]diagspecifies whether A is unit triangular. If diag == 'U' or 'u', A is assumed to be unit triangular; If diag == 'N' or 'n', A is not assumed to be unit triangular.
[in]nspecifies the number of rows and columns of matrix A; n must be at least zero.
[in]Aarray of dimensions (lda, n). If uplo == 'U' or 'u', the leading n×n upper triangular part of the array A must contain the upper triangular matrix, and the strictly lower triangular part of A is not referenced. If uplo == 'L' or 'l', the leading n×n lower triangular part of the array A must contain the lower triangular matrix, and the strictly upper triangular part of A is not referenced.
[in]ldaleading dimension of A; lda must be at least max(1,n).
[in]xarray of length at least (1 + (n-1)* abs(incX)). On entry, x contains the n-element, right-hand-side vector b. On exit, it is overwritten with the solution vector x.
[in]incXspecifies the storage spacing for elements of x; incX must not be zero.
[out]xupdated according to op(A) * x = b

Referenced by lama::CUDAInterface::CUDAInterface(), and lama::OpenMPInterface::OpenMPInterface().


The documentation for this struct was generated from the following files: