Provided by: scalapack-doc_1.5-11_all bug

NAME

       PSGESVD  - compute the singular value decomposition (SVD) of an M-by-N matrix A, optionally computing the
       left and/or right singular vectors

SYNOPSIS

       SUBROUTINE PSGESVD( JOBU, JOBVT, M, N, A, IA, JA, DESCA, S, U, IU, JU, DESCU, VT, IVT, JVT, DESCVT, WORK,
                           LWORK, INFO )

           CHARACTER       JOBU, JOBVT

           INTEGER         IA, INFO, IU, IVT, JA, JU, JVT, LWORK, M, N

           INTEGER         DESCA( * ), DESCU( * ), DESCVT( * )

           REAL            A( * ), S( * ), U( * ), VT( * ), WORK( * )

PURPOSE

       PSGESVD computes the singular value decomposition (SVD) of an M-by-N matrix A, optionally  computing  the
       left and/or right singular vectors. The SVD is written as

            A = U * SIGMA * transpose(V)

       where  SIGMA  is an M-by-N matrix which is zero except for its min(M,N) diagonal elements, U is an M-by-M
       orthogonal matrix, and V is an N-by-N orthogonal matrix. The diagonal elements of SIGMA are the  singular
       values  of  A  and  the  columns  of  U  and  V  are  the  corresponding right and left singular vectors,
       respectively. The singular values are returned in array S in decreasing order and only the first min(M,N)
       columns of U and rows of VT = V**T are computed.

       Notes
       =====
       Each global data object is described  by  an  associated  description  vector.  This  vector  stores  the
       information required to establish the mapping between an object element and its corresponding process and
       memory location.

       Let  A  be  a  generic  term  for  any  2D  block cyclicly distributed array.  Such a global array has an
       associated description vector DESCA.  In the following comments, the character _ should be  read  as  "of
       the global array".

       NOTATION        STORED IN      EXPLANATION
       ---------------  -------------- -------------------------------------- DTYPE_A(global) DESCA( DTYPE_ )The
       descriptor type.  In this case,
                                      DTYPE_A = 1.
       CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating
                                      the BLACS process grid A is distribu-
                                      ted over. The context itself is glo-
                                      bal, but the handle (the integer
                                      value) may vary.
       M_A    (global) DESCA( M_ )    The number of rows in the global
                                      array A.
       N_A    (global) DESCA( N_ )    The number of columns in the global
                                      array A.
       MB_A   (global) DESCA( MB_ )   The blocking factor used to distribute
                                      the rows of the array.
       NB_A   (global) DESCA( NB_ )   The blocking factor used to distribute
                                      the columns of the array.
       RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
                                      row of the array A is distributed.  CSRC_A (global)  DESCA(  CSRC_  )  The
       process column over which the
                                      first column of the array A is
                                      distributed.
       LLD_A  (local)  DESCA( LLD_ )  The leading dimension of the local
                                      array.  LLD_A >= MAX(1,LOCr(M_A)).

       Let  K  be  the  number  of rows or columns of a distributed matrix, and assume that its process grid has
       dimension p x q. LOCr( K ) denotes the number of elements of K that a process would  receive  if  K  were
       distributed  over  the  p  processes  of  its  process column. Similarly, LOCc( K ) denotes the number of
       elements of K that a process would receive if K were distributed over the q processes of its process row.
       The values of LOCr() and LOCc() may be determined via a call to the ScaLAPACK tool function, NUMROC:
               LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
               LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ).  An upper bound for these quantities may  be
       computed by:
               LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
               LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A

ARGUMENTS

       MP  = number of local rows in A and U NQ = number of local columns in A and VT SIZE = min( M, N ) SIZEQ =
       number of local columns in U SIZEP = number of local rows in VT

       JOBU    (global input) CHARACTER*1
               Specifies options for computing all or part of the matrix U:
               = 'V':  the first SIZE columns of U (the left singular vectors) are returned in the  array  U;  =
               'N':  no columns of U (no left singular vectors) are computed.

       JOBVT   (global input) CHARACTER*1
               Specifies options for computing all or part of the matrix V**T:
               =  'V':  the first SIZE rows of V**T (the right singular vectors) are returned in the array VT; =
               'N':  no rows of V**T (no right singular vectors) are computed.

       M       (global input) INTEGER
               The number of rows of the input matrix A.  M >= 0.

       N       (global input) INTEGER
               The number of columns of the input matrix A.  N >= 0.

       A       (local input/workspace) block cyclic REAL           array,
               global dimension (M, N), local dimension (MP, NQ) On exit, the contents of A are destroyed.

       IA      (global input) INTEGER
               The row index in the global array A indicating the first row of sub( A ).

       JA      (global input) INTEGER
               The column index in the global array A indicating the first column of sub( A ).

       DESCA   (global input) INTEGER array of dimension DLEN_
               The array descriptor for the distributed matrix A.

       S       (global output) REAL           array, dimension SIZE
               The singular values of A, sorted so that S(i) >= S(i+1).

       U       (local output) REAL           array, local dimension
               (MP, SIZEQ), global dimension (M, SIZE) if JOBU = 'V', U contains the first min(m,n) columns of U
               if JOBU = 'N', U is not referenced.

       IU      (global input) INTEGER
               The row index in the global array U indicating the first row of sub( U ).

       JU      (global input) INTEGER
               The column index in the global array U indicating the first column of sub( U ).

       DESCU   (global input) INTEGER array of dimension DLEN_
               The array descriptor for the distributed matrix U.

       VT      (local output) REAL           array, local dimension
               (SIZEP, NQ), global dimension (SIZE, N).  If JOBVT = 'V', VT contains  the  first  SIZE  rows  of
               V**T. If JOBVT = 'N', VT is not referenced.

       IVT     (global input) INTEGER
               The row index in the global array VT indicating the first row of sub( VT ).

       JVT     (global input) INTEGER
               The column index in the global array VT indicating the first column of sub( VT ).

       DESCVT   (global input) INTEGER array of dimension DLEN_
                The array descriptor for the distributed matrix VT.

       WORK    (local workspace/output) REAL           array, dimension
               (LWORK) On exit, if INFO = 0, WORK(1) returns the optimal LWORK;

       LWORK   (local input) INTEGER
               The dimension of the array WORK.

               LWORK > 2 + 6*SIZEB + MAX(WATOBD, WBDTOSVD),

               where SIZEB = MAX(M,N), and WATOBD and WBDTOSVD refer, respectively, to the workspace required to
               bidiagonalize  the  matrix  A  and  to  go  from  the  bidiagonal  matrix  to  the singular value
               decomposition U*S*VT.

               For WATOBD, the following holds:

               WATOBD = MAX(MAX(WPSLANGE,WPSGEBRD), MAX(WPSLARED2D,WPSLARED1D)),

               where WPSLANGE, WPSLARED1D, WPSLARED2D, WPSGEBRD are the workspaces required respectively for the
               subprograms PSLANGE, PSLARED1D, PSLARED2D, PSGEBRD. Using the standard notation

               MP = NUMROC( M, MB, MYROW, DESCA( CTXT_ ), NPROW), NQ = NUMROC( N,  NB,  MYCOL,  DESCA(  LLD_  ),
               NPCOL),

               the workspaces required for the above subprograms are

               WPSLANGE = MP, WPSLARED1D = NQ0, WPSLARED2D = MP0, WPSGEBRD = NB*(MP + NQ + 1) + NQ,

               where  NQ0  and  MP0  refer,  respectively, to the values obtained at MYCOL = 0 and MYROW = 0. In
               general, the upper limit for the workspace is given by a workspace required on processor (0,0):

               WATOBD <= NB*(MP0 + NQ0 + 1) + NQ0.

               In case of a homogeneous process grid this upper limit can be used as an estimate of the  minimum
               workspace for every processor.

               For WBDTOSVD, the following holds:

               WBDTOSVD    =    SIZE*(WANTU*NRU    +    WANTVT*NCVT)   +   MAX(WDBDSQR,   MAX(WANTU*WPSORMBRQLN,
               WANTVT*WPSORMBRPRT)),

       where

      1, if left(right) singular vectors are wanted WANTU(WANTVT) = 0, otherwise

      and WDBDSQR, WPSORMBRQLN and WPSORMBRPRT refer respectively to the workspace required for the  subprograms
      DBDSQR,  PSORMBR(QLN), and PSORMBR(PRT), where QLN and PRT are the values of the arguments VECT, SIDE, and
      TRANS in the call to PSORMBR. NRU is equal to the local number of rows of the matrix  U  when  distributed
      1-dimensional  "column"  of  processes.  Analogously,  NCVT is equal to the local number of columns of the
      matrix VT when distributed across 1-dimensional "row" of processes. Calling the  LAPACK  procedure  DBDSQR
      requires

      WDBDSQR = MAX(1, 2*SIZE + (2*SIZE - 4)*MAX(WANTU, WANTVT))

      on every processor. Finally,

      WPSORMBRQLN  =  MAX(  (NB*(NB-1))/2, (SIZEQ+MP)*NB)+NB*NB, WPSORMBRPRT = MAX( (MB*(MB-1))/2, (SIZEP+NQ)*MB
      )+MB*MB,

      If LIWORK = -1, then LIWORK is global input and a workspace query is assumed; the routine only  calculates
      the  minimum  size for the work array. The required workspace is returned as the first element of WORK and
      no error message is issued by PXERBLA.

       INFO    (output) INTEGER
               = 0:  successful exit.
               < 0:  if INFO = -i, the i-th argument had an illegal value.
               > 0:  if SBDSQR did not converge If INFO = MIN(M,N) + 1, then PSSYEV has  detected  heterogeneity
               by  finding  that  eigenvalues  were  not  identical  across the process grid.  In this case, the
               accuracy of the results from PSSYEV cannot be guaranteed.

LAPACK version 1.5                                 12 May 1997                                        PSGESVD(l)