|
LF95 v6.2 Delivers!
»õ·Î¿öÁø ÁÖ¿ä ±â´Éµé...
¡Ü Automatic
Parallelization
¡Ü Pentium
4 and Xeon optimizations with SSE2 instructions
¡Ü OpenMP
v2.0 Support
¡Ü Prefetch
optimizations for Pentium III and Athlon
¡Ü Unsurpassed
global compile-time and runtime diagnostics
|
| ¡Ü Allocate
arrays up to 2 GB
¡Ü External
file size 2**64 bytes
¡Ü New
Wisk, Winteracter Starter Kit
¡Ü Thread-safe
BLAS and LAPACK v3.0 (includes SSE2 version)
¡Ü Thread-safe
SSL2 math library (includes SSE2 version)
¡Ü Automake,
automatic make utility
¡Ü ALLOCATABLE
attributes on array components
|
"Our
application is based on the finite-difference time-domain
method. Using the auto-parallelization feature of LF95
PRO v6.2, we reduced the execution time by approximately
50 %."
Masafumi
Fujii
Dept.
of Electrical, Electronic and System Engineering Toyama
University, Toyama, Japan
|
|
¡Ü Character
variable length limit: 2,147,483,647
¡Ü MPI
compatible with MPICH source included
¡Ü FDB
debugger
¡Ü Compatible
with TotalView parallel debugger
|
LF
95 v6.2´Â Express ¿Í PRO, µÎ°¡Áö Á¦Ç°À¸·Î ÆÇ¸ÅµË´Ï´Ù.
LF95
Express ´Â
ÄÄÆÄÀÏ·¯¸¦ ÃÖÀûÈÇÏ´Â °·ÂÇÑ Lahey/Fujitsu Fortran
95, command line debugger, g77°ú egcs¿ÍÀÇ ¿¬°á ȣȯ¼º, ¿Â¶óÀÎ µµ¿ò¸»
±×¸®°í ¹«·á À̸ÞÀÏ ±â¼ú Áö¿ø µîÀ» ¸Å¿ì Àú°¡¿¡ °ø±ÞµË´Ï´Ù.
¡Ü LF95
Performance.
¡Ü LF95
Optimizations.
¡Ü Link
Fujitsu C, g77, and egcs object files.
¡Ü Legacy
Fortran Support.
¡Ü ANSI/ISO-Compliant
Fortran 95.
¡Ü Free
Technical Support.
LF95
PRO ´Â ExpressÀÇ ±â´ÉÀ̿ܿ¡
auto-parallelization, OpenMP compatibility, the Winteracter
Starter Kit, WiSK, for creating Windows GUIs and displaying graphics,
thread-safe BLAS and LAPACK, Polyhedron's Automake utility, the
Fujitsu SSL2 math library (thread-safe for parallel applications)
µîÀ» Á¦°øÇϸç, Àüȸ¦ ÅëÇÑ ±â¼ú Áö¿øÀ» ¹ÞÀ» ¼ö ÀÖ½À´Ï´Ù.
¡Ü
Auto-Parallelization
¡Ü
OpenMP compatibility
¡Ü
Winteracter Starter Kit.
¡Ü
BLAS and LAPACK
¡Ü
Fujitsu Scientific Subroutine Library 2
¡Ü
Automake
¡Ü
Hard copy User's Guide
¡Ü
Free Telephone Support
|
|
LF95
Performance
LF95
6.2 features P4 optimizations with SSE2 instructions. We tested
v6.2 optimizations on a 1.8GHz P4 with 512 MB of PC2100 RAM, running
SuSE 8.1, using Polyhedron's (www.polyhedron.com) Fortran benchmarks.
Specifying the new switches --tp4, --sse2, --zfm, --o2, and -x,
the 90 benchmarks ran an average of 12.4% faster and the 77 benchmarks
7.9% faster than they did when built with LF95 Linux v6.1. Try LF95
v6.2 on your code today!
LF95
Optimizations
Basic
Optimization
¡Ü
Constant folding
¡Ü
Common subexpression elimination
¡Ü
Copy propagation
¡Ü
Strength Reduction
¡Ü
Algebraic simplifications
¡Ü
Dead code elimination
¡Ü
Peephole optimization
¡Ü
Loop invariant code motion
¡Ü
Transform array element to simple variable
¡Ü
Local Instruction scheduling
¡Ü
Address calculation optimization
Program
Reconstruction Optimizations
¡Ü
Loop unrolling
¡Ü
Loop interchange
Procedure
Optimization
¡Ü
Inlining mathematical functions
¡Ü
Stack optimization
Others
¡Ü
P4 with SSE2 instructions
¡Ü
Prefetch for Pentium III and Athlon processors
¡Ü
I486/Pentium/Pentium PRO instruction selection
¡Ü
Using fast input/output libraries
Link
Fujitsu C, g77, and egcs object files
LF95
supports static linking with Fujitsu C, g77, or egcs. Combine your
Fortran and C/C++ code into one executable. For the routines you
don't want to develop yourself, you can also link with C/C++ routines
from commercially available libraries.
Legacy
Fortran Support
LF95
extends its language support in other directions adding many legacy
Fortran features, including VAX structures and the various UNIX
service routines. These features further facilitate your move to
cost/performance efficiency on the PC platform:
¡Ü
Unlimited number of continuation lines in free or fixed source form
¡Ü
DO UNTIL statement
¡Ü
FIND statement
¡Ü
STRUCTURE and END STRUCTURE statements
¡Ü
UNION and END UNION statements
¡Ü
MAP and END MAP statements
¡Ü
RECORD statement
¡Ü
Non-standard POINTER statement
¡Ü
AUTOMATIC statement
¡Ü
STATIC statement
¡Ü
VALUE statement
¡Ü
BYTE statement
¡Ü
Hollerith constants
¡Ü
Alternative forms of binary, octal, and hexadecimal constants
¡Ü
Binary, octal, or hexadecimal constants in a DATA, declaration statement
¡Ü
Period structure component separator
¡Ü
IMPLICIT UNDEFINED statement
¡Ü
Namelist input/output on internal file
¡Ü
FORM = 'BINARY'
¡Ü
TOTALREC specifier
¡Ü
STATUS = 'SHR'
¡Ü
Gw, $, \, and R edit descriptors
¡Ü
LOC intrinsic function
¡Ü
The following service subroutines: ABORT, BEEP, BIC, BIS, CLOCK,
CLOCKM, DATE, EXIT, ERRSAV, ERRSTR, ERRSET, ERRTRA, FDATE, FREE,GETARG,
GETDAT, GETLOG, GETPARM, GETTIM, GMTIME, IBTOD, IDATE, IETOM, ITIME,
IVALUE, LTIME, MTOIE, PERROR, PRNSET, QSORT, SETRCD, SETBIT, SIGNAL,
SLEEP
¡Ü
The following service functions: ACCESS, ALARM, BIT, CHDIR, CHMOD,
CTIME, DRAND, DTIME, ETIME, FGETC, FPUTC, FSEEK, FSTAT, FTELL, GETC,
GETCWD, GETFD, GETPID, HOSTNM, IARGC, IERRNO, INMAX, IOINIT, IRAND,
JDATE, KILL, LNBLNK, LONG, LSTAT, MALLOC, NARGS, PUTC, RAN, RAND,
RENAME, RINDEX, RTC, SECOND, SECNDS, SETDAT, SETTIM, SHORT, STAT,
TIME, TIMEF, UNLINK
ANSI/ISO-Compliant
Fortran 95
LF95
is a complete implementation of the ANSI/ISO Fortran 95 standard.
Fortran 95 offers some small but important improvements over Fortran
90, including the ability to create your own elemental procedures,
default initialization for structure components, the NULL intrinsic
for initializing pointers, the FORALL construct, and a standard
CPU_TIME intrinsic procedure.
Free
Technical Support
LF95
Linux Express includes e-mail technical support at no extra charge.
|
|
Automatic
Parallelization
The
LF95 compiler automatically parallelizes DO loops and array operations
without you having to make modifications to the program. This makes
it easy to migrate source programs to other platforms (as long as
the program conforms with the Fortran Standard). The effect is to
save elapsed execution time by using two or more CPUs simultaneously.
For instance, if a DO loop can be executed in parallel by dividing
it in half, then, theoretically, the execution time of this DO loop
may be cut in half. In practice, improving performance requires
some care and some work on the part of the programmer. During compilation,
the auto-parallel function will return information regarding which
processes were (and which were not) parallelized and why. While
certain loops can be analyzed sufficiently to be parallelized by
the compiler without input from the programmer, many loops have
data dependencies that prevent automatic parallelization because
of the potential for incorrect results. For that reason, LF95 PRO
also includes optimization control lines (OCLs) that provide information
necessary for the compiler to parallelize these otherwise unparallelizable
loops. The OCLs are Fortran comments in a particular format, for
example:
!OCL
PARALLEL
Note
that programs with OCLs are standard-conforming and can be compiled
with other compilers that do not support OCLs.
Four
compiler switches control automatic parallelization: --parallel,
--threads, --threadstack, and --ocl. Details of automatic parallelization
(loop slicing, interchange, distribution, fusion, and reduction,
as well as OCL syntax and specifiers) are documented in the LF95
User's Guide and at www.lahey.com/doc.htm.
OpenMP
v2.0 Compatibility
OpenMP
specifies a set of compiler directives, library routines, and environment
variables for shared-memory parallelism in Fortran and C/C++ programs.
LF95 Linux PRO v6.2 supports the OpenMP v2.0 specification for Fortran.
Like automatic parallelization, OpenMP directives are used to parallelize
a program that runs on a computer with more than one processor.
With OpenMP you have more control over how code is parallelized,
but also more coding to do.
The
LF95 Linux PRO v6.2 CD includes the OpenMP v2.0 Fortran specification
in PDF. You can also view the specification at www.lahey.com/doc.htm.
You can learn more about OpenMP at www.openmp.org.
|
|
BLAS and LAPACK
BLAS is a library for vector and matrix operations. The BLAS thread-safe version is based on BLAS provided on Netlib. Included in LF95 v6.2 is an optimized version for the Pentium 4 with SSE2 instructions. BLAS includes 57 functions. The total number of routines for all precision types amounts to approximately 170.
BLAS thread-safe version provides the following routines:
Level 1 BLAS : Vector operations
Level 2 BLAS : Matrix and vector operations
Level 3 BLAS : Matrix and matrix operations
Sparse-BLAS : Sparse vector operations
The thread-safe implementation of BLAS has exactly the same subroutine names and calling parameters as those of the Netlib baseline version.
Differences include:
?
the thread-safe version can be used in the environment of SMP (Symmetric Multiple Processing)
?
subroutines of the thread-safe version can be called from an OpenMP Fortran program
The purpose of using BLAS thread-safe version is to have a subroutine concurrently perform operations on different sets of data that are independent from each other, and thus reduce the time necessary to finish all the operations.
LAPACK is a library of linear algebra routines. The LAPACK thread-safe version is based on LAPACK 3.0 provided on Netlib. Included in LF95 v6.2 is an optimized version for the Pentium 4 with SSE2 instructions. LAPACK includes approximately 300 functions. The total number of routines for all precision types amounts to approximately 1100.
LAPACK provides the following routines:
¡Ü
Linear equations
¡Ü
Linear least squares problems
¡Ü
Eigenvalue problems
¡Ü
Singular value decomposition
The LAPACK thread-safe version, like the BLAS version, can be called from an OpenMP program in the environment of SMP.
Fujitsu Scientific Subroutine Library 2
The Fujitsu Scientific Subroutine Library 2 (SSL2) has been in use for years in Japan on Fujitsu mainframe and workstation hardware. Included in LF95 v6.2 is an optimized version for the Pentium 4 with SSE2 instructions. SSL2 offers over 250 optimized thread-safe routines in the following areas:
Linear Algebra
Matrix Storage Mode Conversion
Matrix Manipulation
Linear Equations and Matrix Inversion (Direct Method)
Least Squares Solution
Eigenvalues and Eigenvectors
Eigenvalues and Eigenvectors of a Real Matrix
Eigenvalues and Eigenvectors of a Complex Matrix
Eigenvalues and Eigenvectors of a Real Symmetric Matrix
Eigenvalues and Eigenvectors of a Hermitian Matrix
Eigenvalues and Eigenvectors of a Real Symmetric Band Matrix
Eigenvalues and Eigenvectors of a Real Symmetric Generalized Eigenproblem
Eigenvalues and Eigenvectors of a Real Symmetric Band Generalized Eigenproblem
Nonlinear Equations
Polynomial Equations
Transcendental Equations
Nonlinear Simultaneous Equations
Extrema
Minimization of Function with a Variable
Unconstrained Minimization of Multivariable Function
Unconstrained Minimization of Sum of Squares of Functions (Nonlinear Least Squares Solution)
Linear Programming
Nonlinear Programming (Constrained Minimization of Multivariable Function)
Interpolation and Approximation
Interpolation
Approximation
Smoothing
Series
Transforms
Discrete Real Fourier Transforms
Discrete Cosine Transforms
Discrete Sine Transforms
Discrete Complex Fourier Transforms
Laplace Transform
Numerical Differentiation and Quadrature
Differential Equations
Special Functions
Elliptic Integrals
Exponential Integral
Sine and Cosine Integrals
Fresnel Integrals
Gamma Functions
Error Functions
Bessel Functions
Normal Distribution Functions
Pseudo Random Numbers
Pseudo Random Generation
Pseudo Random Testing
Free Telephone Support.
In addition to the free e-mail, fax, and postal technical support, LF95 PRO includes free telephone support via Lahey's 775-831-2500 number.
|