


The first goal of LFMat is to furnish convenient matrix tools for the finite element methods. Actually, there's a lot of libraries for linear algebra on the net, but it seems that it's still hard to find flexible and high performance free software for the required procedures (genericity, speed, adaptated storage, ...). LFMat is a generic purpose, fully templated open source C++ matrix library. Particular attention has been furnished to get convenient storage for SIMD instructions like 3Dnow! and SSE2 on x86 processors and Altivec on PowerPC ones. It means that there's specializations for severals important types like float or double in order to get the deserving performances. Furthermore, important routines make careful use of cache, leading  as example  to solvers up to 8 times faster than standard lapack ones in the same situation (see benchmarks). Matrices can contain any kind of data (double, float, symbolic expressions, ...) and user can choose orientation, storage style and structure (see tutorial). Furthermore, matrices can be of fixed size (known at compilation time), allowing compilers to make additional optimizations.  
 
These instructions work far better with data aligned in memory. If it is not the case, the CPU wastes time to retrieve data. Table 1.1. Indexes in a symmetric float row oriented dense matrix, stored using the lower part with no alignment and with an alignment of 4 values.
Thus, for a classical symmetric matrix, lines should be aligned with multiples of the size of vectors used in SIMD instruction. Table 1.1 shows the indexes in a symmetric dense matrix, with no alignment and with an alignment of 4 values. Thus, scalar products between lines and other lines or vector become truly faster. This increases performance of a lot of procedures because this kind of scalar product is in the center of a lot of ones.  
 
For now, storage styles can be:
Furthermore, matrices can be:
Some useful procedures have been coded for different kind of matrices:
All these procedures have been designed to be fast, using cache and SIMD instruction where possible.
Table 1.2. Supported solvers. "R" and "C" means that ro and column oriented are supported. "O" means that the procedure is optimized for good use of the cache memory.
 
 
