LAMA is a framework for developing hardware-independent, high performance code for heterogeneous computing systems. It facilitates the development of fast and scalable software that can be deployed on nearly every type of system, from embedded devices to highly parallel supercomputers, with a single code base.


By using LAMA for their application, software developers benefit from higher productivity in the implementation phase and stay up to date with the latest hardware innovations, both leading to shorter time-to-market.


The framework supports multiple target platforms (including GPUs and Xeon Phi) within a distributed heterogeneous environment. It offers optimized device code on the back-end side and high scalability through latency hiding and asynchronous execution across multiple nodes. LAMA's modular and extensible software design supports the developer on several levels, regardless of whether writing his own portable code with the Heterogeneous Computing Development Kit or using prepared functionality from the Linear Algebra Package, the user always gains high productivity and maximum performance.


LAMA's design enables its use on future hardware architectures with optimal performance ensured due to its inherent data structure layout that can be easily extended to support novel and even experimental hardware setups. LAMA includes unique communication features, which allow the data transfer between compute components within a node and between nodes to be completely hidden.


Productivity is combined with performance in execution – which is not mutually exclusive. LAMA’s flexible software design introduces only a minimal overhead, conserving the full performance of the underlying BLAS implementations from the hardware vendors and from the highly optimized kernel back-ends. Performance comparison to concurring software libraries in the field of linear algebra show comparable results for single node implementations. On distributed systems the asynchronous execution model guarantees efficient overlapping of calculation, memory transfer and communication reaching linear scaling on GPUs.

Linear Algebra Package

The Linear Algebra Package facilitates the development of (sparse) numerical algorithms for various application domains. Code can be written in text-book-syntax as

 y = A * x

(where x and y are vectors and A is a matrix). Due to the underlying layers, the problem formulation is handled independently of the implementation details regardless of the target architecture and distribution strategy as memory management and communication is processed internally. Furthermore, with load balancing between different components and asynchronous execution, full system performance can be obtained.

In addition, LAMA offers various iterative solvers like Jacobi or CG methods, that can be used directly or preconditioned, with a combination of several user-definable stopping criteria. Furthermore, the integration of a custom-built solver is straightforward.

Areas of Application

The target applications for LAMA are based mainly in High Performance Computing or Embedded Computing but can be wherever hardware independant applications are needed. The field of applications is huge, e.g., simulation as reservoir simulations, seismic imaging, performance engineering, or computational fluid dynamics but also image and video processing and many more.

Free Software

LAMA is licensed for free under LGPL (GNU Lesser General Public License v3), so derivative work must also be redistributed under LGPL, but applications using the LAMA library don’t have to be.

How to get it

Your find a tar-ball of LAMA's release 3.0 here.