**Applied Software widely used for **

**supercomputer simulations and virtual experiments**

**Bioinformatics and Genomics**

1.ALLPATHS-LG

https://software.broadinstitute.org/allpaths-lg/blog/

National Human Genome Research Institute and National Institute of Allergy and Infectious Diseases

Massively parallel DNA sequencing technologies are revolutionizing genomics by making it possible to generate billions of relatively short (∼100-base) sequence reads at very low cost.

LLPATHS-LG implements the algorithm for genome assembly and its application to massively parallel DNA sequence data from the human and mouse genomes, generated on the Illumina platform. The resulting draft genome assemblies have good accuracy, short-range contiguity, long-range connectivity, and coverage of the genome. In particular, the base accuracy is high (≥99.95%) and the scaffold sizes (N50 size = 11.5 Mb for human and 7.2 Mb for mouse) approach those obtained with capillary-based sequencing.

2.BLAST

https://blast.ncbi.nlm.nih.gov/Blast.cgi

NCBI

BLAST is Basic Local Alignment Search Tool is an algorithm for comparing primary biological sequence information, such as the amino-acid sequences of different proteins or the nucleotides of DNA sequences.

A BLAST search enables a researcher to compare a query sequence with a library or database of sequences, and identify library sequences that resemble the query sequence above a certain threshold.

3.ClustalW2

https://www.ebi.ac.uk/Tools/msa/clustalw2/

EMBL-EBI

ClustalW2 is platform for multiple alignment of nucleic acid to a genomic DNA sequence and protein sequences, compares a sequences, allowing for introns and frame shifting errors .Parallel ClustalW2 software package to allow faster alignment of very large data sets and to increase alignment accuracy. This software calculates the best matches and aligns the sequences according to the identified similarities.

The number of nucleotides are compared to the genomic DNA sequence with the number of nucleotides in 10^{8 }.

4.BEAST2

BEAST 2 is a cross-platform program for Bayesian phylogenetic analysis of molecular sequences. It estimates rooted, time-measured phylogenies using strict or relaxed molecular clock models. It can be used as a method of reconstructing phylogenies but is also a framework for testing evolutionary hypotheses without conditioning on a single tree topology. BEAST 2 uses Markov chain Monte Carlo (MCMC) to average over tree space, so that each tree is weighted proportional to its posterior probability. BEAST 2 includes a graphical user-interface for setting up standard analyses and a suit of programs for analysing the results.

5.BWA

http://bio-bwa.sourceforge.net/

BWA is a software package for mapping low-divergent sequences against a large reference genome, such as the human genome. It consists of three algorithms: BWA-backtrack, BWA-SW and BWA-MEM. The first algorithm is designed for Illumina sequence reads up to 100bp, while the rest two for longer sequences ranged from 70bp to 1Mbp. BWA-MEM and BWA-SW share similar features such as long-read support and split alignment, but BWA-MEM, which is the latest, is generally recommended for high-quality queries as it is faster and more accurate. BWA-MEM also has better performance than BWA-backtrack for 70-100bp Illumina reads..

6.FALCON

https://github.com/PacificBiosciences/FALCON

Pacific Biosciences

Falcon a set of tools for fast aligning long reads for consensus and assembly. The Falcon tool kit is a set of simple code collection which I use for studying efficient assembly algorithm for haploid and diploid genomes

- FreeBayes 1.3.1

https://mybiosoftware.com/freebayes-0-9-4-bayesian-genetic-variant-detector.html

Geneious

FreeBayes is a Bayesian genetic variant detector designed to find small polymorphisms, specifically SNPs (single-nucleotide polymorphisms), indels (insertions and deletions), MNPs (multi-nucleotide polymorphisms), and complex events (composite insertion and substitution events) smaller than the length of a short-read sequencing alignment.

7.IQ-TREE

Large phylogenomics data sets require fast tree inference methods, especially for maximum-likelihood (ML) phylogenies. Fast programs exist, but due to inherent heuristics to find optimal trees, it is not clear whether the best tree is found. Thus, there is need for additional approaches that employ different search strategies to find ML trees and that are at the same time as fast as currently available ML programs.

IQ-TREE is an efficient phylogenomic inference software designed for the reconstruction of trees by maximum likelihood and assessing branch supports. This fast and effective stochastic algorithm is widely used in molecular systematics and IQ-TREE has essentially greater results when compared to RAxML and PhyML in terms of likelihood.

IQ-TREE found higher likelihoods between 62.2% and 87.1% of the studied alignments, thus efficiently exploring the tree-space. If we use the IQ-TREE stopping rule, RAxML and PhyML are faster in 75.7% and 47.1% of the DNA alignments and 42.2% and 100% of the protein alignments, respectively. However, the range of obtaining higher likelihoods with IQ-TREE improves to 73.3–97.1%. IQ-TREE is freely available at http://www.cibiv.at/software/iqtree.

8.MEMSAT-SVM

http://bioinfadmin.cs.ucl.ac.uk/downloads/memsat-svm/

University College London

A method capable of automatically identifying pore-lining regions in transmembrane proteins from sequence information alone, which can then be used to determine the pore stoichiometry. MEMSAT-SVM provides a way to characterise pores in transmembrane proteins and may even provide a starting point for discovering novel routes of therapeutic intervention in a number of important diseases.

9.RAY

https://sourceforge.net/projects/denovoassembler/

Ray is a parallel de novo genome assemblies for parallel DNA sequencing. It uses the message passing interface (MPI) for passing messages, while assembling reads obtained with new sequencing technologies (Illumina, 454, SOLiD ) for next-generation sequencing data .

10.Samtool

https://hpc.nih.gov/apps/samtools.html

The NIH HPC group

Samtools is a suite of programs for interacting and post-processing with high-throughput sequencing data and short DNA sequence read alignments in SAM format. The SAM generic format is used for storing large nucleotide sequence alignments. Also, the tools provided support complex tasks, such as variant calling and alignment viewing as well as sorting, indexing, data extraction .

Samtools is a suite of applications for processing high throughput sequencing data:

samtools is used for working with SAM ,BAM, and CRAM files containing aligned sequences; bcftools is used for working with BCF2, VCF, and gVCF files containing variant calls ; htslib is a library for reading and writing the formats mentioned above. samtools and bcftools are based on htslib and format conversion.

** **

**Computational Chemistry**

11.NWChem

Pacific Northwest National Laboratory, US Department of Energy

NWChem is the DOE flagship quantum chemistry – molecular mrchanics code, which was designed from scratch to run on supercomputers.

NWChem aims to provide its users with computational chemistry tools that are scalable both in their ability to treat large scientific computational chemistry problems efficiently, and in their use of available parallel computing resources from high-performance parallel supercomputers to conventional workstation clusters

The NWChem development strategy is focused on providing new and essential scientific capabilities to its users in the areas of kinetics and dynamics of chemical transformations, chemistry at interfaces and in the condensed phase, and enabling innovative and integrated research at EMSL.

NWChem software can handle:

- Biomolecules, nanostructures, and solid-state

- From quantum to classical, and all combinations

- Ground and excited-states

- Gaussian basis functions or plane-waves

-Ab-initio molecular dynamics (Carr-Parinello)

-In general: single-point calculations, geometry optimizations, vibrational analysis.

-Extended (solid-state) systems DFT

-Classical force-fields (Molecular Mechanics: AMBER, CHARMM, etc.)

Classical molecular dynamics capabilities provide for the simulation of macromolecules and solutions, including the computation of free energies using a variety of force fields.

NWChem is scalable, both in its ability to treat large problems efficiently, and in its utilization of available parallel computing resources. NWChem has been optimized to perform calculations on large molecules using large parallel computers, and it is unique in this regard. This document is intended as an aid to chemists using the code for their own applications. Users are not expected to have a detailed understanding of the code internals, but some familiarity with the overall structure of the code

- Open Babel

http://openbabel.org/wiki/Main_Page

https://sourceforge.net/projects/openbabel/

Open Babel is a free, open-source version of the Babel chemistry file translation program. Open Babel is a project designed to pick up where Babel left off, as a cross-platform program and library designed to interconvert between many file formats used in molecular modeling, computational chemistry, and many related areas.

13.DIRAC

http://www.diracprogram.org/doku.php

DIRAC a relativistic ab initio electronic structure program,

Program for Atomic and Molecular Direct Iterative Relativistic All-electron Calculations

The DIRAC computes molecular properties using relativistic quantum chemical methods. It is named after P.A.M. Dirac, the father of relativistic electronic structure theory.

** **

**Molecular Dynamics, Molecular Mechanics and Molecular interactions **

14.GROMACS

https://bioexcel.eu/

Centre of Excellence for Computational Biomolecular Research ( BioExcel)

GROMACS is an engine to perform molecular dynamics simulations and energy minimization. These are two of the many techniques that belong to the realm of computational chemistry and molecular modeling. Molecular modeling indicates the general process of describing complex chemical systems in terms of a realistic atomic model, with the aim to understand and predict macroscopic properties based on detailed knowledge on an atomic scale. Often molecular modeling is used to design new materials, drugs, nano structure and atomic clusters for which the accurate prediction of physical properties of realistic systems is required

Macroscopic physical properties can be distinguished in (a) static equilibrium properties, such as the binding constant of an inhibitor to an enzyme, the average potential energy of a system, or the radial distribution function in a liquid, and (b) dynamic or non-equilibrium properties, such as the viscosity of a liquid, diffusion processes in membranes, the dynamics of phase changes, reaction kinetics, or the dynamics of defects in crystals. The choice of technique depends on the question asked and on the feasibility of the method to yield reliable results at the present state of the art.

- NAMD

https://www.ks.uiuc.edu/Research/namd/

The Theoretical and Computational Biophysics Group at the University of Illinois

NAMD is a parallel, object-oriented molecular dynamics code designed for high-performance simulation of large biomolecular systems. Simulation preparation and analysis is integrated into the visualization package VMD.

NAMD pioneered the use of hybrid spatial and force decomposition, a technique used by most scalable programs for biomolecular simulations, including Blue Matter.

NAMD is developed using Charm++ and benefits from its adaptive communication-computation overlap and dynamic load balancing. We describe some recent optimizations including: pencil decomposition of the Particle Mesh Ewald method, reduction of memory footprint, and topology sensitive load balancing. Unlike most other MD programs, NAMD not only runs on a wide variety of platforms ranging from commodity clusters to supercomputers, but also scales to over one hundred thousand processors thousands . NAMD was tested 1.07 billion-atom complex benchmark up to 64,000 processors .

- LAMMPS

Sandia National Laboratories, Department of Energy,USA

LAMMPS is a classical molecular dynamics simulation code designed to run efficiently on parallel computers. It was developed at Sandia National Laboratories, a US Department of Energy facility, with funding from the DOE.

In the most general sense, LAMMPS integrates Newton's equations of motion for collections of atoms, molecules, or macroscopic particles that interact via short − or long − range forces with a variety of initial and/or boundary conditions.

On parallel machines, LAMMPS uses spatial decomposition techniques to partition the simulation domain into small 3d sub domains, one of which is assigned to each processor. LAMMPS is most efficient (in a parallel sense) for systems whose particles fill a 3d rectangular box with roughly uniform density.

*Kinds of systems LAMMPS can simulate :*

-bead−spring polymers

-united−atom polymers or organic molecules

-all−atom polymers, organic molecules, proteins, DNA

-metals

-granular materials

-coarse−grained mesoscale models

-ellipsoidal particles

-point dipolar particles

-hybrid systems

- DL_POLY

https://www.scd.stfc.ac.uk/Pages/DL_POLY.aspx

DL POLY was developed at Daresbury Laboratory, the Science & Technology Facilities Council UK, with support from the Engineering and Physical Sciences Research Council the Natural Environment Research Council (NERC ).

DL_POLY is a general purpose classical molecular dynamics (MD) simulation software . The package is used to model the atomistic (coarse-grained or DPD) evolution of the full spectrum of models commonly employed in the materials sciences as solid state chemistry, biological simulation and soft condensed-matter communities. Calculates molecular dynamics and solves problems of molecular mechanics for very large molecules containing up to 10^{9} atoms.

DL POLY is a fully data distributed code, employing methodologies such as spatial domain decomposition (DD), link-cells (LC) built Verlet neighbour . The particle density of the modelled systems close to uniform in space and time (ensuring load balancing)

- GENESIS MD

https://www.r-ccs.riken.jp/labs/cbrt/

GENESIS (GENeralized-Ensemble SImulation System) mainly developed by the Sugita groups in RIKEN, Japan. (Computational Biophysics Research Team (R-CCS), Theoretical Molecular Science Laboratory (CPR), and Laboratory for biomolecular function simulation (BDR))

The GENESIS program package is composed of two MD programs (ATDYN, SPDYN) and trajectory analysis tools:

-CHARMM force field, AMBER force field, MARTINI model, and Go models ;

-Energy minimization and molecular dynamics simulations.

-SHAKE/RATTLE, SETTLE, and LINCS algorithms for bond constraint Bussi,

Langevin, and Berendsen thermostat/barostat ;

-Replica-exchange molecular dynamics method (REMD) in temperature, pressure, and surface-tension space;

-Generalized replica-exchange with solute tempering (gREST) and replica-exchange umbrella sampling (REUS) with collective variables ;

-Multi-dimensional REMD methods ;

-Gaussian accelerated molecular dynamics method.

-String method for reaction pathway search;

-Hybrid quantum mechanics/molecular mechanics calculation ( QM/MM)

-Implicit solvent model (Generalized Born/Solvent Accessible Surface Area model);

-Free-energy perturbation method (FEP);

-An harmonic vibrational analysis using SINDO;

-Steered MD and Targeted MD simulations;

-Restrained MD simulations (Distance, angle, dihedral angle, position, etc);

-Hybrid MPI+OpenMP, hybrid CPU+GPGPU, mixed double+single precision calculations

-Scalable MD simulations for huge systems > 100,000,000 atoms

-Spatial decomposition analysis (SPANA)

- ls1 mardyn:

https://www.ls1-mardyn.de/home.html

The molecular dynamics (MD) simulation program ls1 mardyn mainly developed by High Performance Computing Center Stuttgart; Leibniz Supercomputing Centre, Scientific Computing in Computer Science; University of Kaiserslautern, Laboratory of Engineering Thermodynamics

*ls1 mardyn* was optimized for massively parallel execution on supercomputing architectures. With an efficient MD simulation engine, explicit particle-based force-field models of the intermolecular interactions can be applied to length and time scales which were previously out of scope for molecular methods. Employing a dynamic load balancing scheme for an adaptable volume decomposition, *ls1 mardyn* delivers a high performance even for challenging heterogeneous configurations.

The program is an interdisciplinary endeavor, whose contributors have backgrounds from engineering, computer science and physics, aiming at studying challenging scenarios with up to trillions of molecules. In the considered systems, the spatial distribution of the molecules may be heterogeneous and subject to rapid unpredictable change. This is reflected by the algorithms and data structures as well as a highly modular software engineering approach.

It is more specialized than most of the molecular simulation programs mentioned above. In particular, it is restricted to rigid molecules, and only constant volume ensembles

are supported, so that the pressure cannot be specified in advance. Electrostatic long-range

interactions, beyond the cut-off radius, are considered by the reaction field method, which

cannot be applied to systems containing ions.

However, ls1 mardyn is highly performant and scalable. Holding the present world record in simulated system size, it is furthermore characterized by a modular structure, facilitating a high degree of flexibility within a single code base. Thus, ls1 mardyn is not only a simulation engine, but also a framework for developing and evaluating simulation algorithms, e.g. different thermostats or parallelization schemes;

**Quantum Chemistry**

- CPMD (Car-Parrinello Molecular Dynamics).

https://bioexcel.eu/software/cpmd/

Physical Chemistry Institute of the Zurich University; Max Planck Institute, Stuttgart ; IBM Research Laboratory, Zurich and Centre of Excellence for Computational Biomolecular Research (Bioexcel).

The CPMD code is a parallelized plane wave/pseudopotential implementation of Density Functional Theory, particularly designed for ab-initio molecular dynamics. CPMD is currently the most HPC efficient code that allows performing quantum molecular dynamics simulations by using the Car-Parrinello molecular dynamics scheme. CPMD simulations are usually restricted to systems of few hundred atoms.

The main characteristics of the CPMD code include:

- works with norm conserving or ultrasoft pseudopotentials
- free energy density functional implementation
- isolated systems and system with periodic boundary conditions; k-points
- molecular and crystal symmetry
- wavefunction optimization: direct minimization and diagonalization
- geometry optimization: local optimization and simulated annealing
- molecular dynamics: constant energy, constant temperature and constant pressure
- - molecular dynamics: NVE, NVT, NPT ensembles.
- response functions and many electronic structure properties
- time -dependent Density Functional Theory (excitations, molecular dynamics in excited states)
- Hybrid quantum mechanical / molecular mechanics calculations (QM/MM

* *

- CP2K

The CP2K Foundation; University of Zurich, Computational Chemistry ; Paul Scherrer Institute

CP2K is a quantum chemistry and solid state physics software package that can perform atomistic simulations of solid state, liquid, molecular, periodic, material, crystal, and biological systems. It is especially aimed at massively parallel and linear-scaling electronic structure methods and state-of-the-art ab initio molecular dynamics simulations. Excellent performance for electronic structure calculations is achieved using novel algorithms implemented for modern high-performance computing systems.

CP2K is a suite of modules, collecting a variety of molecular simulation methods at different levels of accuracy, from ab-initio density functional theory (DFT) to classical Hamiltonians, passing through semi-empirical neglect of diatomic differential overlap(NDDO) approximation It is used routinely for predicting energies, molecular structures, vibrational frequencies of molecular systems, reaction mechanisms, and ideally suited for performing molecular dynamics studies.

CP2K can do simulations of molecular dynamics, metadynamics, Monte Carlo, Ehrenfest dynamics, vibrational analysis, core level spectroscopy, energy minimization, and transition state optimization using NEB or dimer method.

22.Quantum Espresso

https://www.quantum-espresso.org/

Quantum ESPRESSO Foundation, Centre of Excellence MaX (MAterials design at the eXascale)

Quantum Espresso is an integrated suite of computer codes for electronic-structure calculations and materials modeling at the nanoscale. It is based on density-functional theory, plane waves, and pseudo potentials. The core plan wave DFT functions of QE are provided by the Plane-Wave Self-Consistent Field (Wscf) component.

Quantum Espresso can currently perform the following kinds of calculations:

- ground-state energy and one-electron (Kohn-Sham) orbitals
- atomic forces, stresses, and structural optimization
- molecular dynamics on the ground-state Born-Oppenheimer surface, also with variable cell
- Nudged Elastic Band (NEB) and Fourier String Method Dynamics (SMD) for energy barriers and reaction paths
- macroscopic polarization and finite electric fields via the modern theory

of polarization (Berry Phases).

- General Atomic and Molecular Electronic Structure System ( GAMES)

https://www.msg.chem.iastate.edu/gamess/

Mark Gordon's Quantum Theory Group, Ames Laboratory, Iowa State University

GAMESS is a program for ab-initio molecular quantum chemistry. Briefly, GAMESS can compute SCF wave functions ranging from RHF, ROHF, UHF, GVB, and MCSCF. Correlation corrections to these SCF wave functions include Configuration Interaction, second order perturbation Theory, and Coupled-Cluster approaches, as well as the Density Functional Theory approximation. Excited states can be computed by CI, EOM, or TD-DFT procedures.

**Supercomputing atomistic simulation tools to understand structure-property relation of nanomaterial**

- FLEUR

http://www.max-centre.eu/codes-max/fleur

FLEUR is mainly developed at the Forschungszentrum Jülich at the Institute of Advanced Simulation and the Peter Grünberg Institut.

Full-potential Linearised augmented plane wave in EURope (FLEUR) is a code family for calculating groundstate as well as excited-state properties of solids within the context of density functional theory (DFT)..

The Fleur code implements the all-electron full-potential linearized augmented-plane-wave (FLAPW) approach to density functional theory (DFT). It allows the calculation of properties obtainable by DFT for crystals and thin films composed of arbitrary chemical elements. For this it treats all electrons on the basis of DFT and does not rely on the pseudopotential approximation. There are also no shape approximations to the potential required. However, this comes at the cost of complex parametrizations of the calculations. The Fleur approach to this complex parametrization is the usage of an input generator that itself only requires basic structural input. Using this it generates a completely parametrized Fleur input file with material adapted default parameters.

- BigDFT

http://www.max-centre.eu/codes-max/bigdft

European Centre of Excellence MAX

BigDFT is an electronic structure pseudopotential code that employs Daubechies wavelets as a computational basis, designed for usage on massively parallel architectures. It features high-precision cubic-scaling DFT functionalities enabling treatment of molecular, slab-like as well as extended systems, and efficiently supports hardware accelerators such as GPUs since 2009. Also, it features a linear-scaling algorithm that employs adaptive Support Functions (generalized Wannier orbitals) enabling the treatment of system of many thousand atoms. The code is developed and released as a software suite made of independent, interoperable components, some of which have already been linked and distributed in other DFT codes.

BigDFT is fast, precise, and flexible code for ab-initio atomistic simulation.

** **

**Virtual Drug Discovery **

Force field and Free Energy Cakculations

- 25. AMBER

https://ambermd.org/AmberModels.php

The term "Amber" refers to two things.

First, it is a set of molecular mechanical force fields.

Amber is designed to work with several simple types of force fields, although it is most commonly used with parametrizations developed by Peter Kollman and his co-workers and “descendents”. The traditional parametrization uses fixed partial charges, centered on atoms. Less commonly used modifications add polarizable dipoles to atoms, so that the charge description depends upon the environment; such potentials are called “polarizable” or “non-additive”. An alternative is to use force fields originally developed for the CHARMM or Tinker (AMOEBA) codes

Since various choices make good sense, as of Amber 16 we have implemented a new scheme for users to specify the force fields they wish to use.

Depending on what components are in your system, you may need to specify for the simulation of biomolecules (these force fields are in the public domain, and are used in a variety of simulation programs):

- a protein force field
- a DNA force field
- an RNA force field
- a carbohydrate force field
- a lipid force field
- a water model with associated atomic ions (more variable, but the most common choice is still tip3p);
- a general force field, for organic molecules like ligands
- other components (such as modified amino acids or nucleotides, other ions)

Second, AmberTool21 molecular simulation programs

https://ambermd.org/AmberTools.php

AmberTools 21consists of several independently developed packages that work well by themselves, and with Amber20 itself. The suite can also be used to carry out complete molecular dynamics simulations, with either explicit water or generalized Born solvent models.

The AmberTools suite is free of charge, and its components are mostly released under the GNU General Public License (GPL). A few components are included that are in the public domain or which have other, open-source, licenses. The sander program has the LGPL license.

- 26. CHARMM

Chemistry at HARvard Macromolecular Mechanics (CHARMM)

CHARMM is actively maintained by a large group of developers led by Martin Karplus.

A molecular simulation program with broad application to many-particle systems with a comprehensive set of energy functions, a variety of enhanced sampling methods, and support for multi-scale techniques including Quantum Mechanics/Molecular Mechanics Quantum( QM/MM), hybrid molecular mechanics/coarse-grained (MM/CG) simulation, and a range of implicit solvent models. .

- CHARMM primarily targets biological systems including peptides, proteins, prosthetic

groups, small molecule ligands, nucleic acids, lipids, and carbohydrates, as they occur in solution, crystals, and membrane environments. CHARMM also finds broad applications for inorganic materials with applications in materials design.

- CHARMM contains a comprehensive set of analysis and model builiding tools.
- CHARMM achieves high performance on a variety of platforms including parallel clusters and GPUs.

Modules:

- Coordinate manipulation and analysis | corman
- Energy commands | energy
- Non-bonded options | nbonds
- Minimization | minimiz
- Molecular dynamics | dynamc
- Constraints and restraints | cons
- Time series and correlation functions | correl
- Atom selections | select

Modules in alphabetical order

https://academiccharmm.org/documentation

**Massively parallel ligand screening**

- 27. VirtualFlow

Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Harvard University, Boston,USA; Department of Physics, Faculty of Arts and Sciences,

Harvard University, Boston; Department of Cancer Biology, Dana-Farber Cancer,Boston,USA; Institute;Department of Pharmacy, Pharmaceutical and Medicinal Chemistry, Saarland University, Saarbrücken, Germany; Enamine, National Taras Shevchenko University of Kyiv, Kyiv, Ukraine; Zuse Institute Berlin, Berlin, Germany; Institute of Mathematics, Technical University Berlin, Berlin, Germany; Department of Mathematics and Computer Science, Freie Universität Berlin, Berlin, Germany;

https://github.com/VirtualFlow/VFVS

On average, an approved drug currently costs US$2–3 billion and takes more than

10 years to develop. In part, this is due to expensive and time-consuming Wet-laboratory experiments, poor initial hit compounds and the high attrition rates in the (pre-) clinical phases.

Structure-based virtual screening has the potential to mitigate these problems. With structure-based virtual screening, the quality of the hits improves with the number of compounds screened. However, despite the fact that large databases of compounds exist, the ability to carry out large-scale structure-based virtual screening on supercomputers in an accessible, efficient and flexible manner has remained difficult.

Virtual Flow (VF) is is highly automated and versatile open-source drug discovery platform with perfect scaling behavior that is able to prepare and efficiently screen ultra-large libraries of compounds.

VF is able to use a variety of the most powerful docking programs. Using VF, we prepared one of the largest and freely available ready-to-dock ligand libraries, with more than 1.4 billion commercially available molecules.

Screening 1 billion compounds on a single processor core, with an average docking time of 15 s per ligand, would take approximately 475 years. Virtual Flow can dock 1 billion compounds in

Approximately 2 weeks when leveraging 100,000 CPU cores simultaneously.

VFLP prepares ligand databases by converting them from the SMILES format to any desired target format (for example, the PDBQT format, which is required by many of the AutoDock-based docking programs).

VFLP uses the JChem package of ChemAxon as well as Open Babel to desalt ligands, neutralize them, generate (one or multiple) tautomeric states, compute protonation states at specific pH values, calculate three-dimensional coordinates and convert the molecules into desired

target formats .

*Preparation of the Enamine REAL library*

One of the largest vendor libraries that is currently available is the REAL library of Enamine, which contains approximately 1.4 billion make-on-demand compounds.

https://enamine.net/library-synthesis/real-compounds/real-database

The ZINC 15 database contained 1.46 billion compounds, but only provided 630 million molecules in a ready-to-dock format.

The entire database has a six-dimensional lattice architecture, the general concept of which was modelled after the ZINC 15 database, in which each dimension corresponds to a physico-chemical property of the compounds (molecular mass, partition coefficient, number of

hydrogen bond donors and acceptors, number of rotatable bonds and the topological polar surface area).

- EXSCALATE

E4C is a public-private consortium supported by the European Commission’s. The E4C consortium, coordinated by Dompé Farmaceutici , is composed by 18 institutions from seven European countries.

EXSCALATE a Ultra High Performance Virtual Screening Platform for computer aided drug design (CADD), based on LiGen, an exascale software able to screen billion of compounds in a very short time, and a library of trillion of compounds.

Since 2010, Dompé SpA, , has invested in a proprietary software for computer aided drug design (CADD), through its dedicated Drug Discovery Platform. The most relevant tool is the de novo structure-based virtual screening software LiGen(Ligand Generator), co-designed in collaboration with the Italian super computer center, CINECA. The distinguishing feature of LiGen is that it has been designed and developed to run on High Performance Computing (HPC) architectures. To maintain the performance primate beyond 2020,

LiGen is tools for molecular de novo design are actively sought incorporating sets of chemical rules for fast and efficient identification of structurally new chemotypes endowed with a desired set of biological properties. He is tools for molecular de novo design are actively sought incorporating sets of chemical rules for fast and efficient identification of structurally new chemotypes endowed with a desired set of biological properties. In its standard application, LiGen modules are used to define input constraints, either structure-based, through active site identification, or ligand-based, through pharmacophore definition, to docking and to de novo generation. Alternatively, individual modules can be combined in a user-defined manner to generate project-centric workflows. Specific features of LiGen are the use of a pharmacophore-based docking procedure which allows flexible docking without conformer enumeration and accurate and flexible reactant mapping coupled with reactant tagging through substructure searching.

**Biological Target – Ligand Docking**

- High Ambiguity Driven protein-protein DOCKing (HADDOCK)

Centre of Excellence for Computational Biomolecular Research (Bioexcel).

https://bioexcel.eu/software/haddock/

HADDOCK is a versatile information-driven flexible docking approach for the modelling of biomolecular complexes. HADDOCK distinguishes itself from ab-initio docking methods in the fact that it can integrate information derived from biochemical, biophysical or bioinformatics methods to enhance sampling, scoring, or both. The information that can be integrated is quite diverse: interface restraints from NMR or MS, mutagenesis experiments, or bioinformatics predictions; various orientational restraints from NMR and, recently, cryo-electron maps.

30.DOCK 6.9

Department of Pharmaceutical Chemistry , University of California, San Francisco .USA

http://dock.compbio.ucsf.edu/Online_Licensing/index.htm

DOCK 6.9 algorithm addressed rigid body docking using a geometric matching algorithm to superimpose the ligand onto a negative image of the binding pocket. Important features that improved the algorithm's ability to find the lowest-energy binding mode, including force-field based scoring, on-the-fly optimization, an improved matching algorithm for rigid body docking and an algorithm for flexible ligand docking.

DOCK 6 include Delphi electrostatics, ligand conformational entropy corrections, ligand desolvation, receptor desolvation; Hawkins-Cramer-Truhlar GB/SA solvation scoring with optional salt screening; PB/SA solvation scoring; and AMBER scoring-including receptor flexibility, the full AMBER molecular mechanics scoring function with implicit solvent, conjugate gradient minimization, and molecular dynamics simulation capabilities.

- VinaL.C.

Lawrence Livermore National Laboratory, Department of Energy, USA

https://github.com/XiaohuaZhangLLNL/VinaLC

A mixed parallel scheme that combines message passing interface (MPI) and multithreading was implemented in the AutoDock Vina molecular docking program. The resulting program, named VinaLC, was tested on the petascale high performance computing (HPC) machines at Lawrence Livermore National Laboratory. Parallel performance analysis of the VinaLC program shows that the code scales up to more than 15K CPUs with a very low overhead cost of 3.94%. One million flexible compound docking calculations took only 1.4 h to finish on about 15K CPUs. The docking accuracy of VinaLC has been validated against the DUD data set by the re‐docking of X‐ray ligands and an enrichment study, 64.4% of the top scoring poses have RMSD values under 2.0 Å. The program has been demonstrated to have good enrichment performance on 70% of the targets in the DUD data set. An analysis of the enrichment factors calculated at various percentages of the screening database indicates VinaLC has very good early recovery of actives.

- SwissDOCK

Swiss Institute of Bioinformatics

SwissDock is based on the docking software EADock DSS, whose algorithm consists of the following steps:

- many binding modes are generated either in a box (local docking) or in the vicinity of all target cavities (blind docking);
- simultaneously, their CHARMM energies are estimated on a grid;
- the binding modes with the most favorable energies are evaluated with FACTS, and clustered;
- the most favorable clusters can be visualized online and downloaded on your computer.

**SwissDock**, a web service to predict the molecular interactions that may occur between a target protein and a small molecule.

**S3DB**, a database of manually curated target and ligand structures, inspired by the Ligand-Protein Database.

** **

**3D Protein Structure Prediction**

- High - Resolution Protein Structure Prediction Codes (ROSETTA 3)

Hughes Institute, University of Washington.

https://www.rosettacommons.org/

ROSETTA is a library based object-oriented software suite which provides a robust system for predicting and designing protein structures, protein folding mechanisms, and protein-protein interactions. The Rosetta3 codes have been successful in the Critical Assessment of Techniques for Protein Structure Prediction (CASP7) competitions.

The Roseta3 method uses a two-phase Monte Carlo algorithm to sample the extremely large space of possible structures in order to find the most favorable one. The first phase generates a low-resolution model of the protein backbone atoms while approximating the side chains with a single dummy atom. The high-resolution phase then uses a more realistic model of the full protein, along with the corresponding interactions, to find the best candidate for the native structure.

The library contains the various tools , such as Atom, ResidueType, Residue, Conformation, Pose, ScoreFunction, ScoreType, and so forth. These components provide the data and services Rosetta uses to carry out its computations.

*Rosetta Functionality Summary*

-Rosetta Abinitio : Performs de novo protein structure prediction

-Rosetta Design :Low free energy sequences for target protein backbones.

-Rosetta Design pymol plugin: A user-friendly interface for submitting Protein Design simulations using Rosetta Design.

-Rosetta Dock : Predicts the structure of a protein-protein complex from the individual structures of the monomer components.

-Rosetta Antibody : Predicts antibody Fv region structures and performs antibody-antigen docking.

-Rosetta Fragments Generates : Fragment libraries for use by Rosetta ab initio in building protein structures.

Rosetta NMR : Incorporates NMR data into the basic Rosetta protocol to accelerate the process of NMR structure prediction

Rosetta DNA :For the design of proteins that interact with specified DNA sequences.

Rosetta RNA:Fragment assembly of RNA.

Rosetta Ligand : For small molecule - protein docking

**Seismic Wave Impact Simulation**

34 .SPECFEM3D - seismic wave propagation

https://geodynamics.org/cig/software/specfem3d/

California Institute of Technology .USA; University of Pau, France .

SPECFEM3D is Computational Infrastructure for Geodynamics. Unstructured hexahedral mesh generation is a critical part of the modeling process in the Spectral-Element Method (SEM). We present some examples of seismic wave propagation in complex geological models, automatically meshed on a parallel machine based upon CUBIT (Sandia Laboratory), an advanced 3D unstructured hexahedral mesh generator that offers new opportunities for seismologist to design, assess, and improve the quality of a mesh in terms of both geometrical and numerical accuracy. The main goal is to provide useful tools for understanding seismic phenomena due to surface topography and subsurface structures such as low wave-speed sedimentary basins.

35.SEISSOL.

Software

https://github.com/SeisSol/SeisSol

SeisSol is a software package for simulating wave propagation and dynamic rupture based on the arbitrary high-order accurate derivative discontinuous Galerkin method (ADER-DG).

Computational earthquake dynamics is emerging as a key component in physics-based approaches to strong motion prediction for seismic hazard assessment and in physically constrained inversion approaches to earthquake source imaging from seismological and geodetic observations. Typical applications in both areas require the ability to deal with rupture surfaces of complicated, realistic geometries with high computational efficiency. In our implementation, tetrahedral elements are used which allows for a better fit of the geometrical constraints of the problem, i.e., the fault shape, and for an easy control of the variation of element sizes using smooth refining and coarsening strategies.

Characteristics of the SeisSol simulation software are:

-use of tetrahedral meshes

- to approximate complex 3D model geometries and rapid model generation

-use of elastic, viscoelastic and viscoplastic material to approximate realistic geological subsurface properties.

- use of arbitrarily high approximation order in time and space to produce reliable and sufficiently accurate synthetic seismograms or other seismological data sets.

**Computational Fluid Dynamics**

- Code Saturne

Électricité de France

Code_Saturne is the free, open-source software developed and released by EDF to solve computational fluid dynamics (CFD) applications.

https://www.code-saturne.org/cms/

It solves the Navier-Stokes equations for 2D, 2D-axisymmetric and 3D flows, steady or unsteady, laminar or turbulent, incompressible or weakly dilatable, isothermal or not, with scalars transport if required.

Several turbulence models are available, from Reynolds-Averaged models to Large-Eddy Simulation models..

Physical modelling

- Laminar and turbulent flows;
- Compressible flow
- Radiative heat transfer;
- Conjugate heat transfer ;
- Combustion coal, fuel, gas ;
- Electric arc and Joule effect
- Lagrangian module for dispersed particle tracking;
- ALE method for deformable meshes
- Specific engineering modules for nuclear waste surface storage and cooling towers
- Derived version for atmospheric flows ;
- Derived version for eulerian multiphase flows
- Lagrangian method - stochastic modelling with 2-way coupling (momentum, heat, mass)

Transport and deposit of droplets, ashes, coal, corrosion products, radioactive particles, chemical forces.

- Gas combustion
- Coal combustion

37.OpenFOAM

The OpenFOAM Foundation

https://www.openfoam.com/download/

https://www.openfoam.com/documentation/tutorial-guide

OpenFOAM(Open source Field Operation And Manipulation") is a C++ toolbox for the development of customized numerical solvers, and pre-/post-processing utilities for the solution of continuum mechanics problems, including computational fluid dynamics (CFD).It has a large user base across most areas of engineering and science, from both commercial and academic organizations. OpenFOAM has an extensive range of features to solve anything from complex fluid flows involving chemical reactions, turbulence and heat transfer, to solid dynamics and electromagnetics.

- Delft3D

Delft3D Open Source Community

https://oss.deltares.nl/web/delft3d/download

The Delft3D Flexible Mesh Suite (Delft3D FM) allows you to simulate the interaction of water, sediment, ecology, and water quality in time and space. The suite is mostly used for the modelling of natural environments like coastal, estuarine, lakes and river areas, but it is equally suitable for more artificial environments like harbours, locks, urban areas, etc. Delft3D FM consists of a number of well-tested and validated modules, which are linked to and integrated with each other.

Delft3D is a integrated modelling suite, which simulates two-dimensional (in either the horizontal or a vertical plane) and three-dimensional flow, sediment transport and morphology, waves, water quality and ecology and is capable of handling the interactions between these processes. The suite is designed for use by domain experts and non-experts alike, which may range from consultants and engineers or contractors, to regulators and government officials, all of whom are active in one or more of the stages of the design, implementation and management cycle.

As a second option we tried to use a communication-hiding conjugate gradient method, PETSc’s linear solver KSPPIPECG, to solve the linear system arising from the spatial discretisation, but we were not able to get any performance . Currently the full source code is available of the Delft3D-FLOW (including morphology), Delft3D-WAVE, DELWAQ (D-Water Quality and D-Ecology) and PART (D-Particle Tracking) engines under GPLv3 conditions..

**Massively parallel Large Eddy Simulations (LES) and Direct Numerical Simulation(DES) for the study of complex flows**

**https://coec-project.eu/references-codes/**

39.CIAO

https://www.fz-juelich.de/ias/jsc/EN/Expertise/High-Q-Club/CIAO/_node.html

Institute for Advanced Simulation (IAS)

Jülich Supercomputing Centre (JSC)

Compressible/Incompressible Advanced reactive turbulent simulations with Overset.

CIAO performs Direct Numerical Simulations (DNS) as well as Large-Eddy Simulations (LES) of the Navier-Stokes equations along with multiphysics effects (multiphase, combustion, soot, spark). It is a structured, finite difference code, which enables the coupling of multiple domains and their simultaneous computation. Moving meshes are supported and overset meshes can be used for local mesh refinement. A fully compressible as well as an incompressible/low-Mach solver are available within the code framework. Spatial and temporal staggering of flow variables are used in order to increase the accuracy of stencils. The sub-filter model for the momentum equations is an eddy viscosity concept in form of the dynamic Smagorinsky model with Lagrangian averaging along fluid particle trajectories. While the fully compressible solver uses equation of states or tabulated fluid properties, a transport equation for internal/total energy, and a low-storage five-stage, explicit Runge-Kutta method for time integration, the incompressible/low-Mach solver uses Crank-Nicolson time advancement and an iterative predictor corrector scheme. The resulting Poisson equation for pressure is solved by HYPRE’s multi-grid solver.

- Alya

Barcelona Supercomputing Centre

Alya is a high performance computational mechanics code to solve complex coupled multi-physics / multi-scale / multi-domain problems, which are mostly coming from the engineering realm.

Among the different physics solved by Alya we can mention: incompressible/compressible flows, non-linear solid mechanics, chemistry, particle transport, heat transfer, turbulence modeling, electrical propagation, etc.

From scratch, Alya was specially designed for massively parallel supercomputers, and the parallelization embraces four levels of the computer hierarchy :

- A substructuring technique with MPI as the message passing library is used for

distributed memory supercomputers.

- At the node level, both loop and task parallelisms are considered using OpenMP as an

alternative to MPI. Dynamic load balance techniques have been introduced as well to better exploit computational resources at the node level.

- At the CPU level, some kernels are also designed to enable vectorization.
- Finally, accelerators like GPU are also exploited through OpenACC pragmas or with

CUDA to further enhance the performance of the code on heterogeneous computers.

Multiphysics coupling is achieved following a multi-code strategy, relating different instances of Alya. MPI is used to communicate between the different instances, where each instance solves a particular physics. This powerful technique enables asynchronous execution of the different physics.

- AVBP

https://cerfacs.fr/en/computational-fluid-dynamics-softwares/

CERFACS (Centre de recherche fondamentale et appliquée spécialisé dans la modélisation et la simulation numériques, également centre de formation avancée).

AVBP is a LES (Large Eddy Simulation) code dedicated to unsteady compressible flows in complex geometries with combustion or without combustion. It is applied to combustion chambers, turbo machinery, safety analysis, optimization of combustors, pollutant formation (CO, NO, soot), UQ analysis. AVBP uses a high-order Taylor Galerkin scheme on hybrid meshes for multi species perfect of real gases. Its spatial accuracy on unstructured hybrid meshes is 3 (4 on regular meshes). The AVBP formulation is fully compressible and allows to investigate compressible combustion problems such as thermoacoustic instabilities (where acoustics are important) or detonation engines (where combustion and shock must be computed simultaneously).

AVBP is a world standard for LES of combustion in engines and gas turbines, owned by CERFACS and IFP Energies Nouvelles. It is used by multiple laboratories (IMFT in Toulouse, EM2C in Centralesupelec, TU Munich, Von Karmann Institute, ETH Zurich, etc) and companies (SAFRAN AIRCRAFT ENGINES, SAFRAN HELICOPTER ENGINES, ARIANEGROUP, HERAKLES, etc).

AVBP is also used today to compute turbomachinery (compressors and turbines) and to compute full engine configurations. Being able to compute simultaneously the compressor and the chamber of the chamber and the turbine or all three is now possible with AVBP. This is critical for multiple problems such as new propulsion concepts (such as Rotating Detonation Engines) or to study coupled phenomena such as the noise emitted from a gas turbine.

AVBP has always been at the forefront of HPC research at CERFACS: its efficiency has been verified up to 250 000 cores with grids of 2 to 4 billion cells.

- YALES2

https://www.coria-cfd.fr/index.php/YALES2#Solvers

CORIA lab : Joint lab from CNRS, INSA and University of Rouen

YALES2 aims at the solving of two-phase combustion from primary atomization to pollutant prediction on massive complex meshes. It is able to handle efficiently unstructured meshes with several billions of elements, thus enabling the Direct Numerical Simulation of laboratory and semi-industrial configurations.

YALES2 is based on a large numerical library to handle partitioned meshes, various differential operators or linear solvers, and on a series of simple or more complex solvers:

-Scalar solver (SCS)

-Level set solver (LSS)

- Incompressible solver (ICS)

- Variable density solver (VDS)

-Spray solver (SPS = ICS + LSS + Ghost-Fluid Method)

-Lagrangian solver (LGS)

-Compressible solver (ECS)

-Magneto-hydrodynamic solver (MHD)

-Mesh movement solver (MMS)

-Radiative solver (RDS)

-Linear acoustics solver (ACS)

-Heat transfers solver (HTS)

- Immersed boundary solver (IBS)

-Granular Flow solver (GFS)

- Nek5000

Argonne National Laboratory and Swiss Federal Institute of Technology, Zurich.

Simulation code Nek5000 sheds light on the turbulent flow fields of internal combustion engines, nuclear reactors, airplane wings, and more. The open-source software, which has evolved for over 30 years, features scalable algorithms that are fast and efficient on platforms ranging from laptops to the world’s fastest computers

Highlights

- Incompressible and low Mach-number Navier-Stokes
- Spectral element disrectization
- Runs on all POSIX compliant operating systems
- Proven scalability to over a million ranks using pure MPI for parallelization
- Easy-to-build with minimal dependencies
- High-order conformal curved quadrilateral/hexahedral meshes
- Semi-implicit 2nd/3rd order adaptive timestepping
- Conjugate fluid-solid heat transfer
- Efficient preconditioners
- Parallel I/O
- Lagrangian phase model
- Moving and deforming meshes
- Overlapping overset grids
- Basic meshing tools including converters
- LES and RANS(Reynolds-Averaged Navier-Stokes ) turbulence models
- VisIt and Paraview support for data analysis and visualization

- PRECISE_UNS

Rolls-Royce and Institute Energy and Power Plant Technology (EKT) of Darmstadt University

Numerical Modeling Methods for Prediction of Ignition Processes in Aero-Engines

Code PRECISE-UNS (Predictive-System for Real Engine Combustors - Unstructured) is a finite volume based unstructured CFD solver for turbulent multi-phase and reacting flows. It is a pressure-based code, which uses the pressure correction scheme / PISO scheme to achieve pressure velocity coupling. It is applicable to both low-Mach number and fully compressible flows. Discretisation in time and space is up to second order. The linearized equations are solved using various well-known libraries such as PETSc, HYPRE and AGMG. Several turbulence models are available: k-epsilon, k-ω-SST, RSM, SAS, LES. Different combustion models are available, ranging from the classical conserved scalar (flamelet) models and global reaction mechanism, to FGM and detailed chemistry ,

PRECISE-UNS is built on Dolfyn, an open-source code written in Fortran. All

investigations documented in this work have been performed using PRECISE-UNS.

**Finite Element Computer Simulation**

45.SALOME Library

Électricité de France ; Le Commissariat à l’énergie atomique et aux énergies alternatives (CEA)

SALOME platform is an open software framework for integration of numerical solvers in various physical domains. The CEA and EDF use SALOME to realize a wide range of simulations, which typically concern industrial equipment in nuclear production plants. Among primary concerns are the design of new-generation reactor types, nuclear fuel management and transport, material ageing for equipment life-cycle management, and the reliability and safety of nuclear installations. To satisfy these challenges, SALOME integrates a CAD/CAE modeling tool, industrial meshing algorithms, and advanced 3D visualization functionalities. SALOME is a generic platform for numerical simulation with the following aims

- Facilitate interoperation between CAD modelling and computing codes
- Facilitate implementation of coupling between computing codes in a distributed environment
- Provide a generic user interface
- Pool production of developments (pre and post processors, calculation distribution and supervision) in the field of numerical simulation.

**Weather Research and Forecasting Model**

https://www.mmm.ucar.edu/weather-research-and-forecasting-model

National Center for Atmospheric Research (NCAR), the National Oceanic and Atmospheric Administration (represented by the National Centers for Environmental Prediction (NCEP) and the Earth System Research Laboratory), the U.S. Air Force, the Naval Research Laboratory, the University of Oklahoma, and the Federal Aviation Administration (FAA).

The Weather Research and Forecasting (WRF) Model is a next-generation mesoscale numerical weather prediction system designed for both atmospheric research and operational forecasting applications. It features two dynamical cores, a data assimilation system, and a software architecture supporting parallel computation and system extensibility. The model serves a wide range of meteorological applications across scales from tens of meters to thousands of kilometers.

For researchers, WRF can produce simulations based on actual atmospheric conditions (i.e., from observations and analyses) or idealized conditions. WRF offers operational forecasting a flexible and computationally-efficient platform, while reflecting recent advances in physics, numerics, and data assimilation contributed by developers from the expansive research community.

The WRF system contains two dynamical solvers, referred to as the ARW (Advanced Research WRF) core and the NMM (Nonhydrostatic Mesoscale Model) core. The ARW users' page is: https://www2.mmm.ucar.edu/wrf/users/

The NMM core was developed by the National Centers for Environmental Prediction (NCEP), and is currently used in their HWRF (Hurricane WRF) system.

**Libraries **

- PETCs -Portable, Extensible Toolkit for Scientific Computing

Mathematics and Computer Science Division, Argonne National Laboratory, USA.

https://www.mcs.anl.gov/petsc/

PETSc has been used for modeling in all of these areas: Acoustics, Aerodynamics, Air Pollution, Arterial Flow, Bone Fractures, Brain Surgery, Cancer Surgery, Cancer Treatment, Carbon Sequestration, Cardiology, Cells, CFD, Combustion, Concrete, Corrosion, Data Mining, Dentistry, EarthQuakes, Economics, Esophagus, Fission, Fusion, Glaciers, Ground Water Flow, Linguistics, Mantel Convection, Magnetic Films, Material Science, Medical Imaging, Ocean Dynamics, Oil Recover, Page Rank, Polymer Injection Molding, Polymeric Membranes, Quantum computing, Seismology, Semiconductors, Rockets, Relativity, Surface Water Flow.

48.ParMETIS - Parallel Graph Partitioning and Fill-reducing Matrix Ordering

Department of Computer Science and Engineering , University of Minnesota,

http://glaros.dtc.umn.edu/gkhome/metis/parmetis/overview

The algorithms implemented in METIS are based on the multilevel recursive-bisection, multilevel k-way, and multi-constraint partitioning schemes. The fill-reducing orderings produced by METIS are significantly better than those produced by other widely used algorithms including multiple minimum degree. For many classes of problems arising in scientific computations and linear programming, METIS is able to reduce the storage and computational requirements of sparse matrix factorization, by up to an order of magnitude. Moreover, unlike multiple minimum degree, the elimination trees produced by METIS are suitable for parallel direct factorization

- Linear Algebra PACKage (LAPACK)

LAPACK provides routines for solving systems of simultaneous linear equations, least-squares solutions of linear systems of equations, eigenvalue problems, and singular value problems. The associated matrix factorizations (LU, Cholesky, QR, SVD, Schur, generalized Schur) are also provided, as are related computations such as reordering of the Schur factorizations and estimating condition numbers. Dense and banded matrices are handled, but not general sparse matrices. In all areas, similar functionality is provided for real and complex matrices, in both single and double precision.

The original goal of the LAPACK project was to make the widely used EISPACK and LINPACS libraries run efficiently on shared-memory vector and parallel processors. LAPACK requires that highly optimized block matrix operations be already implemented on each machine.

50.SuperLU

National Energy Research Scientific Computing Center (NERSC), Department of Energy, USA

https://portal.nersc.gov/project/sparse/superlu/

SuperLU is a general purpose library for the direct solution of large, sparse, nonsymmetric systems of linear equations on high performance machines. The library is written in C and is callable from either C or Fortran. The library routines will perform an LU decomposition with partial pivoting and triangular system solves through forward and back substitution. The LU factorization routines can handle non-square matrices but the triangular solves are performed only for square matrices.

- Multifrontal massively parallel sparse direct solver ( MUMSPS)

CERFACS, CNRS, ENS Lyon, INP Toulouse, Inria, Mumps Technologies, University of Bordeaux.

Multifrontal massively parallel sparse direct solver ( MUMSPS) for solution of large linear systems with symmetric positive definite matrices, general symmetric matrices and general unsymmetrical matrices. Several reorderings interfaced: AMD, QAMD, AMF, PORD, METIS, PARMETIS, SCOTCH, PT-SCOTCH.

52.Trilinos

Trilinos Community

https://trilinos.github.io/index.html

Trilinos is a collection of open-source software libraries, called packages, intended to be used as building blocks for the development of scientific applications. Trilinos facilitate the design, development, integration and ongoing support of mathematical software libraries within an object-oriented framework for the solution of large-scale, complex multi-physics engineering and scientific problems. Trilinos addresses two fundamental issues of developing software for these problems: Providing a streamlined process and set of tools for development of new algorithmic implementations; Promoting interoperability of independently developed software.

- Scalable Linear Solvers and Multigrid Methods HYPRE

Lawrence Livermore National Laboratory, Department of Energy, USA

https://computing.llnl.gov/projects/hypre-scalable-linear-solvers-multigrid-methods

HYPRE library of linear solvers makes possible larger, more detailed simulations by solving problems faster than traditional methods at large scales. It offers a comprehensive suite of scalable solvers for large-scale scientific simulation, featuring parallel multigrid methods for both structured and unstructured grid problems. The HYPRE library is highly portable and supports a number of languages.

The HYPRE team was one of the first to develop algebraic multigrid algorithms and software for extreme-scale parallel supercomputers.

**Monte Carlo Simulations**

54.Geant4

CERN

https://geant4.web.cern.ch/node/1

GEANT4 is a software toolkit for the simulation of the passage of particles through matter. It is used by a large number of experiments and projects in a variety of application domains, including high energy physics, astrophysics and space science, medical physics and radiation protection . As a Monte Carlo simulation toolkit, Geant4 profits from improved throughput via parallelism derived from the independence of modeled events and their computation. Until Geant4 version 10.0, parallelization was obtained with a simple distribution of inputs: each computation unit (e.g. a core of a node in a cluster) ran a separate instance of Geant4 that was given a separate set of input events and associated random number seeds.

Given a computer with k cores, the design goal of multithreaded Geant4 was to replace k independent instances of a Geant4 process with a single, equivalent process with k threads using the many-core machine in a memory-efficient, scalable manner. The corresponding methodology involved transforming the code for thread safety and memory footprint reduction

**Libraries for Artificial Intelligence and Data Analysis**

- The R Project for Statistical Computing

https://www.r-project.org/

R is a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing. The R language is widely used among statisticians and data miners for developing statistical software and data analysis. The basic package (inlcuding the HPC package) of R is installed on the HPC. Users can install their own packages in their home directories. The packages available include: Chemometrics and Computational Physics, Clinical Trial Design, Monitoring, and Analysis, Econometrics, Analysis of Ecological and Environmental Data, Empirical Finance, Statistical Genetics, Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization, Hydrological Data and Modeling, Machine Learning & Statistical Learning, Medical Image Analysis, Multivariate Statistics, Natural Language Processing, Psychometric Models and Methods, Analysis of Spatial Data, among others.

- Python - Modern, interpreted, object-oriented, full featured high level programming language. Versions to include are 2.7.x and 3.x. Packages for Computational Science include: numpy and pandas for data operation and analysis, scipy for higher level computational routines, matplotlib for plotting. Additional packages can be installed in user’s home directory.

57.Tensorflow

https://www.tensorflow.org/learn

TensorFlow is an open source software library for machine learning developed by Google. Its mission is to train and build neural networks. It can be used on CPU and GPU architectures. It is furthermore an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API.

- Machine Learning Libraries – Keras, Caffe, Pytorch, Theano.

A Python version of Torch, known as Pytorch, was open-sourced by Facebook in January 2017. PyTorch offers dynamic computation graphs, which let you process variable-length inputs and outputs. Torch is a computational framework with an API written in Lua that supports machine-learning algorithms. Some version of it is used by large tech companies such as Facebook and Twitter, which devote in-house teams to customizing their deep learning platforms. Lua is a multi-paradigm scripting language that was developed in Brazil in the early 1990s. Caffe is a well-known and widely used machine-vision library that ported Matlab’s implementation of fast convolutional nets to C and C++ (see Steve Yegge’s rant about porting C++ from chip to chip if you want to consider the tradeoffs between speed and this particular form of technical debt). Caffe is not intended for other deep-learning applications such as text, sound or time series data. Like other frameworks mentioned here, Caffe has chosen Python for its API. Caffe2 is the second deep-learning framework to be backed by Facebook after Torch/PyTorch. The main difference seems to be the claim that Caffe2 is more scalable and light-weight. It purports to be deep learning for production environments. Like Caffe and PyTorch, Caffe2 offers a Python API running on a C++ engine.

**Libraries for Neuroscience**

- The Neural Simulation Tool NEST

Prof.Dr. Markus Diesmann Institute of Neuroscience and Medicine (INM-6), Computational and Systems Neuroscience, Jülich Research Center, Jülich, Germany

Prof.Dr. Marc-Oliver Gewaltig École Polytechnique Fédérale de Lausanne, , Switzerland

https://www.nest-simulator.org/

NEST is a simulator for spiking neural network models from small-scale microcircuits to brain-scale networks of the order of 10^8 neurons and 10^12 synapses. Main features include: Integrate-and-fire neuron models with current- and conductance-based synapses, Adaptive threshold integrate-and-fire neuron models (AdEx, MAT2), Hodgkin-Huxley type neuron models with one compartment, Simple multi-compartmental neuron models, Static and plastic synapse models (STDP, short-term plasticity, neuromodulation), Grid based spike interaction and interaction in continuous time, Exact Integration for linear neuron models and appropriate solvers for others, Topology Module and support for CSA for creating complex networks.

60.NEURON - flexible and powerful simulator of neurons and networks

Yale University , USA

https://neuron.yale.edu/neuron/

It was primarily developed by Michael Hines, John W. Moore, and Ted Carnevale at Yale and Duke.

Simulation environment for modeling individual neurons and networks of neurons. It provides tools for conveniently building, managing, and using models in a way that is numerically sound and computationally efficient. It is particularly well-suited to problems that are closely linked to experimental data, especially those that involve cells with complex anatomical and biophysical properties. NEURON's computational engine employs special algorithms that achieve high efficiency by exploiting the structure of the equations that describe neuronal properties. It has functions that are tailored for conveniently controlling simulations, and presenting the results of real neurophysiological problems graphically in ways that are quickly and intuitively grasped.

- Extreme Parallel Tools for Brain Neural Network Simulations

Prof. Stoyan Markov , Dr. Kristina Kapanova , Mag. , Jasmine Brune

New tool developed by NCSA, Bulgaria giving the capability to simulate Hodgkin-Huxley type neuron models, while working in a pipeline algorithmic manner. Different to other simulation tools, the simulation is cell driven, with a user specified number of cycles, which describe the simulation time of the system. Each cycle consists of a 300ms window, during which the HH membrane potential is computed and the reaction of the cell is recorded. The algorithm works on a densely interconnected and sparse outside connection model, thus significantly reducing the volume of communications. Correspondingly during the simulation of the neural network each neuron is considered as an independent entity by means of the cell data structure, which records all the required communication.

** **

**The Performance Optimisation and Productivity Centre of Excellence in Computing Applications.**

62 .Extrae

Barcelona Supercomputing Centre, Barcelona, Spain

https://tools.bsc.es/

Extrae is the package devoted to generate Paraver trace-files for a post-mortem analysis. It is a tool that uses different interposition mechanisms to inject probes into the target application so as to gather information regarding the application performance.

- Scalasca

Forschungszentrums Jülich, Jülich, Germany

Scalasca is a software tool that supports the performance optimization of parallel programs by measuring and analyzing their runtime behavior. The analysis identifies potential performance bottlenecks – in particular those concerning communication and synchronization – and offers guidance in exploring their causes. Scalasca targets mainly scientific and engineering applications based on the programming interfaces MPI and OpenMP, including hybrid applications based on a combination of the two. The tool has been specifically designed for use on large-scale systems.

- Cube

https://www.scalasca.org/scalasca/software/cube-4.x/download.html

Cube, which is used as performance report explorer for Scalasca and Score-P, is a generic tool for displaying a multi-dimensional performance space consisting of the dimensions (i) performance metric, (ii) call path, and (iii) system resource. Each dimension can be represented as a tree, where non-leaf nodes of the tree can be collapsed or expanded to achieve the desired level of granularity. In addition, Cube can display multi-dimensional Cartesian process topologies.

65.Score-P

Forschungszentrums Jülich, Jülich, Germany

http://scorepci.pages.jsc.fz-juelich.de/scorep-pipelines/doc.r14401/quickstart.html

Scalable Performance Measurement Infrastructure for Parallel Codes (Score-P) measurement infrastructure is a highly scalable and easy-to-use tool suite for profiling, event tracing, and online analysis of HPC applications. Score-P offers the user a maximum of convenience by supporting a number of analysis tools. Currently, it works with Periscope, Scalasca, Vampir, and Tau and is open for other tools.

66.Vampir

Forschungszentrums Jülich, Jülich, Germany

Vampir provides an easy-to-use framework that enables developers to quickly display and analyze arbitrary program behavior at any level of detail. The tool suite implements optimized event analysis algorithms and customizable displays that enable fast and interactive rendering of very complex performance monitoring data.

Vampir and Score-P provide a performance tool framework with special focus on highly-parallel applications. Performance data is collected from multi-process (MPI, SHMEM), thread-parallel (OpenMP, Pthreads), as well as accelerator-based paradigms (CUDA, OpenCL, OpenACC).

- Extra-P

Forschungszentrums Jülich, Jülich, Germany

https://www.scalasca.org/software/extra-p/

https://github.com/extra-p/extrap

Extra-P is an automatic performance-modeling tool that supports the user in the identification of scalability bugs. A scalability bug is a part of the program whose scaling behavior is unintentionally poor, that is, much worse than expected.

Extra-P uses measurements of various performance metrics at different processor configurations as input to represent the performance of code regions (including their calling context) as a function of the number of processes. All it takes to search for scalability issues even in full-blown codes is to run a manageable number of small-scale performance experiments, launch Extra-P, and compare the asymptotic or extrapolated performance of the worst instances to the expectations.

- Paraver: a flexible performance analysis tool

Barcelona Supercomputing Centre, Barcelona, Spain

https://tools.bsc.es/paraver

Paraver was developed to respond to the need to have a qualitative global perception of the application behavior by visual inspection and then to be able to focus on the detailed quantitative analysis of the problems. Expressive power, flexibility and the capability of efficiently handling large traces are key features addressed in the design of Paraver. The clear and modular structure of software plays a significant role towards achieving these targets. Some features of its features include: -Detailed quantitative analysis of program performance;

-Concurrent comparative analysis of several traces;

-Customizable semantics of the visualized information;

-Cooperative work, sharing views of the tracefile;

-Building of derived metrics;

-The following are major features of the Paraver philosophy and functionality.