Nuovo Cluster HPC-GPU

 Il Cluster GPU è parte integrante del cluster HPC di ReCaS-Bari e vede la propria potenzialità maggiormente espressa per applicazioni che utilizzano GPU. Mette a disposizione 1755 core, 13.7 TB di RAM, 55 TB di spazio disco e 38 GPU ad altissime prestazioni (18 Nvidia A100 e 20 Nvidia V100). Ogni nodo ha accesso al file system distribuito di ReCaS-Bari, con circa 3800 TB in singola replica e altri 180 TB, dove è garantita una maggiore sicurezza dei dati attraverso la doppia replica. La banda di comunicazione nodo-storage è di 10 Gbps.
Le applicazioni sono eseguite esclusivamente tramite Docker container, tecnologia che conferisce semplicità di configurazione ed esecuzione, affidabilià, flessibilità e sicurezza.
L'utente può richiedere l'istanziazione di servizi interattivi, come IDE utilizzabili da remoto (Jupyter Notebook e RStudio), e la sottomissione di workflow rappresentati con Directed Acyclic Graphs (DAG).
Ove possibile, i servizi saranno istanziati con IP privato, in modo da non essere raggiungibili dall'esterno e quindi meno vulnerabili agli attacchi informatici: in questo caso l'utente potrà accedere alle proprie risorse attraverso una VPN. Per poter utilizzare i servizi offerti dal Cluster GPU è necessario che l'utente faccia una apposita richiesta.

Presentazioni:
2° Congresso della Sezione INFN e del Dipartimento di Fisica di Bari, 03-04 Feb 2022

UPS extraordinary maintenance November 4, 2019

Today, November 4, 2019, an extraordinary maintenance will be performed on the UPSs to replace some worn out fans.
It should not affect the functioning of the ReCaS-Bari center.

Short black-outs today in the electricity supply in tho Bari Campus

Due to interventions on the electricity distribution lines, today there will be two short interruptions of the electricity supply in  the Bari campus.

Given the expected short duration, UPSs should be sufficient to keep ReCaS in operation, even in cases where the auxiliary generator will not enter in  operation.

Interruption of the ReCaS-Bari services

On Monday 08/04 there was an interruption in the electricity supply of ReCaS-Bari and, consequently, the interruption of the services provided.
Some services were restored as early as Monday 08/04, while others took longer.
After intensive work, all services are now (11/4) active again.
Users are asked to verify the functioning of the services they use (virtual machines, HPC / HTC clusters, personal storage, SaaS services) and to report any problems.
We apologize for the inconvenience

UPS periodic maintenance intervention February 2, 2018

We would like to inform Users that, during the periodic maintenance intervention on the UPS system (which guarantees the continuity of the power supply) carried out by the Vertiv company, scheduled for tomorrow 02/02/2018, one UPS will be operated, for a few hours, in by-pass mode.
As in previous cases, we do not expect any impact on the normal operation of the data center.

Periodic inspection of the supervision system of ReCaS-Bari

On Wednesday, November 22, 2017, regular periodic inspection of the ReCaS-Bari supervision system will be carried out.
During the visit, the two transformers serving the Data Center will also be exchenged. 
Unlike the last time, in this case, we expect the visit to have no impact on normal operations.

Doubled the bandwidth of the connection between the ReCaS- Bari data center and the GARR network

The technical intervention on the ReCaS-Bari data center border router, scheduled for September 12,  2017, carried out in collaboration with the GARR, the research network provider, was completed successfully.
During the intervention  the Border Gateway Protocol (BGP), a modern routing protocol that connects multiple routers belonging to several autonomous systems, was enabled.
By adopting BGP, ReCaS-Bari can implement dynamic routing autonomously, without relying entirely on the GARR,  choosing the routing on the base of the "use case": it will be possible to separate data transfers that relate to CERN experiments by other types of transfers, with a net gain in flexibility and elasticity of the system.
At the same time, the speed of the connection to the GARR network was doubled, with data now traveling at 20 Gbit / s.
The intervention was completely transparent to users, who did not notice any effect on their resources.
It is however recommended to monitor your data transfers in the coming days and report any suspicion of malfunctioning.
The band's doubling during the last September 12  intervention, represents the first step towards a 100 Gbit / s ReCaS-Bari connection bandwidth.