UOC Technology

Technology Infrastructure

The availability of services

service

The UOC works 24 hours a day, 365 days a year. For this reason, it is very important to the University to ensure that all the important services are working properly all of the time.

Consequently, to achieve maximum availability, we pursue different strategies:

Infrastructure with redundant systems in high availability, contingency plans, quality controls and 24x7 monitoring and surveillance systems.

 

  • High-availability infrastructure

A significant part of the possible fault points are duplicated and a reserve element automatically comes into operation so that the service is not affected. (In many cases, the principal and the reserve element are not differentiated, both working at the same time with load balancing.)

In cases where it is not possible to provide the system with automatic redundancy, we establish a contingency plan, which consists of applying a procedure that ensures the recovery of a specific service in the least time possible.

 

  • Quality control (the various environments)

Before a service is available to users, it undergoes a control process with the aim of ensuring that it works properly, in other words, to ensure that it is stable and performs well. To achieve this, in its infrastructure, the UOC has a series of additional working environments to the production environment.

Development environment: Where the developers generate the applications.

Test environment: Where the functional tests are conducted.

Pre-production environment: Where the integration and load tests are conducted.

 


Click on the picture to zoom.
 
  • Data center contingency

Besides the partial contingency plans for each service, the UOC has a global contingency plan in the event of a serious contingency in the main data center. In this case, the contingency environments located in the Tibidabo data center would come into service, taking on the UOC's main services in a short space of time. For this to be possible, on the one hand, all the relevant data are replicated daily between the main and the secondary data centers, and on the other, the contingency environments and the procedures required to transfer the service are kept up to date.

24x7 Service

All of the UOC's critical systems and services are monitored constantly 24 hours a day. The monitoring system comprises a series of tests that are executed constantly and which trigger an alarm in the event of an anomaly.

The 24x7 service consists of:

 

  • Proactive Part:

Measures are taken to minimise the number of incidents. Those that have occurred are analysed and measures are taken so that they do not reoccur.

 

  • Reactive Part:

This is what is set off when an incident occurs.

It consists of 4 different levels:

Level 0: Monitoring of physical infrastructure, communications, servers, operating systems, base software, databases, applications and services using the open software Nagios tool.

Level 1: Resolution of procedures and automated incidents.

Level 2: Solving the remaining incidents.

Level 3: Support for level 2.

Each level comes into action when the immediately previous level has not been able to solve the incident.

The tests are very diverse, so that monitoring of a service is done by combining user transaction simulations with system measures at the hardware, operating system and applications software level.

 


Click on the picture to zoom.