Site Reliability Engineering Services

SRE Consulting – Site Reliability Engineering Services

Protect the good reputation of your company: our site reliability engineering (SRE) services ensure the smooth performance of your applications because, not only do frequent periods of extended downtime mean lost revenue, they can also cause damage to your reputation and customer confidence.

Application Monitoring and Application Risk Management

Application architectures are becoming increasingly complex, which often makes an end-2-end view a challenge for operators. This is where our customer reliability engineers come in. Drawing on their many years of experience, they know precisely which indicators to monitor to ensure the reliability of your applications. In an emergency, specialists with extensive knowledge in the most diverse areas ensure that the cause of the incident is swiftly identified.

Benefit from the expertise of our CREs and their best practice methods:

  • Proactive downtime prevention
  • Greater customer satisfaction 
  • Easily contactable even in an emergency 
  • Platform-independent services 

Your benefits

  • Many years of expertise in the operation of applications and cloud platforms
  • Early identification of problems thanks to end-to-end observability
  • Platform-independent, hybrid SRE/CRE services (public and private cloud)

When is it the right      solution?

Are your applications business critical and is application reliability therefore vital? Then our site reliability engineering (SRE) services are precisely what you are looking for. Drawing on their considerable expertise and best practice methods, our CREs will help you strike a balance between innovation and change.

Take a second to imagine the impact of prolonged downtime on your business: alongside the financial consequences, your reputation is also on the line. Once you have lost customer confidence, regaining it is no easy task. One thing is certain however: no matter how proactive and competent your incident management, failures can never be completely ruled out.

Our CREs guarantee the smoothest possible performance of your applications. SRE services such as application risk monitoring and reliability reporting are an important part of this. Equally important is the troubleshooting support provided by our CREs – from detection and root cause analysis through to the fix. Drawing on their considerable technical expertise and application knowledge, they will not lose any time. The benefit to you: thanks to the rapid processing, users often do not even notice that a problem occurred. We also work with you on a post-mortem, which helps us learn for the future and prevents the same errors from being repeated.

Why Swisscom?

  • Best Practice:  Our CREs draw on their many years of experience to support you
  • Risk minimisation : Our CREs know when and where problems can arise
  • Extensive expertise: Our CREs have experienced teams of specialists on hand

SRE for continuous business processes

Our site reliability engineering (SRE) services guarantee the smooth functioning of digitised business processes. Continuous application optimisation ensures stability and availability even in the face of rapidly increasing user numbers.

Due to the weekly addition of new features, the business-critical application of a shipping company is becoming increasingly unstable. The frequent changes eventually take their toll and, one morning, customers and partners are unable to access the service for many hours. The company faces a significant loss in revenue and confidence and therefore puts extreme pressure on the IT team to resolve the issue.


The IT manager calls in a Swisscom CRE to resolve the issue, which is clearly caused by the excessively rapid innovation of the application. To strike the right balance between innovation and change, and eliminate the technical debts, the CRE introduces ‘error budgets’. This restores application reliability and resolves a problem that had been brewing for some time. At the same time, the IT manager meets the expectations of his CEO, who wanted a solution yesterday.

An insurance company takes the brave step of moving to the cloud. However, many of the expensive mainframes still continue to run in the company’s own data centre. The question is how to identify and resolve reliability issues at an early stage if the cloud applications for the core insurance activities (e.g. insurance policy calculations) continue to depend on these mainframes.


Reliable application risk management combined with regular reliability reporting enables early risk detection and, as a result, the resolution of any reliability issues. At the same time, previous experience can be used to draw inferences about potential future situations to be avoided (prevention by continuous learning). Our CREs also help you with application monitoring design because monitoring and an event management design review by Swisscom can provide valuable input for a migration to the cloud in particular.

A Swiss SME developed an app that allows users to buy food from local farms and have it delivered directly by e-bike. As a result of word-of-mouth recommendations, the user numbers continue to grow. One day, an influencer presents the app in a TikTok video, resulting in an explosion of new users. As a consequence, the company struggles to guarantee the app’s reliability.


With the aid of site reliability engineering (SRE) services from Swisscom, the reliability of the application is increased significantly, enabling it to keep up with the rapidly growing user numbers in the future. In such cases, an application reliability review allows a clear approach to be defined for the identification of the root cause of problems and for suggestions for subsequent optimisation. With their many years of experience, our CREs are also fully conversant with the best practice methods for resolving stability issues.

Our experts will be happy to answer your questions. Contact us.