Recovery and Stabilization of Business-Critical Applications
Mar 03 2023 |6 min read
Problem Statement
Client required support for monitoring and automation of applications developed by in-house Development teams to reduce the manual effort involved in recovery and stabilization of the essential components which provide necessary data for successful trade execution.
Project Objectives
- Provide 24x7 support to ensure stability of the onboarded applications and services.
- Work on developing scripts and automation of applications
- Gradually expand the support from L1 to L2 complete support model.
- Gain expertise in the monitoring tools to help in user queries.
- Problem management and RCA of the issues encountered to reduce the MTTR going forward.
- Iteratively conduct WAF review and address the gaps in the monitoring.
Scope of Work included
- Ensuring 24x7 support throughout the year and act as a backbone for the Dev teams by being the first line of defense for the applications based on service-oriented architecture.
- Proactive Monitoring of applications (including trade floor critical applications) using different tools like Splunk, PagerDuty, Prometheus/ Grafana etc.
- Incident and Problem management of the Production issues encountered for the services and implementing the action items discussed as a part of RCA to avoid issues in the applications during trading hours.
Approach Followed
- A single team was subdivided into different verticals to focus and support the applications in various domains, each owned by a different person.
- Each domain vertical worked on the projects to transition into the new structure and develop a similar model of support as per the requirements.
Technology Stack
- Technologies: Python, Windows PowerShell, Unix/Linux
- Databases: Oracle, PL/SQL
- Tools: Splunk, AWS Cloud implementation, PagerDuty, SolarWinds, Prometheus, Grafana, Autosys
- Configuration Management: SVN/Git, Quick build and Gitlab (CICD), JIRA, Service Now
Sunpreet Kaur
Case Studies you may like
There are no more case studies for this cateory.