
Platform
Engagement Scope
- Use AIOps to identify anomaly, classify problem and suggest -recovery solution and provide single click recovery action
- Create defects for code/data issues
based on application logs


Customer Environment
- Too many orders getting into problem (3-10%) and needed manual fixing
- Too many incoming tickets 2500-3000 per month
- Getting all tools to feed data and maintaining proper thresholds to avoid false positive alerts.
- Integration challenge: Our AIOps service
needed receiver and feeder APIs to a cloud-based solution – getting security clearance
Solution
- BigPanda Based Architecture for Monitoring, Even Correlation/Dededuplication/enrichment & escalation
- 200+ Microservices/Apps integrated
Solution Approach

Technology
Highlights
Dashboard with single pane of glass health info about systems/apps/ incidents
Incoming tickets reduced to 500-600 per month with code fixes based on defects created/problems notified resulting in 30% fallout automatically handled
30% OPEX savings by triggering Auto-recovery
Process & Average Time taken reduced from 30 mins to 3 mins
Transitioned from Manual Incident
Resolution to AI Ops driven resolution resulting in 90% savings
5% revenue increase
Systems we used
- Splunk for Logs
- Zabbix for Systems Monitoring
- SolarWinds for network logs
- Private Cloud
- Java based Microservices/Apps
- .NET Apps
- MongoDB NOSQL
- Oracle & MySQL RDBMS
Challenges
- Getting all tools to feed data and maintaining proper thresholds to avoid false positive alerts.
- Integration challenge: Our AIOps service
needed receiver and feeder APIs to a cloud-based solution – getting security clearance.
