13+ years of experience in developing and designing systems with SLA requirements.
I will conduct SLA calculation and audit for business-critical applications.
I will help turn availability requirements into a working architecture.
Development of an AI data preparation component for an advertising budget optimization system for Top-100 Fortune. Creating static checks for uploaded data, optimizing processing workflows.
Development and maintenance of a Real-Time Bidding system for advertising auctions. Partner integration components, latency and CPU optimization. The system handles millions of requests daily with strict SLA requirements.
Creation of a Change Data Capture system with fast change delivery time. Architectural and code optimizations for real-time streaming data processing. Performance improvement and latency reduction.
Development of an electronic document management and automatic order forecasting system for the largest retail chain. Integration with existing ERP system, process optimization.
I help businesses formalize availability and performance requirements, translate them into architecture and metrics language. Calculate realistic SLAs and build systems that meet them.
Expertise in 8+ programming languages. Selecting the optimal stack for business tasks considering SLA requirements.
Designing systems with consideration for fault tolerance, performance, and scalability requirements. Ensuring SLA compliance at all levels.
Teaching industrial programming at HSE University. Mentoring students from leading technical universities.
Identifying business-critical scenarios, formalizing target metrics (Availability, Latency, Throughput, RTO/RPO). Defining realistic SLA goals based on business needs.
Identifying single points of failure risks, bottlenecks, redundancy, and monitoring. Analyzing current metrics and compliance with stated SLAs.
Calculating realistic SLA, developing a roadmap for system reliability improvement. Assessing the cost of providing different SLA levels.
Training your team on SLA principles, working with metrics and monitoring tools for independent maintenance and improvement of system availability.
If needed, developing and implementing a solution to ensure SLA, setting up monitoring and alerting. Load testing and Chaos Engineering to verify fault tolerance.