Rohan Nayar

Senior Site Reliability Engineer
Gurugram, IN.

About

AWS Certified Solutions Architect - Associate with 6 years of expertise in Site Reliability Engineering, specializing in architecting and optimizing scalable, reliable, and cost-efficient cloud infrastructure. Proven leader in automating zero-downtime deployments, implementing self-healing systems, and modernizing CI/CD pipelines across large-scale distributed environments. Leverages deep technical proficiency in AWS, Terraform, Kubernetes, and Python to drive operational excellence, enhance microservice reliability, and enable developer productivity.

Work

Cvent
|

Senior Site Reliability Engineer

Gurugram, Haryana, India

Summary

Led the architecture, migration, and enhancement of critical cloud infrastructure, focusing on reliability, security, and efficiency across large-scale distributed environments.

Highlights

Architected a blue-green Couchbase patching framework using Jenkins, Python, Chef, and AWS CDK v2, achieving zero downtime across 8 clusters and cutting manual effort by 85%.

Spearheaded migration of RabbitMQ clusters from Classic to Quorum queues, enhancing message persistence and reducing failover recovery time by ~40%.

Designed self-healing infrastructure with AWS Lambda + Step Functions to detect and replace unhealthy EC2 nodes, reducing recovery time from hours to under 10 minutes.

Enhanced microservices reliability by standardizing health checks, scaling policies, and Datadog dashboards, reducing MTTR by 30% and alert noise by 50%.

Optimized AWS costs by 22% through compute/storage right-sizing while maintaining 99.9% availability.

Credex Technology
|

Senior Software Engineer (DevOps)

Noida, Uttar Pradesh, India

Summary

Drove significant improvements in CI/CD, containerization, and infrastructure provisioning, enhancing deployment efficiency and system stability.

Highlights

Built CI/CD pipelines with Jenkins, Octopus Deploy, and GitHub Actions, reducing deployment lead time by 60%.

Containerized legacy Java/PHP applications using Docker and Kubernetes, improving environment parity and reducing drift by 75%.

Automated performance testing using Python and PowerShell, with results visualized via Elasticsearch and Kibana dashboards.

Provisioned AWS infrastructure using Terraform and CloudFormation, improving disaster recovery time by 90%.

Managed and optimized RabbitMQ clusters on Linux, implementing HA queues and monitoring with Prometheus exporters.

TensorIoT
|

Software Engineer

Bangalore, Karnataka, India

Summary

Developed and optimized serverless IoT data pipelines and migrated critical infrastructure, significantly reducing operational costs and improving system reliability.

Highlights

Developed serverless IoT data pipelines using AWS IoT Core, Lambda, API Gateway, and Chalice, reducing operational costs by 40%.

Migrated entire Skybell Doorbell infrastructure from CloudFormation to Terraform, eliminating drift and improving maintainability across multi-account AWS.

Implemented secure OTA firmware delivery with AWS Greengrass, achieving a 99.8% update success rate.

Delivered over 200 QuickSight dashboards for real-time IoT analytics, enhancing data visibility.

Education

Amity University
Noida, Uttar Pradesh, India

B.Tech

Mechanical Engineering

Certificates

AWS Certified Solutions Architect Associate

Issued By

AWS

Skills

Databases & Messaging

Couchbase, PostgreSQL, RabbitMQ.

Operating Systems

Linux Administration.

Reliability Engineering

Incident Management, Self-healing Systems, Zero-downtime Deployments, Microservice Reliability.

Security

AWS Secrets Manager, RBAC, Automated Policy Enforcement.

Cloud Platforms

AWS (EC2, ECS, Lambda, CDK v2, CloudFormation, IoT Core, API Gateway, Greengrass, DynamoDB).

Infrastructure as Code (IaC)

Terraform, CloudFormation, Ansible, Chef, AWS CDK v2, Atlantis, ExternalSecrets.

Containerization & Orchestration

Kubernetes, Docker.

CI/CD & DevOps Tools

Jenkins, GitHub Actions, GitLab CI, Octopus Deploy, GitOps.

Programming & Scripting

Python, Bash, TypeScript, PowerShell.

Monitoring & Alerting

Datadog, Prometheus, SLO/SLI, Elasticsearch, Kibana.