Armstrong Uzoagwa

Site Reliability Engineer  ·  Cloud Platform Engineer  ·  Senior DevOps

Profile

Site Reliability and Cloud Platform Engineer with 5 years of hands-on experience building, automating, and operating production infrastructure on AWS and GCP. Comfortable across the full stack of platform work: from writing Kubernetes cluster configurations and Terraform modules to debugging overnight incidents and documenting the runbooks that prevent the next one. Has worked with global teams across multiple time zones and supported over 200 customers on cloud infrastructure decisions. Focused on building systems that stay up, recover fast, and cost less to run.

Experience

Senior Platform Engineer / Site Reliability Engineer Jan 2024 to Jun 2025
Silvershell Integrated Consulting
  • Designed and deployed a full observability stack using Prometheus, Grafana, and Loki across production environments, cutting mean time to recovery by 40% and enabling the team to catch issues before they became incidents.
  • Wrote Python automation scripts integrated with REST APIs that reduced environment provisioning time by 50%, replacing a manual process that had been a consistent bottleneck across teams.
  • Managed Kubernetes clusters running microservice workloads at 99.9% uptime, implementing automated health checks, rollback strategies, and capacity planning to keep services stable under load.
  • Built enterprise CI/CD pipelines in Harness and Jenkins with SAST and DAST security scanning baked in, improving release velocity by 35% while keeping production deployments safe through canary rollouts.
  • Led the organisation-wide migration from GitLab to GitHub, creating reusable workflow templates that reduced pipeline maintenance effort significantly across multiple teams.
  • Worked alongside CloudOps, Security, and Compliance teams across time zones to define DevSecOps governance standards and translate those requirements into practical infrastructure decisions.
  • Designed Kubernetes migration strategy for hybrid workloads, improving deployment consistency between cloud and on-premises clusters using Terraform and Ansible automation.
DevOps Engineer (SRE Focus) Mar 2021 to Jul 2023
Darey.io / Remote
  • Built and maintained over 30 CI/CD pipelines using Jenkins and GitLab, integrating SonarQube for code quality gates and Artifactory for artifact management across multiple environments.
  • Set up Prometheus and Grafana monitoring infrastructure with actionable alerting dashboards, giving the operations team visibility they did not previously have across the platform.
  • Reduced average incident response time by 20% by automating diagnostic steps, writing structured runbooks, and standardising how the team approached troubleshooting.
  • Deployed and managed Kubernetes microservices using Helm charts, configuring resource limits, autoscaling policies, and monitoring across development, staging, and production clusters.
  • Provided technical guidance and cloud infrastructure support to over 200 customers on AWS and GCP, handling requirements gathering and translating business needs into working infrastructure.
  • Developed Python automation scripts for deployment workflows, infrastructure provisioning, and routine operational tasks to reduce manual effort across the team.
Junior DevOps Engineer Feb 2020 to Jan 2021
Cloud DXA / Contract
  • Automated AWS EC2 provisioning with Terraform and Python, taking setup time from one hour down to ten minutes and removing a category of manual configuration errors entirely.
  • Reduced AWS infrastructure costs by 20% by identifying idle resources and implementing automated monitoring to flag waste before it accumulated.
  • Built GitLab CI/CD pipelines with automated testing and deployment workflows, improving how quickly the development team could ship and validate code.
  • Deployed Prometheus and Grafana monitoring with dashboards covering CPU, memory, disk, and network across the infrastructure.
Linux Systems Administrator May 2018 to Jan 2021
Asendia UK / Part-Time
  • Administered Linux (Ubuntu), VMware, and Windows Server environments in enterprise production, keeping systems reliable and compliant with security requirements.
  • Automated routine administration tasks using Ansible playbooks and roles, reducing the manual workload and making deployments consistent and repeatable.
  • Monitored system performance and capacity metrics to stay ahead of resource constraints, resolving issues before they affected uptime.
  • Coordinated patch management and troubleshooting across production to maintain security compliance and minimise unplanned downtime.

Personal Projects

LiwoxDotNet Digital services platform: web development, cloud infrastructure, DevOps and automation. Lighthouse 95+, full observability stack, automated CI/CD.
AstroTypeScriptTailwindCloudflareTerraform
Forex Delight Live algorithmic trading platform with a custom MQL5 Expert Advisor running a multi-timeframe stochastic strategy. Profit-sharing model, user dashboard. Author: Scaling Micro Accounts in Forex.
MQL5Expert AdvisorAstroCloudflare
Smart Living Lifestyle and home content platform, independently built and deployed to Cloudflare's edge network.
AstroTypeScriptTailwindCloudflare
Construction Content platform serving the building and property trades sector.
AstroTypeScriptTailwindCloudflare
Music Original music productions and curated inspirational playlists.
AstroTypeScriptTailwindCloudflare

Key Results

40%
MTTR Reduction
Via Prometheus, Grafana, and Loki observability stack across production.
99.9%
Production Uptime
Sustained on Kubernetes workloads through proactive monitoring and automated failover.
85%
Faster Provisioning
EC2 setup reduced from 60 minutes to under 10 using Terraform and Python.
35%
Release Velocity
Faster deployments through automated CI/CD with safety controls built in.
20%
Cost Reduction
AWS spend reduced through automated resource monitoring and budget controls.
200+
Customers Supported
Cloud infrastructure guidance on AWS and GCP across multiple industries.