Nicholas Kilby MCP
Shropshire
About Me
Modern, forward-thinking Platform & Infrastructure Engineer with over a decade of experience building and leading high-scale, high-availability systems across multi-site datacentres and hybrid cloud environments. I specialise in automation-first engineering, distributed platforms, and operational leadership — bringing a hands-on mindset paired with strategic oversight.
I’ve led infrastructure for a large self-hosted SaaS platform powering universities, police forces, and public bodies across Europe. My work combines deep technical capability (VMware, Kubernetes, Ansible, ZFS, distributed storage) with team leadership, vendor strategy, datacentre optimisation, and reliability practices.
I’m now seeking Platform Engineering Manager, Infrastructure Manager, SRE Manager, or General IT Manager roles where I can lead teams, own platform direction, and drive engineering excellence in a fast-moving environment.
What I’m Good At
Career Highlights
Built and led large-scale platform infrastructure
Expanded and managed a 3k+ VM / 50-node vSphere estate across multiple datacentres and AWS, supporting public-sector and enterprise workloads.
Engineered distributed, self-healing systems
Delivered production Kubernetes, CockroachDB, Minio, ClickHouse and other clustered components powering mission-critical services.
Agentic development projects
- Bespoke ZFS backup automation using snapshot intelligence and replication logic.
- Adaptive storage orchestration tools built using Python and custom Ansible modules.
- Automated datacentre decisioning workflows (load-aware and temperature-aware operations).
Cut datacentre cooling energy usage by up to 75%
Designed and built an innovative free-air cooling system to significantly reduce energy consumption during cooler periods — saving thousands annually and reducing environmental impact.
Introduced full observability
Defined and deployed a Prometheus, Grafana and Loki stack, establishing alerting, metrics, and real-time diagnostics across the platform.
Experience
Zengenti — Platform Engineer / Hosting Operations Engineer
Lead platform engineer within a high-scale self-hosted SaaS environment, delivering infrastructure, reliability, and automation across multi-site datacentres.
Leadership & management
- Technical lead for day-to-day infrastructure operations, mentoring engineers and aligning priorities with product and leadership teams.
- ISO 27001 audit lead for the hosting team, implementing processes and controls.
- Vendor-side lead for hardware strategy, negotiation, and lifecycle planning.
Engineering & delivery
- Managed a multi-site 3k+ VM estate running VMware vSphere, Windows and Linux.
- Architected and deployed clusters: Kubernetes, CockroachDB, Minio, ClickHouse.
- Designed ZFS storage with replication, snapshot workflows, and agentic backup tooling.
- Delivered Infrastructure as Code using Ansible, Packer and Python (including custom modules).
- Owned end-to-end datacentre operations including power, cooling, capacity planning, and hardware optimisation.
- Introduced full observability using Prometheus, Grafana and Loki, improving MTTR and transparency.
- Reduced cooling energy consumption by up to 75% using free-air cooling engineering.
Kingspan Insulation — Technical Services Engineer
Provided global infrastructure support including oversight of Northern UK sites and remote support for US facilities. Gained international project experience and strengthened cross-cultural operations.
SimplyWISP Ltd — Director & Co-Founder
Founded and operated a rural broadband ISP providing wireless connectivity to underserved areas. Built the network, managed operations, handled budgets, and led customer delivery, gaining commercially focused leadership experience.
S&A Produce (UK) Ltd — IT Technician
Delivered infrastructure, network, AD/Exchange, VPN, Wi-Fi, and ERP/MRP support to a £50m business across multiple UK sites.
Earlier Roles (2001–2010)
Technical Specialist, Planner, Administrator and Events Management roles providing broad operational and organisational experience.
Community & Leadership
Trustee — Ashford Carbonell Village Hall & Recreation Ground
Supporting governance, modernisation efforts and community digital strategy.
Technical Skills
Infrastructure: VMware vSphere, Windows Server, Ubuntu, Storage, Hardware
Distributed systems: CockroachDB, Minio, Kubernetes, ClickHouse
Automation: Ansible, Packer, Python, CI/CD
Networking: Juniper, Dell, Firewalls, PtP wireless
Observability: Prometheus, Grafana, Zabbix
Cloud: AWS, hybrid environments
Other: SQL, Git, datacentre design, capacity modelling, budgeting, team management