Walk me through how you would design a multi-account AWS setup for a UK fintech.

Architecture instinct, scored on regulatory awareness. Strong answers cover: AWS Organizations with management account isolated, separate accounts for prod, pre-prod, dev, security and logging (centralised audit), SCPs at OU level for guardrails, IAM Identity Center for human access (no long-lived IAM users), Control Tower or equivalent landing zone, cross-account roles with permission boundaries, VPC peering or Transit Gateway for cross-account networking, and centralised logging account with object-locked S3 for tamper-evidence. Mention region selection (eu-west-2 London, eu-west-1 Ireland for DR), data-residency for UK regulated data. Weak candidates describe a single-account setup. The kill-shot is recommending IAM users for human access in 2026.

Tell me how you would migrate a 200-VM on-premises workload to AWS.

Migration question, scored on pragmatism. Strong answers walk through the 7Rs (rehost, replatform, repurchase, refactor, retire, retain, relocate), discovery first (Application Discovery Service), pilot with a non-critical workload, then waves. For the lift-and-shift workloads, AWS MGN. For databases, DMS with CDC. For network, Direct Connect for production, Site-to-Site VPN for dev. Mention identity migration (AD Connector or AD on EC2), DNS strategy (Route 53 private hosted zones), and parallel-running for cutover. Weak candidates jump to 'we would use AWS MGN for everything'. The kill-shot is not mentioning testing strategy. UK enterprise panels at financial services and government test this scenario specifically.

How do you write Terraform that is maintainable at 50-engineer scale?

IaC quality, scored on production realism. Strong answers cover: module structure (separate root modules per environment, shared modules for common patterns, semantic versioning), state management (per-environment remote state in S3 with DynamoDB lock, no shared state), pipeline-only apply (no local apply in production), policy-as-code (OPA or Sentinel for guardrails), automated drift detection, and a code review pattern that scales. Mention atlantis or Terraform Cloud for PR-based workflow. Weak candidates describe a single state file or local apply. The kill-shot is admitting you have run terraform apply locally against production. UK senior cloud engineering hires are expected to know IaC as a discipline, not a tool.

Walk me through your FinOps approach for a £2m/month AWS spend.

FinOps is increasingly a senior-level UK requirement in 2026. Strong answers cover: tagging strategy (cost allocation by team/product/environment, enforced by SCPs that block untagged resources), Cost Explorer + Athena for analysis, Compute Savings Plans + Reserved Instances for baseline, Spot for stateless workloads, regular rightsizing using Compute Optimizer, idle-resource cleanup automation, and chargeback or showback to product teams. Mention that engineering teams cannot optimise what they do not see. Weak candidates suggest 'turning things off'. The kill-shot is not knowing the difference between Savings Plans and Reserved Instances. UK enterprise and fintech panels test FinOps depth at senior level.

How do you approach disaster recovery for a critical UK production workload?

DR question, scored on RPO/RTO realism. Strong answers cover: classify the workload (Tier 0 zero-downtime, Tier 1 minutes, Tier 2 hours), pick a DR pattern that matches (active-active, pilot light, warm standby, backup-and-restore), define RPO and RTO with the business (not the engineering team alone), test DR quarterly with real failover (not paper), and document the runbook. Mention regulatory requirements (UK financial services often require <4h RTO for critical workloads, with documented annual DR test). Weak candidates describe DR as 'we have backups'. The kill-shot is admitting you have never run a real failover test. UK fintech and regulated panels disqualify on this.

Tell me about a time you cut cloud spend without breaking workloads.

Behavioural with FinOps focus. Strong answers are specific: 'AWS spend was £180k/month and growing 8 percent month-on-month. I tagged every resource, identified that 32 percent of EC2 was idle dev environments running 24/7, automated start-stop with EventBridge for non-prod, switched 60 percent of stateless workloads to Spot, purchased a Compute Savings Plan for the steady-state workload, and rightsized 14 oversized RDS instances. Spend dropped to £108k/month, no production impact, no engineer pushback.' Weak answers describe generic cost work without numbers. The kill-shot is having no FinOps story. Senior UK cloud engineers in 2026 are expected to have at least one real cost-optimisation war story.

How do you handle Kubernetes cost optimisation specifically?

K8s FinOps is a 2026-current topic. Strong answers cover: rightsizing requests/limits based on real usage (Vertical Pod Autoscaler, Goldilocks), node-level optimisation (Karpenter or Cluster Autoscaler with spot pools), horizontal scaling with HPA based on CPU/memory and custom metrics, namespace resource quotas to prevent runaway, and bin-packing efficiency. Mention KubeCost or OpenCost for visibility. Weak candidates describe pod sizing as 'set requests equal to limits'. The kill-shot is not knowing what bin-packing means. UK panels at scale-ups and AI-infra companies test this specifically because Kubernetes cost overruns are common.

Walk me through how you would secure an AWS account that handles UK customer PII.

Cloud security with regulatory awareness. Strong answers cover: data residency (eu-west-2 London, no replication outside UK without legal review), encryption at rest (KMS CMKs, S3 default encryption), encryption in transit (TLS 1.3, no plaintext), IAM least privilege with permission boundaries, GuardDuty + Macie for detection, CloudTrail to logging account with object-lock, VPC flow logs, secrets in Secrets Manager not env vars, and audit-logged access reviews quarterly. Mention UK GDPR alignment (Article 32 technical and organisational measures). Weak candidates describe encryption alone. The kill-shot is recommending us-east-1 for UK customer PII. UK regulated panels test this scenario constantly.

How do you approach observability for cloud-native workloads?

Observability question, scored on production realism. Strong answers cover: the three pillars (logs, metrics, distributed traces) with OpenTelemetry as the 2026 standard, structured logging with request IDs end-to-end, RED method metrics (rate, errors, duration) per service, USE method (utilisation, saturation, errors) for infrastructure, SLOs and error budgets with alerting on burn rate (not raw threshold), and a dashboard culture that engineers actually look at. Mention real tools (Datadog, Grafana Cloud, Honeycomb, Prometheus). Weak candidates describe CloudWatch alone. The kill-shot is alerting on every metric. UK senior cloud engineers know signal-to-noise is the discipline that matters.

Tell me about a production incident you led and what you changed afterwards.

Operations question. Strong answers describe a specific incident: 'Production Postgres failover did not complete cleanly during a maintenance window. RTO was 90 minutes. I led the response, restored from PITR, identified that the read replica was not configured with adequate WAL retention. Afterwards, I introduced quarterly DR drills, automated WAL retention monitoring, and ran a blameless post-mortem that surfaced three other latent failure modes. We have not had a similar incident in 18 months.' Weak answers describe incidents without follow-up. The kill-shot is describing an incident with no learning. UK panels at senior level test for the operations-as-discipline mindset.

How do you keep up with AWS releases without being overwhelmed?

Process question, reveals judgement. Strong answers describe a filter: subscribe to AWS What's New RSS, follow the official AWS blog and Werner Vogels' blog, watch reInvent keynotes, follow specific AWS Heroes, and evaluate new services by problem-solved-not-novelty. Mention that you wait 6-12 months before adopting new AWS services in production unless they fill a real gap. Weak candidates name-drop services or claim to use everything. The kill-shot is admitting you do not look at AWS releases at all. UK senior cloud engineers in 2026 are expected to filter the AWS release firehose intelligently — too many or too few signals are both red flags.

Why are you leaving your current role?

Standard closer. Strong answers are forward-looking: you want bigger scope (multi-region, multi-account at greater scale), you want to lead the cloud function, you want a more 2026-current stack (container-first or serverless-first if your current is VM-heavy), you want a regulated environment if your current is unregulated, or vice versa. Weak answers attack your current employer or focus on salary alone. The kill-shot is bad-mouthing your current architecture team. UK cloud engineering is a small community in London and Manchester especially; everyone interviewing you knows the architect you are complaining about. Stay forward-looking. The panel wants reassurance you will not be making the same complaint about them in 18 months.

Interview Q's · Tech · UK 2026

Cloud Engineer Interview Questions UK

Q: Tell me about a time you cut cloud spend without breaking workloads.

Behavioural with FinOps focus. Strong answers are specific: 'AWS spend was £180k/month and growing 8 percent month-on-month. I tagged every resource, identified that 32 percent of EC2 was idle dev environments running 24/7, automated start-stop with EventBridge for non-prod, switched 60 percent of stateless workloads to Spot, purchased a Compute Savings Plan for the steady-state workload, and rightsized 14 oversized RDS instances. Spend dropped to £108k/month, no production impact, no engineer pushback.' Weak answers describe generic cost work without numbers. The kill-shot is having no FinOps story. Senior UK cloud engineers in 2026 are expected to have at least one real cost-optimisation war story.

Q: How do you handle Kubernetes cost optimisation specifically?

K8s FinOps is a 2026-current topic. Strong answers cover: rightsizing requests/limits based on real usage (Vertical Pod Autoscaler, Goldilocks), node-level optimisation (Karpenter or Cluster Autoscaler with spot pools), horizontal scaling with HPA based on CPU/memory and custom metrics, namespace resource quotas to prevent runaway, and bin-packing efficiency. Mention KubeCost or OpenCost for visibility. Weak candidates describe pod sizing as 'set requests equal to limits'. The kill-shot is not knowing what bin-packing means. UK panels at scale-ups and AI-infra companies test this specifically because Kubernetes cost overruns are common.

Q: Walk me through how you would secure an AWS account that handles UK customer PII.

Cloud security with regulatory awareness. Strong answers cover: data residency (eu-west-2 London, no replication outside UK without legal review), encryption at rest (KMS CMKs, S3 default encryption), encryption in transit (TLS 1.3, no plaintext), IAM least privilege with permission boundaries, GuardDuty + Macie for detection, CloudTrail to logging account with object-lock, VPC flow logs, secrets in Secrets Manager not env vars, and audit-logged access reviews quarterly. Mention UK GDPR alignment (Article 32 technical and organisational measures). Weak candidates describe encryption alone. The kill-shot is recommending us-east-1 for UK customer PII. UK regulated panels test this scenario constantly.

Q: How do you approach observability for cloud-native workloads?

Observability question, scored on production realism. Strong answers cover: the three pillars (logs, metrics, distributed traces) with OpenTelemetry as the 2026 standard, structured logging with request IDs end-to-end, RED method metrics (rate, errors, duration) per service, USE method (utilisation, saturation, errors) for infrastructure, SLOs and error budgets with alerting on burn rate (not raw threshold), and a dashboard culture that engineers actually look at. Mention real tools (Datadog, Grafana Cloud, Honeycomb, Prometheus). Weak candidates describe CloudWatch alone. The kill-shot is alerting on every metric. UK senior cloud engineers know signal-to-noise is the discipline that matters.

Q: Tell me about a production incident you led and what you changed afterwards.

Operations question. Strong answers describe a specific incident: 'Production Postgres failover did not complete cleanly during a maintenance window. RTO was 90 minutes. I led the response, restored from PITR, identified that the read replica was not configured with adequate WAL retention. Afterwards, I introduced quarterly DR drills, automated WAL retention monitoring, and ran a blameless post-mortem that surfaced three other latent failure modes. We have not had a similar incident in 18 months.' Weak answers describe incidents without follow-up. The kill-shot is describing an incident with no learning. UK panels at senior level test for the operations-as-discipline mindset.

Cloud Engineer interviews in UK 2026 are weighted toward AWS depth (also Azure and GCP, but AWS dominates UK enterprise and fintech), infrastructure-as-code fluency (Terraform overwhelmingly, with some Pulumi), production-scale operations and FinOps instinct. Senior Cloud Engineers in London earn £95-130k base, more at fintech and US tech London offices. The 12 questions below are the ones I see in real UK cloud-engineering loops — and the answers reveal whether the candidate has shipped real production cloud, or only personal-project cloud. I have written each answer from the recruiter's side: what the panel is testing for, what a strong response looks like, and what mistake immediately ends the conversation.

By Alex · 12-year UK recruiter · 12 questions + recruiter answers

Question 1

Walk me through how you would design a multi-account AWS setup for a UK fintech.

Architecture instinct, scored on regulatory awareness. Strong answers cover: AWS Organizations with management account isolated, separate accounts for prod, pre-prod, dev, security and logging (centralised audit), SCPs at OU level for guardrails, IAM Identity Center for human access (no long-lived IAM users), Control Tower or equivalent landing zone, cross-account roles with permission boundaries, VPC peering or Transit Gateway for cross-account networking, and centralised logging account with object-locked S3 for tamper-evidence. Mention region selection (eu-west-2 London, eu-west-1 Ireland for DR), data-residency for UK regulated data. Weak candidates describe a single-account setup. The kill-shot is recommending IAM users for human access in 2026.
Question 2

Tell me how you would migrate a 200-VM on-premises workload to AWS.

Migration question, scored on pragmatism. Strong answers walk through the 7Rs (rehost, replatform, repurchase, refactor, retire, retain, relocate), discovery first (Application Discovery Service), pilot with a non-critical workload, then waves. For the lift-and-shift workloads, AWS MGN. For databases, DMS with CDC. For network, Direct Connect for production, Site-to-Site VPN for dev. Mention identity migration (AD Connector or AD on EC2), DNS strategy (Route 53 private hosted zones), and parallel-running for cutover. Weak candidates jump to 'we would use AWS MGN for everything'. The kill-shot is not mentioning testing strategy. UK enterprise panels at financial services and government test this scenario specifically.
Question 3

How do you write Terraform that is maintainable at 50-engineer scale?

IaC quality, scored on production realism. Strong answers cover: module structure (separate root modules per environment, shared modules for common patterns, semantic versioning), state management (per-environment remote state in S3 with DynamoDB lock, no shared state), pipeline-only apply (no local apply in production), policy-as-code (OPA or Sentinel for guardrails), automated drift detection, and a code review pattern that scales. Mention atlantis or Terraform Cloud for PR-based workflow. Weak candidates describe a single state file or local apply. The kill-shot is admitting you have run terraform apply locally against production. UK senior cloud engineering hires are expected to know IaC as a discipline, not a tool.
Question 4

Walk me through your FinOps approach for a £2m/month AWS spend.

FinOps is increasingly a senior-level UK requirement in 2026. Strong answers cover: tagging strategy (cost allocation by team/product/environment, enforced by SCPs that block untagged resources), Cost Explorer + Athena for analysis, Compute Savings Plans + Reserved Instances for baseline, Spot for stateless workloads, regular rightsizing using Compute Optimizer, idle-resource cleanup automation, and chargeback or showback to product teams. Mention that engineering teams cannot optimise what they do not see. Weak candidates suggest 'turning things off'. The kill-shot is not knowing the difference between Savings Plans and Reserved Instances. UK enterprise and fintech panels test FinOps depth at senior level.
Question 5

How do you approach disaster recovery for a critical UK production workload?

DR question, scored on RPO/RTO realism. Strong answers cover: classify the workload (Tier 0 zero-downtime, Tier 1 minutes, Tier 2 hours), pick a DR pattern that matches (active-active, pilot light, warm standby, backup-and-restore), define RPO and RTO with the business (not the engineering team alone), test DR quarterly with real failover (not paper), and document the runbook. Mention regulatory requirements (UK financial services often require <4h RTO for critical workloads, with documented annual DR test). Weak candidates describe DR as 'we have backups'. The kill-shot is admitting you have never run a real failover test. UK fintech and regulated panels disqualify on this.
Question 6

Tell me about a time you cut cloud spend without breaking workloads.

Behavioural with FinOps focus. Strong answers are specific: 'AWS spend was £180k/month and growing 8 percent month-on-month. I tagged every resource, identified that 32 percent of EC2 was idle dev environments running 24/7, automated start-stop with EventBridge for non-prod, switched 60 percent of stateless workloads to Spot, purchased a Compute Savings Plan for the steady-state workload, and rightsized 14 oversized RDS instances. Spend dropped to £108k/month, no production impact, no engineer pushback.' Weak answers describe generic cost work without numbers. The kill-shot is having no FinOps story. Senior UK cloud engineers in 2026 are expected to have at least one real cost-optimisation war story.
Question 7

How do you handle Kubernetes cost optimisation specifically?

K8s FinOps is a 2026-current topic. Strong answers cover: rightsizing requests/limits based on real usage (Vertical Pod Autoscaler, Goldilocks), node-level optimisation (Karpenter or Cluster Autoscaler with spot pools), horizontal scaling with HPA based on CPU/memory and custom metrics, namespace resource quotas to prevent runaway, and bin-packing efficiency. Mention KubeCost or OpenCost for visibility. Weak candidates describe pod sizing as 'set requests equal to limits'. The kill-shot is not knowing what bin-packing means. UK panels at scale-ups and AI-infra companies test this specifically because Kubernetes cost overruns are common.
Question 8

Walk me through how you would secure an AWS account that handles UK customer PII.

Cloud security with regulatory awareness. Strong answers cover: data residency (eu-west-2 London, no replication outside UK without legal review), encryption at rest (KMS CMKs, S3 default encryption), encryption in transit (TLS 1.3, no plaintext), IAM least privilege with permission boundaries, GuardDuty + Macie for detection, CloudTrail to logging account with object-lock, VPC flow logs, secrets in Secrets Manager not env vars, and audit-logged access reviews quarterly. Mention UK GDPR alignment (Article 32 technical and organisational measures). Weak candidates describe encryption alone. The kill-shot is recommending us-east-1 for UK customer PII. UK regulated panels test this scenario constantly.
Question 9

How do you approach observability for cloud-native workloads?

Observability question, scored on production realism. Strong answers cover: the three pillars (logs, metrics, distributed traces) with OpenTelemetry as the 2026 standard, structured logging with request IDs end-to-end, RED method metrics (rate, errors, duration) per service, USE method (utilisation, saturation, errors) for infrastructure, SLOs and error budgets with alerting on burn rate (not raw threshold), and a dashboard culture that engineers actually look at. Mention real tools (Datadog, Grafana Cloud, Honeycomb, Prometheus). Weak candidates describe CloudWatch alone. The kill-shot is alerting on every metric. UK senior cloud engineers know signal-to-noise is the discipline that matters.
Question 10

Tell me about a production incident you led and what you changed afterwards.

Operations question. Strong answers describe a specific incident: 'Production Postgres failover did not complete cleanly during a maintenance window. RTO was 90 minutes. I led the response, restored from PITR, identified that the read replica was not configured with adequate WAL retention. Afterwards, I introduced quarterly DR drills, automated WAL retention monitoring, and ran a blameless post-mortem that surfaced three other latent failure modes. We have not had a similar incident in 18 months.' Weak answers describe incidents without follow-up. The kill-shot is describing an incident with no learning. UK panels at senior level test for the operations-as-discipline mindset.
Question 11

How do you keep up with AWS releases without being overwhelmed?

Process question, reveals judgement. Strong answers describe a filter: subscribe to AWS What's New RSS, follow the official AWS blog and Werner Vogels' blog, watch reInvent keynotes, follow specific AWS Heroes, and evaluate new services by problem-solved-not-novelty. Mention that you wait 6-12 months before adopting new AWS services in production unless they fill a real gap. Weak candidates name-drop services or claim to use everything. The kill-shot is admitting you do not look at AWS releases at all. UK senior cloud engineers in 2026 are expected to filter the AWS release firehose intelligently — too many or too few signals are both red flags.
Question 12

Why are you leaving your current role?

Standard closer. Strong answers are forward-looking: you want bigger scope (multi-region, multi-account at greater scale), you want to lead the cloud function, you want a more 2026-current stack (container-first or serverless-first if your current is VM-heavy), you want a regulated environment if your current is unregulated, or vice versa. Weak answers attack your current employer or focus on salary alone. The kill-shot is bad-mouthing your current architecture team. UK cloud engineering is a small community in London and Manchester especially; everyone interviewing you knows the architect you are complaining about. Stay forward-looking. The panel wants reassurance you will not be making the same complaint about them in 18 months.

How to use these answers

Cloud Engineer interviews in UK 2026 reward depth on AWS specifically (Azure and GCP at major UK enterprises too), Terraform fluency at scale, FinOps instinct, and operations stories with real numbers. The single biggest mistake I see is candidates listing every cloud service they have touched without depth in any one area; UK panels test for production-scale fluency, not breadth. Prep with three real shipped systems you can talk through end-to-end (the architecture decisions, the cost trade-offs, the operational learnings). Practise the multi-account architecture round on real organisations you have built. And make sure your operations stories include real numbers (RTO/RPO, MTTR, cost impact). UK senior cloud hires get the salary premium because they earn it on production judgement, regulatory awareness and FinOps as much as architecture skill.

Related across UK Rights & Guides

Keep reading

UK careers reference — 215+ guides indexed →

Pillars + free tools

Related job-search guides + calculators

Pillars

→ UK Career Change — credentialed-fields pillar — sector-switch playbook
→ UK Resume pillar — AI prompts + ATS-safe — CV format + ATS-safe AI prompts
→ UK Interview Prep pillar — STAR + 4-stage — STAR + 4-stage UK process
→ UK Cover Letter pillar — five opening patterns — five UK opening patterns

Free recruiter-built tools