Offshoring DevOps/SRE roles to Poland for Canadian Manufacturing

manufacturing

Company name:

Non-disclosed (marketing approval pending)

Company size:

1000+

Services provided:

SRE adoption

Duration of project:

ongoing

Engineers (FTE):

150+

Technology:

Azure, AKS, RestAPI + Events, including Azure Functions on K8s

The Problem

The client, a US-based fleet safety and manufacturing company with over 1,000 employees, struggled to scale its cloud operations. For several months, they attempted to hire a Site Reliability Engineering (SRE) Lead in Vancouver, Canada, but could not find the right talent locally.

The company also faced a vendor monopoly, with 80% of its R&D outsourced to a single provider in Colombia. This limited their innovation and created communication bottlenecks across their portfolio of 70+ digital products. They needed a strategic partner to diversify their talent pool and establish modern SRE standards.

The Strategy: Talent as a Service (TaaS)

Relout used its Talent as a Service model to provide a solution that went beyond traditional recruitment.

  • Engineer-to-Engineer Vetting: The recruitment process was led by Relout’s founder, Gerard Stańczak, an SRE expert, who personally verified the technical skills and seniority of candidates. This technical oversight was a key factor in winning the client’s trust over standard HR agencies.
  • Night Shift Sourcing: Because the client operates in Pacific Standard Time (PST), the Polish lead had to work until 1:00 AM CET. Relout specifically recruited senior engineers whose personal lifestyles allowed them to work these hours long-term.
  • Budgeting Realities (Poland vs. LATAM): Relout used market data to show that while Polish rates (approx. $56/h) were higher than in Colombia, the technical maturity and efficiency of Polish talent provided better annual value, even when accounting for more vacation days and holidays in Poland.

Technical Focus: Managing 20Gb/s Data and 70+ Products

The technical environment was massive, involving Azure Government and Commercial clouds, 60-70 node production AKS clusters, and a Kafka instance handling 20Gb/s peak ingestion.

Key SRE implementations included:

  • Incident Management: Relout built a new incident response system from scratch, using War Rooms and dedicated Slack channels to centralize communication.
  • Blameless Culture: They introduced Blameless Post-Mortems to focus on root cause analysis and actionable items rather than finger-pointing.
  • Capacity Planning: A custom framework was created for the Data Ingestion System (DIS) to manage resource allocation for microservices using KEDA for autoscaling.
  • Business Observability: Relout moved beyond basic technical metrics to create
  • Grafana dashboards that allowed stakeholders to track business health, such as daily IoT file processing volumes.
  • To ensure the transition was not just reactive, Relout developed a structured SRE Roadmap for 2026.
Strategy SRE pyramid
Postmortems and blameless culture
Incident Management High Level

Operational Success: Team Integration and 8-Hour Overlap

The success of the first SRE Lead role allowed Relout to build a “Team of Five” within six months, including Cloud Leads and DevOps engineers.

  • Managing the Time Difference: Relout optimized the work schedule to maintain a 50% to 75% overlap with the US teams. This ensured critical communication happened during the 3–4 hours when all global teams were online.
  • Relout-Funded Integration: As part of the TaaS model, Relout paid for team integration events – such as team dinners and social syncs – to build a cohesive unit. This helped reduce turnover and improved the team’s internal working relationship at no cost to the client.
  • Frequent Communication: The team participated in many technical and consulting calls, often after 9:00 PM CET, to overcome resistance from the previous vendor and ensure smooth knowledge transfers.

Result and Outcome

In eight months, the transition to an SRE-driven culture delivered major improvements:

  • Service Uptime: All key services reached an uptime of 99.5% or higher.
  • Stability: The monthly incident count was reduced by 50% within four months.
  • Efficiency: Alert noise was cut by 25%, and the team reached a state of zero active unresolved incidents, clearing a long-standing backlog.

?

Processed applications

?

HR
Interviews

?

Technical Interviews

?

Client Interviews

?

Hired Engineers

?

Onsite Events & Workshops (in Sweden)

data from Relout ATS (Applicant Tracking System) , as of 01.02.2025

cta-quote-image

Our mission is to connect best-in-class, passionate engineers with fast-growing digital & technology companies.

Gerard Stańczak

Founder

Have a question?

Book a discovery call