Company name:
Non-disclosed (marketing approval pending)
Company size:
1000+
Services provided:
SRE adoption
Duration of project:
ongoing
Engineers (FTE):
150+
Technology:
Azure, AKS, RestAPI + Events, including Azure Functions on K8s
The Problem
The client, a US-based fleet safety and manufacturing company with over 1,000 employees, struggled to scale its cloud operations. For several months, they attempted to hire a Site Reliability Engineering (SRE) Lead in Vancouver, Canada, but could not find the right talent locally.
The company also faced a vendor monopoly, with 80% of its R&D outsourced to a single provider in Colombia. This limited their innovation and created communication bottlenecks across their portfolio of 70+ digital products. They needed a strategic partner to diversify their talent pool and establish modern SRE standards.
The Strategy: Talent as a Service (TaaS)
Relout used its Talent as a Service model to provide a solution that went beyond traditional recruitment.
- Engineer-to-Engineer Vetting: The recruitment process was led by Relout’s founder, Gerard Stańczak, an SRE expert, who personally verified the technical skills and seniority of candidates. This technical oversight was a key factor in winning the client’s trust over standard HR agencies.
- Night Shift Sourcing: Because the client operates in Pacific Standard Time (PST), the Polish lead had to work until 1:00 AM CET. Relout specifically recruited senior engineers whose personal lifestyles allowed them to work these hours long-term.
- Budgeting Realities (Poland vs. LATAM): Relout used market data to show that while Polish rates (approx. $56/h) were higher than in Colombia, the technical maturity and efficiency of Polish talent provided better annual value, even when accounting for more vacation days and holidays in Poland.
Technical Focus: Managing 20Gb/s Data and 70+ Products
The technical environment was massive, involving Azure Government and Commercial clouds, 60-70 node production AKS clusters, and a Kafka instance handling 20Gb/s peak ingestion.
Key SRE implementations included:
- Incident Management: Relout built a new incident response system from scratch, using War Rooms and dedicated Slack channels to centralize communication.
- Blameless Culture: They introduced Blameless Post-Mortems to focus on root cause analysis and actionable items rather than finger-pointing.
- Capacity Planning: A custom framework was created for the Data Ingestion System (DIS) to manage resource allocation for microservices using KEDA for autoscaling.
- Business Observability: Relout moved beyond basic technical metrics to create
- Grafana dashboards that allowed stakeholders to track business health, such as daily IoT file processing volumes.
- To ensure the transition was not just reactive, Relout developed a structured SRE Roadmap for 2026.
Operational Success: Team Integration and 8-Hour Overlap
The success of the first SRE Lead role allowed Relout to build a “Team of Five” within six months, including Cloud Leads and DevOps engineers.
- Managing the Time Difference: Relout optimized the work schedule to maintain a 50% to 75% overlap with the US teams. This ensured critical communication happened during the 3–4 hours when all global teams were online.
- Relout-Funded Integration: As part of the TaaS model, Relout paid for team integration events – such as team dinners and social syncs – to build a cohesive unit. This helped reduce turnover and improved the team’s internal working relationship at no cost to the client.
- Frequent Communication: The team participated in many technical and consulting calls, often after 9:00 PM CET, to overcome resistance from the previous vendor and ensure smooth knowledge transfers.
Result and Outcome
In eight months, the transition to an SRE-driven culture delivered major improvements:
- Service Uptime: All key services reached an uptime of 99.5% or higher.
- Stability: The monthly incident count was reduced by 50% within four months.
- Efficiency: Alert noise was cut by 25%, and the team reached a state of zero active unresolved incidents, clearing a long-standing backlog.
?
Processed applications
?
HR
Interviews
?
Technical Interviews
?
Client Interviews
?
Hired Engineers
?
Onsite Events & Workshops (in Sweden)
data from Relout ATS (Applicant Tracking System) , as of 01.02.2025