Azure Functions maintains pools of pre-warmed containers to avoid the high container-allocation latency. The size of a pool is important: a pool that is too small leads to high allocation latency, whereas a pool that is too large wastes resources and increases cost. Service providers typically oversize pools to meet service-level objectives (SLOs). Our findings indicate that the cost of maintaining pre-warmed container pools
dominates the overall platform cost, motivating the need for effective pool management strategies.
We characterize container-allocation traces from Azure Functions, revealing key findings: the demand for container allocation is highly bursty, while the supply of new containers is virtually unlimited but may have long delays. Traditional resource management approaches, which rely on prediction or reactive techniques, fail to reduce costs while meeting the target SLO due to the bursty nature of the workload. These insights lead us to develop a new statistical, data-driven pool optimization method that uses historical traces to compute the size of each pool. Our evaluation shows that the proposed method meets the target SLO while reducing the operational cost by 41% compared to the static approach employed earlier in the production platform.
People
- Ahmed Alquraan
- Abdelrahman Baba
- Rafael Mendes (Microsoft Research)
- Sameh Elnikety (Microsoft Research)
- Paul Batum (Microsoft)
- Yan Chen (Microsoft)
- Hamid Henry Safi (Microsoft)
- Seth Fine (Microsoft)
- Samer Al-Kiswany
Downloads
Publications
[1] DROPS: Managing Serverless Resource Pools in Microsoft Azure Functions
Ahmed Alquraan, Abdelrahman Baba, Rafael Mendes, Sameh Elnikety, Paul Batum, Yan Chen, Hamid Henry Safi, Seth Fine, Samer Al-Kiswany
The European Conference on Computer Systems (EuroSys), Apr. 2026 [pdf]