March 13, 2017

Analysis and Diagnosis of SLA Violations in a Production SaaS Cloud

  • Ganesan R.
  • Iyer R.
  • Kalbarczyk Z.
  • Martino C.
  • Sarkar S.

A software-as-a-service (SaaS) needs to provide its intended service as per its stated service-level agreements (SLAs). While SLA violations in a SaaS platform have been reported, not much work has been done to empirically characterize failures of SaaS. In this paper, we study SLA violations of a production SaaS platform, diagnose the causes, unearth several critical failure modes, and then, suggest various solution approaches to increase the availability of the platform as perceived by the end user. Our approach combines field failure data analysis (FFDA) and fault injection. Our study is based on 283 days of operational logs of the platform. During this time, the platform received business workload from 42 customers spread over 22 countries. We have first developed a set of home-grown FFDA tools to analyze the log, and second implemented a fault injector to automatically inject several runtime errors in the application code written in .NET/C#, and then, collate the injection results. We summarize our finding as: first, system failures have caused 93% of all SLA violations; second, our fault injector has been able to recreate a few cases of bursts of SLA violations that could not be diagnosed from the logs; and third, the fault injection mechanism could recreate several error propagation paths leading to data corruptions that the failure data analysis could not reveal. Finally, the paper presents some system-level implication of this study and how the joint use of fault injection and log analysis may help in improving the reliability of the measured platform.

View Original Article

Recent Publications

March 23, 2017

Performance Comparison of Capacity-Achieving Modulation Formats for Transoceanic Optical Systems

  • Fernandez De Jauregui Ruiz I.
  • Ghazisaeidi A.
  • Tran P.

We experimentally compare the performance of geometrically-shaped (64APSK) and probabilistically-shaped 64QAM (PS64QAM) formats considering linear and nonlinear penalties.

March 20, 2017

LWIP and Wi-Fi Boost Link Management

  • K S.
  • Kim B.
  • Ling J.
  • Lopez-Perez D.
  • Ming Ding
  • Vasudevan S.

3GPP LWIP Release 13 technology and its prestandard version Wi-Fi Boost have recently emerged as an efficient LTE and Wi-Fi integration at the IP layer, allowing uplink on LTE and downlink on Wi-Fi. This solves all the contention problems of Wi-Fi and allows an optimum usage of the unlicensed band ...

March 13, 2017

Advanced C+L-Band Transoceanic Transmission Systems Based on Probabilistically-Shaped PDM-64QAM

  • Brindel P.
  • Buchali F.
  • Carbo Meseguer A.
  • Charlet G.
  • Fernandez I.
  • Ghazisaeidi A.
  • Hu Q.
  • Renaudier J.
  • Rios-Muller R.
  • Schmalen L.
  • Tran P.

We review the most recent, advanced concepts and methods employed in the cutting-edge spectrally-efficient coherent fiber-optic transoceanic transmission systems, such as probabilistic shaping, adaptive digital nonlinear compensation, rate-adaptive spatially-coupled low density parity check codes, and dual-band C+L-band transmission. Building upon all these concepts and methods, we demonstrate transmission of 179 ...