Skip to content
#

SRE

Site reliability engineering (SRE) is a set of principles and practices that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create scalable and highly reliable software systems. Site reliability engineering is closely related to DevOps, a set of practices that combine software development and IT operations, and SRE has also been described as a specific implementation of DevOps.

Here are 690 public repositories matching this topic...

howtheysre

StackStorm (aka "IFTTT for Ops") is event-driven automation for auto-remediation, incident responses, troubleshooting, deployments, and more for DevOps and SREs. Includes rules engine, workflow, 160 integration packs with 6000+ actions (see https://exchange.stackstorm.org) and ChatOps. Installer at https://docs.stackstorm.com/install/index.html

  • Updated Aug 15, 2024
  • Python

This solution, offered by the Open-Source community, will no longer receive contributions from Microsoft. Customers are encouraged to transition to Microsoft Azure Verified Modules for continued support and updates from Microsoft. Please note, this repository is scheduled for decommissioning and will be removed on July 1, 2025.

  • Updated Aug 1, 2024
  • HCL
Followers
119 followers
Wikipedia
Wikipedia