Site Reliability Engineer Job at EdHike LLC, Texas

NXZxNXB2VEEvWE9HZGFHK0JXcXdHUUhPTHc9PQ==
  • EdHike LLC
  • Texas

Job Description

Job Title: Site Reliability Engineer (SRE)

Location: Austin, TX

Job Summary

We are seeking a Site Reliability Engineer (SRE) to join our team and ensure the reliability, availability, and performance of our production systems. You will bridge the gap between development and operations, applying software engineering principles to system administration and infrastructure management.

Responsibilities

  • Design, build, and maintain scalable and reliable infrastructure.
  • Develop and maintain automation tools for deployment, monitoring, and site reliability.
  • Monitor system performance and troubleshoot issues to ensure high availability.
  • Collaborate with development and DevOps teams to improve system reliability and scalability.
  • Conduct root cause analysis of production errors and implement sustainable solutions.
  • Define and measure Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets.
  • Participate in on-call rotations to support system uptime and respond to incidents.
  • Continuously improve CI/CD pipelines and operational processes.
  • Document systems, processes, and playbooks to facilitate knowledge sharing.

Requirements

Required:

  • Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent experience).
  • 3+ years of experience in SRE, DevOps, or related fields.
  • Proficiency with cloud platforms (e.g., AWS, GCP, Azure).
  • Strong skills in scripting or programming (e.g., Python, Go, Bash).
  • Experience with infrastructure as code tools (e.g., Terraform, Ansible).
  • Proficiency with containerization and orchestration (e.g., Docker, Kubernetes).
  • Familiarity with monitoring and logging tools (e.g., Prometheus, Grafana, ELK, Datadog).
  • Strong understanding of networking, system internals, and distributed systems.

Preferred:

  • Experience with incident response and postmortem culture.
  • Knowledge of security best practices in cloud and infrastructure.
  • Certification in cloud technologies (e.g., AWS Certified DevOps Engineer).

Job Tags

Similar Jobs

Avenue Contract Solutions, LLC

Project Manager - Air Force Travel Voucher Processing Management Job at Avenue Contract Solutions, LLC

Project Manager Air Force Travel Voucher Processing Management Location: Ellsworth AFB, South Dakota (On-site) Company: Avenue Contract Solutions (ACS) About ACS: Avenue Contract Solutions (ACS) is a certified woman-owned small business founded in 2021. Since...

Design Phase

Industrial Designer Job at Design Phase

The OpportunityDesign Phase - is a display manufacturer on the forefront of creating unique brand experiences at retail. We are expanding and looking for a full time staff designer to be part of the design team. As a member of the team you will be creating thoughtful... 

London Hair Salon

Salon Assistant & Hair Stylists Job at London Hair Salon

We are looking for part time assistants to work Wed. 9:30 - 4:30\ Thurs. 11:00 - 7:30, Fri. 11:30 - 7:00 \ You must have a cosmetology license or attending school with 600 hours completed.\ We need someone who is enthusiastic and willing to learn from our talented... 

Smiths Detection

Field Service Technician I - Commercial and TSA Job at Smiths Detection

 ...and passion. Job Description This position requires daily travel to locations forcommercial work in SE MI and regional TSA. SPECIFIC DUTIES, ACTIVITIES AND RESPONSIBILITIES INCLUDE BUT ARE NOT LIMITED TO : Responsible for meeting daily service repair... 

Walmart Inc.

(USA) Senior Manager, Risk Management - Actuary Job at Walmart Inc.

 ...the analysis and decision-making process for our P&C insurance products. Your expertise will help us to accurately assess and manage risk, optimize pricing, and drive business growth. The Senior Manager Actuary, reports directly to the Finance Director of Risk &...