Senior Site Reliability Engineer
In this role you will join the Cloud Infrastructure Team and take on tasks that include a focus on automation, tools, deployment, monitoring, managing and optimizing the systems that run Sight Machine software. You must love learning new technology, have excellent problem solving, and embrace the Infrastructure as Code paradigm.
Success will take a blend of technical expertise, experience with deployment technology frameworks, customer-centric focus, and a team-spirited approach to solve architectural challenges supporting your peers in Development Engineering.
- Employing DevOps principles, provide technical operational expertise for comprehensive cloud infrastructure operations for all customers, internal and external
- Troubleshoot and resolve complex systems problems across multiple layers of the systems stack from ci/cd, container-based systems, networking, operating systems, cloud resources, and databases
- Instrument Monitoring and Alerting infrastructure for critical services
- Creating, revising, and testing operational runbooks and automation for maintaining Sight Machine Infrastructure
- Design and code appropriate tools to support our internal platforms and systems
- Participate in our on-call schedule
- Proactively pursue opportunities of operational innovation to improved stability, reliability, and availability of Sight Machines services
- Embody a Quality-first & Security-first culture in all that you do
- 5+ years of experience with Kubernetes / Docker in at least one of the top tier cloud providers (Azure, GCP, AWS, etc.)
- 5+ years of experience coding with languages Python, Java, Go, Terraform, etc
- 5+ years of experience using IaC and CI/CD tools like FluxCD (or similar), Jenkins, Terraform, Github, etc.
- Strong experience with the Linux OS
- Strong working knowledge of Networking (TCP/IP and Application)
- A willingness to author technical documentation for design, workflows, processes, best practices, etc
- Willing to mentor other team members and engineers
- Strong bias for action vs endless planning, you’re hands on, have made a mistakes,learned from them and can balance risk vs. impact to customers
- You value clear communication and you're empathetic and respectful of others
- Operational experience with monitoring/alerting systems such as Sentry, Opsgenie, Prometheus
- Deep understanding of cloud performance, and how to diagnose and resolve bottlenecks, and keep the performance at optimal levels
Nice to Haves
- Experience with elements of our current tech stack are a plus: Kubernetes, FluxCD, Terraform, Helm Charts, Prometheus, Elasticsearch, Python, Java, Kafka, Postgres, and Jenkins
- Previous experience or a keen interest in industrial IoT, analytics, or manufacturing a plus
Location: San Francisco, CA (Hybrid)
About Sight Machine
Sight Machine is built on the shoulders of a unique, robust and highly scalable Infrastructure as Code model. This enables the creation and operation of customer instances in our ecosystem in a standardized and simplified manner. We are looking for team members to help us build, maintain, and improve the infrastructure that makes Sight Machine the leading provider of Manufacturing Data Pipelines and Analytics.
Great things happen when people can bring their authentic selves to work. We empower all of our team members to share their perspectives, passions and experiences because collectively we make a better, stronger team through always “open communications” mind.
Our team collaborates closely with peers & cross functional stakeholders throughout the business, our clients on the forefront of digital transformation, and the cutting edge of digital manufacturing thought leadership.
Sight Machine has offices in San Francisco, CA and Ann Arbor, Mi. We do have a remote-friendly culture with people based all around the US and the rest of the world. For this role in particular, the ideal candidate is located near either of our offices and willing to work in a hybrid capacity. We would still consider 100% remote for exceptional candidates if they aren’t located near an office.
Sight Machine strengthens manufacturers by providing the industry’s only standard data model and system-level visualization capabilities. By integrating all crucial data into a single innovative platform, everyone involved in the fabrication process can visualize, contextualize and examine data in one intuitive interface.
Sight Machine is committed and mission-driven to improve lives, strengthen communities and make the world cleaner through continuously re-envisioning manufacturing processes - making them more efficient, sustainable and absolute.
Founded in Michigan in 2011 and expanded to San Francisco in 2012, Sight Machine blends the spirit of technology innovation and the down to earth style of Detroit manufacturing. Our team includes early leadership from Yahoo, Tesla Motors and Oracle. Together, we share wide industry knowledge and a commitment to advance manufacturing to a more sustainable future.
We take pride in our self-starter culture where employees are enabled and encouraged to achieve their professional goals through leadership guidance, learning and development. Our philosophy is that careers are continuous journeys, and we dedicate time and offer resources so that employees can reach their full potential.
Sight Machine is proud to be an equal opportunity employer and considers candidates legally authorized to work in the US regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. Sight Machine also considers qualified applicants regardless of criminal histories, consistent with legal requirements.