Portfolio Careers

Leverage our network to build your career.

Infrastructure Engineer



Other Engineering
Posted on Sunday, June 9, 2024

First and most importantly: our mission is to bring transparency and clarity to the world's data.

Our platform, FiftyOne, is where AI work happens. Our enterprise platform is the mission critical linchpin for managing unstructured data, model development, and AI systems at the world's largest companies.

We believe that open source is the way to lead the data-centric AI revolution. Our open source version has 2 million downloads to-date.

Our software massively impacts AI work across almost every vertical: from self-driving cars to medical imaging to revolutionizing agriculture, we are at the thrilling center of real-world AI advancement’s next wave.

And we’re built on three key tenets:

  • We are all human beings: we strive to be a “human-first” organization and treat everyone with the respect, care, and flexibility that all people deserve.
  • We are distributed: we believe in getting autonomy and power into the hands of people actually doing the work
  • We believe in the power of community

Our fully-remote team is based across North America today.

About your role

As a Senior Infrastructure Engineer at Voxel51, you’ll collaborate with a team that delivers features to support dataset curation, model analysis, and integrations that span the entire machine learning lifecycle. You’ll design and develop containerized systems and CI/CD pipelines using industry best-practices that deploy software in our cloud and into our customers’ environments. You’ll also solve unique challenges that arise when storing, managing, and serving unstructured data (images and video) at scale.

Every member of our fully-remote team is empowered to own their work and play an active role in advancing our mission to democratize data-centric ML.

What you will do

  • Design, develop, and manage systems that leverage best-of-breed tools like Docker and Kubernetes to deploy and scale from individual users to Fortune 500 companies
  • Build and maintain robust CI/CD pipelines
    • We use GitHub Actions and Google Cloud Build right now, but we'd definitely be open to other answers!
  • Work with enterprise customers to deploy into, troubleshoot and manage their cloud or on-premises environment installations
  • Collaborate with Customer Success machine learning engineers during customer acquisition lifecycle
  • Deploy and manage Voxel51-hosted, customer environments using GKE and MongoDB
  • Create and maintain deployment solutions for customers' on-prem deployments
    • We use Helm and Docker Compose, but other answers would be AWESOME!
  • Design, develop, and manage solutions for improving efficiency in peer Engineering organizations
    • Local deploys (using minikube and skaffold now, other tools would be awesome)
    • Automated deploys to cloud (GKE) environments
    • Automated deploys to internal Docker Compose environments
  • Troubleshoot deployment failures, build failures, runtime failures
  • Design, develop, and manage solutions to predict failures in internal and customer installations

What you should bring

  • 7+ years of professional experience managing and deploying software systems
  • Expertise with containerization and container orchestration (Docker, Kubernetes)
  • Experience with infrastructure automation
    • We use Terraform and Ansible, but bringing other technologies to the table is GREAT!
  • Scripting (Bash) and Software Languages (Python, Go)
  • Test Driven Development (TDD) with pytest and Terratest
  • Management of NoSQL databases (MongoDB, DocumentDB, Elasticsearch)
  • Observability (metrics, monitoring, alerting, dashboarding, logging)
  • Software packing (Python Wheels, PyPI), container image building, and release management
  • Security with experience applying the Principle of Least Privilege
  • Continuous Integration (CI) and Continuous Delivery (CD) systems
    • We use Google Cloud Build and GitHub Actions, but experience with other technologies is great!
  • Networking, Access Control Lists (ACLs), Load Balancers, DNS, TLS, Webservers (Nginx)
  • Linux (Debian based distributions)
  • Documentation generation and curation (internal and customer facing)
  • Experience working with enterprise customers to deploy containerized systems into their environments (AWS, GCP, Azure, on-premises)
  • Ability to work in a remote-first, cooperative environment using collaborative development tools (GitHub, Slack)

The cash compensation for this person is in the $195K-$220K range. In addition to base comp for this role, we offer equity in the form of options, a variety of benefits, and the opportunity to grow in an exciting and collaborative environment.