About the Role:

As an early member of the team, you’ll play a key role in shaping and building our infrastructure, developer experience, observability, and security. This position is ideal for someone with a strong foundation in Google Cloud Platform (GCP) cloud infrastructure who enjoys working closely with engineering and business stakeholders to address a wide range of challenges.

What You'll Do:

  • Lead optimization efforts for our Google Cloud Platform (GCP) usage, focusing on performance, development tooling, and cost efficiency.
  • Allocate a significant portion of your time to IT, infrastructure, and cybersecurity tasks to ensure the reliability and scalability of our cloud-based services.
  • Develop and maintain CI/CD pipelines and workflows using CircleCI, GitHub Actions, and ArgoCD to enhance the development team’s efficiency.
  • Manage and deploy Kubernetes, utilizing Helm charts and Terraform for GKE-based infrastructure maintenance.
  • Contribute to database management and optimization with Cloud SQL for PostgreSQL.
  • Enhance our monitoring and observability framework with tools like Datadog for high availability and performance.
  • Implement SRE principles to improve reliability, incident response, and automation.
  • Collaborate closely with development teams to architect and implement scalable platforms, facilitating continuous delivery and improving the developer experience.

What You’ll Need:

  • 4+ years of industry experience in cloud infrastructure, platform engineering, SRE, or related fields, ideally in a fintech startup environment.
  • 3+ years of expertise with cloud infrastructure, especially GCP and Kubernetes.
  • 2+ years of experience with Infrastructure as Code (IaC) tools, such as Helm and Terraform.
  • 1+ years of experience with CI/CD tools like CircleCI, GitHub Actions, and ArgoCD.
  • 1+ years of database management experience, ideally with Cloud SQL for PostgreSQL.
  • Strong understanding of monitoring and observability principles and tools, such as Datadog and Prometheus.
  • A knack for streamlining development workflows and a keen interest in Site Reliability Engineering (SRE) practices and platform engineering.
  • Demonstrated ability to be resourceful and thrive in dynamic, rapidly evolving environments.
  • Enthusiasm for taking on wide-ranging ownership and tackling challenges spanning infrastructure, DevOps, security, IT, and more.
  • Bachelor’s or Master’s degree in Computer Science, Information Systems, Electrical Engineering, or a similar field.

Nice to Have:

  • CI/CD experience (e.g., GitHub Actions or CircleCI).
  • Observability experience (e.g., Datadog).
  • SRE and incident response experience and insights.
  • Compliance/Security experience (e.g., SOC2, PCI, vendor security questionnaires).
  • Coding abilities.

Compensation:

This role offers an annual starting salary range of $180,000 - $200,000 plus equity and benefits. Compensation may exceed this range for exceptional, experienced candidates joining above a senior level.

Benefits & Perks:

  • Unlimited PTO
  • Flexible and remote work culture
  • Competitive compensation and equity
  • Health, dental, and vision coverage for you and your dependents
  • Lunch & commute reimbursement when working in the NYC office located near Union Square
  • $500 home office setup reimbursement
  • $2500 annual growth and development stipend
  • $50 monthly food delivery gift card
  • 401k
  • Brex company card
  • HSA/FSA
  • WeWork membership
  • ERG groups (e.g., LGBTQIA+, Women)
  • Employee Referral Program
  • Team building events