Skip to main content

Private Cloud Infrastructure Architect

Chandler, AZ
Permanent
Location:

Chandler, AZ

Salary Range:

Competitive

Introduction

The individual in this role will lead the end-to-end architecture and operational governance for private cloud infrastructure supporting LLM and agentic AI implementations, including GPU and accelerated compute platforms. This position is crucial for ensuring security, reliability, auditability, and FinOps requirements are met.

Required Skills & Qualifications
  • 10 years in infrastructure architecture, platform engineering, or private cloud engineering in large-scale enterprise environments.
  • Demonstrated experience designing/operating hybrid data center infrastructure (private cloud, on-prem virtualization/container platforms, storage, network).
  • Hands-on experience with GPU platforms and accelerated compute operations (cluster design, scheduling, capacity planning, monitoring, lifecycle management).
  • Proven ownership of observability/telemetry programs: API inventory, metrics/logs/traces strategy, collection interval tuning, and data quality controls.
  • Direct FinOps experience in a large organizationshow back/chargeback, cost allocation, unit economics, and cost governance for infrastructure platforms.
Preferred Skills & Qualifications
  • FinOps Certified Practitioner/Professional
  • Strong understanding of resilience and recovery engineering, including data retention and operational readiness (warm-up) dependencies.
  • Excellent stakeholder management and ability to influence across engineering, finance, and risk organizations.
  • Prior work experience at client or in client's Industry
  • Applicants must be able to work directly for Artech on W2
Day-to-Day Responsibilities
  • Own the Platform API Inventory and Collection Interval Validation Matrix across the AI ecosystem.
  • Manage the per-platform retention audit (metrics, logs, traces, billing/cost, capacity, and model-serving telemetry).
  • Unblock warm-up recovery design for expedited restoration of operational readiness after incidents, patching, upgrades, or DR events.

For immediate consideration please click APPLY to begin the screening process.

Job Type: Permanent

Job ID: 253247222