CrawlerATS
← Back to Jobs

Forward Deployed Reliability Engineer

Apply Now ↗

Palantir

New York, NY
Product DevelopmentFull-timevia lever

Job Description

A World-Changing Company

Palantir builds the world’s leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empower our partners to develop lifesaving drugs, forecast supply chain disruptions, locate missing children, and more.

The Role
 
As a Forward Deployed Reliability Engineer (FDRE), you ensure stability and reliability of mission-critical workflows built on Palantir software. You gather signal by going on call — resolving problems before the customer is impacted — and use those learnings to drive product change, shape our internal tooling, and refine our operational processes such that we provide an increasing quality of service to more and more customers.
 
Your approach is hands-on and pragmatic: you’ll rapidly address issues as they arise with quick and effective solutions and advocate for workflow or product improvements after the immediate issue is resolved. You are energized by engaging directly with problems, from writing a script to automate a manual task, to finding creative workarounds, or building a case for a product enhancement. You don’t just fix issues— you look for opportunities to simplify, automate, and make the entire system more resilient.
 
An FDRE synthesizes learnings from support into best practices for others to follow. These are captured into documentation and shared with the team and broader organization. In this way, you raise the bar for reliability and efficiency across Palantir. 
  • Develop a deep understanding of Palantir's products and operational processes
  • Go on-call, responding quickly and effectively to mission-critical incidents
  • Diagnose, resolve, and proactively prevent issues encountered in the field
  • Collaborate with internal stakeholders to increase the scalability and reliability of Foundry workflows for our customers
  • Identify recurring pain points and inefficiencies, and take initiative to automate or streamline workflows
  • Advocate for and implement product enhancements based on insights gleamed from the field
  • Create clear, actionable documentation and share best practices to elevate team and company-wide reliability
  •  
    Note: While active work is not required on weekends or outside business hours, you must be available to respond to critical outages during assigned on-call weeks. 
  • Ability to work independently and collaboratively to solve ambiguous technical and operational challenges
  • Excellent written and verbal communication skills, capable of interacting effectively with both technical and non-technical stakeholders.
  • Proficiency in Python, Java, and SQL
  • Familiarity with parallel data processing and Spark job optimization
  • Strong organizational skills and attention to detail, with the ability to prioritize effectively
  • Resourcefulness and creativity in fast-paced dynamic environments
  • Experience with root cause analysis and documenting solutions for broader impact
  • Enthusiasm for hands-on problem solving, continuous improvement, and knowledge sharing
  • Background in Computer Science, Engineering, Information Systems, or other technical field
  • Must be a US citizen or green card holder
  • Listing Details

    Posted
    February 10, 2026
    First seen
    March 20, 2026
    Last seen
    March 20, 2026

    Are you the hiring manager?

    ★ Promote this listing