Reliability & Quality Engineer

Oracle

United States Full time Hardware Engineering
Apply on EasyApply

Create a free account to apply in seconds

Oracle Cloud Infrastructure is designing, standardizing, and operating mission-critical data center electrical infrastructure at extraordinary scale. We are hiring a Reliability & Quality Engineering leader to strengthen the design quality, system reliability, and availability of large-scale data center electrical distribution architectures.

Early in the role, the priority will be evaluating the full data center electrical distribution system, identifying potential failure points, assessing how those risks could impact availability, and developing system-level resiliency concepts such as failure isolation, fault containment, and blast radius reduction. This role requires someone who can look beyond component-level reliability and understand how the overall electrical architecture behaves under credible failure scenarios.

The ideal candidate will bring deep reliability and quality engineering capability across mission-critical power infrastructure, electrical equipment, or standardized data center design products. This person will partner closely with design engineering, construction, commissioning, operations, equipment suppliers, and executive stakeholders to convert reliability analysis into durable design standards, product improvements, risk mitigations, and measurable availability outcomes.

Responsibilities

What you’ll do

• Evaluate end-to-end data center electrical distribution architectures, including utility or behind-the-meter interfaces, substations, medium-voltage distribution, switchgear, UPS systems, generators, BESS where applicable, protection systems, controls, and downstream power delivery.

• Identify design-level failure points, single points of failure, common-mode risks, hidden dependencies, protection coordination concerns, and failure modes that could materially impact availability.

• Assess how electrical systems perform under credible failure scenarios, including equipment faults, transfer events, protection operations, control-system failures, degraded-mode operation, maintenance conditions, and abnormal grid or generation events.

• Develop system-level resiliency concepts and design recommendations that improve fault isolation, recoverability, maintainability, failure containment, and blast radius reduction.

• Partner with electrical design engineering, commissioning, operations, construction, supply chain, and equipment vendors to translate reliability findings into design standards, product requirements, test expectations, acceptance criteria, and corrective action plans.

• Apply quality and reliability engineering methods such as FMEA, fault-tree analysis, reliability block diagrams, root-cause analysis, design-for-reliability reviews, lessons-learned integration, and field-performance trend analysis.

• Evaluate product quality and reliability across power infrastructure, electrical equipment, standardized electrical products, and repeatable data center design platforms.

• Recommend improvements to specifications, supplier qualification, equipment testing, design validation, manufacturing quality controls, inspection processes, and supplier feedback loops.

• Create executive-ready reliability risk assessments that clearly communicate availability impact, technical trade-offs, mitigation options, residual risk, and prioritization across large-scale infrastructure programs.

• Establish reliability KPIs, defect taxonomies, quality gates, issue review cadence, and design-readiness criteria that improve reliability before construction, commissioning, and operational handoff.

• Champion safety, compliance, disciplined change management, and operational excellence while improving reliability at the architecture, product, and system levels.

What you’ll bring

• 10+ years of experience in reliability engineering, quality engineering, electrical design assurance, product quality, mission-critical power systems, data center infrastructure, industrial power, utilities, generation, transmission and distribution, or equivalent high-availability environments.

• Strong systems-thinking capability, with demonstrated ability to evaluate electrical architectures under failure scenarios rather than only assessing individual components or equipment ratings.

• Deep understanding of data center electrical distribution concepts, including medium-voltage and low-voltage distribution, substations, switchgear, UPS systems, generators, protection schemes, controls, grounding, power quality, redundancy, maintainability, and operational recovery.

• Experience identifying failure modes, common-mode vulnerabilities, cascading risks, hidden dependencies, and blast-radius concerns across complex electrical systems.

• Proven ability to assess availability impact and reliability tradeoffs using structured engineering methods such as FMEA, fault-tree analysis, reliability block diagrams, event analysis, root-cause analysis, or similar methodologies.

• Relevant product quality and reliability experience across power infrastructure, electrical equipment, standardized electrical products, or repeatable data center design platforms.

• Ability to influence design standards, supplier requirements, equipment qualification expectations, commissioning acceptance criteria, and operational readiness deliverables.

• Strong cross-functional leadership skills, with experience partnering across design engineering, construction, commissioning, operations, vendors, and executive stakeholders.

• Excellent communication and executive reporting skills, with the ability to translate complex technical reliability risk into clear decisions, priorities, and implementation plans.

• Commitment to safety, compliance, disciplined engineering governance, and operational excellence in high-uptime environments.

Qualifications

Disclaimer:

Certain U.S. based or U.S. customer or client-facing roles may be required to comply with applicable requirements, such as immunization/occupational health mandates, and/or drug testing requirements.

Range and benefit information provided in this posting are specific to the stated locations only

US: Hiring Range in USD from: $161,700 to $338,500 per annum. May be eligible for bonus, equity, and compensation deferral.

Oracle maintains broad salary ranges for its roles in order to account for variations in knowledge, skills, experience, market conditions and locations, as well as reflect Oracle's differing products, industries and lines of business.
Candidates are typically placed into the range based on the preceding factors as well as internal peer equity.

Oracle US offers a comprehensive benefits package which includes the following:
1. Medical, dental, and vision insurance, including expert medical opinion
2. Short term disability and long term disability
3. Life insurance and AD&D
4. Supplemental life insurance (Employee/Spouse/Child)
5. Health care and dependent care Flexible Spending Accounts
6. Pre-tax commuter and parking benefits
7. 401(k) Savings and Investment Plan with company match
8. Paid time off: Flexible Vacation is provided to all eligible employees assigned to a salaried (non-overtime eligible) position. Accrued Vacation is provided to all other employees eligible for vacation benefits. For employees working at least 35 hours per week, the vacation accrual rate is 13 days annually for the first three years of employment and 18 days annually for subsequent years of employment. Vacation accrual is prorated for employees working between 20 and 34 hours per week. Employees working fewer than 20 hours per week are not eligible for vacation.
9. 11 paid holidays
10. Paid sick leave: 72 hours of paid sick leave upon date of hire. Refreshes each calendar year. Unused balance will carry over each year up to a maximum cap of 112 hours.
11. Paid parental leave
12. Adoption assistance
13. Employee Stock Purchase Plan
14. Financial planning and group legal
15. Voluntary benefits including auto, homeowner and pet insurance

The role will generally accept applications for at least three calendar days from the posting date or as long as the job remains posted.
Career Level - IC6

Skills

Reliability EngineeringQuality EngineeringElectrical Distribution SystemsFailure Mode and Effects Analysis (FMEA)Root Cause AnalysisDesign for ReliabilityCommunicationCollaborationRisk AssessmentProblem Solving