Oracle Senior Site Reliability Engineer (Join OCI SRE) in Denver, Colorado
Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate service capacity planning and demand forecasting, software performance analysis, and system tuning.
Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance. Authority for end-to-end performance and operability. Partner with development teams in defining and implementing improvements in service architecture. Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add premier capabilities to the Oracle Cloud service portfolio. Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack. Demonstrate clear understanding of automation and orchestration principles. Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs). Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations. Understand and explain the affect of product architecture decisions on distributed systems. Professional curiosity and a desire to a develop deep understanding of services and technologies.
A BS or MS in Computer Science, or equivalent. Identifies solutions to knowledge of server hardware and software configuration, networking, standard internet services, scripting languages, cloud computing patterns, technology security and compliance. Experience running large scale customer facing web services. Identifies solutions to understanding of load balancing technologies and experience with development in programming languages, databases and big data stores, and container technologies. Work involves defining and documenting technical architecture of complex and highly scalable products. A minimum of 5 years experience of running large scale customer facing web services.
Oracle is an Affirmative Action-Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability, protected veterans status, age, or any other characteristic protected by law.
Oracle’s Cloud Infrastructure team is building new Infrastructure-as-a-Service technologies that operate at high scale in a broadly distributed multi-tenant cloud environment. Our customers run their businesses on our cloud, and our mission is to provide them with best-in-class compute, storage, networking, database, security, and an ever-expanding set of foundational cloud-based services.
We seek an experienced site reliability engineer who can rapidly deploy, monitor, and operate software solutions in a distributed environment. As a member of the technical content infrastructure team, you will work with internal stakeholders as well as customer-facing solutions that support our documentation eorts.
The ideal candidate will be a strong self-starter, able to identify problems and quickly implement solutions with minimal input. They should have familiarity with supporting front- and back-end tools in production environments. A keen interest in quality at every stage of development, from request to delivery, is a must.
These are exciting times in our space—we are growing fast, still at an early stage, and working on ambitious new initiatives. At Oracle, you can have signicant strategic and technical impact by helping to build innovative technical content from the ground up.
Job Responsibilities • Work with internal stakeholders to operate and maintain software tools
• Ensure continuous availability through creation of robust deployment methods and automation
• Understand, embrace, and improve Oracle’s cloud documentation publishing process
• Provide timely, constructive feedback to team members regarding design and implementation decisions
• Contribute to the Technical Content roadmap by identifying areas of need and engaging with stakeholders to scope work
• Document manual and automated processes
Qualications • BS or MS degree or equivalent experience
• At least four years of operations, sysadmin, or SRE experience
• Prociency with Docker, Terraform, Python, nginx, and bash
• Comfort with Agile development processes and tools (JIRA, Conuence, Bitbucket)
• Previous experience operating distributed applications
• Strong customer-rst mentality
• Passion for documentation
• Experience working within a geographically-distributed team
Extra Credit • Experience contributing to backend cloud infrastructure products
• Experience working with technical documentation (authoring, editing, and/or publishing)
Job: *Product Development
Title: Senior Site Reliability Engineer (Join OCI SRE)
Location: United States
Requisition ID: 20000WZX
- Oracle Jobs