Jobs for Veterans, Veteran Job Board |

Post Jobs

Job Information

Oracle Monitoring Software Developer - Prometheus in Frisco, Texas

Monitoring Software Developer - Prometheus

Preferred Qualifications

Monitoring and Observability Services

Organization : GBU Cloud Development Services, Cloud Reliability Services

Cloud Reliability Services (CRS) Description:

Cloud Reliability Services (CRS) is a strategic component that will transform Oracle’s Global Business Unit’s (GBU) cloud operations. As cloud service operations evolve from a predominately re-active model (i.e., responding to emergencies with high levels of human involvement) to a pro-active model (i.e., preventing emergencies and outages with intelligent tools, services, and automation), the mission of CRS will ensure that all GBUs can efficiently operate ultra-scalable and highly-reliable SaaS/Foundation services, across multiple operating models as they iterate to become Cloud Native.

Monitoring & Observability Team Description:

The Cloud Reliability Services (CRS) - Service Monitoring & Observability (M&O) team provides an integrated suite of tools to allow partners to monitor core Business KPIs as well as user experience and Service SLA reporting. Besides, the service allows for the monitoring of Compute, Dependencies, and Network Infrastructure. The service provides greater visibility into core components while also overlaying Customer Experience to better focus engineering resources when incidents occur as well as prevent occurrence. The M&O team is a fast-paced, highly motivated team that embodies “We” rather than “I”. CRS – M&O will be a geo-diverse team that will allow it to quickly respond to customer engagements and challenges, and adapt quickly to incoming incidents while still delivering on committed features and enhancements.

Roles & Responsibilities:

The candidate will work with highly-skilled, highly motivated engineers using agile methodologies based on Scrum or Kanban, and incorporating enterprise agile practices from Scaled Agile Framework (SAFe).

The team embraces a DevOps environment – the Developers are the Operators. The work environment is to treat everything as code (code, configuration, infrastructure, pipelines, everything) to achieve the highest quality product in the most efficient amount of time.

You will work alongside a software development team within the greater Oracle Cloud Reliability Engineering team where you will develop new features as well as expand and support existing features.

One week you may be writing automated tests for an existing feature. The next week you may be developing a new feature (design, code, test, and deploy) for a customer in our environment. The next week you may be providing support to a customer on your new feature.

You will learn new technologies based on what we already deploy and use. You will also learn about and research new technologies that you bring to the team to better our offerings.

You will play a key role in building more intelligence, into CRS services that we deliver so that SaaS services function more and more autonomously over time.

Per team roles and responsibilities:

  • Work with the Product Owner and team members to build new features and enhancements, while supporting existing M&O services being heavily utilized across all Oracle Global Business Units

  • Planning, designing, coding, documenting and testing of new Monitoring & Observability services used by multiple Oracle Software as a Service products

  • Develop software using Agile methodologies and participate as a member of scrum development teams

  • Use Everything-As-Code methodologies to ensure traceability, configurability, immutability, repeatability, and governability

  • Participate in a follow-the-sun model for 24x7 support of CRS – M&O services for designated engineers on a rotating basis

  • Manage and continuously improve existing CRS – M&O capabilities

  • Review and approve the work products of other team members

  • Supporting the operation of services using DevOps methodologies for the rapid introduction to production of new services and operational enhancements

  • Technical thought leadership and mentoring of junior colleagues

  • Attend training as required to meet Oracle and CRS compliance and regulatory standards. Perform daily tasks in accordance with compliance and regulatory standards

  • Other duties as assigned

General Qualifications:

  • Ability to explore and learn multiple, cutting edge technologies in the Cloud industry

  • Skills to solve complex technical problems and communicate effectively in a team environment

  • Good understanding of CI/CD best practices

  • Ability to advance automation of standard/recurring tasks

  • Experience with development/test in an open source environment including operation of SSH and shell functions

  • Good networking knowledge

  • Ability to assimilate and apply new technologies

  • Experience with Software Configuration Management (SCM) tools and software engineering best practices

  • Willingness to work with remote, global teams as well as individually

  • Ability to produce documentation for application engineers in support of developed work

  • Agile methodology knowledge

  • Self-motivation and fast learning skills

Preferred Qualifications:

An ideal candidate will have expertise with as many as the following:

  • Programming and scripting languages (Python, Bash, Java Script - additional experience with, Java, Ansible, and/or Go is a plus)

  • Containers and orchestration (Docker, Kubernetes, and docker-compose)

  • Experience in Prometheus and/or Grafana is a must

  • JAEGER tracing technology

  • Linux/Unix development (Oracle Linux preferred)


  • Oracle database experience

  • CI/CD (Jenkins and GitLab CI)

  • Cloud computing platform (Oracle Cloud Infrastructure Services)

  • Git version-control and collaboration (GitLab)

  • Issue tracking and collaboration (Jira and Confluence)

  • Experience with market-leading Monitoring solutions is expected

  • Product/Service ownership or Project Management experience is a plus

  • Experience in ITIL V3 or V4, Foundation Level certified is preferred

  • 5 years of experience in Agile methodology and Scrum framework is expected


Oracle GBUs provide services to many critical systems globally requiring 24x7 support. DevOps engineers will rotate with other team members in a designated, on-call status following in-country requirements. CRS’s primary support model is Follow the Sun, utilizing geographically diverse team members during normal working hours to provide support. CRS will strive to have subject matter experts distributed globally. With geographic diversity, countries and regions have a broader array of holidays requiring a flexible support schedule across multiple geographies. Additionally, coverage is required throughout the weekend.

To provide the required support to Oracle customers, CRS will use additional compensation to cover extended business hours and/or on-call pay based on in-country laws and Oracle policy. In general, there will be a Primary and Secondary engineer designated in advance to provide coverage for select services. If numerous services are supported in a specific geographic region, there may be more than one set of Primary/Secondary engineers selected. Software engineers and database administrators will engage in activities to restore services that are down or degraded. This may be as simple as running an existing script to restart a service or executing a standard operating procedure or may require code changes with review steps, integration, testing, and software deployments to restore service to normal operation. Besides, engagement with other Oracle development and support teams may be needed.

Detailed Description and Job Requirements

Design, develop, troubleshoot and debug software programs for databases, applications, tools, networks etc.

As a member of the software engineering division, you will take an active role in the definition and evolution of standard practices and procedures. You will be responsible for defining and developing software for tasks associated with the developing, designing and debugging of software applications or operating systems.

Work is non-routine and very complex, involving the application of advanced technical/business skills in area of specialization. Leading contributor individually and as a team member, providing direction and mentoring to others. BS or MS degree or equivalent experience relevant to functional area. 7 years of software engineering or related experience.

As part of Oracle's employment process candidates will be required to successfully complete a pre-employment screening process. This will involve identity and employment verification, professional references, education verification and professional qualifications and memberships (if applicable).

Oracle is an Affirmative Action-Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability, protected veterans status, age, or any other characteristic protected by law.

Job: Product Development

Location: HU-HU,Hungary-Budapest

Other Locations: US-TX,Texas-Frisco, IN-IN,India-Bengaluru, IN-IN,India-Pune

Job Type: Regular Employee Hire

Organization: Oracle