Description
Role &
Responsibilities
Maintain and improve instrumentation for monitoring and logging the health and availability of services.
Proactively monitor systems, networks, and applications to provide input in improving the stability, security, efficiency, and scalability of systems.
Maintain Monitoring and Logging Frameworks for all of OA
Take personal responsibility for the quality, reliability and availability of global OA observability infrastructure.
Own operations documentation of monitoring and logging for global OA observability production infrastructure.
Participate in rotating on-call incident response on the weekdays and on the weekends.
Improve operational efficiencies via scripting, bots and integrations.
Participate cross functionally with vendors and other software & engineering teams to ensure smooth service delivery.
Knowledgeable in observability systems troubleshooting, fault analysis, and resolution.
Requirements
In-depth experience designing at scale monitoring and logging for corporate infrastructure services.
Expert level experience in monitoring and logging technologies, both open source and closed source (e.g. Logic Monitor, Sumo Logic, ELK)
Experience in RBAC and user based security services such as ISE, Radius, LDAP, and AD.
Must have strong automation/scripting skills - proficiency in Python or Ruby is a plus.
Must have understanding and maintaining APM, RUM, and tracing capabilities for an observability stack.
Proficient in developing and maintaining technical documentation, runbooks, and procedures.
A working knowledge in Network is needed. Fundamental knowledge of TCP/IP stack, application protocols (DHCP/DNS/HTTPs) and networking concepts (HSRP/NAT/VPN/VLANs/802.1x/Wireless/Clustering/High Availability/Load Balancing).
Understanding of enterprise networks using Cisco IOS/NXOS with a working knowledge of IP Protocols (TCP/UDP/ICMP) and Routing Protocols (BGP/OSPF/IS-IS).
Technology understanding of Palo Alto Firewalls, including Firewall Policy Rules, URL-Filtering, App-, User-, etc.
Experience interacting with Telco and Global ISPs (WAN/DIA) and the monitoring of those services.
A working knowledge of systems is needed. Fundamental knowledge of Configuration Management and Automation tools, with experience in:
Terraform, Ansible, Chef, Puppet, Jenkins
Designing and implementing CI/CD pipelines
Infrastructure provisioning and management
Strong in troubleshooting incidents in production environment.
A strong ownership attitude and a track record of taking responsibility for problems and pushing through to resolution.
Bachelor's degree in Computer Science or EE, or relevant industry experience is required.
Ability to communicate and coordinate with cross-functional engineering teams across multiple geographic regions.
Nice To Have:
Ability to take lead in an operations environment.
Contributed to Open Source - your public Git repos/contributions show good examples of giving back to the community.
Architected a monitoring and logging infrastructure that was technology agnostic for a production infrastructure environment.
Knowledge of revision control software such as GIT.
Familiarity with REST APIs scripting, i.e. with PAN OS API / Infoblox WAPI.
Experience with integrations in Artifactory, Prometheus, Grafana, Kibana, Open Search, Splunk, Dynatrace, New Relic, Duo, One Login, Slack, and OCEAN
As a member of the software engineering division, you will take an active role in the definition and evolution of standard practices and procedures.
You will be responsible for defining and developing software for tasks associated with the developing, designing and debugging of software applications or operating systems.
An Oracle career can span industries, roles, Countries and cultures, giving you the opportunity to flourish in new roles and innovate, while blending work life in. Oracle has thrived through 40+ years of change by innovating and operating with integrity while delivering for the top companies in almost every industry.
In order to nurture the talent that makes this happen, we are committed to an inclusive culture that celebrates and values diverse insights and perspectives, a workforce that inspires thought leadership and innovation.
Oracle offers a highly competitive suite of Employee Benefits designed on the principles of parity, consistency, and affordability. The overall package includes certain core elements such as Medical, Life Insurance, access to Retirement Planning, and much more. We also encourage our employees to engage in the culture of giving back to the communities where we live and do business.
At Oracle, we believe that innovation starts with diversity and inclusion and to create the future we need talent from various backgrounds, perspectives, and abilities. We ensure that individuals with disabilities are provided reasonable accommodation to successfully participate in the job application, interview process, and in potential roles. to perform crucial job functions.
That’s why we’re committed to…