Skip Navigation
Close Search

Search Jobs

Search

Man works diligently on his computer. Man works diligently on his computer.

Push Us. Amaze Us. Inspire Us.

Careers in Engineering

We are intellectual mavericks, pioneers, game-changers, and doers of the extraordinary, on a mission to create cloud, mobility, security, and virtualization solutions that will reach millions of users around the world. At VMware, our people are empowered to succeed and are valued for their innovative contributions as we revolutionize the IT industry.

Are you ready to join us?

dare to explore

Staff Engineer - SRE / Site Reliability Engineer - NSX - VMC on AWS Cloud

Palo Alto, California

Apply Now
Job ID R1900384

Staff Engineer - SRE / Site Reliability Engineer - NSX - VMC on AWS Cloud

Oversees and drives Root Cause Analyses and Corrective Actions to improve site availability and integrity.

  • Sr. SRE identifies “areas of interest” for additional investigation to improve long-term availability and integrity.
  • The SRE is continuously improving his/her specialist knowledge, and takes on new areas to support the team. The SRE also educates his/her co-workers, and serves as a subject matter expert in his/her field of specialty for the organization.
  • The SRE standardizes and maintains internal technical documentation (e.g. Wiki, run-books) and improves technical situational awareness (i.e. implement and monitoring of key metrics).
  • The SRE provides hands-on expertise during service impacting events and technical escalations (e.g. analysis / trouble shooting of systems and servers). This also includes configuration changes to production systems, as well as standardization of trouble shooting procedures.
  • Mentors and trains junior administrators and Operations Control Center personnel.
  • Working with design engineering, propose architectural changes, and fosters communication between different organizational units.
  • Directs RCA and gauges systems status against established base line.
  • The SRE will drive sustainable solutions across functions with minimum or no supervision.
  • Strong experience in Unix / Linux / ESX systems administration and or Networking:
  • Bachelor’s degree in Computer Sciences preferred, or equivalent degree & work experience necessary.
  • 7 years plus experience in independently managing Unix/Linux systems / VMware ESX
  • Knowledge in Unix systems analytics and performance management
  • Cloud Experience with AWS and VMware ESX / ESX Networking
  • Experience in creating scripts and runbooks to remediate issue 
  • Proven trouble shooting skills including the ability to execute Root Cause Analysis
  • Exposure to management of complex network equipment (e.g. load balancers or firewalls routing
  • Operational experience is required in support of production network
  • Identifying code and design pattern errors
  • Experienced in Software Development and some cloud operations with ability to work with operational knowledge of Software Revision tracking and Release Management desired.
  • This role requires the right candidate to demonstrate an analytical mind-set, natural curiosity, initiative and the willingness to go "beyond” in determining trigger events and root cause.
  • Excellent written and verbal communication skills are required.
  • Ability to work collaboratively to convince others in their field of expertise is expected.
  • Applied practical experience with Systems Thinking (ability to analyze systems and its components), as well as analytical methods
  • The Staff SRE is expected to be available for on call duty as well as for application support during off-hours, as needed.

Your saved jobs

You have not saved any jobs.

Recently viewed opportunities

You have not viewed any jobs.

Job Alerts

Get the latest career opportunities as soon as they become available.

Interested InEnter category and/or location, then click ADD. You must have at least one alert to sign up.

  • Engineering and Technology, Palo Alto, California, United StatesRemove