Linux Systems Administrator - HW1474 (BBBH340) East Palo Alto, California
We are looking for a talented, motivated, and passionate Systems Administrator who will be responsible for the technical ownership, technical design, planning, implementation, and the highest level of performance tuning and recovery procedures for mission critical enterprise systems. Will work within the IT Operations team and will primarily support our engineering team and engineering infrastructure.
- Manages the day-to-day operations of engineerings various compute clusters, including accompanying supporting infrastructure (PXE, DHCP, OpenStack, etc.)
- Develops new system and application implementation plans, custom scripts and testing procedures to ensure operational reliability. Cross-train technical staff in how to use new software and hardware developed and/or acquired
- Performs troubleshooting as required. As such, leads problem-solving efforts often involving outside vendors and other support personnel and/or organizations
- Establishes, maintains and manages users Unix accounts. Installs, modifies and maintains systems and utility software on server computer systems. Provides server support related to other software
- Ensures high availability and acceptable levels of performance of mission critical host computer resources
- Develops procedures, programs and documentation for backup and restoration of host operating systems and host-based applications
- Stays current with technological developments in systems administration technology
Bachelors degree in Computer Science or related discipline. Relevant experience may substitute for the degree requirement on a year for year basis. Four years work experience in complex systems design, programming and systems software and support
Knowledge of: Varying *nix-based operating systems, shell scripting(Shell, Bash, Korn), DHCP (Including Scope and reserved DHCP), DNS, PXE(Knows how to create kickstart scripts and modify PXE Menus), Virtualization(KVM and VMWare ESXi), Monitoring
Ability to: Plan, organize and document complex system design activities and to configure systems to be consistent with institutional policies/procedures; communicate technical/complex information both verbally and in writing; establish and maintain cooperation, understanding, trust and credibility; perform multiple tasks concurrently and respond to emergency situations effectively.
The Pluses: Understanding of VLANs, OpenStack/Cloudstack, Windows Server including ActiveDirectory; Active knowledge of distributed file systems(NFS, GlusterFS, zfs); Knowledge around Hadoop; Experience using IPMI for monitoring and server management