As the Lead DevOps Engineer you will be a part of a team that’s responsible for the overall architecture, building, design, monitoring and support of our cloud infrastructure. Our environment spans multiple datacenters and public cloud infrastructure. In this role, you’ll have the chance to apply your comprehensive, in-depth knowledge of technical concepts, practices and procedures. You will also have the opportunity to work very closely with the leadership team to shape our product and architectural strategy.
Duties
Communication: Communicates efficiently and effectively, and with a sense of urgency
Problem Solving and Managing Ambiguity: Operating effectively, even when things are not certain or the way forward is not clear
Accountability: Takes full ownership over decisions, actions and failures
Act as the main point of contact for support infrastructure, and lead and mentor department team members
Architect, implement and support infrastructure as code solutions that host the suite of cloud-hosted products
Ensure stability of all services by monitoring for trends of issues, performance metrics and capacity
Provide in-depth diagnosis for system/software failures and develop/implement preventative solutions as needed
Create processes that provide clarity around daily operations and cross department communications.
Document and implement configuration standards for utilization and manageability of data-center systems
Collaborate with Technology team regarding availability of the infrastructure, and technical considerations for planning and managing upgrades and enhancements
Qualifications
5+ years of experience deploying or managing mid to large scale, distributed, customer-facing, OLTP Linux environments spanning hundreds of servers
5+ years of experience designing, configuring, scaling, and supporting a 24x7x365 hosted SaaS environment
Deep knowledge of modern monitoring and alerting tools and practices including Open Telemetry standards and open-source tools such as Prometheus, influx dB, Grafana, ELK, Telegraf, fluent, etc…
Experience building and maintaining hybrid infrastructures and on-premise data centers
5+ years with infrastructure automation / configuration management / IaC (Infrastructure as Code) tools such as Ansible, Chef, Puppet, Bicep, Terraform
3+ years implementing and supporting modern infrastructure services such as consul, vault, Kubernetes, application load balancers and integrating these with both monolithic and service-based applications
5+ years administering Solaris, Linux & Windows platforms
3+ years supporting complex routing/switching environments including VPN and dynamic routing protocols
Experience with Oracle VM on SPARC hardware is a plus
3+ years administering enterprise block and file storage platforms
3+ years administering MySQL platform (replication experience highly desirable) & Java applications
Experience with Windows and Active Directory and infrastructure as a code
Strong working knowledge of datacenter network topologies, components (routing, switching, fiber channel, and next gen firewalls) and various networking and application protocols, including TCP/IP
Working knowledge of relational database solutions such as MySQL as well as NoSQL and in-memory database solutions
Hands-on leadership: ability to jump in and put “hands on the keyboard” in high pressure situations
Root cause mentality, always looking to understand the “why” and have a bias to action to solve problems
Precision and discipline around adherence to process and standard operating procedures, but can also navigate ambiguity
Seniority level
Mid-Senior level
Employment type
Full-time
Job function
Engineering and Information Technology
Industries
Staffing and Recruiting
Referrals increase your chances of interviewing at Green Key Resources by 2x