The Technical Operations Engineer will be an active member of the Technical Operations team and will be responsible for the “day-to-day” operational support of all company Products and Services. Requires a “can-do, results driven” attitude with a passion for problem solving, driven by the interest of identifying and resolving operational issues.
Primary responsibilities: Provide daily Operations and Engineering support for company networks, products, and services.
Ensure that systems (hardware, software and service) are operating as designed and configured in a manner that ensures scalability and reliability:
- Will be involved in the planning and deployment of all new software and hardware required to support all company products and services.
- Develop operational solutions by defining requirements and screening potential solutions; and by determining impact on total system while balancing technical benefits vs cost and budget requirements.
- Will document operational procedures and create operations test plans.
- Review SLAs, participate in the ITSM and DevOps workflow, drive and review RCAs, manage 3rd party licenses and subscriptions, and review data center capacity reports.
- Anticipate operational problems by studying operating trends and targets, modes of operations, and monitoring system performance.
- Assist with triaging issues with in-house DevOps team and a diverse outsourced vendor community.
Collaborate with other groups providing operational perspectives and performing operational responsibilities; playing a key role in shaping the portfolio with a focus on delivering innovation through automation and developing a robust operational support model:
- Engage with Engineering team to assist with guiding, defining and the configuration of current and future technologies.
- Partner with the Engineering team to develop the necessary processes, tools and procedures to ensure application uptime and SLAs for products developed in-house.
- Actively participate in software releasing and deployment to ensure supportability, system and application monitoring, and SLAs.
- Correlate classic system metrics – (CPU, memory, network, I/O, etc) with application metrics to report on service utilization and health and to resolve issues in production.
- Generate a baselines of application metrics and use them to provide visibility into product behavior to identify and solve problems before customers are affected, preventing unnecessary support tickets and eliminating false alerts.
- Will operate as the primary interface to company vendors and service providers to ensure all operation plans align to company goals and objectives and are executed in a manner that ensure service continuity and performance stability.
- Strong advocate of customer service. Considers the needs of the customer when performing work or developing solutions. Makes sure solutions proposed by other departments are in the best interests of the customer.
- Ability to provide remote and hands-on operations support in an outsourced environment which includes deploying, supporting, testing applications from bare metal to virtualized and cloud deployed infrastructures.
Core Responsibilities & Accountabilities
- Coordinate incident and trouble ticket management activities including triaging production issues and conducting root cause analysis.
- Ensure all approved testing is properly executed against all products and services prior to deploying all new hardware or software into the production network.
- Ensure all best security practices are executed in a manner that ensures the company network, products, and services are not susceptible to threats.
- Document, track and report compliance against all defined Service Level Agreements (SLAs).
- Utilize Operations Maintenance metrics data as needed to identify and execute corrective action as warranted and approved.
- Coordinate and execute approved enhancements and application changes.
- Resolve issues with application-level service performance
- Embrace and /or adopt industry best-practices in IT operations and support processes (e.g. DevOps, ITIL)
- Ensure business continuity and disaster recovery plans are current, in-place and operational.
- Assist with environment and network designs
- Directly participate in DevOps CI/CD process
- Assist, create and support deployment and infrastructure automation for Production systems
- Perform DevOps deployments and break fix triage for Production systems
- Study, interpret and respond to monitoring and system alerts
- Participation in Toll Free Number Registry (TFN-R) Business Requirement and Technical Requirements discussions.
- Preparation of High Level Architecture and High Level Design for the proposed applications for TFN-R and WBA, MGI legacy Interfaces.
- Preparation of TFN-R Wave 2 API specification that are specific to the client side interfaces.
- Provide architectural inputs on identified frameworks and components for TFN-R and legacy interfaces.
- Evaluate new off the shelf software that may improve TFN-R application and legacy interface performance.
- Participate in design sessions for proposed TFN-R features.
- Implement Proof of Concept (PoC) components to demonstrate workability of the proposed solutions specific to TNF-R and Legacy Interfaces.
- Review SCM, CI/CD and Release management tools and procedures.
- Review code and document deliverable as part of the TFN-R project implementation.
- TFN-R Site development.
- TFN-R UAT support.
- Release and deployment support for TFN-R.
- Support bug fixes and issues reported.
- Provide troubleshooting within TFN-R Environment and associated Network components