Site Reliability Engineer (Remote Anywhere Canada or U.S. EST timezone )

Candidate should be located in EST time zone.The SRE/Cloud Developer is a member of Cloud Operations team and responsible for the reliability, security and efficiency of Change Healthcare’s cloud environments and products that comprise Enterprise Imaging solutions. Participate in the Cloud Operations team activities including continuous delivery, configuration changes and performance monitoring. Manage and monitor cloud resources utilization and cost.Define and drive automation of cloud operations and deploymentsDefine and drive implementation of Cloud Operations procedures Define and implement effective and reliable cloud infrastructure. Define, implement methodology and toolset for fully automated infrastructure management as a code Serve as the company’s subject matter expert to support other Change Healthcare teams for purposes cloud technologies, operations, and DevOps methodology. Plan, implement, monitor, and test systems and procedures for best practice Business Continuity and Disaster Recovery (BCDR). Operate, scale, and troubleshoot applications and infrastructure for the cloud-based SaaS Enterprise Imaging platform and all components within. 24x7x365 shift-based support with rotating on-call Follow any related to cloud operations standard operations procedures Work with other teams to make sure that the infrastructure and applications that depend on it work together seamlessly Support other team’s infrastructure needs on an as-needed basis Use and develop tools for systems continuous delivery automation System Administration on Linux (CentOS, etc..) and Windows Server operating systems, network configurations, access and permissions, cloud services. Oversee, participate in and manage production applications deployments. Interface with globally dispersed vendor and internal teams. Ensure cloud environment reliability, performance and safety of information. Ensure high availability and performance of enterprise imaging applications. Define and implement effective cloud infrastructure, services, applications and customer connectivity monitoring and emergency alerting. Ensure quick recovery from incidents utilizing well defined operational procedures, tools, and efficient communication with various internal and external stakeholders. Define metrics of success and provide operational reports and dashboards utilizing company analytics tools. Ensure secure and managed access to production and staging environments. Actively use and suggest improvements for continuous delivery toolset. Proactively collaborate with various operational teams on defining SLAs, aligning on processes, tools and procedures. Work with engineering, security operations, software architecture, support, platform, and other cross-functional teams on company priorities and roadmap planning for a rapidly growing customer base. Ensure compliance with medical device, privacy and security regulations. Actively support compliance auditing activities. Monitor cloud resources utilization and associated cost. Interact with vendors, consultants, partners, and customers to ensure that cloud operations meet the needs of all users of the platform. Minimum Requirements (Required) 3+ years and a proven record of success in administration of cloud infrastructure and deployed applications for enterprise SaaS or PaaS companies in public clouds such as GCP(Preferred), AWS, Azure. 3+ years of experience in administration of IT systems include compute, network, storage, general access control 3+ years and a proven record of success of using/creating automated delivery and configuration tools 1 years of programming experience in any of the platforms.Critical Skills (Required) Energetic, motivated, and customer focused. Proficient with one of the scripting languages like Python, PowerShell, BashProgramming knowledge in Node.js or Python or any other languageProficient with DevOps tools and environments like Terraform, Jenkins, Git, Ansible. Exceptional critical and highly analytical thinking skills; ability to decompose complex problems, prioritize issues, and implement sensible solutions. Proficient experience managing outages, customer escalations, crisis management, and other similar circumstances. Able and willing to work in a fast paced, quickly changing environment Strong knowledge of cloud infrastructure includes compute, networking, storage and other cloud services. Solid foundation in Linux/Windows operating systems and tools. Strong knowledge of IT infrastructure includes switches, routers, firewalls, VPNs, IDS, IPS, Proxy, etc… Experience with centralized logging services like StackDriver , ELK, DataDog, Splunk. Experience with monitoring tools like StackDriver, NewRelic, Graphite, Nagios, Zabbix. Understanding of cybersecurity methodology such as security controls, access control and auditing Adhere to standard operations procedures Excellent communicator and presentation skills Additional Knowledge and Skills (Preferred) Experience working with across multiple boundaries including internal and external relationships management. Advanced Experience with DevOps methodology and Continuous Delivery. Experience in migrating products from on premise to cloud Capability of effectively negotiating with peers without direct authority Experience with HIPAA compliance and the security of PHI data Familiarity with Healthcare IT standards as well as with Healthcare workflows Education (Required)Bachelor's degree in Information Systems, Computer Science, Engineering or related fieldPhysical RequirementsGeneral Office DemandsAdditional Job Posting InformationAdditional Job Posting Information

Site Reliability Engineer (Remote Anywhere Canada or U.S. EST timezone )

Site Reliability Engineer (Remote Anywhere Canada or U.S. EST timezone )

Share this job now