Care Engineer
Nokia
Portugal
Posted on Feb 26, 2026
As a Care Engineer, you will play a key role in maintaining the reliability and quality of our platforms and products. You will be responsible for incident resolution, root cause analysis, performance optimization, and continuous improvement of systems running in production environments.
You will collaborate closely with engineering, development, and support teams to deliver high‑quality customer experiences and contribute to the robustness of mission‑critical systems. Your work will have direct impact on service continuity, operational efficiency, and customer satisfaction.
We are looking for a highly motivated Care Engineer to join our team and ensure the stability, performance, and continuous operation of our solutions. If you have strong experience with Linux, Kubernetes, and OpenShift, and enjoy solving complex production issues, this role is an excellent fit for you.
Must-have
- Strong hands‑on experience with Linux administration, troubleshooting, and performance optimization.
- Practical experience with Kubernetes (cluster management, deployments, and debugging).
- Advanced knowledge of Red Hat OpenShift or other enterprise Kubernetes platforms.
- Solid problem‑solving and analytical skills, especially in distributed environments.
- Previous experience providing technical support in production environments.
- Strong written and verbal communication skills in English.
- Ability to work independently, proactively, and collaboratively.
Nice-to-have
- Experience with CI/CD pipelines (e.g., Jenkins, GitLab CI).
- Scripting knowledge (Bash, Python, or similar).
- Familiarity with monitoring and observability tools (Prometheus, Grafana, ELK, etc.).
- Experience working with high‑availability and large‑scale systems.
#LI-Hybrid
- Provide second/third‑level technical support and act as a subject‑matter expert in complex incident resolution.
- Monitor, diagnose, and troubleshoot issues in Linux, Kubernetes, and OpenShift environments.
- Work with engineering teams to perform root cause analysis and implement long‑term corrective actions.
- Ensure platform stability and contribute to performance, security, and resilience improvements.
- Participate in preventive and corrective maintenance activities.
- Support the automation of operational tasks to improve efficiency and reduce manual error.
- Document technical procedures, incident resolutions, and operational best practices.
- Collaborate with internal and external stakeholders, providing clear and timely communication.