Principal Data Engineer
Job DescriptionJob Description
As a Principal Data Engineer, you will lead a team in building and maintaining scalable, reliable data pipelines that bridge on-premises and cloud environments. You'll leverage your expertise in data engineering, streaming technologies, and leadership to drive the team towards achieving business objectives.
Key Responsibilities:
-
Team Leadership: Lead, mentor, and manage a team of data engineers specializing in streaming technologies.
-
Data Pipeline Development: Design and implement high-throughput, low-latency streaming data pipelines using Apache Kafka, ensuring integration with cloud services (e.g., BigQuery, Looker).
-
Data Analytics: Oversee the development of stream processing applications using Apache Spark or Apache Flink, and implement real-time data transformations and analytics using KSQL.
-
Data Storage: Design and maintain scalable data storage solutions with ClickHouse for fast analytics on streaming data.
-
ETL Processes: Lead the design and implementation of ETL processes, extracting, transforming, and loading data into a data warehouse.
-
Data Quality: Ensure data integrity, consistency, and accuracy through robust data quality assurance.
-
Optimization: Optimize performance and implement best practices in data engineering, covering data quality, security, and efficiency.
-
Collaboration: Collaborate with stakeholders to gather requirements and align data strategies with business objectives.
-
Technology Updates: Stay current with emerging technologies in streaming and cloud environments, evaluating their potential application.
Qualifications:
-
5+ years of hands-on data engineering experience with Python, Scala, or Java
-
3+ years of experience with cloud vendors (AWS, Azure, GCP), data warehouse services (e.g., Redshift, Databricks), and cloud storage
-
Expertise in KSQL, ClickHouse, ETL/ELT tools (e.g., Airflow, ADF, Glue, NiFi), and orchestration
-
Proficiency in code versioning (Git) and CI/CD pipelines
-
Experience with stream processing tools (Apache Kafka, Apache Flink, Apache Spark Structured Streaming)
-
Strong understanding of data modeling, optimization techniques, and stream processing patterns
-
Excellent leadership and mentorship skills with at least 2 years in a leadership role
Nice to Have:
-
Experience with NoSQL databases, Kafka Connect, Kafka Streams, Superset for data visualization
-
Knowledge of integrating real-time machine learning models into streaming environments
-
Expertise with monitoring and observability tools for streaming systems
Why Join Overwatch Agriculture?
At Overwatch Agriculture, we prioritize people over processes, fostering a supportive and tech-savvy environment. Our customized benefits prioritize your well-being and professional growth, making this an ideal opportunity for tech enthusiasts.