Informica
Informica - Lead Data Engineer - ETL/DataLake
Job Location
in, India
Job Description
Job Title : Lead Data Engineer. Location : Remote. Experience Required: 10 Years. Job Description - We are seeking an experienced Lead Data Engineer for one of our clients, specializing in the development and optimization of ETL processes. - The successful candidate will have a deep understanding of Spark-based data engineering, particularly with PySpark Notebooks and Microsoft Fabric, and possess a strong command of SQL. - In this role, you will lead data engineering initiatives, work closely with cross-functional teams, and support the modernization of legacy SQL Server environments. Key Responsibilities : ETL Pipeline Development: Design, develop, and maintain robust ETL pipelines using PySpark Notebooks and Microsoft Fabric. Stakeholder Collaboration: Work with data scientists, analysts, and other stakeholders to understand data requirements and deliver effective data solutions. Data Migration: Migrate and integrate data from legacy SQL Server environments to modern data platforms. Pipeline Optimization: Ensure data pipelines are optimized for scalability, efficiency, and reliability. Technical Leadership: Provide technical guidance and mentorship to junior developers and team members. Troubleshooting: Identify and resolve complex data engineering issues related to performance, data quality, and scalability. Best Practices: Establish and maintain data engineering best practices, coding standards, and documentation. Code Review: Conduct code reviews and provide constructive feedback to improve team productivity and code quality. Data Integrity: Support data-driven decision-making processes by ensuring data integrity, availability, and consistency across platforms. Requirements. Education: Bachelor's or Master's degree in Computer Science, Data Science, Engineering, or a related field. Experience: 10 years in data engineering, with a strong focus on ETL development using PySpark or other Spark-based tools. SQL Expertise: Proficiency in SQL, including complex queries, performance tuning, and data modeling. Microsoft Fabric: Hands-on experience with Microsoft Fabric or similar cloud-based data integration platforms. Data Lakes & Warehouses: Strong knowledge of Data Lake, Data Warehouse, and Delta Lake technologies. Azure Data Services: Experience with Azure Data Factory, Azure Synapse, or similar data services. Scripting Proficiency: Skilled in scripting languages such as Python or Scala for data manipulation and automation. Big Data & ETL Frameworks: Solid understanding of data warehousing concepts, ETL frameworks, and big data processing. Bonus Skills: Familiarity with Hadoop, Hive, Kafka, and DevOps practices, including CI/CD pipelines and containerization tools like Docker and Kubernetes. Structured & Unstructured Data: Experience handling both structured and unstructured data sources. Key Qualities : - Strong problem-solving skills with the ability to troubleshoot complex data engineering issues. - Proven ability to work independently, as part of a team, and in leadership roles. - Excellent communication skills to translate complex technical concepts for business stakeholders. Remote, experience 10-12 years (ref:hirist.tech)
Location: in, IN
Posted Date: 11/25/2024
Location: in, IN
Posted Date: 11/25/2024
Contact Information
Contact | Human Resources Informica |
---|