Expian Technologies Pvt Ltd
Senior Data Engineer - PySpark/SQL
Job Location
bangalore, India
Job Description
SALARY : 30LPA - 40LPA PEI Group is a subscriber-focused business intelligence company that focuses on private investment markets in real estate, infrastructure, private equity, private debt, and specialist sector-specific activities within private asset classes. We provide industry-leading journalism, data, and market insight to subscribing clients via a wide portfolio of specialist brands supported by our robust and scalable digital publishing, analytics, and database platform. Since its inception in 2001, we have grown into a subscriber-focused business intelligence company with our multi-talented global team of over 400 people, spread across EMEA, USA & Asia, our purpose is to inform and connect investment professionals across global, specialised markets. As a Senior Data Engineer, you will be responsible for designing, implementing, and maintaining data processing pipelines and workflows using Databricks on the Azure platform. Your expertise in PySpark, SQL, Databricks, test-driven development, and Docker will be essential to the success of our data engineering initiatives. Roles and responsibilities : - Collaborate with cross-functional teams to understand data requirements and design scalable and efficient data processing solutions. - Develop and maintain data pipelines using PySpark and SQL on the Databricks platform. - Optimize and tune data processing jobs for performance and reliability. - Implement automated testing and monitoring processes to ensure data quality and reliability. - Work closely with data scientists, data analysts, and other stakeholders to understand their data needs and provide effective solutions. - Troubleshoot and resolve data-related issues, including performance bottlenecks and data quality problems. - Stay up to date with industry trends and best practices in data engineering and Databricks. Key Requirements : - 5 years of experience as a Data Engineer, with a focus on Databricks and cloud-based data platforms with a minimum of 2 years of latest experience with TDD on Databricks. - Hands-on experience in PySpark programming for data manipulation, transformation, and analysis. - Strong experience in SQL and writing complex queries for data retrieval and manipulation. - Experience in developing and implementing test cases for data processing pipelines using a test-driven development approach. - Experience in Docker for containerising and deploying data engineering applications. - Experience in the scripting language Python is mandatory. - Strong knowledge of Databricks platform and its components, including Databricks notebooks, clusters, and jobs. - Experience in designing and implementing data models to support analytical and reporting needs will be an added advantage. - Strong Knowledge of Azure Data Factory for Data orchestration, ETL workflows, and data integration is good to have. - Good to have knowledge of cloud-based storage such as Amazon S3 and Azure Blob Storage. - Bachelor's or Master's degree in Computer Science, Engineering, or a related field. - Strong analytical and problem-solving skills. - Strong English communication skills, both written and spoken, are crucial. - Capability to solve complex technical issues and comprehend risks prior to the circumstance. (ref:hirist.tech)
Location: bangalore, IN
Posted Date: 11/21/2024
Location: bangalore, IN
Posted Date: 11/21/2024
Contact Information
Contact | Human Resources Expian Technologies Pvt Ltd |
---|