TQuanta Technologies Pvt. Ltd.
TQuanta Technologies - PySpark Developer - Python/SQL
Job Location
in, India
Job Description
Job Description : PySpark Developer Position Overview : We are seeking a highly skilled PySpark Developer with experience to join our dynamic team. The ideal candidate will have a strong background in big data technologies, with expertise in PySpark and related tools to design, develop, and maintain scalable data processing solutions. This role involves collaborating with cross-functional teams to drive data-driven decision-making and improve the efficiency of large-scale data pipelines. Key Responsibilities : - Develop and maintain scalable ETL pipelines using PySpark to process large volumes of structured and unstructured data. - Optimize data pipelines for performance, reliability, and scalability. - Implement data quality and data validation frameworks to ensure the integrity of processed data. - Integrate data from diverse sources, including APIs, databases, and flat files, into big data platforms. - Transform raw data into usable formats for downstream analysis and machine learning applications. - Collaborate with data engineers, data scientists, and business stakeholders to gather requirements and deliver data solutions tailored to business needs. - Design and implement solutions aligned with architectural best practices and organizational goals. - Analyze and optimize PySpark jobs for performance improvements. - Tune cluster configurations and resource utilization in cloud or on-premises environments. - Stay updated with emerging big data technologies and provide recommendations for new tools and frameworks. - Mentor junior team members and conduct code reviews to ensure quality standards. - Prepare detailed technical documentation for developed solutions. - Create dashboards and reports for monitoring data pipelines and job execution metrics. Required Skills and Qualifications : - Proficiency in PySpark and Spark core concepts (RDDs, DataFrames, Datasets). - Strong programming skills in Python. - Hands-on experience with big data platforms like Hadoop, Hive, or HDFS. - Familiarity with databases (SQL and NoSQL). - Experience with cloud platforms such as AWS, Azure, or Google Cloud for data processing. - Knowledge of CI/CD pipelines, version control systems (e.g., Git), and containerization tools like Docker. - Ability to debug and resolve issues in distributed data processing systems. - Strong analytical and troubleshooting skills for complex systems. - Strong interpersonal skills for effective communication with technical and non-technical teams. - Proven ability to work in an Agile development environment. - Experience with machine learning frameworks and libraries like TensorFlow or Scikit-Learn. - Knowledge of streaming data frameworks like Kafka or Spark Streaming. - Exposure to tools like Airflow for workflow orchestration. - Certification in big data or cloud technologies is a plus. Educational Requirements : - Bachelor's or Master's degree in Computer Science, Information Technology, or a related field. - Collaborative and innovative work environment. - Attractive compensation and growth opportunities. (ref:hirist.tech)
Location: in, IN
Posted Date: 1/19/2025
Location: in, IN
Posted Date: 1/19/2025
Contact Information
Contact | Human Resources TQuanta Technologies Pvt. Ltd. |
---|