HeyCoach

HeyCoach - Artificial Intelligence Engineer - Machine Learning/NLP

Click Here to Apply

Job Location

bangalore, India

Job Description

Position : AI Engineer Experience : 3 Years Location : HSR Sector-6, Bangalore About HeyCoach : We are an exceptional group of highly skilled individuals, passionate about addressing a fundamental challenge within the education industry. Our team consists of talented geeks who possess a deep understanding of the issues at hand and are dedicated to finding innovative solutions. In our quest for excellence, we are constantly seeking out remarkable individuals who can contribute to our growth and success. Whether it's developing cutting-edge technologies, designing immersive learning experiences, or implementing groundbreaking teaching methodologies, we consistently strive for excellence. About the role : We are seeking an AI Engineer to develop a state-of-the-art system that enables the automatic dubbing of educational videos into multiple vernacular languages of India. The project encompasses the entire pipeline-from video preprocessing to final video reassembly-with a focus on synchronization, context-aware translation, and natural-sounding speech synthesis. This role requires expertise in cutting-edge AI/ML techniques, a deep understanding of audio-video processing, and familiarity with vernacular linguistic challenges. Key Responsibilities : 1. Build and Own the AI Pipeline : Design, implement, and optimize the following stages : Video Preprocessing : Extract audio, detect scenes, and analyze lip movements using tools like Mediapipe and OpenCV. Speech-to-Text (STT) : Integrate models like OpenAI Whisper, Google Speech-to-Text, or AWS Transcribe for accurate transcription with timestamps. Translation : Utilize context-aware translation models (e.g., Hugging Face Transformers, MarianMT, or mBART) with domain-specific adjustments for mathematical and technical terms. Text-to-Speech (TTS) : Generate natural-sounding voices in vernacular languages using Tacotron 2, WaveGlow, or Variational Inference TTS for voice cloning. Lip Sync Alignment : Implement advanced models like Wav2Lip or FaceFX for seamless lip synchronization. Video Reassembly : Ensure smooth video stitching post-processing. 2. Collaboration : - Work closely with linguists and subject-matter experts to fine-tune translation models for technical accuracy. - Collaborate with designers and video editors for final product delivery. 3. Optimization and Scalability : - Enhance system performance for real-time or near-real-time processing. - Scale the pipeline for processing thousands of videos. 4. Research and Innovation : Stay updated with advancements in AI, especially in video processing, speech synthesis, and translation.- Experiment with novel techniques to improve lip-sync accuracy and audio-visual coherence. 5. Documentation and Reporting : - Maintain detailed documentation of the AI pipeline and processes. - Regularly report progress and challenges to stakeholders. Education Qualifications : - Bachelor's/Master's/Ph.D. in Computer Science, Artificial Intelligence, Machine Learning, or related fields. Core Technical Skills : - Machine Learning and AI : Experience with deep learning frameworks (TensorFlow, PyTorch). - Audio-Video Processing : Familiarity with tools like Mediapipe, OpenCV, and FFMPEG. - Speech Processing : Expertise in STT and TTS systems, with hands-on experience in OpenAI Whisper, Tacotron, WaveGlow, or similar models. - Natural Language Processing (NLP) : Proficiency in context-aware translation models, sentence restructuring, and linguistic nuances. - Lip Sync Alignment : Knowledge of Wav2Lip, FaceFX, or similar technologies. - Programming : Strong proficiency in Python. Familiarity with C++ for performance-critical tasks is a plus. Additional Skills : - Knowledge of cloud platforms (AWS, GCP) for deploying and scaling AI models. - Familiarity with audio-linguistic nuances of Indian vernacular languages. - Strong problem-solving and debugging skills. Experience : - 3 years in AI/ML roles with a focus on audio, video, or NLP applications. - Proven experience in building end-to-end pipelines for multimedia processing (ref:hirist.tech)

Location: bangalore, IN

Posted Date: 1/12/2025
Click Here to Apply
View More HeyCoach Jobs

Contact Information

Contact Human Resources
HeyCoach

Posted

January 12, 2025
UID: 4955034812

AboutJobs.com does not guarantee the validity or accuracy of the job information posted in this database. It is the job seeker's responsibility to independently review all posting companies, contracts and job offers.