Bristol Myers Squibb
Senior Software Engineer Research Data (RAG Engineer) [Python & AWS]
Job Location
hyderabad, India
Job Description
Overview: Join our Semantic Data Products team within the Research Data group and become a pivotal force in revolutionizing our understanding and utilization of research data to drive drug discovery. As a RAG Engineer, youll be at the intersection of advanced AI and domain expertise, employing retrieval-augmented generation and semantic modeling principles to create meaningful connections within our research data platform. This role is instrumental in transforming raw data into actionable insights, accelerating scientific breakthroughs through intelligent data integration and querying. Your Impact in Drug Discovery: In this role, you will directly contribute to the enhancement of our research data products, enriching the platform with data that empowers machine learning models and supports evidence-based decision-making in drug discovery. Your work will enable the development of RAG-powered data products, providing deep insights and actionable knowledge to support and speed up our journey to new scientific advancements. Key Responsibilities: Develop RAG workflows, including prompt engineering and retrieval models, to answer R&D High Value Questions (HVQ) through the creation of advanced semantic data models. Design and support ETL processes on AWS to align data from multiple sources with unified data models, leveraging tools like Glue, Athena, and DataZone. Innovate and implement methods for validating data products against smart data contracts, ensuring quality and consistency. Build techniques to enable self-service querying and derive insights from our data products, making complex information accessible to researchers. Contribute methods that facilitate and automate normalization, linking, and augmentation of R&D data, ensuring a smooth data flow for research use cases. Expand expertise in collaborating with life sciences and early discovery stakeholders to align data products with research goals. Prototype and leverage AI agents to automate tasks such as biomarker extraction from research papers, information gathering from internal and external documents, and summarizing key research concepts. Qualifications: Strong expertise in RAG workflows and data consumption/data exposure using AWS tools like Glue, Athena, and DataZone. Solid understanding of findable, accessible, interoperable, and reusable (FAIR) data principles. Knowledge of semantic data theory and practical applications, with experience in or willingness to learn RDF/SPARQL/SKOS/Knowledge Graphs. Proficiency in Python for automation, data processing, and natural language processing (NLP). Excellent problem-solving and analytical skills, with an ability to work independently and collaboratively. Strong communication skills, both written and verbal, with a passion for continuous learning. Preferred Experience: Previous experience in a life sciences environment or drug discovery setting is advantageous. Familiarity with semantic data modeling tools (e.g., TopBraid EDG, SciBite) and visualization tools (e.g., GraphDB, relevant Python libraries) Join our team and make a transformative impact on the future of scientific research and drug discovery. Your expertise in RAG and data engineering will unlock new frontiers in data analysis and semantic data understanding.
Location: hyderabad, IN
Posted Date: 1/29/2025
Location: hyderabad, IN
Posted Date: 1/29/2025
Contact Information
Contact | Human Resources Bristol Myers Squibb |
---|