Applied Bioinformatics Engineer, Pipelines & AI
Boston
Friday, 08 May 2026
Pipeline Development and Engineering. Support for computational biology workflows, including single cell, spatial, and other multi-omics analysis workflows for clinical and preclinical applications. Use modern workflow managers (e.g. Nextflow, Snakemake, or similar) and containerization (Docker, Singularity) to make pipelines portable, testable, and reusable across projects and teams. Help build and maintain reproducible analytical pipelines for statistical genetics and bioinformatics workflows. Wrap and harden ad-hoc analytical scripts written by scientists into production-quality tools that can be re-run reliably by others. Write tests, documentation, and clear examples so the pipelines you build are usable by colleagues with a range of technical backgrounds. AI-Enabled Tooling and Workflows. Prototype agentic workflows that automate established and routine analytical tasks — for example, pulling target evidence across data sources, generating standardized due-diligence reports, or letting scientists interrogate complex datasets in natural language. Build and maintain MCP connectors that expose internal data, public resources, and analytical pipelines to LLM-based agents and tools like Claude. Identify and develop use cases where LL - Ms and agentic AI workflows can improve the speed, quality, consistency, or accessibility of work across therapeutic areas, focusing on end-to-end capabilities rather than isolated task completion. Contribute to a shared library of reusable AI tooling, prompt patterns, and integration code that the team can build on. Define technical standards for evaluation, documentation, guardrails, and workflow quality so that AI-based solutions are trusted, reproducible, and suitable for repeated use across teams and projects. Know the latest with the AI tooling landscape and bring back ideas the team can put to work. Help improve AI fluency among collaborators by demonstrating practical workflows. Collaboration Across Lilly Research Labs. Partner closely with statistical geneticists, computational biologists, and software engineers within the Cardiometabolic Data Science group and across other Lilly Research Labs teams. Work with therapeutic area partners to understand their analytical needs and translate them into pipeline requirements. Coordinate with platform and engineering groups to ensure your pipelines integrate cleanly with broader Lilly infrastructure. Contribute to internal knowledge sharing — code reviews, demos, documentation, and helping colleagues get unblocked. Basic Requirements. B. S. in computer science, computational biology, bioinformatics, biological sciences, statistics, or a related field, with 10 years relevant work experience,OR M. S. in computer science, computational biology, bioinformatics, biological sciences, statistics, or a related field, with 7 years relevant work experience. OR PhD. in computer science, computational biology, bioinformatics, biological sciences, statistics, or a related field, with 1 years relevant work experience. Additional Skills/ Preferences. Strong programming skills in Python and/or R including comfort with version control (Git), code review, testing, and writing maintainable code. Demonstrated experience building data analysis pipelines, ideally using a workflow manager such as Nextflow, Snakemake, or WDL - Working familiarity with bioinformatics file formats (VCF, BED, GTF, BAM, etc.) and standard tools (PLINK, samtools, bcftools, or similar)Familiarity with typical data types in high-throughput biology, including NGS data. Hands-on experience or strong demonstrated interest in modern AI tooling — using LL - Ms through APIs, building MCP servers/connectors, prompt engineering, or wiring up agentic workflows. Demonstrated ability to build stable and practical, reusable workflows and not just code for one-off analyses, with strong implementation skills in Python and modern AI/ ML tooling. A collaborative, low-ego mentality; you enjoy building tools that other people use and you take feedback well. Comfort with cloud computing environments (AWS, GCP, or Azure) and Linux/command-line work. Ability to work successfully in a matrixed environment. Prior experience with statistical workflows/biomedical statistics Prior exposure to statistical genetics methods (GWAS, fine-mapping, MR, colocalization, burden testing) or large-scale genomic datasets (UK Biobank, gnom. AD, GT - Ex, Open Targets)Prior experience with complex high-throughput biological data or experiments such as spatial transcriptomics, large-scale screens, or multi-omics studies. Familiarity with R in addition to Python, particularly for statistical genetics packages. Experience with relational and/or graph databases, and with biomedical ontologies. Contributions to open-source projects or a public portfolio (GitHub, blog posts, demos)Prior experience in pharma, biotech, or academic genomics research. Resources Managed. This is an individual contributor role with no direct reports. The Applied Bioinformatics Engineer, Pipelines & AI will work closely with scientists, engineers, and external partners across Lilly Research Labs. Lilly is dedicated to helping individuals with disabilities to actively engage in the workforce, ensuring equal opportunities when vying for positions. If you require accommodation to submit a resume for a position at Lilly, please complete the accommodation request form ( for further assistance. Please note this is for individuals to request an accommodation as part of the application process and any other correspondence will not receive a response. Lilly is proud to be an EEO Employer and does not discriminate on the basis of age, race, color, religion, gender identity, sex, gender expression, sexual orientation, genetic information, ancestry, national origin, protected veteran status, disability, or any other legally protected status.