NIH Cloud Lab is a resource developed by NIH’s CIT Cloud Services Team to support STRIDES’ mission of enabling and modernizing biomedical research through the cloud. Through this resource, NIH-funded researchers can become more efficient and comfortable in leveraging the cloud for their research purposes.
We provide a Cloud Lab for each of the three major Cloud Service Providers, each with its own dedicated GitHub repository! Visit these repositories for more information and collections of tutorials:
We have compiled a variety of tutorials from different sources to help you navigate through various research methods using cloud platforms. These tutorials cover a wide range of topics including biomedical workflows, artificial intelligence, medical imaging, and more. Each tutorial is designed to guide you through specific tasks using services from three major cloud service providers: Azure, AWS, and GCP. Whether you are working with a virtual machine, using a familiar environment like Jupyter Notebooks, or other cloud-managed services, these tutorials will provide you with the necessary steps and insights to efficiently accomplish your research goals. A few of the tutorials available in our tutorial repositories are highlighted below. Please navigate to our CSP-specific repositories to find the full list of available tutorials. Let's dive into the exciting world of cloud computing!
NIH employees and affiliates may request a free Cloud Lab account. These accounts may be provisioned for AWS, GCP, or Azure and come with $500 of cloud credits, which are valid for up to 90 days. To find instructions on how to request an account please visit the NIH Cloud Lab information page. Our terms and conditions can be viewed in the docs folder of each CSP repository.
If you are new to the cloud don't forget take a look at our how-to docs (Azure, Google Cloud, AWS) to learn how to utilize common cloud resources like analyzing billing, applying auto shutdown in VMS and notebooks, utilizing Jupyter Notebooks, and more. If you have any questions, refer to our FAQ page.
- Machine Learning & Artificial Intelligence
- Accelerated Biomedical Workflows
- Clinical Informatics
- GWAS
- SARS-CoV-2 Lineage Analyses
- RNASeq
- Long Read Sequencing
- Utilizing SRA Data
- Blast
Machine Learning and Artificial Intelligence (ML/AI) are revolutionizing the way we interact with technology, offering unprecedented opportunities for innovation and automation. Whether you're a beginner or an advanced user, these tutorials will guide you through the latest advancements in ML/AI, helping you harness its potential. Check our repos to learn about how ML/AI can help with drug discovery (Google Cloud and AWS), proteomics utilizing tools like AlphaFold (Google Cloud and AWS), and medical imaging.
Learn about AI medical imaging techniques and tools in this section. It includes resources for using pre-trained models to run a custom Spleen Segmentation model using NVIDIA Models and MONAI:
Learn how to deploy models using Vertex AI in GCP, create a PubMed Chatbot using Azure, and utilize an AI playground like Bedrock on AWS. Our tutorials guide you through the new and emerging field of AI on the most popular cloud platforms. To explore more tutorials on this topic, please visit the cloud platform repositories:
This section provides tutorials and resources for executing a variety of biomedical workflows in the cloud, leveraging popular workflow languages and cloud-based tools to streamline and accelerate processing.
Discover how to utilize workflow languages like Nextflow and Snakemake in cloud-based Batch and HPC environments and learn how to accelerate and efficiently run workflows on AWS, GCP, and Azure, leveraging tools like AWS's ParallelCluster, Google Batch, and more.
Explore single-cell RNA sequencing (scRNA-Seq) techniques to run an accelerated scRNAseq pipeline using NVIDIA's RAPIDS tool on Google Cloud, AWS, and Azure, enabling detailed analysis of single-cell data.
Discover resources and techniques that aid Clinical Informatics to improve healthcare delivery, patient outcomes, and clinical decision-making bridging the gap between healthcare, technology, and data science by designing, implementing, and optimizing systems that manage clinical information. You can check out the various tutorials below to provide practical insights and tools to help you effectively leverage technology and data in healthcare.
Discover resources for conducting Genome-Wide Association Studies (GWAS) on various cloud platforms. This section includes tutorials on running GWAS workflows using deep learning techniques and cloud-based tools like:
- GCP's Vertex AI Workbench or Kubernetes on GCP to deploy a machine learning pipeline using Kubeflow
- AWS's EC2
- Azure's Machine Learning Studio
Learn how to run a standard COVID bioinformatics pipeline using the Pangolin workflow all within a cloud Jupyter environment for GCP, Azure, and AWS.
This section provides resources for a step by step breakdown of RNA-Seq analysis on different cloud platforms. It includes tutorials for running RNA-Seq pipelines on AWS and Azure, helping you run a familiar pipeline using cloud technology.
Discover Oxford Nanopore's comprehensive collection of notebook tutorials for working with long-read data, enabling tasks such as variant calling, RNA sequencing (RNA-seq), SARS-CoV-2 analysis, and more.
Learn how to download and analyze SRA data in the cloud from the NCBI Sequence Read Archive (SRA) on Google Cloud and AWS, supporting comprehensive genomic studies.
Learn how to run BLAST in the cloud, these tutorials explain how to set up and execute ElasticBLAST workflows on Google Cloud and AWS and Blast+ in Azure facilitating large-scale sequence alignment tasks.