foldingdiff-pytorch

An unofficial re-implementation of FoldingDiff, a diffusion-based generative model for protein backbone structure generation. The official implementation of FoldingDiff can be found here.

Gallery

Noising

Denoising

Installation

Install through pip.

$ pip install foldingdiff-pytorch

Quickstart

Training

$ python -m foldingdiff_pytorch.train --meta data/meta.csv \
  --data-dir data/npy --batch-size 64

Sampling

$ python -m foldingdiff_pytorch.sample --ckpt [CHECKPOINT_PATH] \
  --timepoints 1000 --out [OUTPUT_PATH]

Sampling pipeline

With the snakemake command below, you can simply run unconditional protein backbone generation pipeline to obtain .pt files containing backbone coordinates and .gif files showing the whole denoising process.

$ snakemake -s sample.smk -j1

Downloading and preprocessing training data

Download non-redundant protein backbone structure data (40% similary cutoff) from CATH.

$ wget ftp://orengoftp.biochem.ucl.ac.uk/cath/releases/latest-release/non-redundant-data-sets/cath-dataset-nonredundant-S40.pdb.tgz

Extract the downloaded file and attach .pdb extension to files

$ tar xvf cath-dataset-nonredundant-S40.pdb.tgz && cd dompdb
$ for f in *; do mv "$f" "$f.pdb"; done

Run snakemake pipeline to convert pdb files to npy files containing angle information of shape (n, 6).

$ snakemake -s preprocess.smk -prq -j [CORES] --keep-going

Reproduction status

Model training for reproduction is currently running. The live training log is available at here.

Ramachandran plot

Visualized Ramachandran plot for 10 samples of length 64 for sanity check while training. Looks like the model is learning to produce reasonable secondary structures.

Citation

@misc{wu2022protein,
      title={Protein structure generation via folding diffusion}, 
      author={Kevin E. Wu and Kevin K. Yang and Rianne van den Berg and James Y. Zou and Alex X. Lu and Ava P. Amini},
      year={2022},
      eprint={2209.15611},
      archivePrefix={arXiv},
      primaryClass={q-bio.BM}
}

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
.github/workflows		.github/workflows
envs		envs
foldingdiff_pytorch		foldingdiff_pytorch
img		img
note		note
scripts		scripts
.gitignore		.gitignore
README.md		README.md
preprocess.smk		preprocess.smk
sample.smk		sample.smk
setup.py		setup.py
test.npy		test.npy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

foldingdiff-pytorch

Gallery

Noising

Denoising

Installation

Quickstart

Training

Sampling

Sampling pipeline

Downloading and preprocessing training data

Reproduction status

Ramachandran plot

Citation

About

Uh oh!

Releases 3

Uh oh!

Languages

dohlee/foldingdiff-pytorch

Folders and files

Latest commit

History

Repository files navigation

foldingdiff-pytorch

Gallery

Noising

Denoising

Installation

Quickstart

Training

Sampling

Sampling pipeline

Downloading and preprocessing training data

Reproduction status

Ramachandran plot

Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 3

Uh oh!

Languages