PaCS-Toolkit User Guide

Welcome to PaCS-Toolkit User Guide. Here, we provide instructions on how to use PaCS-Toolkit. If you encounter any issues and cannot be solved here, please feel free to submit a Github Issue. We hope PaCS-MD will be helpful for your research.

Content

What is PaCS-MD ?

PaCS-MD is an enhancement sampling method that exhaustedly samples the conformational space of given systems. PaCS-MD comprises multiple short MD simulations being cascaded by the selection of the starting structures for the next simulation cycles based on given evaluation criteria. By repeating this process, a large conformational space can be sampled without any mechanistic driven bias.

%%{ init: { 'flowchart': { 'curve': 'stepAfter'}, 'theme':'neutral', 'themeVariables': {'fontSize':'20px'} } }%%
flowchart TD;

cycle0["Preliminary MD simulations"];
ranking0["Structural Ranking"];
cycle0 --> ranking0;
ranking0 --> input_structures;

subgraph cycle["cycle"];
  ranking1["Structural Ranking"];
  judge{{"Fulfill requirements?"}};
  export1["Export top N snapshots to next cycle"];
  input_structures --> trajectories --> ranking1 --> judge;
  judge -- No --> export1 --> input_structures;
end

subgraph trajectories["Short MD simulations"];
  direction TB;
  traj1(trj1);
  traj2(trj2);
  etc_trj(...);
  traj(trj N);
end

subgraph input_structures[" "];
  direction TB;
  rank1(rank1);
  rank2(rank2);
  etc_sim(...);
  rank(rank N);
end

End((("finish")));
judge -- Yes --> End;

Citation

@article{Ikizawa2024,
  title = {PaCS-Toolkit: Optimized Software Utilities for Parallel Cascade Selection Molecular Dynamics (PaCS-MD) Simulations and Subsequent Analyses},
  volume = {128},
  ISSN = {1520-5207},
  url = {http://dx.doi.org/10.1021/acs.jpcb.4c01271},
  DOI = {10.1021/acs.jpcb.4c01271},
  number = {15},
  journal = {The Journal of Physical Chemistry B},
  publisher = {American Chemical Society (ACS)},
  author = {Ikizawa,  Shinji and Hori,  Tatsuki and Wijaya,  Tegar Nurwahyu and Kono,  Hiroshi and Bai,  Zhen and Kimizono,  Tatsuhiro and Lu,  Wenbo and Tran,  Duy Phuoc and Kitao,  Akio},
  year = {2024},
  month = apr,
  pages = {3631–3642}
}

Access

# Head
- Kitao

# Team leader
- Ikizawa

# Coding team including documentation
- Ikizawa (lead)
- Hori
- Kono

# Testing team
- Tegar (lead)
- Duy
- Bai
- Kimizono
- Lu

Quick Start

  • on this page, we will introduce a quick guide on running dissociation PaCS-MD using gromacs.
  • If you want to try other PaCS-MD, please refer to Analyzer.

Content

Step1: Installing by pip or conda

  1. Running git clone for getting source code
git clone https://github.com/Kitaolab/PaCS-Toolkit.git
  1. Installing with your favorite method
# Install by pip
pip install -e ".[mdtraj]"

# Or install by conda and pip
conda create -n pacsmd "python>=3.8" -y
conda activate pacsmd
pip install -e ".[mdtraj]"
  • The above methods are better, but sometimes they are not suitable for a special situation.
  • For more information about installation, Please refer to Install page.

Step2: Preparing initial files

  • On this page, we will use gromacs as simulator.
  • In gromacs, gro, top, mdp and ndx files are necessary. Please get these files ready.
  • Sample jobscripts and input files are here

Step3: Preparing input file for PaCS-MD

  • PaCS-MD requires a toml format input file. Please make input.toml
    • Please adjust the above settings as needed.
    • For more information about this file, please refer to input file.
vim input.toml
input.toml
# Input file for PaCS-MD
## basic
# pacsmd settings
## basic
max_cycle = 2                           # Maximum number of cycles to run. (ex. 1, ..., 123, ..., 999)
n_replica = 3                           # Number of replica. (ex. 1, ..., 123, ..., 999)
n_parallel = 3                          # Number of replica which are calculated at a time
skip_frame = 1                          # Frequency of frames used for ranking among trajectories
centering = true                        # Whether to move the molecule to the center
centering_selection = "protein"         # Name of molecule to move in the center
working_dir = "./."                     # Directory where pacsmd will run

## simulator
simulator = "gromacs"                   # Software used inside PaCS-MD
cmd_mpi = "mpirun -np 4"                # Commands for MPI such as mpirun, blank is OK
cmd_serial = "gmx_mpi mdrun -ntomp 6"   # Commands to run the simulator serially
cmd_parllel = "gmx_mpi mdrun -ntomp 6"  # Commands to run the simulator parallelly
structure = "input.gro"                 # Structural file such as gro, pdb, rst7, etc.
topology = "topol.top"                  # Topology file such as top, parm7, psf, etc.
mdconf = "parameter.mdp"                # Parameter file such as mdp, mdin, namd, etc.
index_file = ".index.ndx"               # Gromacs index file
trajectory_extension = ".xtc"           # Trajectory file extension. ("." is necessary)

## analyzer
type = "dissociation"                   # Evaluation type
threshold = 100                         # CV threshold for determining to terminate a trial
skip_frame = 1                          #  How many frames to skip when ranking CVs
analyzer = "mdtraj"                     # Trajectory tool used to calculate the evaluation type
selection1 = "resid 1 to 5"             # Selection string for specified group in trajectories
selection2 = "resid 6 to 10"            # Selection string for specified group in trajectories

## postprocess
genrepresent = true                     #  Whether genrepresent is executed after trial
rmmol = true                            #  Whether rmmol is executed after each cycle
keep_selection = "not water"            #  Molecular name or index group to be left in the trajectory when rmmol
rmfile = true                           #  Whether rmfile is executed after trial

Step4: Running PaCS-MD

  • Finally, we are ready to perform PaCS-MD
  • Specifying the trial id in argument t and the input file in argument f.
pacs mdrun -t 1 -f input.toml
CAUTION In this case, the total core will be 24.

So, 8 cores will be used in each 3 replica at once. (24 / 3 = 8 cores)

  • If you want to continue the simulation from the middle of a cycle or trial, simply run pacsmd again. Completed cycles will be skipped, and the simulation will resume from the the point of interruption.

  • You will get the results

$ ls
trial001/

Supplement: Runing other trial

  • Usually, a single trial is not sufficient for sampling, so multiple trials are required.
  • We can run trial2 PaCS-MD
pacs mdrun -t 2 -f input.toml
  • Output will be the following
$ ls
trial002/

Step5: Making new fitted trajectories from pacsmd results

  • For some analyses, fitting trajectories can make subsequent calculations smoother.
  • For fitting, we prepare pacs fit mdtraj command. But you can use existing software such as gmx trjconv or cpptraj as usual.
$ pacs fit trial mdtraj -t 1 -tf prd_rmmol.xtc -top rmmol_top.pdb -r ref.gro -ts protein -rs protein

Step6: Extracting collective-variables from fitted trajectories

  • After making fitted trajectories, you need to extract CV to build MSM, where CV denotes collective variables such as "distance" and "inter COM vector" and "PCA" and etc.
  • For extracting CVs, we got pacs genfeature command. But this command provides you only frequently used CV such as com-distance, com-vector and rmsd.
  • So if you want to use other specific CVs, you need to write a code by yourself.
$ pacs genfeature comdist mdtraj -t 1 -tf prd.xtc -top inputs/example_gromacs/input.gro -s1 "residue 1" -s2 "residue 9"
$ ls
comdist-CV/

Step7: Building MSM and predicting free energy

  • After extracting CVs, various analyses can be performed on them.
  • PaCS-MD is especially compatible with analyses using MSM.

Install

Content

Requirements

  • Python >= 3.7 (but python >= 3.8 is recommended because of deeptime)
  • PaCS-Toolkit currently supports 3 simulator

1. Install by pip

1.1 Install by conda and pip

conda create -n pacsmd "python>=3.8" -y
conda activate pacsmd
  • if using whole pacstk function
pip install "pacs[all] @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
  • elif using "pacs mdrun" and analyzer == "mdtraj"
pip install "pacs[mdtraj] @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
  • elif using "pacs mdrun" and analyzer == "gromacs"
pip install "pacs @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
  • elif performing MSM
    • python >= 3.8 is recommended because of deeptime
pip install "pacs[msm] @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"

1.2. Install by pip

  • if using whole pacstk function
pip install "pacs[all] @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
  • elif using "pacs mdrun" and analyzer == "mdtraj"
pip install "pacs[mdtraj] @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
  • elif using "pacs mdrun" and analyzer == "gromacs"
pip install "pacs @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
  • elif performing MSM
    • python >= 3.8 is recommended because of deeptime
pip install "pacs[msm] @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"

2. Get source code from github

  • git clone
git clone https://github.com/Kitaolab/PaCS-Toolkit.git
cd pacsmd
  • get latest release
verion="x.x.x"
wget https://github.com/Kitaolab/PaCS-Toolkit/archive/refs/tags/${version}.tar.gz
tar -xvzf ${version}.tar.gz
cd pacsmd-${version}

2.1. Install by conda and pip locally

conda create -n pacsmd "python>=3.8" -y
conda activate pacsmd
  • if using whole pacstk function
    • python >= 3.8 is recommended because of deeptime
pip install -e ".[all]"
  • elif using "pacs mdrun" and analyzer == "mdtraj"
pip install -e ".[mdtraj]"
  • elif using "pacs mdrun" and analyzer == "gromacs"
pip install -e "."
  • elif using "pacs mdrun" and analyzer == "cpptraj"
pip install -e "."
  • elif performing MSM
pip install -e ".[msm]"

2.2. Install by pip locally

  • if using whole pacstk function
    • python >= 3.8 is recommended because of deeptime
pip install -e ".[all]"
  • elif using "pacs mdrun" and analyzer == "mdtraj"
pip install -e ".[mdtraj]"
  • elif using "pacs mdrun" and analyzer == "gromacs"
pip install -e "."
  • elif using "pacs mdrun" and analyzer == "cpptraj"
pip install -e "."
  • elif performing MSM
    • python >= 3.8 is recommended because of deeptime
pip install -e ".[msm]"

mdrun

  • This command is used to execute PaCS-MD.
  • Unlike other commands in pacskit, many variables are set within pacsmd's input file.
  • It is assumed that PaCS-MD will first be executed by this command and then analyzed using other commands.
  • See Input file for details on input files.

Example

pacs mdrun -t 1 -f input.toml

Arguments

usage: pacs mdrun [-h] [-t] [-f]
  • -t, --trial (int):
    • trial number without 0-fill when pacsmd was conducted (e.g. -t 1)
  • -f, --file (str):
    • input file path for PaCS-MD (e.g. -f input.toml)

Overview

generated files

  • mdrun reads the file specified by -f and performs MD calculations according to its contents.
  • After the command is executed, the following directory structure is created.
Directory structure of pacsmd:
  • tiral001 /
    • cycle000 /
      • replica001 /
        • trajectory file
      • summary /
        • cv.log
        • cv_ranked.log
        • progress.log
    • cycle001 /
      • replica001 /
      • replica002 / ...
      • replica n_replica /
      • summary /
    • cycle002 / ...
    • cycle max_cycle /
  • The directory structure is created as shown above, and the following files are created.
    • trajectory file
      • The trajectory file is a file that contains the coordinates of the atoms in the system at each time step.
      • file name without extension is prd and the extension is determined by the input file.
    • cv.log
      • The cv.log file contains the calculated collective variable (CV) values for each frame in the trajectory file.
      • first column: replica number (1-indexed)
      • second column: frame number (1-indexed)
      • third column: CV value
    • cv_ranked.log
      • The cv_ranked.log file contains the sorted CV values.
    • progress.log
      • The progress.log file contains the progress of the simulation in each cycle.

Overall diagram

  • PaCS-MD runs multiple short simulations, from which it selects a snapshot that meets its objectives and serves as the initial structure for the next cycle. Therefore, the diagram is as follows.
sequenceDiagram

participant cycle as cycle
participant simulator as simulator
participant analyzer as analyzer
loop 0, max_cycle
    Note over cycle: prepare initial structures
    par run short MD simulations
    cycle ->>+ simulator: run short MD simulations
    simulator ->>- cycle: trajectory files
    end
    par calculate CV 
    cycle ->>+ analyzer: calculate CV
    Note over analyzer: calculate CV based on the input file.
    Note over analyzer: sort the snapshots based on the CV.
    analyzer ->>- cycle: cv.log, cv_ranked.log
    end
    break the top CV reaches threshold
    analyzer -->> cycle: finish PaCS-MD
    end
    Note over cycle: export top N snapshots to next cycle.
end
  • simulator

    • The simulator performs one short MD on the starting structure in cycle 0.
    • In the other cycles, it executes n_replica short MDs, starting from the top-ranking structure of each cycle.
    • Currently, only GROMACS, AMBER and NAMD are supported.
    • See Simulator for more information
  • analyzer

    • The analyzer calculates the collective variable (CV) based on the type specified in the input file.
    • The analyzer sorts the structures in each trajectory based on the CV.
    • User-defined types are also available.
    • See Analyzer for more information

Basic

There are common settings for any PaCS-MD run. These include the maximum number of cycles to run, how many MD simulations to run in each cycle, and so on. These settings are described in the input file under the #basic keyword.

Following are the details of each keyword to be set. For simplicity, we chose "replica" as the MD simulation to be executed in each cycle.

keywords

  • max_cycle: int, default=1

    • Maximum number of cycles to run. (e.g. 100)
    • Minimum value is 1.
    • Maximum value is 999.
  • n_replica: int, default=1

    • Number of replicas to be excuted in one cycle. (e.g. 50)
    • Minimum value is 1.
    • Maximum value is 999.
  • n_parallel: int, default=1

    • Number of replicas to be simulated and analyzed in parallel at a time. (e.g. 10)
    • It is recommended that n_parallel set to be a divisor of n_replica.
  • centering: bool, default=True

    • This flag controls whether the initial structure of the next cycle is moved to the center of the box at the end of each cycle.
    • if centering is true, the initial structure of the next cycle is moved to the center of the box.
    • if centering is false, the initial structure of the next cycle will not be moved to the center of the box.
    • For longer MD, the proteins may move away from the center of the box and the features may behave non-intuitively. Therefore, it is recommended to set centering to true.
  • centering_selection: str, default="protein" or "Protein" or "@CA,C,O,N,H"

    • If centering is true, the group specified by this keyword moves to the center of the box.
    • If centering is false, there is no need to specify this keyword.
    • default value will be changed to match analyzer.
      • if analyzer == "mdtraj", default="protein"
      • if analyzer == "gromacs", default="Protein"
      • if analyzer == "cpptraj", default="@CA,C,O,N,H"
  • working_dir: str, default="./."

    • Directory where pacsmd will run.
  • rmmol: bool, default=false

    • This flag controls whether the unnecessary molecules will be removed from the trajectory after each cycle.
    • if rmmol is true, atoms not specified in the keep_selection are removed from the trajectory file.
    • The operation of removing atoms is irreversible.
  • keep_selection: str, (required if rmmol=true)

    • if rmmol is true, atoms specified in the keep_selection are kept in the trajectory file. (e.g. "protein")
  • rmfile: bool, default=false

    • This flag controls whether the unnecessary files are removed from the working directory after trial.
    • Backup files and other files are deleted.

Example

An example of a basic option in an input file is shown below. This can also be found on this page.

max_cycle = 20                     
n_replica = 8                    
n_parallel = 4                    
centering = true                 
centering_selection = "protein"   
working_dir = "/work/"            
rmmol = true                      
keep_selection = "not water"      
rmfile = true                    

Simulator

  • In PaCS-MD, simulations corresponding to the simulator are executed. Gromacs, Amber, and NAMD are supported.
  • In each cycle, n_replica simulations are executed. n_parallel simulations are will run in parallel. If n_replica is not a multiple of n_parallel, or in cycle 0, the remainder of the simulations will run in series.
  • If you want to run multiple replica simulations in parallel using MPI, set cmd_mpi. If cmd_mpi is not set, parallel execution will be performed using the multiprocessing module of python.

Content

When are cmd_serial & cmd_parallel used ?

  • See the table below for cmd_serial and cmd_parallel in the input file. cmd_serial is used if cycle0 or n_parallel=1. If n_replica is not divisible by n_parallel, then cmd_serial is used for the remaining replicas. Otherwise, cmd_parallel is used.
GPUMPIn_parallelcommand
xx1cmd_serial
xxncmd_serial
xo1cmd_serial
xoncmd_parallel
ox1cmd_serial
oxncmd_serial
oo1cmd_seiral
ooncmd_parallel

Gromacs

To run the simulation using gromacs, write in the inputfile as in this example. The details of each keyword are as follows.

keywords

  • simulator: str, required
    • Software used inside PaCS-MD. e.g. "gromacs"
  • cmd_mpi: str, default=""
    • Command for MPI parallelizataion. e.g. "mpirun -np 4"
  • cmd_serial: str, required
    • Command to run simulation in serial. e.g. "gmx_mpi mdrun -ntomp 6"
  • cmd_parallel: str, default=cmd_serial
    • Command to run simulation in parallel. e.g. "gmx_mpi mdrun -ntomp 6"
  • structure: str, required
    • Structure file path for MD simulation. e.g. "./input.gro"
    • This is also used as the initial structure of PaCS-MD
  • topology: str, required
    • Topology file path for MD simulation. e.g. "./topol.top"
  • mdconf: str, required
    • Parameter file path for MD simulation. e.g. "./parameter.mdp"
  • index_file: str, required
    • Gromacs index file path. e.g. "./index.ndx"
  • trajectory_extension: str, required
    • Trajectory file extension. (The "." is necessary.) e.g. ".trr"

Amber

To run the simulation using amber, write in the inputfile as in this example. The details of each keyword are as follows.

keywords

  • simulator: str, required
    • Software used inside PaCS-MD. e.g. "amber"
  • cmd_mpi: str, default=""
    • Command for MPI such as mpirun. e.g. "mpirun -np 4"
  • cmd_serial: str, required
    • Command to run simulation in serial. e.g. "pmemd.cuda"
  • cmd_parallel: str, default=cmd_serial
    • Command to run simulation in parallel. e.g. "pmemd.cuda"
  • structure: str, required
    • Structure file path for MD simulation. e.g. "./input.rst7"
    • This is also used as the initial structure of PaCS-MD.
  • topology: str, required
    • Topology file path for MD simulation. e.g. "./topology.parm7"
  • mdconf: str, required
    • Parameter file path for MD simulation. e.g. "./parameter.mdin"
  • trajectory_extension: str, required
    • Trajectory file extension. (The "." is necessary.) e.g. ".nc"

NAMD

To run the simulation using NAMD, write in the inputfile as in this example. The details of each keyword are as follows.

keywords

  • simulator: str, required
    • Software used inside PaCS-MD. e.g. "namd"
  • cmd_mpi: str, default=""
    • Command for MPI parallelizataion. e.g. "mpirun -np 4"
  • cmd_serial: str, required
    • Command to run simulation in serial. e.g. "namd2 +p4"
  • cmd_parallel: str, default=cmd_serial
    • Command to run simulation in parallel. e.g. "namd2 +p4"
  • structure: str, required
    • Structure file path for MD simulation. e.g. "./input.pdb"
    • This is also used as the initial structure of PaCS-MD.
  • topology: str, required
    • Topology file path for MD simulation. e.g. "./topology.psf"
  • mdconf: str, required
    • Parameter file path for MD simulation. e.g. "./parameter.conf"
  • trajectory_extension: str, required
    • Trajectory file extension. (The "." is necessary.) e.g. ".dcd"

Analyzer

  • In PaCS-MD, the analyzer is a component that evaluates the snapshot of each trajectory in each cycle and ranks them based on the evaluation type specified in the input file.
  • Some of the top ranked frames become the initial structure of each replica in the next cycle.
  • You can use the following evaluation functions by setting the #analyzer keyword in your input file. Depending on the phenomenon of interest, the appropriate type must be selected.
  • If you do not find a type that meets your purpose, you can implement it yourself by setting type = template.
  • The following are some commonly used evaluation types.

Content

Dissociation

  • Evaluate snapshots of each trajectory so that the centers of mass of the two selected groups are far apart.
  • Usage Example
    • Ligand dissociating from a protein
    • Separate the distance between the two domains of a protein
  • For more information, see here.
  • See here for an example input file.
Papers
[1] Protein-Ligand Dissociation Simulated by Parallel Cascade Selection Molecular Dynamics, https://doi.org/10.1021/acs.jctc.7b00504
[2] Dissociation Process of a MDM2/p53 Complex Investigated by Parallel Cascade Selection Molecular Dynamics and the Markov State Model, https://doi.org/10.1021/acs.jpcb.8b10309
[3] Binding free energy of protein/ligand complexes calculated using dissociation Parallel Cascade Selection Molecular Dynamics and Markov state model, https://doi.org/10.2142/biophysico.bppb-v18.037
[4] High pressure inhibits signaling protein binding to the flagellar motor and bacterial chemotaxis through enhanced hydration, https://doi.org/10.1038/s41598-020-59172-3
[5] Dissociation Pathways of the p53 DNA Binding Domain from DNA and Critical Roles of Key Residues Elucidated by dPaCS-MD/MSM, https://doi.org/10.1021/acs.jcim.1c01508

Association

  • Evaluate snapshots of each trajectory so that the centers of mass of the two selected groups are close together.
  • Usage Example
    • Bringing a ligand closer to a protein
    • Bringing two domains of a protein closer together
  • For more information, see here.
  • See here for an example input file.
Papers

Target

  • Evaluate snapshots of each trajectory so that the RMSD from the reference structure becomes smaller.
  • Usage Example
    • Finding transitions between two known structures.
  • Evaluate to follow the reference structure using RMSD.
  • For more information, see here.
  • See here for an example input file.
Papers
[1] Parallel cascade selection molecular dynamics (PaCS-MD) to generate conformational transition pathway, https://doi.org/10.1063/1.4813023

RMSD

  • Evaluate snapshots of each trajectory so that the RMSD from the reference structure becomes larger.
  • Usage Example
    • Sampling a wide range of structures
    • Sampling a wide range of ligand binding modes
  • For more information, see here.
  • See here for an example input file.
Papers
[1] Inhibition of the hexamerization of SARS-CoV-2 endoribonuclease and modeling of RNA structures bound to the hexamer, https://doi.org/10.1038/s41598-022-07792-2

EdgeExpansion

  • Evaluate snapshots of each trajectory so that the frame forms the convex hull of the 4-dimentional principal component space(PCs).
  • Usage Example
    • Sampling a wide range of phase space without knowing the reference structure
  • For more information, see here.
  • See here for an example input file.
Papers
[1] Edge expansion parallel cascade selection molecular dynamics simulation for investigating large-amplitude collective motions of proteins, https://doi.org/10.1063/5.0004654

A_D

  • Evaluate snapshots of each trajectory so that the distance between the centers of mass of the two selected groups fluctuates significantly within a certain range.
  • Usage Example
    • Observing large movements of proteins, such as opening and closing.
    • Observing the binding and unbinding of ligands.
  • For more information, see here.
  • See here for an example input file.
Papers
[1] Kinetic Selection and Relaxation of the Intrinsically Disordered Region of a Protein upon Binding, https://doi.org/10.1021/acs.jctc.9b01203

Template

  • Evaluate snapshots of each trajectory using a user-defined evaluation function.
  • Additional variables can be specified in input.toml
  • Usage Example
    • If you want to use a different evaluation function than the ones provided.
  • For more information, click here

RMSD

  • Evaluate snapshots of each trajectory so that the RMSD from the reference structure becomes larger.
  • The structure of the groups specified by selection1 and selection3 are superimposed by the least squares method and the RMSD is calculated with the structure of the groups specified by selection2 and selection4.
  • Usage Example
    • Sampling a wide range of structures
    • Sampling a wide range of ligand binding modes
  • See here for an example input file.

keywords

  • type: str, required
    • evaluation type, "rmsd"
  • threshold: float, required
    • PaCS-MD terminates the calculation when the evaluation value exceeds this threshold (in units of nm).
  • skip_frame: int, default=1
    • Number of frames to skip when ranking CVs.
    • If you set skip_frame=2, PaCS-MD will use every other frame.
  • analyzer: str, default="mdtraj"
    • Trajectory tool used to calculate the evaluation value.
    • "mdtraj", "gromacs" and "cpptraj" are supported.
  • reference: str, required
    • Structure file path for the reference structure of RMSD calculation.
  • selection1: str, required
    • Selection string or name of index group for specified group in trajectories (least squares fit)
    • Depending on the analyzer, there are different ways to specify selection.
      • If you use mdtraj (analyzer="mdtraj"), the selection should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA"
      • If you use gromacs (analyzer="gromacs"), the selection should follow Gromacs index group in index.ndx. (Gromacs supported only)
  • selection2: str, required
    • Selection string or name of index group for specified group in trajectories (RMSD calculation)
    • Depending on the analyzer, there are different ways to specify selection.
      • If you use mdtraj (analyzer="mdtraj"), the selection should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA"
      • If you use gromacs (analyzer="gromacs"), the selection should follow Gromacs index group in index.ndx. (Gromacs supported only)
  • selection3: str, default=selection1
    • Selection string or name of index group for specified group in reference (least squares fit)
    • If your reference structure has different topology from your trajectories, you can utilize this option. (e.g. reference has different configuration about mutation, missing residues or modified residues from your the trajectories.) Otherwise, you don't need to specify this option.
    • This option is valid when analyzer="mdtraj" or analyzer="cpptraj".
    • If you use gromacs (analyzer="gromacs"), this option is ignored and the same selection indice are automatically used for your trajectories and the reference.
  • selection4: str, default=selection2
    • Selection string or name of index group for specified group in reference (RMSD calculation)
    • If your reference structure has different topology from your trajectories, you can utilize this option. (e.g. reference has different configuration about mutation, missing residues or modified residues from your the trajectories.) Otherwise, you don't need to specify this option.
    • This option is valid when analyzer="mdtraj" or analyzer="cpptraj".
    • If you use gromacs (analyzer="gromacs"), this option is ignored and the same selection indices are automatically used for your trajectories and the reference.

Papers

[1] Inhibition of the hexamerization of SARS-CoV-2 endoribonuclease and modeling of RNA structures bound to the hexamer, https://doi.org/10.1038/s41598-022-07792-2

RMSD

  • Evaluate snapshots of each trajectory so that the RMSD from the reference structure becomes smaller.
  • The structure of the groups specified by selection1 and selection3 are superimposed by the least squares method and the RMSD is calculated with the structure of the groups specified by selection2 and selection4.
  • Usage Example
    • Finding transitions between two known structures.
  • See here for an example input file.

keywords

  • type: str, required
    • evaluation type, "target".
  • threshold: float, required
    • PaCS-MD terminates the calculation when the evaluation value falls below this threshold (in units of nm).
  • skip_frame: int, default=1
    • Number of frames to skip when ranking CVs.
    • If you set skip_frame=2, PaCS-MD will use every other frame.
  • analyzer: str, default="mdtraj"
    • Trajectory tool used to calculate the evaluation value.
    • "mdtraj", "gromacs" and "cpptraj" are supported.
  • reference: str, required
    • Structure file path for the reference structure of RMSD calculation.
  • selection1: str, required
    • Selection string or name of index group for specified group in trajectories (least squares fit)
    • Depending on the analyzer, there are different ways to specify selection.
      • If you use mdtraj (analyzer="mdtraj"), the selection should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA"
      • If you use gromacs (analyzer="gromacs"), the selection should follow Gromacs index group in index.ndx. (Gromacs supported only)
  • selection2: str, required
    • Selection string or name of index group for specified group in trajectories (RMSD calculation)
    • Depending on the analyzer, there are different ways to specify selection.
      • If you use mdtraj (analyzer="mdtraj"), the selection should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA"
      • If you use gromacs (analyzer="gromacs"), the selection should follow Gromacs index group in index.ndx. (Gromacs supported only)
  • selection3: str, default=selection1
    • Selection string or name of index group for specified group in reference (least squares fit)
    • If your reference structure has different topology from your trajectories, you can utilize this option. (e.g. reference has different configuration about mutation, missing residues or modified residues from your the trajectories.) Otherwise, you don't need to specify this option.
    • This option is valid when analyzer="mdtraj" or analyzer="cpptraj".
    • If you use gromacs (analyzer="gromacs"), this option is ignored and the same selection indice are automatically used for your trajectories and the reference.
  • selection4: str, default=selection2
    • Selection string or name of index group for specified group in reference (RMSD calculation)
    • If your reference structure has different topology from your trajectories, you can utilize this option. (e.g. reference has different configuration about mutation, missing residues or modified residues from your the trajectories.) Otherwise, you don't need to specify this option.
    • This option is valid when analyzer="mdtraj" or analyzer="cpptraj".
    • If you use gromacs (analyzer="gromacs"), this option is ignored and the same selection indices are automatically used for your trajectories and the reference.

Papers

[1] Parallel cascade selection molecular dynamics (PaCS-MD) to generate conformational transition pathway, https://doi.org/10.1063/1.4813023

Dissociation

  • Evaluate snapshots of each trajectory so that the centers of mass of the two selected groups are far apart.

  • The two selected groups are selection1 and selection2 in the input file.

  • Usage Example

    • Ligand dissociating from a protein
    • Separate the distance between the two domains of a protein
  • See here for an example input file.

keywords

  • type: str, required
    • evaluation type, "dissociation".
  • threshold: float, required
    • PaCS-MD terminates the calculation when the evaluation value exceeds this threshold.
  • skip_frame: int, default=1
    • Number of frames to skip when ranking CVs
    • If you set skip_frame=2, PaCS-MD will use every other frame.
  • analyzer: str, default="mdtraj"
    • Trajectory tool used to calculate the evaluation value.
    • "mdtraj", "gromacs" and "cpptraj" are supported.
  • selection1: str, required
    • Selection string or name of index group for specified group in trajectories
    • Depending on the analyzer, there are different ways to specify selection.
      • If you use mdtraj (analyzer="mdtraj"), the selection should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA"
      • If you use gromacs (analyzer="gromacs"), the selection should follow Gromacs index group in index.ndx. (Gromacs supported only)
  • selection2: str, required
    • Selection string or name of index group for specified group in trajectories
    • Depending on the analyzer, there are different ways to specify selection.
      • If you use mdtraj (analyzer="mdtraj"), the selection should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA"
      • If you use gromacs (analyzer="gromacs"), the selection should follow Gromacs index group in index.ndx. (Gromacs supported only)

Papers

[1] Protein-Ligand Dissociation Simulated by Parallel Cascade Selection Molecular Dynamics, https://doi.org/10.1021/acs.jctc.7b00504
[2] Dissociation Process of a MDM2/p53 Complex Investigated by Parallel Cascade Selection Molecular Dynamics and the Markov State Model, https://doi.org/10.1021/acs.jpcb.8b10309
[3] Binding free energy of protein/ligand complexes calculated using dissociat
ion Parallel Cascade Selection Molecular Dynamics and Markov state model, http
s://doi.org/10.2142/biophysico.bppb-v18.037
[4] High pressure inhibits signaling protein binding to the flagellar motor an
d bacterial chemotaxis through enhanced hydration, https://doi.org/10.1038/s41
598-020-59172-3
[5] Dissociation Pathways of the p53 DNA Binding Domain from DNA and Critical
Roles of Key Residues Elucidated by dPaCS-MD/MSM, https://doi.org/10.1021/acs.
jcim.1c01508

Association

  • Evaluate snapshots of each trajectory so that the centers of mass of the two selected groups are close together.

  • The two selected groups are selection1 and selection2 in the input file.

  • Usage Example

    • Bringing a ligand closer to a protein
    • Bringing two domains of a protein closer together
  • See here for an example input file.

keywords

  • type: str, required
    • evaluation type, "association".
  • threshold: float, required
    • PaCS-MD terminates the calculation when the evaluation value falls below this threshold (in units of nm).
  • skip_frame: int, default=1
    • Number of frames to skip when ranking CVs.
    • If you set skip_frame=2, PaCS-MD will use every other frame.
  • analyzer: str, default="mdtraj"
    • Trajectory tool used to calculate the evaluation value.
    • "mdtraj", "gromacs" and "cpptraj" are supported.
  • selection1: str, required
    • Selection string or name of index group for specified group in trajectories
    • Depending on the analyzer, there are different ways to specify selection.
      • If you use mdtraj (analyzer="mdtraj"), the selection should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA"
      • If you use gromacs (analyzer="gromacs"), the selection should follow Gromacs index group in index.ndx. (Gromacs supported only)
  • selection2: str, required
    • Selection string or name of index group for specified group in trajectories
    • Depending on the analyzer, there are different ways to specify selection.
      • If you use mdtraj (analyzer="mdtraj"), the selection should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA"
      • If you use gromacs (analyzer="gromacs"), the selection should follow Gromacs index group in index.ndx. (Gromacs supported only)

Papers

A_D

  • Evaluate snapshots of each trajectory so that the distance between the centers of mass of the two selected groups fluctuates significantly within a certain range.
  • Usage Example
    • Observing large movements of proteins, such as opening and closing.
    • Observing the binding and unbinding of ligands.
  • See here for an example input file.

keywords

  • type: str, required
    • evaluation type, "a_d".
  • skip_frame: int, default=1
    • Number of frames to skip when ranking CVs.
    • If you set skip_frame=2, PaCS-MD will use every other frame.
  • analyzer: str, default="mdtraj"
    • Trajectory tool used to calculate the evaluation value.
    • "mdtraj", "gromacs" and "cpptraj" are supported.
  • selection1: str, required
    • Selection string or name of index group for specified group in trajectories
    • Depending on the analyzer, there are different ways to specify selection.
      • If you use mdtraj (analyzer="mdtraj"), the selection should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA"
      • If you use gromacs (analyzer="gromacs"), the selection should follow Gromacs index group in index.ndx. (Gromacs supported only)
  • selection2: str, required
    • Selection string or name of index group for specified group in trajectories
    • Depending on the analyzer, there are different ways to specify selection.
      • If you use mdtraj (analyzer="mdtraj"), the selection should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA"
      • If you use gromacs (analyzer="gromacs"), the selection should follow Gromacs index group in index.ndx. (Gromacs supported only)
  • d_threshold: float, required
    • Threshold for switching from dissociation to association mode. If the maximum value of cv calculated while in dissociation mode exceeds this value, it switches to association mode.
  • frame_sel: int, required
    • One of the conditions used to switch from association to dissociation.
    • If the highest ranking frame in association mode is less than this value, it will be assumed that there has been no change from the previous cycle (i.e., it has come as close as possible). Hence, we consider the molecule to be within bound.
  • bound_threshold: int, required
    • One of the conditions used to switch from association to dissociation.
    • When the number of bound cycles exceeds this value, the mode is switched from association to disassociation and the number of bound cycles is initialized to 0.

Papers

[1] Kinetic Selection and Relaxation of the Intrinsically Disordered Region of
 a Protein upon Binding, https://doi.org/10.1021/acs.jctc.9b01203

EdgeExpansion

  • Evaluate snapshots of each trajectory so that the frame forms the convex hull of the 4-dimentional principal component space(PCs).
  • Snapshots of each cycle are projected onto a 4-dimensional principal component spaces (PCs). Its convex hull is computed and the initial structure of the next cycle is selected from the polygon's vertices.
  • Usage Example
    • Sampling a wide range of phase space without knowing the reference structure
  • See here for an example input file.

keywords

  • type: str, required
    • Evaluation type, "ee".
  • skip_frame: int, default=1
    • Number of frames to skip when ranking CVs.
    • If you set skip_frame=2, PaCS-MD will use every other frame.
  • analyzer: str, default="mdtraj"
    • Trajectory tool used to calculate the evaluation value.
    • only "mdtraj" is supported.
  • reference: str, required
    • Trajectory file used to compute the covariance matrix in Principal Component Analysis (PCA).
    • If this value is not set, the covariance matrix is computed based on the trajectory of cycle 0.
  • selection1: str, required
    • Selection string or name of index group for specified group in trajectories (least squares fit)
      • If you use mdtraj (analyzer="mdtraj"), the selection should follow mdtraj's atom selection.
  • selection2: str, required
    • Selection string or name of index group for specified group in trajectories (PCA calculation)
    • No correction by atomic weights or number of atoms is made when calculating the covariance matrix or projecting.
      • If you use mdtraj (analyzer="mdtraj"), the selection should follow mdtraj's atom selection.
  • selection3: str, default=selection1
    • Selection string or name of index group for specified group in reference (least squares fit)
    • If your reference structure has different topology from your trajectories, you can utilize this option. (e.g. reference has different configuration about mutation, missing residues or modified residues from your the trajectories)
    • Otherwise, you don't need to specify this option.
  • selection4: str, default=selection2
    • Selection string or name of index group for specified group in reference (PCA calculation)
    • If your reference structure has different topology from your trajectories, you can utilize this option. (e.g. reference has different configuration about mutation, missing residues or modified residues from your the trajectories)
    • Otherwise, you don't need to specify this option.

Papers

[1] Edge expansion parallel cascade selection molecular dynamics simulation for investigating large-amplitude collective motions of proteins, https://doi.org/10.1063/5.0004654

Template

  • Template type is a type that can be defined by the user yourself. It is possible to include user-specific variables in input.toml, and these variables can be used in template type in which they are defined.
  • This feature embodies the flexibility of PaCS-MD. This is also a feature that recognizes that the development team alone is limited in its ability to support such flexibility.
  • This is for advanced users and requires familiarity with PaCS-MD and the simulator used internally (Gromacs, Amber, Namd). If you are not familiar with it, please practice with other types.

Content

Step1: Get original source code

Step2: Code template type

  • You will see template.py is in pacsmd/md/analyzer/template.py
  • Here is template type
class Template(SuperAnalyzer):
    def calculate_cv(self, settings: MDsettings, cycle: int, replica: int, send_rev) -> None:
        """
        TODO
        1. Read trajectory based on self.cycle
        2. Calculate the value that suited the evaluation type.
        3. send the values by send_rev.send(ret_arr)
        """
        pass

    def ranking(self, settings: MDsettings, CVs: List[Snapshot]) -> List[Snapshot]:
        """
        CVs: List[Snapshot]
            The CVs is a list of Snapshot objects
            that contain the CV for each frame in the trajectory.
        Snapshot
        TODO:
            Arrange them in ascending, descending, etc.
            order to match the PaCSMD evaluation type.
        Example:
        sorted_cv = sorted(CVs, key=lambda x: x.cv)
        return sorted_cv
        """
        pass

    def is_threshold(self, settings: MDsettings, CVs: List[Snapshot] = None) -> bool:
        if CVs is None:
            CVs = self.CVs
        return CVs[0].cv < settings.threshold
  • You need to code 3 types mainly (calculate_cv, ranking and is_threshold)

calculate_cv

  • calcurate_cv is the core of each evaluation type. Here collective variables(CVs) (distance, RMSD, energy, etc.) are calculated from the MD simulation results in each replica and returned in a list. The evaluation type should be well defined so that the frames are chosen in the direction you wish to sample.
  • We recommend you to refer to other evaluation types (e.g. dissociation.py, rmsd.py, etc.).

ranking

  • In ranking, the list of CVs calculated by calcurate_cv is sorted in descending or ascending order.
  • If you want to separate two residues, sort in descending order as the distance increases. Conversely, if you want to bring them closer together, sort in ascending order so that the distance becomes smaller.

is_threshold

  • is_threshold determines if the top of the list of CVs (the frame with the most desired CV of all frames) is above or below the threshold set by the user. The inequality should be changed according to the situation.
  • If you want to simulate a larger distance, the process would be such that if it goes above a certain distance, it will terminate.

Step3: Prepare input.toml

  • In input.toml, you can define variables you want to use.
## analyzer
type = "template"
threshold = "???"

### user defined variable
user-defined-variable1 = 123
user-defined-variable2 = "hoge"
...
  • If you set your variables like the above example, you can use the variables in template.py like the following example
class Template(SuperAnalyzer):
    def calculate_cv(self, settings: MDsettings, cycle: int) -> List[float]:
        ...
            if traj.distance > setttings.user-defined-variable1:
                break
        ...

        return list_float

Step4: Run PaCS-MD

  • Now you can run your original PaCS-MD
pacs mdrun -t 1 -f input.toml
  • Please make a pull_request in Github for new possible template scripts.

Input file

  • PaCS-MD requires an input file. Here are the variables in the input file
  • input file must be in toml format.

Contents

sample input file

basic option

click here
  • max_cycle: int, default=1
    • Maximum number of cycles to run. (ex. 1, ..., 123, ..., 999)
  • n_replica: int, default=1
    • Number of replica. (ex. 1, ..., 123, ..., 999)
  • n_parallel: int, default=1
    • Number of replica calculated at a time
  • centering: bool, default=True
    • Whether to bring the molecule selected by centering_selection to the center of the simulation cell when creating the initial structure for the next cycle
  • centering_selection: str, default="protein" or "Protein" or "@CA,C,O,N,H"
    • Molecules to be moved to the center
    • default value will be changed to match analyzer
      • if analyzer == "mdtraj", default="protein"
      • if analyzer == "gromacs", default="Protein"
      • if analyzer == "cpptraj", default="@CA,C,O,N,H"
  • working_dir: str, default="./."
    • Directory where pacsmd will run
  • rmmol: bool, default=false
    • Whether rmmol is executed after each cycle
  • keep_selection: str, (required if rmmol=true)
    • Molecular name or index group to be left in the trajectory when rmmol
  • rmfile: bool, default=false
    • Whether rmfile is executed after trial
max_cycle = 2                     # Maximum number of cycles to run. (ex. 1, ..., 123, ..., 999)
n_replica = 3                     # Number of replica. (ex. 1, ..., 123, ..., 999)
n_parallel = 3                    # Number of replica calculated at a time
centering = true                  # Whether to move the molecule to the center
centering_selection = "protein"   # Name of molecule to move in the center
working_dir = "/work/"            # Directory where pacsmd will run
rmmol = true                      # Whether rmmol is executed after each cycle
keep_selection = "not water"      # Molecular name or index group to be kept in the trajectory when rmmol
rmfile = true                     # Whether rmfile is executed after trial

simulator option

analyzer \ simulatorgromacsambernamd
mdtrajooo
gromacsoxx
cpptrajxox

Gromacs

click here
  • simulator: str, required
    • Software used inside PaCS-MD
  • cmd_mpi: str, default=""
    • Commands for MPI such as mpirun, blank is OK
  • cmd_serial: str, required
    • Commands to run the simulator serially
  • cmd_parallel: str, default=cmd_serial
    • Commands to run the simulator parallelly
  • structure: str, required
    • Structural file such as gro, pdb, rst7, etc.
  • topology: str, required
    • Topology file such as top, parm7, psf, etc.
  • mdconf: str, required
    • Parameter file such as mdp, mdin, namd, etc.
  • index_file: str, (required if Gromacs)
    • Gromacs index file
  • trajectory_extension: str, required
    • Trajectory file extension. ("." is necessary)
  • nojump: bool, default=false
    • whether to execute -pbc nojump treatment for the selection feature calculation in analayzer, snapshot extraction in exporter and performing rmmol
    • valid only when analyzer is also gromacs
    • If true, molecules are allowed to get out of the simulation box in order to avoid the error in MSM due to the jumping of break of the molecule over pbc box.
    • If false, molecules are just made whole by -pbc mol and can warp across the pbc box.
    • Be noted that the output prd.xtc files are not processed with these -pbc options. (only prd_rmmol.xtc files are processed)
    • This option is recommended to use when a/dissociation and a_d pacsmd is performed using gromacs as simulator and analyzer
    • nojump=true can lead too large coordinate value to cause overflow or loss-of-significane problem. It will not happpen in most cases, but be carefull if your ligand is very small and simulation box is very large.
    • When this options is applied, analyzer can consider the distance even if ligand exceeds simulation box
    • This option is not present in example input in the sample input repository since this option was added in version 1.1.0
simulator = "gromacs"                   # Software used inside PaCS-MD
cmd_mpi = "mpirun -np 4"                # Commands for MPI such as mpirun, blank is OK
cmd_serial = "gmx_mpi mdrun -ntomp 6"   # Commands to run the simulator serially
cmd_parllel = "gmx_mpi mdrun -ntomp 6"  # Commands to run the simulator parallelly
structure = "/work/input.gro"           # Structural file such as gro, pdb, rst7, etc.
topology = "/work/topol.top"            # Topology file such as top, parm7, psf, etc.
mdconf = "/work/parameter.mdp"          # Parameter file such as mdp, mdin, namd, etc.
index_file = "/work/index.ndx"          # Gromacs index file
trajectory_extension = ".xtc"           # Trajectory file extension. ("." is necessary)
nojump = true                           # whether to execute nojump treatment only for gmx

Amber

click here
  • simulator: str, required
    • Software used inside PaCS-MD
  • cmd_mpi: str, default=""
    • Commands for MPI such as mpirun, blank is OK
  • cmd_serial: str, required
    • Commands to run the simulator serially
  • cmd_parallel: str, default=cmd_serial
    • Commands to run the simulator parallelly
  • structure: str, required
    • Structural file such as gro, pdb, rst7, etc.
  • topology: str, required
    • Topology file such as top, parm7, psf, etc.
  • mdconf: str, required
    • Parameter file such as mdp, mdin, namd, etc.
  • trajectory_extension: str, required
    • Trajectory file extension. ("." is necessary)
simulator = "amber"                     # Software used inside PaCS-MD
cmd_mpi = ""                            # Commands for MPI such as mpirun, blank is OK
cmd_serial = "pmemd.cuda"               # Commands to run the simulator serially
cmd_parllel = "pmemd.cuda"              # Commands to run the simulator parallelly
structure = "/work/structure.rst7"      # Structural file such as gro, pdb, rst7, etc.
topology = "/work/topology.parm7"       # Topology file such as top, parm7, psf, etc.
mdconf = "/work/parameter.mdin"         # Parameter file such as mdp, mdin, namd, etc.
trajectory_extension = ".nc"            # Trajectory file extension. ("." is necessary)

NAMD

click here
  • simulator: str, required
    • Software used inside PaCS-MD
  • cmd_mpi: str, default=""
    • Commands for MPI such as mpirun, blank is OK
  • cmd_serial: str, required
    • Commands to run the simulator serially
  • cmd_parallel: str, default=cmd_serial
    • Commands to run the simulator parallelly
  • structure: str, required
    • Structural file such as gro, pdb, rst7, etc.
  • topology: str, required
    • Topology file such as top, parm7, psf, etc.
  • mdconf: str, required
    • Parameter file such as mdp, mdin, namd, etc.
  • trajectory_extension: str, required
    • Trajectory file extension. ("." is necessary)
simulator = "namd"                      # Software used inside PaCS-MD
cmd_mpi = "mpirun -np 4"                # Commands for MPI such as mpirun, blank is OK
cmd_serial = "namd2 +p6"                # Commands to run the simulator serially
cmd_parllel = "namd2 +p6"               # Commands to run the simulator parallelly
structure = "/work/input.pdb"           # Structural file such as gro, pdb, rst7, etc.
topology = "/work/ionized.psf"          # Topology file such as top, parm7, psf, etc.
mdconf = "/work/production.namd"        # Parameter file such as mdp, mdin, namd, etc.
trajectory_extension = ".dcd"           # Trajectory file extension. ("." is necessary)

analyzer option

analyzer \ typedissociationassociationrmsdtargeteea_dtemplate
mdtrajoooooo-
gromacsooooxo-
cpptrajooooxo-

Target

click here
  • type: str, required
    • evaluation type, "target".
  • threshold: float, required
    • PaCS-MD terminates the calculation when the evaluation value falls below this threshold (in units of nm).
  • skip_frame: int, default=1
    • Number of frames to skip when ranking CVs.
    • If you set skip_frame=2, PaCS-MD will use every other frame.
  • analyzer: str, default="mdtraj"
    • Trajectory tool used to calculate the evaluation value.
    • "mdtraj", "gromacs" and "cpptraj" are supported.
  • reference: str, required
    • Structure file path for the reference structure of RMSD calculation.
  • selection1: str, required
    • Selection string or name of index group for specified group in trajectories (least squares fit)
    • Depending on the analyzer, there are different ways to specify selection.
      • If you use mdtraj (analyzer="mdtraj"), the selection should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA"
      • If you use gromacs (analyzer="gromacs"), the selection should follow Gromacs index group in index.ndx. (Gromacs supported only)
  • selection2: str, required
    • Selection string or name of index group for specified group in trajectories (RMSD calculation)
    • Depending on the analyzer, there are different ways to specify selection.
      • If you use mdtraj (analyzer="mdtraj"), the selection should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA"
      • If you use gromacs (analyzer="gromacs"), the selection should follow Gromacs index group in index.ndx. (Gromacs supported only)
  • selection3: str, default=selection1
    • Selection string or name of index group for specified group in reference (least squares fit)
    • If your reference structure has different topology from your trajectories, you can utilize this option. (e.g. reference has different configuration about mutation, missing residues or modified residues from your the trajectories.) Otherwise, you don't need to specify this option.
    • This option is valid when analyzer="mdtraj" or analyzer="cpptraj".
    • If you use gromacs (analyzer="gromacs"), this option is ignored and the same selection indices are automatically used for your trajectories and the reference.
  • selection4: str, default=selection2
    • Selection string or name of index group for specified group in reference (RMSD calculation)
    • If your reference structure has different topology from your trajectories, you can utilize this option. (e.g. reference has different configuration about mutation, missing residues or modified residues from your the trajectories.) Otherwise, you don't need to specify this option.
    • This option is valid when analyzer="mdtraj" or analyzer="cpptraj".
    • If you use gromacs (analyzer="gromacs"), this option is ignored and the same selection indices are automatically used for your trajectories and the reference.
type = "target"                 # Evaluation type
threshold = 0.01                # CV threshold used to decide whether to terminate the calculation (in units of nm)
skip_frame = 1                  # How many frames to skip when ranking CVs

# if analyzer == "mdtraj"
analyzer = "mdtraj"             # Trajectory tool used to calculate the evaluation type
reference = "/work/ref.pdb"     # Structure for comparison
selection1 = "backbone"         # Selection string for specified group in trajectroies (least squares fit)
selection2 = "backbone"         # Selection string for specified group in trajectories (RMSD calculation)
selection3 = "backbone and (not resid 1 to 10)"         # Selection string for specified group in reference (least squares fit)
selection4 = "backbone and (not resid 1 to 10)"         # Selection string for specified group in reference (RMSD calculation)

# else if analyzer == "gromacs"
# analyzer = "gromacs"          # Trajectory tool used to calculate the evaluation type
# selection1 = "Backbone"       # Name of index group for specified group in trajectories (least squares fit)
# selection2 = "Backbone"       # Name of index group for specified group in trajectories (RMSD calculation)

# else if analyzer == "cpptraj"
# analyzer = "cpptraj"          # Trajectory tool used to calculate the evaluation type
# selection1 = "@CA,N,O,C"      # Selection string for specified group in trajectroies (least squares fit)
# selection2 = "@CA,N,O,C"      # Selection string for specified group in trajectories (RMSD calculation)

RMSD

click here
  • type: str, required
    • evaluation type, "rmsd"
  • threshold: float, required
    • PaCS-MD terminates the calculation when the evaluation value exceeds this threshold (in units of nm).
  • skip_frame: int, default=1
    • Number of frames to skip when ranking CVs.
    • If you set skip_frame=2, PaCS-MD will use every other frame.
  • analyzer: str, default="mdtraj"
    • Trajectory tool used to calculate the evaluation value.
    • "mdtraj", "gromacs" and "cpptraj" are supported.
  • reference: str, required
    • Structure file path for the reference structure of RMSD calculation.
  • selection1: str, required
    • Selection string or name of index group for specified group in trajectories (least squares fit)
    • Depending on the analyzer, there are different ways to specify selection.
      • If you use mdtraj (analyzer="mdtraj"), the selection should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA"
      • If you use gromacs (analyzer="gromacs"), the selection should follow Gromacs index group in index.ndx. (Gromacs supported only)
  • selection2: str, required
    • Selection string or name of index group for specified group in trajectories (RMSD calculation)
    • Depending on the analyzer, there are different ways to specify selection.
      • If you use mdtraj (analyzer="mdtraj"), the selection should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA"
      • If you use gromacs (analyzer="gromacs"), the selection should follow Gromacs index group in index.ndx. (Gromacs supported only)
  • selection3: str, default=selection1
    • Selection string or name of index group for specified group in reference (least squares fit)
    • If your reference structure has different topology from your trajectories, you can utilize this option. (e.g. reference has different configuration about mutation, missing residues or modified residues from your the trajectories.) Otherwise, you don't need to specify this option.
    • This option is valid when analyzer="mdtraj" or analyzer="cpptraj".
    • If you use gromacs (analyzer="gromacs"), this option is ignored and the same selection indices are automatically used for your trajectories and the reference.
  • selection4: str, default=selection2
    • Selection string or name of index group for specified group in reference (RMSD calculation)
    • If your reference structure has different topology from your trajectories, you can utilize this option. (e.g. reference has different configuration about mutation, missing residues or modified residues from your the trajectories.) Otherwise, you don't need to specify this option.
    • This option is valid when analyzer="mdtraj" or analyzer="cpptraj".
    • If you use gromacs (analyzer="gromacs"), this option is ignored and the same selection indices are automatically used for your trajectories and the reference.
type = "rmsd"                   # Evaluation type
threshold = 2                   # CV threshold used to decide whether to terminate the calculation (in units of nm)
skip_frame = 1                  # How many frames to skip when ranking CVs

# if analyzer == "mdtraj"
analyzer = "mdtraj"             # Trajectory tool used to calculate the evaluation type
reference = "/work/ref.pdb"     # Structure for comparison
selection1 = "backbone"         # Selection string for specified group in trajectories (least squares fit)
selection2 = "backbone"         # Selection string for specified group in trajectories (RMSD calculation)
selection3 = "backbone and (not resid 1 to 10)"         # Selection string for specified group in reference (least squares fit)
selection4 = "backbone and (not resid 1 to 10)"         # Selection string for specified group in reference (RMSD calculation)

# else if analyzer == "gromacs"
# analyzer = "gromacs"          # Trajectory tool used to calculate the evaluation type
# selection1 = "Backbone"       # Name of index group for specified group in trajectories (least squares fit)
# selection2 = "Backbone"       # Name of index group for specified group in trajectories (RMSD calculation)

# else if analyzer == "cpptraj"
# analyzer = "gromacs"          # Trajectory tool used to calculate the evaluation type
# selection1 = "@CA,N,O,C"      # Selection string for specified group in trajectories (least squares fit)
# selection2 = "@CA,N,O,C"      # Selection string for specified group in trajectories (RMSD calculation)

Association

click here
  • type: str, required
    • evaluation type, "association".
  • threshold: float, required
    • PaCS-MD terminates the calculation when the evaluation value falls below this threshold (in units of nm).
  • skip_frame: int, default=1
    • Number of frames to skip when ranking CVs.
    • If you set skip_frame=2, PaCS-MD will use every other frame.
  • analyzer: str, default="mdtraj"
    • Trajectory tool used to calculate the evaluation value.
    • "mdtraj", "gromacs" and "cpptraj" are supported.
  • selection1: str, required
    • Selection string or name of index group for specified group in trajectories
    • Depending on the analyzer, there are different ways to specify selection.
      • If you use mdtraj (analyzer="mdtraj"), the selection should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA"
      • If you use gromacs (analyzer="gromacs"), the selection should follow Gromacs index group in index.ndx. (Gromacs supported only)
  • selection2: str, required
    • Selection string or name of index group for specified group in trajectories
    • Depending on the analyzer, there are different ways to specify selection.
      • If you use mdtraj (analyzer="mdtraj"), the selection should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA"
      • If you use gromacs (analyzer="gromacs"), the selection should follow Gromacs index group in index.ndx. (Gromacs supported only)
type = "association"            # Evaluation type
threshold = 0.3                 # CV threshold used to decide whether to terminate the calculation (in units of nm)
skip_frame = 1                  # How many frames to skip when ranking CVs

# if analyzer == "mdtraj"
analyzer = "mdtraj"             # Trajectory tool used to calculate the evaluation type
selection1 = "resid 1 to 5"     # Selection string for specified group in trajectories
selection2 = "resid 6 to 10"    # Selection string for specified group in trajectories

# else if analyzer == "gromacs"
# analyzer = "gromacs"          # Trajectory tool used to calculate the evaluation type
# selection1 = "resid_1_to_5"   # Name of index group for specified group in trajectories
# selection2 = "resid_6_to_10"  # Name of index group for specified group in trajectories

# else if analyzer == "cpptraj"
# analyzer = "cpptraj"          # Trajectory tool used to calculate the evaluation type
# selection1 = ":1-5"           # Name of index group for specified group in trajectories
# selection2 = ":6-10"          # Name of index group for specified group in trajectories

Dissociation

click here
  • type: str, required
    • evaluation type, "dissociation".
  • threshold: float, required
    • PaCS-MD terminates the calculation when the evaluation value exceeds this threshold.
  • skip_frame: int, default=1
    • Number of frames to skip when ranking CVs
    • If you set skip_frame=2, PaCS-MD will use every other frame.
  • analyzer: str, default="mdtraj"
    • Trajectory tool used to calculate the evaluation value.
    • "mdtraj", "gromacs" and "cpptraj" are supported.
  • selection1: str, required
    • Selection string or name of index group for specified group in trajectories
    • Depending on the analyzer, there are different ways to specify selection.
      • If you use mdtraj (analyzer="mdtraj"), the selection should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA"
      • If you use gromacs (analyzer="gromacs"), the selection should follow Gromacs index group in index.ndx. (Gromacs supported only)
  • selection2: str, required
    • Selection string or name of index group for specified group in trajectories
    • Depending on the analyzer, there are different ways to specify selection.
      • If you use mdtraj (analyzer="mdtraj"), the selection should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA"
      • If you use gromacs (analyzer="gromacs"), the selection should follow Gromacs index group in index.ndx. (Gromacs supported only)
type = "dissociation"           # Evaluation type
threshold = 10                  # CV threshold used to decide whether to terminate the calculation (in units of nm)
skip_frame = 1                  # How many frames to skip when ranking CVs

# if analyzer == "mdtraj"
analyzer = "mdtraj"             # Trajectory tool used to calculate the evaluation type
selection1 = "resid 1 to 5"     # Selection string for specified group in trajectories
selection2 = "resid 6 to 10"    # Selection string for specified group in trajectories

# else if analyzer == "gromacs"
# analyzer = "gromacs"      # Trajectory tool used to calculate the evaluation type
# selection1 = "resid_1_to_5"   # Name of index group for specified group in trajectories
# selection2 = "resid_6_to_10"  # Name of index group for specified group in trajectories

# else if analyzer == "cpptraj"
# analyzer = "cpptraj"          # Trajectory tool used to calculate the evaluation type
# selection1 = ":1-5"           # Name of index group for specified group in trajectories
# selection2 = ":6-10"          # Name of index group for specified group in trajectories

EdgeExpansion

click here
  • type: str, required
    • Evaluation type, "ee".
  • skip_frame: int, default=1
    • Number of frames to skip when ranking CVs.
    • If you set skip_frame=2, PaCS-MD will use every other frame.
  • analyzer: str, default="mdtraj"
    • Trajectory tool used to calculate the evaluation value.
    • only "mdtraj" is supported.
  • reference: str, required
    • Trajectory file used to compute the covariance matrix in Principal Component Analysis (PCA).
    • If this value is not set, the covariance matrix is computed based on the trajectory of cycle 0.
  • selection1: str, required
    • Selection string or name of index group for specified group in trajectories (least squares fit)
      • If you use mdtraj (analyzer="mdtraj"), the selection should follow mdtraj's atom selection.
  • selection2: str, required
    • Selection string or name of index group for specified group in trajectories (PCA calculation)
    • No correction by atomic weights or number of atoms is made when calculating the covariance matrix or projecting.
      • If you use mdtraj (analyzer="mdtraj"), the selection should follow mdtraj's atom selection.
  • selection3: str, default=selection1
    • Selection string or name of index group for specified group in reference (least squares fit)
    • If your reference structure has different topology from your trajectories, you can utilize this option. (e.g. reference has different configuration about mutation, missing residues or modified residues from your the trajectories)
    • Otherwise, you don't need to specify this option.
  • selection4: str, default=selection2
    • Selection string or name of index group for specified group in reference (PCA calculation)
    • If your reference structure has different topology from your trajectories, you can utilize this option. (e.g. reference has different configuration about mutation, missing residues or modified residues from your the trajectories)
    • Otherwise, you don't need to specify this option.
type = "ee"                     # Evaluation type
skip_frame = 1                  # How many frames to skip when ranking CVs

# if analyzer == "mdtraj"
analyzer = "mdtraj"             # Trajectory tool used to calculate the evaluation type
selection1 = "backbone"         # Selection string for specified group in trajectories (least squares fit)
selection2 = "backbone"         # Selection string for specified group in trajectories (RMSD calculation)
selection3 = "backbone"         # Selection string for specified group in reference (least squares fit)
selection4 = "backbone"         # Selection string for specified group in reference (RMSD calculation)

A_D

click here
  • type: str, required
    • evaluation type, "a_d".
  • skip_frame: int, default=1
    • Number of frames to skip when ranking CVs.
    • If you set skip_frame=2, PaCS-MD will use every other frame.
  • analyzer: str, default="mdtraj"
    • Trajectory tool used to calculate the evaluation value.
    • "mdtraj", "gromacs" and "cpptraj" are supported.
  • selection1: str, required
    • Selection string or name of index group for specified group in trajectories
    • Depending on the analyzer, there are different ways to specify selection.
      • If you use mdtraj (analyzer="mdtraj"), the selection should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA"
      • If you use gromacs (analyzer="gromacs"), the selection should follow Gromacs index group in index.ndx. (Gromacs is only supported)
  • selection2: str, required
    • Selection string or name of index group for specified group in trajectories
    • Depending on the analyzer, there are different ways to specify selection.
      • If you use mdtraj (analyzer="mdtraj"), the selection should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA"
      • If you use gromacs (analyzer="gromacs"), the selection should follow Gromacs index group in index.ndx. (Gromacs is only supported)
  • d_threshold: float, required
    • Threshold for switching from dissociation to association mode. If the maximum value of cv calculated while in dissociation mode exceeds this value, it switches to associtaion mode.
  • frame_sel: int, required
    • One of the conditions used to switch from association to dissociation.
    • If the highest ranking frame in association mode is less than this value, it is judged that there has been no change from the previous cycle (i.e., it has come as close as it can). Also, at this time, we consider it to be within bound.
  • bound_threshold: int, required
    • One of the conditions used to switch from association to dissociation.
    • When the number of bound cycles exceeds this value, the mode is switched from association to disassociation and the number of bound cycles is initialized to 0. When the number of bound cycles exceeds this value, the mode is switched from association to disassociation and the number of bound cycles is initialized to 0.
type = "a_d"                    # Evaluation type
skip_frame = 1                  # How many frames to skip when ranking CVs

# if analyzer == "mdtraj"
analyzer = "mdtraj"             # Trajectory tool used to calculate the evaluation type
selection1 = "resid 1"          # Selection string for specified group in trajectories
selection2 = "resid 9"          # Selection string for specified group in trajectories
d_threshold = 3.0               # threshold for dissociation to association
frame_sel = 5                   # number of frames to be used for judging whether the system has converged
bound_threshold = 3             # threshold for association to dissociation

Template

click here
  • type: str, required
    • Evaluation type
  • skip_frame: int, default=1
    • Number of frames to skip when ranking CVs
  • threshold: float, required
    • CV threshold for determining to terminate a trial

Template type is a type that can be defined by the user. It is possible to include user-specific variables in input.toml, and these variables can be used in template type in which they are defined. For more information, click here.

type = "template"
threshold = "???"

# user defined variable
user-defined-variable1 = 123
user-defined-variable2 = "hoge"
...

hidden option (No need to specify)

click here
  • cmd_gmx: str
    • Gromacs command (ex. gmx, gmx_mpi)
    • will be created from cmd_serial
  • top_mdtraj: str
    • topology file for mdtraj
    • will be created from input.toml
  • structure_extension: str
    • Structure file extension
    • will be created from structure

Reference

- PaCS-Toolkit
The paper is under consideration. The information will be updated upon publication.
[1] Ikizawa, S, Hori, T., Wijana, T.N., Kono, H., Bai, Z., Kimizono, T., Lu, W., Tran, D.P., & Kitao, A. PaCS-Toolkit: Optimized software utilities for parallel cascade selection molecular dynamics (PaCS-MD) simulations and subsequent analyses. Submitted.

- Original PaCS-MD or targeted-PaCS-MD (t-PaCS-MD)
[2] Harada, R., & Kitao, A. Parallel cascade selection molecular dynamics (PaCS-MD) to generate conformational transition pathway. J. Chem. Phys. 139, 035103 (2013). https://doi.org/10.1063/1.4813023

- Dissociation PaCS-MD (dPaCS-MD)
[3] Tran, D. P., Takemura, K., Kuwata, K., & Kitao, A. Protein–Ligand Dissociation Simulated by Parallel Cascade Selection Molecular Dynamics. J. Chem. Theory Comput. 14, 404–417 (2018). https://doi.org/10.1021/acs.jctc.7b00504
[4] Tran, D. P., & Kitao, A. Dissociation Process of a MDM2/p53 Complex Investigated by Parallel Cascade Selection Molecular Dynamics and the Markov State Model. J. Phys. Chem. B , 123, 11, 2469–2478 (2019). https://doi.org/10.1021/acs.jpcb.8b10309
[5] Hata, H., Phuoc Tran, D., Marzouk Sobeh, M., & Kitao, A. Binding free energy of protein/ligand complexes calculated using dissociation Parallel Cascade Selection Molecular Dynamics and Markov state model. Biophysics and Physicobiology, 18, 305–31 (2021). https://doi.org/10.2142/biophysico.bppb-v18.037

- Application to protein domain motion
[6] Inoue, Y., Ogawa, Y., Kinoshita, M., Terahara, N., Shimada, M., Kodera, N., Ando, T., Namba, K., Kitao, A., Imada, K., & Minamino, T. Structural Insights into the Substrate Specificity Switch Mechanism of the Type III Protein Export Apparatus. Structure, 27 , 965-976 (2019). https://doi.org/10.1016/j.str.2019.03.017

- Association and dissociation PaCS-MD (a/dPaCS-MD)
[7] Tran, D. P., & Kitao, A. Kinetic Selection and Relaxation of the Intrinsically Disordered Region of a Protein upon Binding. J. Chem. Theory Comput. 16, 2835–2845 (2020). https://doi.org/10.1021/acs.jctc.9b01203

- Edge expansion PaCS-MD (eePaCS-MD)
[8] Takaba, K., Tran, D. P., & Kitao, A. Edge expansion parallel cascade selection molecular dynamics simulation for investigating large-amplitude collective motions of proteins. J. Chem. Phys. 152, 225101 (2020). https://doi.org/10.1063/5.0004654
[9] Takaba, K., Tran, D. P., & Kitao, A.  Erratum: "Edge expansion parallel cascade selection molecular dynamics simulation for investigating large-amplitude collective motions of proteins" [J. Chem. Phys. 152, 225101 (2020)]. . J. Chem. Phys. 153, 179902 (2020). https://doi.org/10.1063/5.0032465

- rmsdPaCS-MD
[10] Tran, D. P., Taira, Y., Ogawa, T., Misu, R., Miyazawa, Y., & Kitao, A. Inhibition of the hexamerization of SARS-CoV-2 endoribonuclease and modeling of RNA structures bound to the hexamer. Sci Rep 12, 3860 (2022). https://doi.org/10.1038/s41598-022-07792-2

- Recent applications
[12] Hata, H., Nishihara, Y., Nishiyama, M., Sowa, Y., Kawagishi, I., & Kitao, A.  High pressure inhibits signaling protein binding to the flagellar motor and bacterial chemotaxis through enhanced hydration. Sci Rep 10, 2351 (2020). https://doi.org/10.1038/s41598-020-59172-3
[12] Sobeh, M. M., & Kitao, A. Dissociation Pathways of the p53 DNA Binding Domain from DNA and Critical Roles of Key Residues Elucidated by dPaCS-MD/MSM. J. Chem. Inf. Model. 62, 1294–1307 (2022). https://doi.org/10.1021/acs.jcim.1c01508
[13] Wijaya, T.N., & Kitao, A. Energetic and Kinetic Origins of CALB Interfacial Activation Revealed by PaCS-MD/MSM. J. Phys. Chem. B 2023, 127, 34, 7431–7441 (2023). https://doi.org/10.1021/acs.jpcb.3c02041

rmmol

  • This command is used after executing pacs mdrun.
  • This command removes unnecessary atoms from the trajectory created by running PaCS-MD.
  • This command reduces the file size and the burden on the computer capacity.

Caution

  • This command is not reversible.
  • This command operates on all trajectories of one trial at a time.
  • If you want to apply it to more than one trial, you could easily achieve it by writing an iteration shell script.

Example

  • The following example removes water molecules from the trajectory.

mdtraj

pacs rmmol mdtraj -t 1 -k "not water" -e .xtc -m trial001/cycle000/replica001/prd.gro

gromacs

pacs rmmol gmx -t 1 -k "not_water" -e .xtc -n index.ndx -g gmx --nojump

cpptraj

pacs rmmol cpptraj -t 1 -k "not water" -e .nc -p trial001/cycle000/replica001/prd.parm7

Arguments

mdtraj

usage: pacs rmmol mdtraj [-h] [-t] [-k] [-e] [-m]
  • -t, --trial (int):
    • trial number without 0-fill when pacsmd was conducted (e.g. -t 1)
  • -k, --keep_selection (str):
    • atom selection to be retained in trajectory (e.g. -k "not water")
  • -e, --trajectory_extension (str):
    • trajectory extension (e.g. -e .xtc)
  • -m, --top_mdtraj (str):
    • topology file path for mdtraj analysis (e.g. -m trial001/cycle000/replica001/prd.pdb)

gromacs

usage: pacs rmmol gmx [-h] [-t] [-k] [-n] [-g] [-e]
  • -t, --trial (int):
    • trial number without 0-fill when pacsmd was conducted (e.g. -t 1)
  • -k, --keep_selection (str):
    • index group to be retained in trajectory (e.g. -k "not_water")
  • -e, --trajectory_extension (str):
    • gromacs trajectory extension (e.g. -e .xtc)
  • -n, --index_file (str):
    • index file for gromacs (e.g. -n index.ndx)
  • -g, --cmd_gmx (str):
    • gromacs command prefix (e.g. -g gmx)
  • --nojump (bool):
    • whether to execute -pbc nojump treatment for the trajectory (e.g. --nojump)
    • If your PaCS-MD trajectory was generated with nojump=true, it is strongly strongly recommended to use this options as well
    • Also refer to the mdrun/inputfile page for more details.

cpptraj

usage: pacs rmmol cpptraj [-h] [-t] [-k] [-e] [-p]
  • -t, --trial (int):
    • trial number without 0-fill when pacsmd was conducted (e.g. -t 1)
  • -k, --keep_selection (str):
    • atom selection to be retained in trajectory (e.g. -k :!WAT)
  • -e, --trajectory_extension (str):
    • trajectory extension (e.g. -e .nc)
  • -p, --topology (str):
    • topology file path for cpptraj analysis (e.g. -p trial001/cycle000/replica001/prd.parm7)

rmfile

  • This command is used after executing pacs mdrun.
  • This command deletes backup files and other unnecessary files.
  • This command reduces the number of files and the burden on the computer capacity.

Caution

  • This command is not reversible.

Example

  • The following example removes unnecessary files from the directory.
pacs rmfile -t 1 -s "gromacs"

Arguments

usage: pacs rmfile [-h] [-t] [-s]
  • -t, --trial (int):
    • trial number without 0-fill when pacsmd was conducted (e.g. -t 1)
  • -s, --simulator (str):
    • simulator name (e.g. -s "gromacs")
    • Only "gromacs", "amber", and "namd" are accepted.

fit

  • This command is used after executing pacs mdrun
  • This command makes new trajectories that fit to the reference structure.
  • The newly created trajectories are placed in the same hierarchy as the original trajectories.

Caution

  • Fitting the molecules by using the script could take a long time (around 1 hour for 1 trial)

Example

  • In the following example, trajectories for all cycles and all replicas of trial1 are fitted with the protein, and then the output files are saved.

for single trajectory

pacs fit traj mdtraj -tf ./trial001/cycle001/replica001/prd.xtc -top ./inputs/input.gro -r ./inputs/input.gro -ts "backbone" -rs "backbone"

for single trial

pacs fit trial mdtraj -t 1 -top ./trial001/cycle001/replica001/prd.pdb -r ./trial001/cycle001/replica001/prd.pdb -ts "protein" -rs "protein" -tf prd.xtc -p 10

Arguments

for single trajectory

usage: pacs fit traj mdtraj [-h] [-tf] [-top] [-r] [-ts] [-rs] [-p] [-o]
  • -tf, --trj_file (str):
    • file name of the trajectory to be fitted (e.g. -tf prd.xtc)
  • -top, --topology (str):
    • topology file path for loading trajectory (e.g. -top trial001/cycle000/replica001/prd.pdb)
  • -r, --ref_structure (str):
    • reference structure file path for fitting reference (e.g. -r trial001/cycle000/replica001/prd.pdb)
  • -ts, --trj_selection (str):
    • atom selection for fitting trajectory (e.g. -ts "protein")
  • -rs, --ref_selection (str):
    • atom selection for fitting reference (e.g. -rs "protein")
  • -p, --parallel (int):
    • number of parallel processes (e.g. -p 10)
    • default: 1
  • -o, --out (str):
    • output file name (e.g. -o prd_fit.xtc)
    • default: {trj_file}_fit.{ext}

for single trial

usage: pacs fit trial mdtraj [-h] [-t] [-tf] [-top] [-r] [-ts] [-rs] [-p] [-o]
  • -t, --trial (int):
    • trial number (e.g. -t 1)
  • -tf, --trj_file (str):
    • file name of the trajectory to be fitted (e.g. -tf prd.xtc)
  • -top, --topology (str):
    • topology file path for loading trajectory (e.g. -top trial001/cycle000/replica001/prd.pdb)
  • -r, --ref_structure (str):
    • reference structure file path for fitting reference (e.g. -r trial001/cycle000/replica001/prd.pdb)
  • -ts, --trj_selection (str):
    • atom selection for fitting trajectory (e.g. -ts "protein")
  • -rs, --ref_selection (str):
    • atom selection for fitting reference (e.g. -rs "protein")
  • -p, --parallel (int):
    • number of parallel processes (e.g. -p 10)
    • default: 1
  • -o, --out (str):
    • output file name (e.g. -o prd_fit.xtc)
    • default: {trj_file}_fit.{ext}

genrepresent

  • This command is used after executing PaCS-MD.
  • This command traces each cycle in reverse order, starting with the frame with the best CV in the last cycle, and generates one continuous trajectory file.
  • The generated file is named repr_complete and located in the genrepresent directory.
  • File extension is determined by the user.

Caution

  • This command operates on all of the trajectories in one trial at a time.
  • If you want to apply this for the multiplie trials, you could simply write an iteration shell script to achieve it.

Example

  • The following example generates a trajectory file.

mdtraj

pacs genrepresent mdtraj -t 1 -trj prd.xtc -top ./inputs/input.gro

gromacs

pacs genrepresent gmx -t 1 -trj prd.xtc -top ./inputs/input.gro -g gmx_mpi

cpptraj

pacs genrepresent cpptraj -t 1 -trj prd.nc -top inputs/input.prmtop 

Arguments

mdtraj

usage: pacs genrepresent mdtraj [-h] [-t] [-trj] [-top]
  • -t, --trial (int):
    • trial number without 0-fill when pacsmd was conducted (e.g. -t 1)
  • -trj, --trajectory (str):
    • trajectory file name (e.g. -trj prd.xtc)
  • -top, --topology (str):
    • topology file path (e.g. -m trial001/cycle000/replica001/prd.pdb)

gromacs

usage: pacs genrepresent gmx [-h] [-t] [-trj] [-top] [-g]
  • -t, --trial (int):
    • trial number without 0-fill when pacsmd was conducted (e.g. -t 1)
  • -trj, --trajectory (str):
    • trajectory file name (e.g. -trj prd.xtc)
  • -top, --topology (str):
    • topology file path (e.g. -m trial001/cycle000/replica001/input.gro)
  • -g, --cmd_gmx (str):
    • gromacs command prefix (e.g. -g gmx_mpi)

cpptraj

usage: pacs genrepresent cpptraj [-h] [-t] [-trj] [-top]
  • -t, --trial (int):
    • trial number without 0-fill when pacsmd was conducted (e.g. -t 1)
  • -trj, --trajectory (str):
    • trajectory file name (e.g. -trj prd.nc)
  • -top, --topology (str):
    • topology file path (e.g. -m trial001/cycle000/replica001/input.parm7)

gencom

  • This command is employed after executing PaCS-MD.
  • The COMs of a selected atom group from a trajectory (generated by "gencom traj") or multiple trajectories in one trial of PaCS-MD (generated by "gencom trial") can be computed and outputted in pdb format.
  • The COMs are stored as a VIRTUAL SITE in the pdb file.
  • This command visualizes the migration paths of atoms and supports intuitive interpretation.

gencom traj

  • This command is used after executing PaCS-MD.
  • The COMs of a selected atom group from a trajectory file can be calculated and outputted in pdb format.
  • This command visualizes the migration paths of atoms and supports intuitive interpretation.

Example

  • The following example generates a pathway file pathway.pdb that contains the COMs of ligand from the prd.xtc file.
pacs gencom traj mdtraj -trj ./trial001/cycle001/replica001/prd.xtc -top ./inputs/input.gro -s "resname ligand"

Arguments

usage: pacs gencom traj mdtraj [-h] [-trj] [-top] [-s] [-r] [-o]
  • -trj, --trajectory (str):
    • trajectory file path (e.g. -trj best_complete_fit.xtc)
  • -top, --top_mdtraj (str):
    • topology file path for mdtraj (e.g. -s ./trial001/cycle000/replica001/rmmol_top.pdb)
  • -s, --selection (str):
    • atom selection for visualization (e.g. -s "resname ligand")
  • -r, --resid (int):
    • Index of the visualized center of mass on the output pdb (e.g. -r 1)
    • default: 1
  • -o, --output (str):
    • output file path (e.g. -o pathway.pdb)
    • default: pathway.pdb

gencom trial

  • This command is used after executing PaCS-MD.
  • The COMs of a selected atom group from all of the trajectories within one trial of PaCS-MD can be calculated and outputted in pdb format.
  • This command visualizes the migration paths of atoms and supports intuitive interpretation.

Example

  • The following example generates a pathway file pathway.pdb that saves the COMs of ligand from prd_fit.xtc file in the trial 1.
pacsmd gencom trial mdtraj -t 1 -trj prd_fit.xtc -top ./inputs/input.gro -s "resname ligand"

Arguments

usage: pacs gencom trial mdtraj [-h] [-t] [-trj] [-top] [-s] [-r] [-o] [-sf]
  • -t, --trial (int):
    • trial number without 0-fill when pacsmd was conducted (e.g. -t 1)
  • -trj, --trajectory (str):
    • trajectory file name (e.g. -trj prd_rmmol_fit.xtc)
  • -top, --top_mdtraj (str):
    • topology file path for mdtraj (e.g. -s ./trial001/cycle000/replica001/rmmol_top.pdb)
  • -s, --selection (str):
    • atom selection for ligand (e.g. -s "resname ligand")
  • -r, --resid (int):
    • Index of the visualized center of mass on the output pdb (e.g. -r 1)
    • default: 1
  • -o, --output (str):
    • output file path (e.g. -o pathway.pdb)
    • default: pathway.pdb
  • -sf, --stride_frame (int):
    • stride to skip when extracting the com for visualization (e.g. -sf 10)
    • default: 10

genfeature

  • This command should be executed after running pacs mdrun.
  • It generates feature data in .npy format, which is cconvenient for MSM analysis in Python.
    • Feature data files (e.g., t001c002r010.npy) are stored in the directory specified with the -od option.
    • Each .npy file has the np.arry in the shape as described in the table below.
  • This command supports parallel processing.

Currently implemented analysis tools and the shape of the output data in .npy files.

featuremdtrajgmxcpptrajshape of .npy
comdistoxx(n_frames,)
comvecoxx(n_frames, 3)
rmsdoxx(n_frames,)
xyzoxx(n_frames, n_atoms, 3)

comdist

  • Center of mass distance
  • Calculate the distance between the centers of mass of s1 and s2.

Example

  • The following example generates features about COM distance for MSM analysis

mdtraj

pacs genfeature comdist mdtraj -t 1 -tf prd.xtc -top inputs/example_gromacs/input.gro -s1 "residue 1" -s2 "residue 9" 

Arguments

mdtraj

usage: pacs genfeature comdist mdtraj [-h] [-t] [-tf] [-top] [-od] [-p] [-s1] [-s2] 
  • -t, --trial (int):
    • trial number without 0-fill
  • -tf, --trj-file (str):
    • trajectory file name (e.g. prd.xtc prd_rmmol.trr)
  • -top, --topology (str):
    • topology file path (e.g. .inputs/input.gro)
  • -od, --outdir (str):
    • output directory path
  • -p, --n_parallel (int):
    • number of parallel
  • -s1, --selection1 (str):
    • mdtraj selection1 for calculating distance
  • -s2, --selection1 (str):
    • mdtraj selection2 for calculating distance

comvec

  • Center of mass vector
  • Calculate the vector between the centers of mass of s1 and s2
    • The vector is calculated as s1 - s2

Example

  • The following example generates features about COM vector for MSM analysis

mdtraj

pacs genfeature comvec mdtraj -t 1 -tf prd.xtc -top inputs/input.gro -ref inputs/input.gro -ft "name CA" -fr "name CA" -s1 "resid 1" -s2 "resid 9"

Arguments

mdtraj

usage: pacs genfeature comvec mdtraj [-h] [-t] [-tf] [-top] [-od] [-p] [-ref] [-ft] [-fr] [-s1] [-s2]
  • -t, --trial (int):
    • trial number without 0-fill
  • -tf, --trj-file (str):
    • trajectory file name (e.g. prd.xtc prd_rmmol.trr)
  • -top, --topology (str):
    • topology file path (e.g. .inputs/input.gro)
  • -od, --outdir (str):
    • output directory path
  • -p, --n_parallel (int):
    • number of parallel
  • -ref, --reference (str):
    • reference file path for fitting
  • -ft, --fit-trj (str):
    • fitting selection for mdtraj in trajectory
  • -fr, --fit-ref (str):
    • fitting selection for mdtraj in reference
  • -s1, --selection1 (str):
    • mdtraj selection1 for calculating distance
  • -s2, --selection1 (str):
    • mdtraj selection2 for calculating distance

RMSD

  • Root Mean Square Deviation
  • Calculate the RMSD relative to the structure specified in ref

Example

  • The following example generates features about RMSD for MSM analysis

mdtraj

pacs genfeature rmsd mdtraj -t 1 -tf prd.xtc -top ./inputs/input.gro -ref ./inputs/input.gro -ft "protein" -fr "protein" -ct "name CA" -cr "name CA"

Arguments

mdtraj

usage: pacs genfeature rmsd mdtraj [-h] [-tf] [-top] [-od] [-p] [-ref] [-ft] [-fr] [-ct] [-cr]
  • -t, --trial (int):
    • trial number without 0-fill
  • -tf, --trj-file (str):
    • trajectory file name (ex. prd.xtc prd_rmmol.trr)
  • -top, --topology (str):
    • topology file path (ex. .inputs/input.gro)
  • -od, --outdir (str):
    • output directory path
  • -p, --n_parallel (int):
    • number of parallel
  • -ref, --reference (str):
    • reference file path for fitting/RMSD calculation
  • -ft, --fit-trj (str):
    • mdtraj selection for fitting in trajectory
  • -fr, --fit-ref (str):
    • mdtraj selection for fitting in reference
  • -ct, --cal-trj (str):
    • mdtraj selection for calculating RMSD in trajectory
  • -cr, --cal-ref (str):
    • mdtraj selection for calculating RMSD in reference

XYZ

  • (x, y, z), 3 dimension coordinates

Example

  • The following example generates features about (x, y, z) for MSM analysis

mdtraj

pacs genfeature xyz mdtraj -t 1 -tf prd.xtc -top ./trial001/cycle000/replica001/input.gro -s "residue 1 to 5"

Arguments

mdtraj

usage: pacs genfeature xyz mdtraj [-h] [-t] [-tf] [-top] [-od OUTDIR] [-p N_PARALLEL] [-s] 
  • -t, --trial (int):
    • trial number without 0-fill
  • -tf, --trj-file (str):
    • trajectory file name (ex. prd.xtc prd_rmmol.trr)
  • top, --topology (str):
    • topology file path (ex. .inputs/input.gro)
  • -od, --outdir (str):
    • output directory path
  • -p, --n_parallel (int):
    • number of parallel
  • -s, --selection (str):
    • mdtraj selection extracted in trajectory

FAQ

Content

Q: How can I check pacsmd progress ?

A. You check it by seeing pacsmd standard output or error.

INFO       2023-12-16 23:22:04  [    parser.py     -       parse      ] input_ims.toml was read successfully
INFO       2023-12-16 23:22:04  [   __main__.py    -      <module>    ] pacsmd version 0.1.1
INFO       2023-12-16 23:22:04  [   __main__.py    -        main      ] PaCS-MD starts
INFO       2023-12-16 23:22:04  [   __main__.py    -     prepare_md   ] MDsettings(..)
INFO       2023-12-16 23:22:04  [   __main__.py    -      pacs_md     ] cycle000 starts
INFO       2023-12-16 23:23:15  [superSimulator.py -  record_finished ] replica001 done
INFO       2023-12-16 23:23:18  [ superAnalyzer.py -      analyze     ] The top ranking CV is replica 1 frame 32 cv 1.864649296784787
INFO       2023-12-16 23:23:19  [     rmmol.py     -    make_top_gmx  ] topology file rmmol_top.pdb has been created in ./trial001/cycle000/replica001
INFO       2023-12-16 23:23:19  [   __main__.py    -      pacs_md     ] cycle000 done
INFO       2023-12-16 23:23:44  [     Cycle.py     -       export     ] export to cycle001 is completed
INFO       2023-12-16 23:23:46  [     rmmol.py     -     rmmol_gmx    ] trajectory file in cycle000 have been reduced
...
  • But sometimes you cannot check it because some supercomputers do not generate log files. If so, you can check it by seeing trialXXX/cycleXXX/replicaXXX/summary/progress.log. Here is a sample command for you to check quickly.
 head -n 1 trial001/cycle*/summary/cv_ranked.log | grep frame | awk '{print NR "," $6}'

And then you can get the following results. First column denotes the cycle number and second column represents CV(for example, COM distance if type == "dissociation").

1,1.864649296784787
2,1.9111852343506632
3,1.9179890510636395
4,1.9223202126596912
5,1.915256901828055
6,1.9316632729334582
7,1.9657408272709807
8,1.9565950015268871
9,1.951968749749852
10,1.9565426138983022
11,1.9765695029520212
12,2.015255318811986
13,2.047009037596073
14,2.0903023704717936
15,2.126198720722031
16,2.1582448887927432
17,2.2197207031516375
18,2.232262081387398
19,2.2752254833312677
20,2.3406157309562796
21,2.372434614483611
22,2.4393667210979166
23,2.480949213506798
24,2.575250861566694
25,2.579432883406738

Q: What should I do if there are not evaluation types that I want to use.

A: Some evaluation types are still in progress. If you need you could also develop by yourself.

  • While PaCS-MD offers many evaluation types, we could not incorporate all of them. You're welcome to utilize our template as a reference and develop your own set of evaluation types. template page.

Q: dissociation PaCS-MD takes a lot of time. Some simulations could not completed.

A: Please check COM of selected molecules.

  • Sometimes wrong selections might lead to wrong dissociation pathway and free energy.

  • But there are some examples where the selections are actually correct but not dissociated based on the initial structure.

  • You can check COM quickly by employing the following command.

$ pacs gencom traj mdtraj -s init.gro -t init.gro -ls "resid 168 || resid 253" -o com_of_r168_r253.pdb
$ pacs gencom traj mdtraj -s init.gro -t init.gro -ls "resname Lig" -o com_of_ligand.pdb
$ vmd -m init.gro com_of_r168_r253.pdb com_of_ligand.pdb

Q: Minor dissociation pathway was obtained. Is is OK ?

A: It really depends on cases but you need to check carefully by yourself.

  • Minor pathway may actually emerge, but it could stem from incorrect selections or initial structures, etc. Therefore it's crucial to assess their validity before proceeding with your simulation.

Q: I got errors during running PaCS-MD

A: Please read the error messages carefully.

  • Since PaCS-MD relies on various other libraries. many issues may arise due to incorrect settings. Before reporting your concerns to Github, please ensure whether the problem genuinely lies with PaCS-MD or not.

  • Additionally, since PaCS-MD is often employed on supercomputers, certain issues might be attributed to the supercomputers themselves. For example, PaCS-MD may encounter difficulties running on supercomputers, or errors might arise that are unrelated to the associated libraries.