PaCS-Toolkit User Guide
Welcome to PaCS-Toolkit User Guide. Here, we provide instructions on how to use PaCS-Toolkit. If you encounter any issues and cannot be solved here, please feel free to submit a Github Issue. We hope PaCS-MD will be helpful for your research.
Content
What is PaCS-MD ?
PaCS-MD is an enhancement sampling method that exhaustedly samples the conformational space of given systems. PaCS-MD comprises multiple short MD simulations being cascaded by the selection of the starting structures for the next simulation cycles based on given evaluation criteria. By repeating this process, a large conformational space can be sampled without any mechanistic driven bias.
%%{ init: { 'flowchart': { 'curve': 'stepAfter'}, 'theme':'neutral', 'themeVariables': {'fontSize':'20px'} } }%% flowchart TD; cycle0["Preliminary MD simulations"]; ranking0["Structural Ranking"]; cycle0 --> ranking0; ranking0 --> input_structures; subgraph cycle["cycle"]; ranking1["Structural Ranking"]; judge{{"Fulfill requirements?"}}; export1["Export top N snapshots to next cycle"]; input_structures --> trajectories --> ranking1 --> judge; judge -- No --> export1 --> input_structures; end subgraph trajectories["Short MD simulations"]; direction TB; traj1(trj1); traj2(trj2); etc_trj(...); traj(trj N); end subgraph input_structures[" "]; direction TB; rank1(rank1); rank2(rank2); etc_sim(...); rank(rank N); end End((("finish"))); judge -- Yes --> End;
Citation
@article{Ikizawa2024,
title = {PaCS-Toolkit: Optimized Software Utilities for Parallel Cascade Selection Molecular Dynamics (PaCS-MD) Simulations and Subsequent Analyses},
volume = {128},
ISSN = {1520-5207},
url = {http://dx.doi.org/10.1021/acs.jpcb.4c01271},
DOI = {10.1021/acs.jpcb.4c01271},
number = {15},
journal = {The Journal of Physical Chemistry B},
publisher = {American Chemical Society (ACS)},
author = {Ikizawa, Shinji and Hori, Tatsuki and Wijaya, Tegar Nurwahyu and Kono, Hiroshi and Bai, Zhen and Kimizono, Tatsuhiro and Lu, Wenbo and Tran, Duy Phuoc and Kitao, Akio},
year = {2024},
month = apr,
pages = {3631–3642}
}
Access
- Author
# Head
- Kitao
# Team leader
- Ikizawa
# Coding team including documentation
- Ikizawa (lead)
- Hori
- Kono
# Testing team
- Tegar (lead)
- Duy
- Bai
- Kimizono
- Lu
- Source code
Quick Start
- on this page, we will introduce a quick guide on running dissociation PaCS-MD using gromacs.
- If you want to try other PaCS-MD, please refer to Analyzer.
Content
- Step1: Installing by pip or conda
- Step2: Preparing initial files
- Step3: Preparing input file for PaCS-MD
- Step4: Running PaCS-MD
- Step5: Making new fitted trajectories from pacsmd results
- Step6: Extracting collective-variables from fitted trajectories
- Step7: Building MSM and predicting free energy
Step1: Installing by pip or conda
- Running git clone for getting source code
git clone https://github.com/Kitaolab/PaCS-Toolkit.git
- Installing with your favorite method
# Install by pip
pip install -e ".[mdtraj]"
# Or install by conda and pip
conda create -n pacsmd "python>=3.8" -y
conda activate pacsmd
pip install -e ".[mdtraj]"
- The above methods are better, but sometimes they are not suitable for a special situation.
- For more information about installation, Please refer to Install page.
Step2: Preparing initial files
- On this page, we will use gromacs as simulator.
- In gromacs, gro, top, mdp and ndx files are necessary. Please get these files ready.
- Sample jobscripts and input files are here
Step3: Preparing input file for PaCS-MD
- PaCS-MD requires a toml format input file. Please make input.toml
- Please adjust the above settings as needed.
- For more information about this file, please refer to input file.
vim input.toml
input.toml
# Input file for PaCS-MD
## basic
# pacsmd settings
## basic
max_cycle = 2 # Maximum number of cycles to run. (ex. 1, ..., 123, ..., 999)
n_replica = 3 # Number of replica. (ex. 1, ..., 123, ..., 999)
n_parallel = 3 # Number of replica which are calculated at a time
skip_frame = 1 # Frequency of frames used for ranking among trajectories
centering = true # Whether to move the molecule to the center
centering_selection = "protein" # Name of molecule to move in the center
working_dir = "./." # Directory where pacsmd will run
## simulator
simulator = "gromacs" # Software used inside PaCS-MD
cmd_mpi = "mpirun -np 4" # Commands for MPI such as mpirun, blank is OK
cmd_serial = "gmx_mpi mdrun -ntomp 6" # Commands to run the simulator serially
cmd_parllel = "gmx_mpi mdrun -ntomp 6" # Commands to run the simulator parallelly
structure = "input.gro" # Structural file such as gro, pdb, rst7, etc.
topology = "topol.top" # Topology file such as top, parm7, psf, etc.
mdconf = "parameter.mdp" # Parameter file such as mdp, mdin, namd, etc.
index_file = ".index.ndx" # Gromacs index file
trajectory_extension = ".xtc" # Trajectory file extension. ("." is necessary)
## analyzer
type = "dissociation" # Evaluation type
threshold = 100 # CV threshold for determining to terminate a trial
skip_frame = 1 # How many frames to skip when ranking CVs
analyzer = "mdtraj" # Trajectory tool used to calculate the evaluation type
selection1 = "resid 1 to 5" # Selection string for specified group in trajectories
selection2 = "resid 6 to 10" # Selection string for specified group in trajectories
## postprocess
genrepresent = true # Whether genrepresent is executed after trial
rmmol = true # Whether rmmol is executed after each cycle
keep_selection = "not water" # Molecular name or index group to be left in the trajectory when rmmol
rmfile = true # Whether rmfile is executed after trial
Step4: Running PaCS-MD
- Finally, we are ready to perform PaCS-MD
- Specifying the trial id in argument t and the input file in argument f.
pacs mdrun -t 1 -f input.toml
CAUTION
In this case, the total core will be 24.So, 8 cores will be used in each 3 replica at once. (24 / 3 = 8 cores)
-
If you want to continue the simulation from the middle of a cycle or trial, simply run pacsmd again. Completed cycles will be skipped, and the simulation will resume from the the point of interruption.
-
You will get the results
$ ls
trial001/
Supplement: Runing other trial
- Usually, a single trial is not sufficient for sampling, so multiple trials are required.
- We can run trial2 PaCS-MD
pacs mdrun -t 2 -f input.toml
- Output will be the following
$ ls
trial002/
Step5: Making new fitted trajectories from pacsmd results
- For some analyses, fitting trajectories can make subsequent calculations smoother.
- For fitting, we prepare
pacs fit mdtraj
command. But you can use existing software such asgmx trjconv
orcpptraj
as usual.
$ pacs fit trial mdtraj -t 1 -tf prd_rmmol.xtc -top rmmol_top.pdb -r ref.gro -ts protein -rs protein
Step6: Extracting collective-variables from fitted trajectories
- After making fitted trajectories, you need to extract CV to build MSM, where CV denotes collective variables such as "distance" and "inter COM vector" and "PCA" and etc.
- For extracting CVs, we got
pacs genfeature
command. But this command provides you only frequently used CV such as com-distance, com-vector and rmsd. - So if you want to use other specific CVs, you need to write a code by yourself.
$ pacs genfeature comdist mdtraj -t 1 -tf prd.xtc -top inputs/example_gromacs/input.gro -s1 "residue 1" -s2 "residue 9"
$ ls
comdist-CV/
Step7: Building MSM and predicting free energy
- After extracting CVs, various analyses can be performed on them.
- PaCS-MD is especially compatible with analyses using MSM.
Install
Content
Requirements
- Python >= 3.7 (but python >= 3.8 is recommended because of deeptime)
- PaCS-Toolkit currently supports 3 simulator
1. Install by pip
1.1 Install by conda and pip
conda create -n pacsmd "python>=3.8" -y
conda activate pacsmd
- if using whole pacstk function
pip install "pacs[all] @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
- elif using "pacs mdrun" and analyzer == "mdtraj"
pip install "pacs[mdtraj] @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
- elif using "pacs mdrun" and analyzer == "gromacs"
pip install "pacs @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
- elif performing MSM
- python >= 3.8 is recommended because of deeptime
pip install "pacs[msm] @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
1.2. Install by pip
- if using whole pacstk function
pip install "pacs[all] @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
- elif using "pacs mdrun" and analyzer == "mdtraj"
pip install "pacs[mdtraj] @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
- elif using "pacs mdrun" and analyzer == "gromacs"
pip install "pacs @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
- elif performing MSM
- python >= 3.8 is recommended because of deeptime
pip install "pacs[msm] @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
2. Get source code from github
- git clone
git clone https://github.com/Kitaolab/PaCS-Toolkit.git
cd pacsmd
- get latest release
verion="x.x.x"
wget https://github.com/Kitaolab/PaCS-Toolkit/archive/refs/tags/${version}.tar.gz
tar -xvzf ${version}.tar.gz
cd pacsmd-${version}
2.1. Install by conda and pip locally
conda create -n pacsmd "python>=3.8" -y
conda activate pacsmd
- if using whole pacstk function
- python >= 3.8 is recommended because of deeptime
pip install -e ".[all]"
- elif using "pacs mdrun" and analyzer == "mdtraj"
pip install -e ".[mdtraj]"
- elif using "pacs mdrun" and analyzer == "gromacs"
pip install -e "."
- elif using "pacs mdrun" and analyzer == "cpptraj"
pip install -e "."
- elif performing MSM
pip install -e ".[msm]"
2.2. Install by pip locally
- if using whole pacstk function
- python >= 3.8 is recommended because of deeptime
pip install -e ".[all]"
- elif using "pacs mdrun" and analyzer == "mdtraj"
pip install -e ".[mdtraj]"
- elif using "pacs mdrun" and analyzer == "gromacs"
pip install -e "."
- elif using "pacs mdrun" and analyzer == "cpptraj"
pip install -e "."
- elif performing MSM
- python >= 3.8 is recommended because of deeptime
pip install -e ".[msm]"
mdrun
- This command is used to execute PaCS-MD.
- Unlike other commands in pacskit, many variables are set within pacsmd's input file.
- It is assumed that PaCS-MD will first be executed by this command and then analyzed using other commands.
- See Input file for details on input files.
Example
pacs mdrun -t 1 -f input.toml
Arguments
usage: pacs mdrun [-h] [-t] [-f]
-t, --trial
(int):- trial number without 0-fill when pacsmd was conducted (e.g.
-t 1
)
- trial number without 0-fill when pacsmd was conducted (e.g.
-f, --file
(str):- input file path for PaCS-MD (e.g.
-f input.toml
)
- input file path for PaCS-MD (e.g.
Overview
generated files
- mdrun reads the file specified by -f and performs MD calculations according to its contents.
- After the command is executed, the following directory structure is created.
- tiral001 /
- cycle000 /
- replica001 /
- trajectory file
- summary /
- cv.log
- cv_ranked.log
- progress.log
- replica001 /
- cycle001 /
- replica001 /
- replica002 / ...
- replica
n_replica
/ - summary /
- cycle002 / ...
- cycle
max_cycle
/
- cycle000 /
- The directory structure is created as shown above, and the following files are created.
trajectory file
- The trajectory file is a file that contains the coordinates of the atoms in the system at each time step.
- file name without extension is
prd
and the extension is determined by the input file.
cv.log
- The cv.log file contains the calculated collective variable (CV) values for each frame in the trajectory file.
- first column: replica number (1-indexed)
- second column: frame number (1-indexed)
- third column: CV value
cv_ranked.log
- The cv_ranked.log file contains the sorted CV values.
progress.log
- The progress.log file contains the progress of the simulation in each cycle.
Overall diagram
- PaCS-MD runs multiple short simulations, from which it selects a snapshot that meets its objectives and serves as the initial structure for the next cycle. Therefore, the diagram is as follows.
sequenceDiagram participant cycle as cycle participant simulator as simulator participant analyzer as analyzer loop 0, max_cycle Note over cycle: prepare initial structures par run short MD simulations cycle ->>+ simulator: run short MD simulations simulator ->>- cycle: trajectory files end par calculate CV cycle ->>+ analyzer: calculate CV Note over analyzer: calculate CV based on the input file. Note over analyzer: sort the snapshots based on the CV. analyzer ->>- cycle: cv.log, cv_ranked.log end break the top CV reaches threshold analyzer -->> cycle: finish PaCS-MD end Note over cycle: export top N snapshots to next cycle. end
-
simulator
- The simulator performs one short MD on the starting structure in cycle 0.
- In the other cycles, it executes n_replica short MDs, starting from the top-ranking structure of each cycle.
- Currently, only GROMACS, AMBER and NAMD are supported.
- See Simulator for more information
-
analyzer
- The analyzer calculates the collective variable (CV) based on the type specified in the input file.
- The analyzer sorts the structures in each trajectory based on the CV.
- User-defined types are also available.
- See Analyzer for more information
Basic
There are common settings for any PaCS-MD run. These include the maximum number of cycles to run, how many MD simulations to run in each cycle, and so on. These settings are described in the input file under the #basic
keyword.
Following are the details of each keyword to be set. For simplicity, we chose "replica" as the MD simulation to be executed in each cycle.
keywords
-
max_cycle: int, default=1
- Maximum number of cycles to run. (e.g. 100)
- Minimum value is 1.
- Maximum value is 999.
-
n_replica: int, default=1
- Number of replicas to be excuted in one cycle. (e.g. 50)
- Minimum value is 1.
- Maximum value is 999.
-
n_parallel: int, default=1
- Number of replicas to be simulated and analyzed in parallel at a time. (e.g. 10)
- It is recommended that
n_parallel
set to be a divisor ofn_replica
.
-
centering: bool, default=True
- This flag controls whether the initial structure of the next cycle is moved to the center of the box at the end of each cycle.
- if
centering
is true, the initial structure of the next cycle is moved to the center of the box. - if
centering
is false, the initial structure of the next cycle will not be moved to the center of the box. - For longer MD, the proteins may move away from the center of the box and the features may behave non-intuitively. Therefore, it is recommended to set
centering
to true.
-
centering_selection: str, default="protein" or "Protein" or "@CA,C,O,N,H"
- If
centering
is true, the group specified by this keyword moves to the center of the box. - If
centering
is false, there is no need to specify this keyword. - default value will be changed to match
analyzer
.- if
analyzer
== "mdtraj", default="protein" - if
analyzer
== "gromacs", default="Protein" - if
analyzer
== "cpptraj", default="@CA,C,O,N,H"
- if
- If
-
working_dir: str, default="./."
- Directory where pacsmd will run.
-
rmmol: bool, default=false
- This flag controls whether the unnecessary molecules will be removed from the trajectory after each cycle.
- if
rmmol
is true, atoms not specified in thekeep_selection
are removed from the trajectory file. - The operation of removing atoms is irreversible.
-
keep_selection: str, (required if
rmmol=true
)- if
rmmol
is true, atoms specified in thekeep_selection
are kept in the trajectory file. (e.g. "protein")
- if
-
rmfile: bool, default=false
- This flag controls whether the unnecessary files are removed from the working directory after trial.
- Backup files and other files are deleted.
Example
An example of a basic option in an input file is shown below. This can also be found on this page.
max_cycle = 20
n_replica = 8
n_parallel = 4
centering = true
centering_selection = "protein"
working_dir = "/work/"
rmmol = true
keep_selection = "not water"
rmfile = true
Simulator
- In PaCS-MD, simulations corresponding to the
simulator
are executed. Gromacs, Amber, and NAMD are supported. - In each cycle,
n_replica
simulations are executed.n_parallel
simulations are will run in parallel. Ifn_replica
is not a multiple ofn_parallel
, or in cycle 0, the remainder of the simulations will run in series. - If you want to run multiple replica simulations in parallel using MPI, set
cmd_mpi
. Ifcmd_mpi
is not set, parallel execution will be performed using the multiprocessing module of python.
Content
When are cmd_serial
& cmd_parallel
used ?
- See the table below for
cmd_serial
andcmd_parallel
in the input file.cmd_serial
is used if cycle0 orn_parallel
=1. Ifn_replica
is not divisible byn_parallel
, thencmd_serial
is used for the remaining replicas. Otherwise,cmd_parallel
is used.
GPU | MPI | n_parallel | command |
---|---|---|---|
x | x | 1 | cmd_serial |
x | x | n | cmd_serial |
x | o | 1 | cmd_serial |
x | o | n | cmd_parallel |
o | x | 1 | cmd_serial |
o | x | n | cmd_serial |
o | o | 1 | cmd_seiral |
o | o | n | cmd_parallel |
Gromacs
To run the simulation using gromacs, write in the inputfile as in this example. The details of each keyword are as follows.
keywords
- simulator: str, required
- Software used inside PaCS-MD. e.g. "gromacs"
- cmd_mpi: str, default=""
- Command for MPI parallelizataion. e.g. "mpirun -np 4"
- cmd_serial: str, required
- Command to run simulation in serial. e.g. "gmx_mpi mdrun -ntomp 6"
- cmd_parallel: str, default=cmd_serial
- Command to run simulation in parallel. e.g. "gmx_mpi mdrun -ntomp 6"
- structure: str, required
- Structure file path for MD simulation. e.g. "./input.gro"
- This is also used as the initial structure of PaCS-MD
- topology: str, required
- Topology file path for MD simulation. e.g. "./topol.top"
- mdconf: str, required
- Parameter file path for MD simulation. e.g. "./parameter.mdp"
- index_file: str, required
- Gromacs index file path. e.g. "./index.ndx"
- trajectory_extension: str, required
- Trajectory file extension. (The "." is necessary.) e.g. ".trr"
Amber
To run the simulation using amber, write in the inputfile as in this example. The details of each keyword are as follows.
keywords
- simulator: str, required
- Software used inside PaCS-MD. e.g. "amber"
- cmd_mpi: str, default=""
- Command for MPI such as mpirun. e.g. "mpirun -np 4"
- cmd_serial: str, required
- Command to run simulation in serial. e.g. "pmemd.cuda"
- cmd_parallel: str, default=cmd_serial
- Command to run simulation in parallel. e.g. "pmemd.cuda"
- structure: str, required
- Structure file path for MD simulation. e.g. "./input.rst7"
- This is also used as the initial structure of PaCS-MD.
- topology: str, required
- Topology file path for MD simulation. e.g. "./topology.parm7"
- mdconf: str, required
- Parameter file path for MD simulation. e.g. "./parameter.mdin"
- trajectory_extension: str, required
- Trajectory file extension. (The "." is necessary.) e.g. ".nc"
NAMD
To run the simulation using NAMD, write in the inputfile as in this example. The details of each keyword are as follows.
keywords
- simulator: str, required
- Software used inside PaCS-MD. e.g. "namd"
- cmd_mpi: str, default=""
- Command for MPI parallelizataion. e.g. "mpirun -np 4"
- cmd_serial: str, required
- Command to run simulation in serial. e.g. "namd2 +p4"
- cmd_parallel: str, default=cmd_serial
- Command to run simulation in parallel. e.g. "namd2 +p4"
- structure: str, required
- Structure file path for MD simulation. e.g. "./input.pdb"
- This is also used as the initial structure of PaCS-MD.
- topology: str, required
- Topology file path for MD simulation. e.g. "./topology.psf"
- mdconf: str, required
- Parameter file path for MD simulation. e.g. "./parameter.conf"
- trajectory_extension: str, required
- Trajectory file extension. (The "." is necessary.) e.g. ".dcd"
Analyzer
- In PaCS-MD, the analyzer is a component that evaluates the snapshot of each trajectory in each cycle and ranks them based on the evaluation type specified in the input file.
- Some of the top ranked frames become the initial structure of each replica in the next cycle.
- You can use the following evaluation functions by setting the
#analyzer
keyword in your input file. Depending on the phenomenon of interest, the appropriatetype
must be selected. - If you do not find a
type
that meets your purpose, you can implement it yourself by settingtype
= template. - The following are some commonly used evaluation types.
Content
Dissociation
- Evaluate snapshots of each trajectory so that the centers of mass of the two selected groups are far apart.
- Usage Example
- Ligand dissociating from a protein
- Separate the distance between the two domains of a protein
- For more information, see here.
- See here for an example input file.
Papers
[1] Protein-Ligand Dissociation Simulated by Parallel Cascade Selection Molecular Dynamics, https://doi.org/10.1021/acs.jctc.7b00504
[2] Dissociation Process of a MDM2/p53 Complex Investigated by Parallel Cascade Selection Molecular Dynamics and the Markov State Model, https://doi.org/10.1021/acs.jpcb.8b10309
[3] Binding free energy of protein/ligand complexes calculated using dissociation Parallel Cascade Selection Molecular Dynamics and Markov state model, https://doi.org/10.2142/biophysico.bppb-v18.037
[4] High pressure inhibits signaling protein binding to the flagellar motor and bacterial chemotaxis through enhanced hydration, https://doi.org/10.1038/s41598-020-59172-3
[5] Dissociation Pathways of the p53 DNA Binding Domain from DNA and Critical Roles of Key Residues Elucidated by dPaCS-MD/MSM, https://doi.org/10.1021/acs.jcim.1c01508
Association
- Evaluate snapshots of each trajectory so that the centers of mass of the two selected groups are close together.
- Usage Example
- Bringing a ligand closer to a protein
- Bringing two domains of a protein closer together
- For more information, see here.
- See here for an example input file.
Papers
Target
- Evaluate snapshots of each trajectory so that the RMSD from the reference structure becomes smaller.
- Usage Example
- Finding transitions between two known structures.
- Evaluate to follow the
reference
structure using RMSD. - For more information, see here.
- See here for an example input file.
Papers
[1] Parallel cascade selection molecular dynamics (PaCS-MD) to generate conformational transition pathway, https://doi.org/10.1063/1.4813023
RMSD
- Evaluate snapshots of each trajectory so that the RMSD from the reference structure becomes larger.
- Usage Example
- Sampling a wide range of structures
- Sampling a wide range of ligand binding modes
- For more information, see here.
- See here for an example input file.
Papers
[1] Inhibition of the hexamerization of SARS-CoV-2 endoribonuclease and modeling of RNA structures bound to the hexamer, https://doi.org/10.1038/s41598-022-07792-2
EdgeExpansion
- Evaluate snapshots of each trajectory so that the frame forms the convex hull of the 4-dimentional principal component space(PCs).
- Usage Example
- Sampling a wide range of phase space without knowing the reference structure
- For more information, see here.
- See here for an example input file.
Papers
[1] Edge expansion parallel cascade selection molecular dynamics simulation for investigating large-amplitude collective motions of proteins, https://doi.org/10.1063/5.0004654
A_D
- Evaluate snapshots of each trajectory so that the distance between the centers of mass of the two selected groups fluctuates significantly within a certain range.
- Usage Example
- Observing large movements of proteins, such as opening and closing.
- Observing the binding and unbinding of ligands.
- For more information, see here.
- See here for an example input file.
Papers
[1] Kinetic Selection and Relaxation of the Intrinsically Disordered Region of a Protein upon Binding, https://doi.org/10.1021/acs.jctc.9b01203
Template
- Evaluate snapshots of each trajectory using a user-defined evaluation function.
- Additional variables can be specified in
input.toml
- Usage Example
- If you want to use a different evaluation function than the ones provided.
- For more information, click here
RMSD
- Evaluate snapshots of each trajectory so that the RMSD from the reference structure becomes larger.
- The structure of the groups specified by
selection1
andselection3
are superimposed by the least squares method and the RMSD is calculated with the structure of the groups specified byselection2
andselection4
. - Usage Example
- Sampling a wide range of structures
- Sampling a wide range of ligand binding modes
- See here for an example input file.
keywords
- type: str, required
- evaluation type, "rmsd"
- threshold: float, required
- PaCS-MD terminates the calculation when the evaluation value exceeds this threshold (in units of nm).
- skip_frame: int, default=1
- Number of frames to skip when ranking CVs.
- If you set
skip_frame=2
, PaCS-MD will use every other frame.
- analyzer: str, default="mdtraj"
- Trajectory tool used to calculate the evaluation value.
- "mdtraj", "gromacs" and "cpptraj" are supported.
- reference: str, required
- Structure file path for the reference structure of RMSD calculation.
- selection1: str, required
- Selection string or name of index group for specified group in trajectories (least squares fit)
- Depending on the
analyzer
, there are different ways to specifyselection
.- If you use mdtraj (
analyzer="mdtraj"
), theselection
should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA" - If you use gromacs (
analyzer="gromacs"
), theselection
should follow Gromacs index group inindex.ndx
. (Gromacs supported only)
- If you use mdtraj (
- selection2: str, required
- Selection string or name of index group for specified group in trajectories (RMSD calculation)
- Depending on the
analyzer
, there are different ways to specifyselection
.- If you use mdtraj (
analyzer="mdtraj"
), theselection
should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA" - If you use gromacs (
analyzer="gromacs"
), theselection
should follow Gromacs index group inindex.ndx
. (Gromacs supported only)
- If you use mdtraj (
- selection3: str, default=
selection1
- Selection string or name of index group for specified group in
reference
(least squares fit) - If your
reference
structure has different topology from your trajectories, you can utilize this option. (e.g.reference
has different configuration about mutation, missing residues or modified residues from your the trajectories.) Otherwise, you don't need to specify this option. - This option is valid when
analyzer="mdtraj"
oranalyzer="cpptraj"
. - If you use gromacs (
analyzer="gromacs"
), this option is ignored and the same selection indice are automatically used for your trajectories and thereference
.
- Selection string or name of index group for specified group in
- selection4: str, default=
selection2
- Selection string or name of index group for specified group in
reference
(RMSD calculation) - If your
reference
structure has different topology from your trajectories, you can utilize this option. (e.g.reference
has different configuration about mutation, missing residues or modified residues from your the trajectories.) Otherwise, you don't need to specify this option. - This option is valid when
analyzer="mdtraj"
oranalyzer="cpptraj"
. - If you use gromacs (
analyzer="gromacs"
), this option is ignored and the same selection indices are automatically used for your trajectories and thereference
.
- Selection string or name of index group for specified group in
Papers
[1] Inhibition of the hexamerization of SARS-CoV-2 endoribonuclease and modeling of RNA structures bound to the hexamer, https://doi.org/10.1038/s41598-022-07792-2
RMSD
- Evaluate snapshots of each trajectory so that the RMSD from the reference structure becomes smaller.
- The structure of the groups specified by
selection1
andselection3
are superimposed by the least squares method and the RMSD is calculated with the structure of the groups specified byselection2
andselection4
. - Usage Example
- Finding transitions between two known structures.
- See here for an example input file.
keywords
- type: str, required
- evaluation type, "target".
- threshold: float, required
- PaCS-MD terminates the calculation when the evaluation value falls below this threshold (in units of nm).
- skip_frame: int, default=1
- Number of frames to skip when ranking CVs.
- If you set
skip_frame=2
, PaCS-MD will use every other frame.
- analyzer: str, default="mdtraj"
- Trajectory tool used to calculate the evaluation value.
- "mdtraj", "gromacs" and "cpptraj" are supported.
- reference: str, required
- Structure file path for the reference structure of RMSD calculation.
- selection1: str, required
- Selection string or name of index group for specified group in trajectories (least squares fit)
- Depending on the
analyzer
, there are different ways to specifyselection
.- If you use mdtraj (
analyzer="mdtraj"
), theselection
should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA" - If you use gromacs (
analyzer="gromacs"
), theselection
should follow Gromacs index group inindex.ndx
. (Gromacs supported only)
- If you use mdtraj (
- selection2: str, required
- Selection string or name of index group for specified group in trajectories (RMSD calculation)
- Depending on the
analyzer
, there are different ways to specifyselection
.- If you use mdtraj (
analyzer="mdtraj"
), theselection
should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA" - If you use gromacs (
analyzer="gromacs"
), theselection
should follow Gromacs index group inindex.ndx
. (Gromacs supported only)
- If you use mdtraj (
- selection3: str, default=
selection1
- Selection string or name of index group for specified group in
reference
(least squares fit) - If your
reference
structure has different topology from your trajectories, you can utilize this option. (e.g.reference
has different configuration about mutation, missing residues or modified residues from your the trajectories.) Otherwise, you don't need to specify this option. - This option is valid when
analyzer="mdtraj"
oranalyzer="cpptraj"
. - If you use gromacs (
analyzer="gromacs"
), this option is ignored and the same selection indice are automatically used for your trajectories and thereference
.
- Selection string or name of index group for specified group in
- selection4: str, default=
selection2
- Selection string or name of index group for specified group in
reference
(RMSD calculation) - If your
reference
structure has different topology from your trajectories, you can utilize this option. (e.g.reference
has different configuration about mutation, missing residues or modified residues from your the trajectories.) Otherwise, you don't need to specify this option. - This option is valid when
analyzer="mdtraj"
oranalyzer="cpptraj"
. - If you use gromacs (
analyzer="gromacs"
), this option is ignored and the same selection indices are automatically used for your trajectories and thereference
.
- Selection string or name of index group for specified group in
Papers
[1] Parallel cascade selection molecular dynamics (PaCS-MD) to generate conformational transition pathway, https://doi.org/10.1063/1.4813023
Dissociation
-
Evaluate snapshots of each trajectory so that the centers of mass of the two selected groups are far apart.
-
The two selected groups are
selection1
andselection2
in the input file. -
Usage Example
- Ligand dissociating from a protein
- Separate the distance between the two domains of a protein
-
See here for an example input file.
keywords
- type: str, required
- evaluation type, "dissociation".
- threshold: float, required
- PaCS-MD terminates the calculation when the evaluation value exceeds this threshold.
- skip_frame: int, default=1
- Number of frames to skip when ranking CVs
- If you set
skip_frame=2
, PaCS-MD will use every other frame.
- analyzer: str, default="mdtraj"
- Trajectory tool used to calculate the evaluation value.
- "mdtraj", "gromacs" and "cpptraj" are supported.
- selection1: str, required
- Selection string or name of index group for specified group in trajectories
- Depending on the
analyzer
, there are different ways to specifyselection
.- If you use mdtraj (
analyzer="mdtraj"
), theselection
should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA" - If you use gromacs (
analyzer="gromacs"
), theselection
should follow Gromacs index group inindex.ndx
. (Gromacs supported only)
- If you use mdtraj (
- selection2: str, required
- Selection string or name of index group for specified group in trajectories
- Depending on the
analyzer
, there are different ways to specifyselection
.- If you use mdtraj (
analyzer="mdtraj"
), theselection
should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA" - If you use gromacs (
analyzer="gromacs"
), theselection
should follow Gromacs index group inindex.ndx
. (Gromacs supported only)
- If you use mdtraj (
Papers
[1] Protein-Ligand Dissociation Simulated by Parallel Cascade Selection Molecular Dynamics, https://doi.org/10.1021/acs.jctc.7b00504
[2] Dissociation Process of a MDM2/p53 Complex Investigated by Parallel Cascade Selection Molecular Dynamics and the Markov State Model, https://doi.org/10.1021/acs.jpcb.8b10309
[3] Binding free energy of protein/ligand complexes calculated using dissociat
ion Parallel Cascade Selection Molecular Dynamics and Markov state model, http
s://doi.org/10.2142/biophysico.bppb-v18.037
[4] High pressure inhibits signaling protein binding to the flagellar motor an
d bacterial chemotaxis through enhanced hydration, https://doi.org/10.1038/s41
598-020-59172-3
[5] Dissociation Pathways of the p53 DNA Binding Domain from DNA and Critical
Roles of Key Residues Elucidated by dPaCS-MD/MSM, https://doi.org/10.1021/acs.
jcim.1c01508
Association
-
Evaluate snapshots of each trajectory so that the centers of mass of the two selected groups are close together.
-
The two selected groups are
selection1
andselection2
in the input file. -
Usage Example
- Bringing a ligand closer to a protein
- Bringing two domains of a protein closer together
-
See here for an example input file.
keywords
- type: str, required
- evaluation type, "association".
- threshold: float, required
- PaCS-MD terminates the calculation when the evaluation value falls below this threshold (in units of nm).
- skip_frame: int, default=1
- Number of frames to skip when ranking CVs.
- If you set
skip_frame=2
, PaCS-MD will use every other frame.
- analyzer: str, default="mdtraj"
- Trajectory tool used to calculate the evaluation value.
- "mdtraj", "gromacs" and "cpptraj" are supported.
- selection1: str, required
- Selection string or name of index group for specified group in trajectories
- Depending on the
analyzer
, there are different ways to specifyselection
.- If you use mdtraj (
analyzer="mdtraj"
), theselection
should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA" - If you use gromacs (
analyzer="gromacs"
), theselection
should follow Gromacs index group inindex.ndx
. (Gromacs supported only)
- If you use mdtraj (
- selection2: str, required
- Selection string or name of index group for specified group in trajectories
- Depending on the
analyzer
, there are different ways to specifyselection
.- If you use mdtraj (
analyzer="mdtraj"
), theselection
should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA" - If you use gromacs (
analyzer="gromacs"
), theselection
should follow Gromacs index group inindex.ndx
. (Gromacs supported only)
- If you use mdtraj (
Papers
A_D
- Evaluate snapshots of each trajectory so that the distance between the centers of mass of the two selected groups fluctuates significantly within a certain range.
- Usage Example
- Observing large movements of proteins, such as opening and closing.
- Observing the binding and unbinding of ligands.
- See here for an example input file.
keywords
- type: str, required
- evaluation type, "a_d".
- skip_frame: int, default=1
- Number of frames to skip when ranking CVs.
- If you set
skip_frame=2
, PaCS-MD will use every other frame.
- analyzer: str, default="mdtraj"
- Trajectory tool used to calculate the evaluation value.
- "mdtraj", "gromacs" and "cpptraj" are supported.
- selection1: str, required
- Selection string or name of index group for specified group in trajectories
- Depending on the
analyzer
, there are different ways to specifyselection
.- If you use mdtraj (
analyzer="mdtraj"
), theselection
should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA" - If you use gromacs (
analyzer="gromacs"
), theselection
should follow Gromacs index group inindex.ndx
. (Gromacs supported only)
- If you use mdtraj (
- selection2: str, required
- Selection string or name of index group for specified group in trajectories
- Depending on the
analyzer
, there are different ways to specifyselection
.- If you use mdtraj (
analyzer="mdtraj"
), theselection
should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA" - If you use gromacs (
analyzer="gromacs"
), theselection
should follow Gromacs index group inindex.ndx
. (Gromacs supported only)
- If you use mdtraj (
- d_threshold: float, required
- Threshold for switching from dissociation to association mode. If the maximum value of cv calculated while in dissociation mode exceeds this value, it switches to association mode.
- frame_sel: int, required
- One of the conditions used to switch from association to dissociation.
- If the highest ranking frame in association mode is less than this value, it will be assumed that there has been no change from the previous cycle (i.e., it has come as close as possible). Hence, we consider the molecule to be within bound.
- bound_threshold: int, required
- One of the conditions used to switch from association to dissociation.
- When the number of bound cycles exceeds this value, the mode is switched from association to disassociation and the number of bound cycles is initialized to 0.
Papers
[1] Kinetic Selection and Relaxation of the Intrinsically Disordered Region of
a Protein upon Binding, https://doi.org/10.1021/acs.jctc.9b01203
EdgeExpansion
- Evaluate snapshots of each trajectory so that the frame forms the convex hull of the 4-dimentional principal component space(PCs).
- Snapshots of each cycle are projected onto a 4-dimensional principal component spaces (PCs). Its convex hull is computed and the initial structure of the next cycle is selected from the polygon's vertices.
- Usage Example
- Sampling a wide range of phase space without knowing the reference structure
- See here for an example input file.
keywords
- type: str, required
- Evaluation type, "ee".
- skip_frame: int, default=1
- Number of frames to skip when ranking CVs.
- If you set
skip_frame=2
, PaCS-MD will use every other frame.
- analyzer: str, default="mdtraj"
- Trajectory tool used to calculate the evaluation value.
- only "mdtraj" is supported.
- reference: str, required
- Trajectory file used to compute the covariance matrix in Principal Component Analysis (PCA).
- If this value is not set, the covariance matrix is computed based on the trajectory of cycle 0.
- selection1: str, required
- Selection string or name of index group for specified group in trajectories (least squares fit)
- If you use mdtraj (
analyzer="mdtraj"
), theselection
should follow mdtraj's atom selection.
- If you use mdtraj (
- Selection string or name of index group for specified group in trajectories (least squares fit)
- selection2: str, required
- Selection string or name of index group for specified group in trajectories (PCA calculation)
- No correction by atomic weights or number of atoms is made when calculating the covariance matrix or projecting.
- If you use mdtraj (
analyzer="mdtraj"
), theselection
should follow mdtraj's atom selection.
- If you use mdtraj (
- selection3: str, default=
selection1
- Selection string or name of index group for specified group in
reference
(least squares fit) - If your
reference
structure has different topology from your trajectories, you can utilize this option. (e.g.reference
has different configuration about mutation, missing residues or modified residues from your the trajectories) - Otherwise, you don't need to specify this option.
- Selection string or name of index group for specified group in
- selection4: str, default=
selection2
- Selection string or name of index group for specified group in
reference
(PCA calculation) - If your
reference
structure has different topology from your trajectories, you can utilize this option. (e.g.reference
has different configuration about mutation, missing residues or modified residues from your the trajectories) - Otherwise, you don't need to specify this option.
- Selection string or name of index group for specified group in
Papers
[1] Edge expansion parallel cascade selection molecular dynamics simulation for investigating large-amplitude collective motions of proteins, https://doi.org/10.1063/5.0004654
Template
- Template type is a type that can be defined by the user yourself. It is possible to include user-specific variables in input.toml, and these variables can be used in template type in which they are defined.
- This feature embodies the flexibility of PaCS-MD. This is also a feature that recognizes that the development team alone is limited in its ability to support such flexibility.
- This is for advanced users and requires familiarity with PaCS-MD and the simulator used internally (Gromacs, Amber, Namd). If you are not familiar with it, please practice with other types.
Content
Step1: Get original source code
- Go to github repository and download it.
Step2: Code template type
- You will see
template.py
is inpacsmd/md/analyzer/template.py
- Here is template type
class Template(SuperAnalyzer):
def calculate_cv(self, settings: MDsettings, cycle: int, replica: int, send_rev) -> None:
"""
TODO
1. Read trajectory based on self.cycle
2. Calculate the value that suited the evaluation type.
3. send the values by send_rev.send(ret_arr)
"""
pass
def ranking(self, settings: MDsettings, CVs: List[Snapshot]) -> List[Snapshot]:
"""
CVs: List[Snapshot]
The CVs is a list of Snapshot objects
that contain the CV for each frame in the trajectory.
Snapshot
TODO:
Arrange them in ascending, descending, etc.
order to match the PaCSMD evaluation type.
Example:
sorted_cv = sorted(CVs, key=lambda x: x.cv)
return sorted_cv
"""
pass
def is_threshold(self, settings: MDsettings, CVs: List[Snapshot] = None) -> bool:
if CVs is None:
CVs = self.CVs
return CVs[0].cv < settings.threshold
- You need to code 3 types mainly (
calculate_cv
,ranking
andis_threshold
)
calculate_cv
calcurate_cv
is the core of each evaluation type. Here collective variables(CVs) (distance, RMSD, energy, etc.) are calculated from the MD simulation results in each replica and returned in a list. The evaluation type should be well defined so that the frames are chosen in the direction you wish to sample.- We recommend you to refer to other evaluation types (e.g. dissociation.py, rmsd.py, etc.).
ranking
- In
ranking
, the list of CVs calculated bycalcurate_cv
is sorted in descending or ascending order. - If you want to separate two residues, sort in descending order as the distance increases. Conversely, if you want to bring them closer together, sort in ascending order so that the distance becomes smaller.
is_threshold
is_threshold
determines if the top of the list of CVs (the frame with the most desired CV of all frames) is above or below the threshold set by the user. The inequality should be changed according to the situation.- If you want to simulate a larger distance, the process would be such that if it goes above a certain distance, it will terminate.
Step3: Prepare input.toml
- In
input.toml
, you can define variables you want to use.
## analyzer
type = "template"
threshold = "???"
### user defined variable
user-defined-variable1 = 123
user-defined-variable2 = "hoge"
...
- If you set your variables like the above example, you can use the variables in
template.py
like the following example
class Template(SuperAnalyzer):
def calculate_cv(self, settings: MDsettings, cycle: int) -> List[float]:
...
if traj.distance > setttings.user-defined-variable1:
break
...
return list_float
Step4: Run PaCS-MD
- Now you can run your original PaCS-MD
pacs mdrun -t 1 -f input.toml
- Please make a pull_request in Github for new possible template scripts.
Input file
- PaCS-MD requires an input file. Here are the variables in the input file
- input file must be in toml format.
Contents
sample input file
- please check here
basic option
click here
- max_cycle: int, default=1
- Maximum number of cycles to run. (ex. 1, ..., 123, ..., 999)
- n_replica: int, default=1
- Number of replica. (ex. 1, ..., 123, ..., 999)
- n_parallel: int, default=1
- Number of replica calculated at a time
- centering: bool, default=True
- Whether to bring the molecule selected by centering_selection to the center of the simulation cell when creating the initial structure for the next cycle
- centering_selection: str, default="protein" or "Protein" or "@CA,C,O,N,H"
- Molecules to be moved to the center
- default value will be changed to match
analyzer
- if
analyzer
== "mdtraj", default="protein" - if
analyzer
== "gromacs", default="Protein" - if
analyzer
== "cpptraj", default="@CA,C,O,N,H"
- if
- working_dir: str, default="./."
- Directory where pacsmd will run
- rmmol: bool, default=false
- Whether rmmol is executed after each cycle
- keep_selection: str, (required if
rmmol=true
)- Molecular name or index group to be left in the trajectory when rmmol
- rmfile: bool, default=false
- Whether rmfile is executed after trial
max_cycle = 2 # Maximum number of cycles to run. (ex. 1, ..., 123, ..., 999)
n_replica = 3 # Number of replica. (ex. 1, ..., 123, ..., 999)
n_parallel = 3 # Number of replica calculated at a time
centering = true # Whether to move the molecule to the center
centering_selection = "protein" # Name of molecule to move in the center
working_dir = "/work/" # Directory where pacsmd will run
rmmol = true # Whether rmmol is executed after each cycle
keep_selection = "not water" # Molecular name or index group to be kept in the trajectory when rmmol
rmfile = true # Whether rmfile is executed after trial
simulator option
analyzer \ simulator | gromacs | amber | namd |
---|---|---|---|
mdtraj | o | o | o |
gromacs | o | x | x |
cpptraj | x | o | x |
Gromacs
click here
- simulator: str, required
- Software used inside PaCS-MD
- cmd_mpi: str, default=""
- Commands for MPI such as mpirun, blank is OK
- cmd_serial: str, required
- Commands to run the simulator serially
- cmd_parallel: str, default=cmd_serial
- Commands to run the simulator parallelly
- structure: str, required
- Structural file such as gro, pdb, rst7, etc.
- topology: str, required
- Topology file such as top, parm7, psf, etc.
- mdconf: str, required
- Parameter file such as mdp, mdin, namd, etc.
- index_file: str, (required if Gromacs)
- Gromacs index file
- trajectory_extension: str, required
- Trajectory file extension. ("." is necessary)
- nojump: bool, default=false
- whether to execute
-pbc nojump
treatment for the selection feature calculation inanalayzer
, snapshot extraction inexporter
and performing rmmol - valid only when
analyzer
is also gromacs - If
true
, molecules are allowed to get out of the simulation box in order to avoid the error in MSM due to the jumping of break of the molecule over pbc box. - If
false
, molecules are just made whole by-pbc mol
and can warp across the pbc box. - Be noted that the output
prd.xtc
files are not processed with these-pbc
options. (onlyprd_rmmol.xtc
files are processed) - This option is recommended to use when a/dissociation and a_d pacsmd is performed using gromacs as simulator and analyzer
nojump=true
can lead too large coordinate value to cause overflow or loss-of-significane problem. It will not happpen in most cases, but be carefull if your ligand is very small and simulation box is very large.- When this options is applied, analyzer can consider the distance even if ligand exceeds simulation box
- This option is not present in example input in the sample input repository since this option was added in version 1.1.0
- whether to execute
simulator = "gromacs" # Software used inside PaCS-MD
cmd_mpi = "mpirun -np 4" # Commands for MPI such as mpirun, blank is OK
cmd_serial = "gmx_mpi mdrun -ntomp 6" # Commands to run the simulator serially
cmd_parllel = "gmx_mpi mdrun -ntomp 6" # Commands to run the simulator parallelly
structure = "/work/input.gro" # Structural file such as gro, pdb, rst7, etc.
topology = "/work/topol.top" # Topology file such as top, parm7, psf, etc.
mdconf = "/work/parameter.mdp" # Parameter file such as mdp, mdin, namd, etc.
index_file = "/work/index.ndx" # Gromacs index file
trajectory_extension = ".xtc" # Trajectory file extension. ("." is necessary)
nojump = true # whether to execute nojump treatment only for gmx
Amber
click here
- simulator: str, required
- Software used inside PaCS-MD
- cmd_mpi: str, default=""
- Commands for MPI such as mpirun, blank is OK
- cmd_serial: str, required
- Commands to run the simulator serially
- cmd_parallel: str, default=cmd_serial
- Commands to run the simulator parallelly
- structure: str, required
- Structural file such as gro, pdb, rst7, etc.
- topology: str, required
- Topology file such as top, parm7, psf, etc.
- mdconf: str, required
- Parameter file such as mdp, mdin, namd, etc.
- trajectory_extension: str, required
- Trajectory file extension. ("." is necessary)
simulator = "amber" # Software used inside PaCS-MD
cmd_mpi = "" # Commands for MPI such as mpirun, blank is OK
cmd_serial = "pmemd.cuda" # Commands to run the simulator serially
cmd_parllel = "pmemd.cuda" # Commands to run the simulator parallelly
structure = "/work/structure.rst7" # Structural file such as gro, pdb, rst7, etc.
topology = "/work/topology.parm7" # Topology file such as top, parm7, psf, etc.
mdconf = "/work/parameter.mdin" # Parameter file such as mdp, mdin, namd, etc.
trajectory_extension = ".nc" # Trajectory file extension. ("." is necessary)
NAMD
click here
- simulator: str, required
- Software used inside PaCS-MD
- cmd_mpi: str, default=""
- Commands for MPI such as mpirun, blank is OK
- cmd_serial: str, required
- Commands to run the simulator serially
- cmd_parallel: str, default=cmd_serial
- Commands to run the simulator parallelly
- structure: str, required
- Structural file such as gro, pdb, rst7, etc.
- topology: str, required
- Topology file such as top, parm7, psf, etc.
- mdconf: str, required
- Parameter file such as mdp, mdin, namd, etc.
- trajectory_extension: str, required
- Trajectory file extension. ("." is necessary)
simulator = "namd" # Software used inside PaCS-MD
cmd_mpi = "mpirun -np 4" # Commands for MPI such as mpirun, blank is OK
cmd_serial = "namd2 +p6" # Commands to run the simulator serially
cmd_parllel = "namd2 +p6" # Commands to run the simulator parallelly
structure = "/work/input.pdb" # Structural file such as gro, pdb, rst7, etc.
topology = "/work/ionized.psf" # Topology file such as top, parm7, psf, etc.
mdconf = "/work/production.namd" # Parameter file such as mdp, mdin, namd, etc.
trajectory_extension = ".dcd" # Trajectory file extension. ("." is necessary)
analyzer option
analyzer \ type | dissociation | association | rmsd | target | ee | a_d | template |
---|---|---|---|---|---|---|---|
mdtraj | o | o | o | o | o | o | - |
gromacs | o | o | o | o | x | o | - |
cpptraj | o | o | o | o | x | o | - |
Target
click here
- type: str, required
- evaluation type, "target".
- threshold: float, required
- PaCS-MD terminates the calculation when the evaluation value falls below this threshold (in units of nm).
- skip_frame: int, default=1
- Number of frames to skip when ranking CVs.
- If you set
skip_frame=2
, PaCS-MD will use every other frame.
- analyzer: str, default="mdtraj"
- Trajectory tool used to calculate the evaluation value.
- "mdtraj", "gromacs" and "cpptraj" are supported.
- reference: str, required
- Structure file path for the reference structure of RMSD calculation.
- selection1: str, required
- Selection string or name of index group for specified group in trajectories (least squares fit)
- Depending on the
analyzer
, there are different ways to specifyselection
.- If you use mdtraj (
analyzer="mdtraj"
), theselection
should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA" - If you use gromacs (
analyzer="gromacs"
), theselection
should follow Gromacs index group inindex.ndx
. (Gromacs supported only)
- If you use mdtraj (
- selection2: str, required
- Selection string or name of index group for specified group in trajectories (RMSD calculation)
- Depending on the
analyzer
, there are different ways to specifyselection
.- If you use mdtraj (
analyzer="mdtraj"
), theselection
should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA" - If you use gromacs (
analyzer="gromacs"
), theselection
should follow Gromacs index group inindex.ndx
. (Gromacs supported only)
- If you use mdtraj (
- selection3: str, default=
selection1
- Selection string or name of index group for specified group in
reference
(least squares fit) - If your
reference
structure has different topology from your trajectories, you can utilize this option. (e.g.reference
has different configuration about mutation, missing residues or modified residues from your the trajectories.) Otherwise, you don't need to specify this option. - This option is valid when
analyzer="mdtraj"
oranalyzer="cpptraj"
. - If you use gromacs (
analyzer="gromacs"
), this option is ignored and the same selection indices are automatically used for your trajectories and thereference
.
- Selection string or name of index group for specified group in
- selection4: str, default=
selection2
- Selection string or name of index group for specified group in
reference
(RMSD calculation) - If your
reference
structure has different topology from your trajectories, you can utilize this option. (e.g.reference
has different configuration about mutation, missing residues or modified residues from your the trajectories.) Otherwise, you don't need to specify this option. - This option is valid when
analyzer="mdtraj"
oranalyzer="cpptraj"
. - If you use gromacs (
analyzer="gromacs"
), this option is ignored and the same selection indices are automatically used for your trajectories and thereference
.
- Selection string or name of index group for specified group in
type = "target" # Evaluation type
threshold = 0.01 # CV threshold used to decide whether to terminate the calculation (in units of nm)
skip_frame = 1 # How many frames to skip when ranking CVs
# if analyzer == "mdtraj"
analyzer = "mdtraj" # Trajectory tool used to calculate the evaluation type
reference = "/work/ref.pdb" # Structure for comparison
selection1 = "backbone" # Selection string for specified group in trajectroies (least squares fit)
selection2 = "backbone" # Selection string for specified group in trajectories (RMSD calculation)
selection3 = "backbone and (not resid 1 to 10)" # Selection string for specified group in reference (least squares fit)
selection4 = "backbone and (not resid 1 to 10)" # Selection string for specified group in reference (RMSD calculation)
# else if analyzer == "gromacs"
# analyzer = "gromacs" # Trajectory tool used to calculate the evaluation type
# selection1 = "Backbone" # Name of index group for specified group in trajectories (least squares fit)
# selection2 = "Backbone" # Name of index group for specified group in trajectories (RMSD calculation)
# else if analyzer == "cpptraj"
# analyzer = "cpptraj" # Trajectory tool used to calculate the evaluation type
# selection1 = "@CA,N,O,C" # Selection string for specified group in trajectroies (least squares fit)
# selection2 = "@CA,N,O,C" # Selection string for specified group in trajectories (RMSD calculation)
RMSD
click here
- type: str, required
- evaluation type, "rmsd"
- threshold: float, required
- PaCS-MD terminates the calculation when the evaluation value exceeds this threshold (in units of nm).
- skip_frame: int, default=1
- Number of frames to skip when ranking CVs.
- If you set
skip_frame=2
, PaCS-MD will use every other frame.
- analyzer: str, default="mdtraj"
- Trajectory tool used to calculate the evaluation value.
- "mdtraj", "gromacs" and "cpptraj" are supported.
- reference: str, required
- Structure file path for the reference structure of RMSD calculation.
- selection1: str, required
- Selection string or name of index group for specified group in trajectories (least squares fit)
- Depending on the
analyzer
, there are different ways to specifyselection
.- If you use mdtraj (
analyzer="mdtraj"
), theselection
should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA" - If you use gromacs (
analyzer="gromacs"
), theselection
should follow Gromacs index group inindex.ndx
. (Gromacs supported only)
- If you use mdtraj (
- selection2: str, required
- Selection string or name of index group for specified group in trajectories (RMSD calculation)
- Depending on the
analyzer
, there are different ways to specifyselection
.- If you use mdtraj (
analyzer="mdtraj"
), theselection
should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA" - If you use gromacs (
analyzer="gromacs"
), theselection
should follow Gromacs index group inindex.ndx
. (Gromacs supported only)
- If you use mdtraj (
- selection3: str, default=
selection1
- Selection string or name of index group for specified group in
reference
(least squares fit) - If your
reference
structure has different topology from your trajectories, you can utilize this option. (e.g.reference
has different configuration about mutation, missing residues or modified residues from your the trajectories.) Otherwise, you don't need to specify this option. - This option is valid when
analyzer="mdtraj"
oranalyzer="cpptraj"
. - If you use gromacs (
analyzer="gromacs"
), this option is ignored and the same selection indices are automatically used for your trajectories and thereference
.
- Selection string or name of index group for specified group in
- selection4: str, default=
selection2
- Selection string or name of index group for specified group in
reference
(RMSD calculation) - If your
reference
structure has different topology from your trajectories, you can utilize this option. (e.g.reference
has different configuration about mutation, missing residues or modified residues from your the trajectories.) Otherwise, you don't need to specify this option. - This option is valid when
analyzer="mdtraj"
oranalyzer="cpptraj"
. - If you use gromacs (
analyzer="gromacs"
), this option is ignored and the same selection indices are automatically used for your trajectories and thereference
.
- Selection string or name of index group for specified group in
type = "rmsd" # Evaluation type
threshold = 2 # CV threshold used to decide whether to terminate the calculation (in units of nm)
skip_frame = 1 # How many frames to skip when ranking CVs
# if analyzer == "mdtraj"
analyzer = "mdtraj" # Trajectory tool used to calculate the evaluation type
reference = "/work/ref.pdb" # Structure for comparison
selection1 = "backbone" # Selection string for specified group in trajectories (least squares fit)
selection2 = "backbone" # Selection string for specified group in trajectories (RMSD calculation)
selection3 = "backbone and (not resid 1 to 10)" # Selection string for specified group in reference (least squares fit)
selection4 = "backbone and (not resid 1 to 10)" # Selection string for specified group in reference (RMSD calculation)
# else if analyzer == "gromacs"
# analyzer = "gromacs" # Trajectory tool used to calculate the evaluation type
# selection1 = "Backbone" # Name of index group for specified group in trajectories (least squares fit)
# selection2 = "Backbone" # Name of index group for specified group in trajectories (RMSD calculation)
# else if analyzer == "cpptraj"
# analyzer = "gromacs" # Trajectory tool used to calculate the evaluation type
# selection1 = "@CA,N,O,C" # Selection string for specified group in trajectories (least squares fit)
# selection2 = "@CA,N,O,C" # Selection string for specified group in trajectories (RMSD calculation)
Association
click here
- type: str, required
- evaluation type, "association".
- threshold: float, required
- PaCS-MD terminates the calculation when the evaluation value falls below this threshold (in units of nm).
- skip_frame: int, default=1
- Number of frames to skip when ranking CVs.
- If you set
skip_frame=2
, PaCS-MD will use every other frame.
- analyzer: str, default="mdtraj"
- Trajectory tool used to calculate the evaluation value.
- "mdtraj", "gromacs" and "cpptraj" are supported.
- selection1: str, required
- Selection string or name of index group for specified group in trajectories
- Depending on the
analyzer
, there are different ways to specifyselection
.- If you use mdtraj (
analyzer="mdtraj"
), theselection
should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA" - If you use gromacs (
analyzer="gromacs"
), theselection
should follow Gromacs index group inindex.ndx
. (Gromacs supported only)
- If you use mdtraj (
- selection2: str, required
- Selection string or name of index group for specified group in trajectories
- Depending on the
analyzer
, there are different ways to specifyselection
.- If you use mdtraj (
analyzer="mdtraj"
), theselection
should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA" - If you use gromacs (
analyzer="gromacs"
), theselection
should follow Gromacs index group inindex.ndx
. (Gromacs supported only)
- If you use mdtraj (
type = "association" # Evaluation type
threshold = 0.3 # CV threshold used to decide whether to terminate the calculation (in units of nm)
skip_frame = 1 # How many frames to skip when ranking CVs
# if analyzer == "mdtraj"
analyzer = "mdtraj" # Trajectory tool used to calculate the evaluation type
selection1 = "resid 1 to 5" # Selection string for specified group in trajectories
selection2 = "resid 6 to 10" # Selection string for specified group in trajectories
# else if analyzer == "gromacs"
# analyzer = "gromacs" # Trajectory tool used to calculate the evaluation type
# selection1 = "resid_1_to_5" # Name of index group for specified group in trajectories
# selection2 = "resid_6_to_10" # Name of index group for specified group in trajectories
# else if analyzer == "cpptraj"
# analyzer = "cpptraj" # Trajectory tool used to calculate the evaluation type
# selection1 = ":1-5" # Name of index group for specified group in trajectories
# selection2 = ":6-10" # Name of index group for specified group in trajectories
Dissociation
click here
- type: str, required
- evaluation type, "dissociation".
- threshold: float, required
- PaCS-MD terminates the calculation when the evaluation value exceeds this threshold.
- skip_frame: int, default=1
- Number of frames to skip when ranking CVs
- If you set
skip_frame=2
, PaCS-MD will use every other frame.
- analyzer: str, default="mdtraj"
- Trajectory tool used to calculate the evaluation value.
- "mdtraj", "gromacs" and "cpptraj" are supported.
- selection1: str, required
- Selection string or name of index group for specified group in trajectories
- Depending on the
analyzer
, there are different ways to specifyselection
.- If you use mdtraj (
analyzer="mdtraj"
), theselection
should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA" - If you use gromacs (
analyzer="gromacs"
), theselection
should follow Gromacs index group inindex.ndx
. (Gromacs supported only)
- If you use mdtraj (
- selection2: str, required
- Selection string or name of index group for specified group in trajectories
- Depending on the
analyzer
, there are different ways to specifyselection
.- If you use mdtraj (
analyzer="mdtraj"
), theselection
should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA" - If you use gromacs (
analyzer="gromacs"
), theselection
should follow Gromacs index group inindex.ndx
. (Gromacs supported only)
- If you use mdtraj (
type = "dissociation" # Evaluation type
threshold = 10 # CV threshold used to decide whether to terminate the calculation (in units of nm)
skip_frame = 1 # How many frames to skip when ranking CVs
# if analyzer == "mdtraj"
analyzer = "mdtraj" # Trajectory tool used to calculate the evaluation type
selection1 = "resid 1 to 5" # Selection string for specified group in trajectories
selection2 = "resid 6 to 10" # Selection string for specified group in trajectories
# else if analyzer == "gromacs"
# analyzer = "gromacs" # Trajectory tool used to calculate the evaluation type
# selection1 = "resid_1_to_5" # Name of index group for specified group in trajectories
# selection2 = "resid_6_to_10" # Name of index group for specified group in trajectories
# else if analyzer == "cpptraj"
# analyzer = "cpptraj" # Trajectory tool used to calculate the evaluation type
# selection1 = ":1-5" # Name of index group for specified group in trajectories
# selection2 = ":6-10" # Name of index group for specified group in trajectories
EdgeExpansion
click here
- type: str, required
- Evaluation type, "ee".
- skip_frame: int, default=1
- Number of frames to skip when ranking CVs.
- If you set
skip_frame=2
, PaCS-MD will use every other frame.
- analyzer: str, default="mdtraj"
- Trajectory tool used to calculate the evaluation value.
- only "mdtraj" is supported.
- reference: str, required
- Trajectory file used to compute the covariance matrix in Principal Component Analysis (PCA).
- If this value is not set, the covariance matrix is computed based on the trajectory of cycle 0.
- selection1: str, required
- Selection string or name of index group for specified group in trajectories (least squares fit)
- If you use mdtraj (
analyzer="mdtraj"
), theselection
should follow mdtraj's atom selection.
- If you use mdtraj (
- Selection string or name of index group for specified group in trajectories (least squares fit)
- selection2: str, required
- Selection string or name of index group for specified group in trajectories (PCA calculation)
- No correction by atomic weights or number of atoms is made when calculating the covariance matrix or projecting.
- If you use mdtraj (
analyzer="mdtraj"
), theselection
should follow mdtraj's atom selection.
- If you use mdtraj (
- selection3: str, default=
selection1
- Selection string or name of index group for specified group in
reference
(least squares fit) - If your
reference
structure has different topology from your trajectories, you can utilize this option. (e.g.reference
has different configuration about mutation, missing residues or modified residues from your the trajectories) - Otherwise, you don't need to specify this option.
- Selection string or name of index group for specified group in
- selection4: str, default=
selection2
- Selection string or name of index group for specified group in
reference
(PCA calculation) - If your
reference
structure has different topology from your trajectories, you can utilize this option. (e.g.reference
has different configuration about mutation, missing residues or modified residues from your the trajectories) - Otherwise, you don't need to specify this option.
- Selection string or name of index group for specified group in
type = "ee" # Evaluation type
skip_frame = 1 # How many frames to skip when ranking CVs
# if analyzer == "mdtraj"
analyzer = "mdtraj" # Trajectory tool used to calculate the evaluation type
selection1 = "backbone" # Selection string for specified group in trajectories (least squares fit)
selection2 = "backbone" # Selection string for specified group in trajectories (RMSD calculation)
selection3 = "backbone" # Selection string for specified group in reference (least squares fit)
selection4 = "backbone" # Selection string for specified group in reference (RMSD calculation)
A_D
click here
- type: str, required
- evaluation type, "a_d".
- skip_frame: int, default=1
- Number of frames to skip when ranking CVs.
- If you set
skip_frame=2
, PaCS-MD will use every other frame.
- analyzer: str, default="mdtraj"
- Trajectory tool used to calculate the evaluation value.
- "mdtraj", "gromacs" and "cpptraj" are supported.
- selection1: str, required
- Selection string or name of index group for specified group in trajectories
- Depending on the
analyzer
, there are different ways to specifyselection
.- If you use mdtraj (
analyzer="mdtraj"
), theselection
should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA" - If you use gromacs (
analyzer="gromacs"
), theselection
should follow Gromacs index group inindex.ndx
. (Gromacs is only supported)
- If you use mdtraj (
- selection2: str, required
- Selection string or name of index group for specified group in trajectories
- Depending on the
analyzer
, there are different ways to specifyselection
.- If you use mdtraj (
analyzer="mdtraj"
), theselection
should follow mdtraj's atom selection. e.g. "resid 5 to 100 and name CA" - If you use gromacs (
analyzer="gromacs"
), theselection
should follow Gromacs index group inindex.ndx
. (Gromacs is only supported)
- If you use mdtraj (
- d_threshold: float, required
- Threshold for switching from dissociation to association mode. If the maximum value of cv calculated while in dissociation mode exceeds this value, it switches to associtaion mode.
- frame_sel: int, required
- One of the conditions used to switch from association to dissociation.
- If the highest ranking frame in association mode is less than this value, it is judged that there has been no change from the previous cycle (i.e., it has come as close as it can). Also, at this time, we consider it to be within bound.
- bound_threshold: int, required
- One of the conditions used to switch from association to dissociation.
- When the number of bound cycles exceeds this value, the mode is switched from association to disassociation and the number of bound cycles is initialized to 0. When the number of bound cycles exceeds this value, the mode is switched from association to disassociation and the number of bound cycles is initialized to 0.
type = "a_d" # Evaluation type
skip_frame = 1 # How many frames to skip when ranking CVs
# if analyzer == "mdtraj"
analyzer = "mdtraj" # Trajectory tool used to calculate the evaluation type
selection1 = "resid 1" # Selection string for specified group in trajectories
selection2 = "resid 9" # Selection string for specified group in trajectories
d_threshold = 3.0 # threshold for dissociation to association
frame_sel = 5 # number of frames to be used for judging whether the system has converged
bound_threshold = 3 # threshold for association to dissociation
Template
click here
- type: str, required
- Evaluation type
- skip_frame: int, default=1
- Number of frames to skip when ranking CVs
- threshold: float, required
- CV threshold for determining to terminate a trial
Template type is a type that can be defined by the user. It is possible to include user-specific variables in input.toml, and these variables can be used in template type in which they are defined. For more information, click here.
type = "template"
threshold = "???"
# user defined variable
user-defined-variable1 = 123
user-defined-variable2 = "hoge"
...
hidden option (No need to specify)
click here
- cmd_gmx: str
- Gromacs command (ex. gmx, gmx_mpi)
- will be created from
cmd_serial
- top_mdtraj: str
- topology file for mdtraj
- will be created from
input.toml
- structure_extension: str
- Structure file extension
- will be created from
structure
Reference
- PaCS-Toolkit
The paper is under consideration. The information will be updated upon publication.
[1] Ikizawa, S, Hori, T., Wijana, T.N., Kono, H., Bai, Z., Kimizono, T., Lu, W., Tran, D.P., & Kitao, A. PaCS-Toolkit: Optimized software utilities for parallel cascade selection molecular dynamics (PaCS-MD) simulations and subsequent analyses. Submitted.
- Original PaCS-MD or targeted-PaCS-MD (t-PaCS-MD)
[2] Harada, R., & Kitao, A. Parallel cascade selection molecular dynamics (PaCS-MD) to generate conformational transition pathway. J. Chem. Phys. 139, 035103 (2013). https://doi.org/10.1063/1.4813023
- Dissociation PaCS-MD (dPaCS-MD)
[3] Tran, D. P., Takemura, K., Kuwata, K., & Kitao, A. Protein–Ligand Dissociation Simulated by Parallel Cascade Selection Molecular Dynamics. J. Chem. Theory Comput. 14, 404–417 (2018). https://doi.org/10.1021/acs.jctc.7b00504
[4] Tran, D. P., & Kitao, A. Dissociation Process of a MDM2/p53 Complex Investigated by Parallel Cascade Selection Molecular Dynamics and the Markov State Model. J. Phys. Chem. B , 123, 11, 2469–2478 (2019). https://doi.org/10.1021/acs.jpcb.8b10309
[5] Hata, H., Phuoc Tran, D., Marzouk Sobeh, M., & Kitao, A. Binding free energy of protein/ligand complexes calculated using dissociation Parallel Cascade Selection Molecular Dynamics and Markov state model. Biophysics and Physicobiology, 18, 305–31 (2021). https://doi.org/10.2142/biophysico.bppb-v18.037
- Application to protein domain motion
[6] Inoue, Y., Ogawa, Y., Kinoshita, M., Terahara, N., Shimada, M., Kodera, N., Ando, T., Namba, K., Kitao, A., Imada, K., & Minamino, T. Structural Insights into the Substrate Specificity Switch Mechanism of the Type III Protein Export Apparatus. Structure, 27 , 965-976 (2019). https://doi.org/10.1016/j.str.2019.03.017
- Association and dissociation PaCS-MD (a/dPaCS-MD)
[7] Tran, D. P., & Kitao, A. Kinetic Selection and Relaxation of the Intrinsically Disordered Region of a Protein upon Binding. J. Chem. Theory Comput. 16, 2835–2845 (2020). https://doi.org/10.1021/acs.jctc.9b01203
- Edge expansion PaCS-MD (eePaCS-MD)
[8] Takaba, K., Tran, D. P., & Kitao, A. Edge expansion parallel cascade selection molecular dynamics simulation for investigating large-amplitude collective motions of proteins. J. Chem. Phys. 152, 225101 (2020). https://doi.org/10.1063/5.0004654
[9] Takaba, K., Tran, D. P., & Kitao, A. Erratum: "Edge expansion parallel cascade selection molecular dynamics simulation for investigating large-amplitude collective motions of proteins" [J. Chem. Phys. 152, 225101 (2020)]. . J. Chem. Phys. 153, 179902 (2020). https://doi.org/10.1063/5.0032465
- rmsdPaCS-MD
[10] Tran, D. P., Taira, Y., Ogawa, T., Misu, R., Miyazawa, Y., & Kitao, A. Inhibition of the hexamerization of SARS-CoV-2 endoribonuclease and modeling of RNA structures bound to the hexamer. Sci Rep 12, 3860 (2022). https://doi.org/10.1038/s41598-022-07792-2
- Recent applications
[12] Hata, H., Nishihara, Y., Nishiyama, M., Sowa, Y., Kawagishi, I., & Kitao, A. High pressure inhibits signaling protein binding to the flagellar motor and bacterial chemotaxis through enhanced hydration. Sci Rep 10, 2351 (2020). https://doi.org/10.1038/s41598-020-59172-3
[12] Sobeh, M. M., & Kitao, A. Dissociation Pathways of the p53 DNA Binding Domain from DNA and Critical Roles of Key Residues Elucidated by dPaCS-MD/MSM. J. Chem. Inf. Model. 62, 1294–1307 (2022). https://doi.org/10.1021/acs.jcim.1c01508
[13] Wijaya, T.N., & Kitao, A. Energetic and Kinetic Origins of CALB Interfacial Activation Revealed by PaCS-MD/MSM. J. Phys. Chem. B 2023, 127, 34, 7431–7441 (2023). https://doi.org/10.1021/acs.jpcb.3c02041
rmmol
- This command is used after executing
pacs mdrun
. - This command removes unnecessary atoms from the trajectory created by running PaCS-MD.
- This command reduces the file size and the burden on the computer capacity.
Caution
- This command is not reversible.
- This command operates on all trajectories of one trial at a time.
- If you want to apply it to more than one trial, you could easily achieve it by writing an iteration shell script.
Example
- The following example removes water molecules from the trajectory.
mdtraj
pacs rmmol mdtraj -t 1 -k "not water" -e .xtc -m trial001/cycle000/replica001/prd.gro
gromacs
pacs rmmol gmx -t 1 -k "not_water" -e .xtc -n index.ndx -g gmx --nojump
cpptraj
pacs rmmol cpptraj -t 1 -k "not water" -e .nc -p trial001/cycle000/replica001/prd.parm7
Arguments
mdtraj
usage: pacs rmmol mdtraj [-h] [-t] [-k] [-e] [-m]
-t, --trial
(int):- trial number without 0-fill when pacsmd was conducted (e.g.
-t 1
)
- trial number without 0-fill when pacsmd was conducted (e.g.
-k, --keep_selection
(str):- atom selection to be retained in trajectory (e.g.
-k "not water"
)
- atom selection to be retained in trajectory (e.g.
-e, --trajectory_extension
(str):- trajectory extension (e.g.
-e .xtc
)
- trajectory extension (e.g.
-m, --top_mdtraj
(str):- topology file path for mdtraj analysis (e.g.
-m trial001/cycle000/replica001/prd.pdb
)
- topology file path for mdtraj analysis (e.g.
gromacs
usage: pacs rmmol gmx [-h] [-t] [-k] [-n] [-g] [-e]
-t, --trial
(int):- trial number without 0-fill when pacsmd was conducted (e.g.
-t 1
)
- trial number without 0-fill when pacsmd was conducted (e.g.
-k, --keep_selection
(str):- index group to be retained in trajectory (e.g.
-k "not_water"
)
- index group to be retained in trajectory (e.g.
-e, --trajectory_extension
(str):- gromacs trajectory extension (e.g.
-e .xtc
)
- gromacs trajectory extension (e.g.
-n, --index_file
(str):- index file for gromacs (e.g.
-n index.ndx
)
- index file for gromacs (e.g.
-g, --cmd_gmx
(str):- gromacs command prefix (e.g.
-g gmx
)
- gromacs command prefix (e.g.
--nojump
(bool):- whether to execute
-pbc nojump
treatment for the trajectory (e.g.--nojump
) - If your PaCS-MD trajectory was generated with
nojump=true
, it is strongly strongly recommended to use this options as well - Also refer to the
mdrun/inputfile
page for more details.
- whether to execute
cpptraj
usage: pacs rmmol cpptraj [-h] [-t] [-k] [-e] [-p]
-t, --trial
(int):- trial number without 0-fill when pacsmd was conducted (e.g.
-t 1
)
- trial number without 0-fill when pacsmd was conducted (e.g.
-k, --keep_selection
(str):- atom selection to be retained in trajectory (e.g.
-k :!WAT
)
- atom selection to be retained in trajectory (e.g.
-e, --trajectory_extension
(str):- trajectory extension (e.g.
-e .nc
)
- trajectory extension (e.g.
-p, --topology
(str):- topology file path for cpptraj analysis (e.g.
-p trial001/cycle000/replica001/prd.parm7
)
- topology file path for cpptraj analysis (e.g.
rmfile
- This command is used after executing
pacs mdrun
. - This command deletes backup files and other unnecessary files.
- This command reduces the number of files and the burden on the computer capacity.
Caution
- This command is not reversible.
Example
- The following example removes unnecessary files from the directory.
pacs rmfile -t 1 -s "gromacs"
Arguments
usage: pacs rmfile [-h] [-t] [-s]
-t, --trial
(int):- trial number without 0-fill when pacsmd was conducted (e.g.
-t 1
)
- trial number without 0-fill when pacsmd was conducted (e.g.
-s, --simulator
(str):- simulator name (e.g.
-s "gromacs"
) - Only
"gromacs"
,"amber"
, and "namd
" are accepted.
- simulator name (e.g.
fit
- This command is used after executing
pacs mdrun
- This command makes new trajectories that fit to the reference structure.
- The newly created trajectories are placed in the same hierarchy as the original trajectories.
Caution
- Fitting the molecules by using the script could take a long time (around 1 hour for 1 trial)
Example
- In the following example, trajectories for all cycles and all replicas of trial1 are fitted with the protein, and then the output files are saved.
for single trajectory
pacs fit traj mdtraj -tf ./trial001/cycle001/replica001/prd.xtc -top ./inputs/input.gro -r ./inputs/input.gro -ts "backbone" -rs "backbone"
for single trial
pacs fit trial mdtraj -t 1 -top ./trial001/cycle001/replica001/prd.pdb -r ./trial001/cycle001/replica001/prd.pdb -ts "protein" -rs "protein" -tf prd.xtc -p 10
Arguments
for single trajectory
usage: pacs fit traj mdtraj [-h] [-tf] [-top] [-r] [-ts] [-rs] [-p] [-o]
-tf, --trj_file
(str):- file name of the trajectory to be fitted (e.g.
-tf prd.xtc
)
- file name of the trajectory to be fitted (e.g.
-top, --topology
(str):- topology file path for loading trajectory (e.g.
-top trial001/cycle000/replica001/prd.pdb
)
- topology file path for loading trajectory (e.g.
-r, --ref_structure
(str):- reference structure file path for fitting reference (e.g.
-r trial001/cycle000/replica001/prd.pdb
)
- reference structure file path for fitting reference (e.g.
-ts, --trj_selection
(str):- atom selection for fitting trajectory (e.g.
-ts "protein"
)
- atom selection for fitting trajectory (e.g.
-rs, --ref_selection
(str):- atom selection for fitting reference (e.g.
-rs "protein"
)
- atom selection for fitting reference (e.g.
-p, --parallel
(int):- number of parallel processes (e.g.
-p 10
) - default: 1
- number of parallel processes (e.g.
-o, --out
(str):- output file name (e.g.
-o prd_fit.xtc
) - default:
{trj_file}_fit.{ext}
- output file name (e.g.
for single trial
usage: pacs fit trial mdtraj [-h] [-t] [-tf] [-top] [-r] [-ts] [-rs] [-p] [-o]
-t, --trial
(int):- trial number (e.g.
-t 1
)
- trial number (e.g.
-tf, --trj_file
(str):- file name of the trajectory to be fitted (e.g.
-tf prd.xtc
)
- file name of the trajectory to be fitted (e.g.
-top, --topology
(str):- topology file path for loading trajectory (e.g.
-top trial001/cycle000/replica001/prd.pdb
)
- topology file path for loading trajectory (e.g.
-r, --ref_structure
(str):- reference structure file path for fitting reference (e.g.
-r trial001/cycle000/replica001/prd.pdb
)
- reference structure file path for fitting reference (e.g.
-ts, --trj_selection
(str):- atom selection for fitting trajectory (e.g.
-ts "protein"
)
- atom selection for fitting trajectory (e.g.
-rs, --ref_selection
(str):- atom selection for fitting reference (e.g.
-rs "protein"
)
- atom selection for fitting reference (e.g.
-p, --parallel
(int):- number of parallel processes (e.g.
-p 10
) - default: 1
- number of parallel processes (e.g.
-o, --out
(str):- output file name (e.g.
-o prd_fit.xtc
) - default:
{trj_file}_fit.{ext}
- output file name (e.g.
genrepresent
- This command is used after executing PaCS-MD.
- This command traces each cycle in reverse order, starting with the frame with the best CV in the last cycle, and generates one continuous trajectory file.
- The generated file is named
repr_complete
and located in thegenrepresent
directory. - File extension is determined by the user.
Caution
- This command operates on all of the trajectories in one trial at a time.
- If you want to apply this for the multiplie trials, you could simply write an iteration shell script to achieve it.
Example
- The following example generates a trajectory file.
mdtraj
pacs genrepresent mdtraj -t 1 -trj prd.xtc -top ./inputs/input.gro
gromacs
pacs genrepresent gmx -t 1 -trj prd.xtc -top ./inputs/input.gro -g gmx_mpi
cpptraj
pacs genrepresent cpptraj -t 1 -trj prd.nc -top inputs/input.prmtop
Arguments
mdtraj
usage: pacs genrepresent mdtraj [-h] [-t] [-trj] [-top]
-t, --trial
(int):- trial number without 0-fill when pacsmd was conducted (e.g.
-t 1
)
- trial number without 0-fill when pacsmd was conducted (e.g.
-trj, --trajectory
(str):- trajectory file name (e.g.
-trj prd.xtc
)
- trajectory file name (e.g.
-top, --topology
(str):- topology file path (e.g.
-m trial001/cycle000/replica001/prd.pdb
)
- topology file path (e.g.
gromacs
usage: pacs genrepresent gmx [-h] [-t] [-trj] [-top] [-g]
-t, --trial
(int):- trial number without 0-fill when pacsmd was conducted (e.g.
-t 1
)
- trial number without 0-fill when pacsmd was conducted (e.g.
-trj, --trajectory
(str):- trajectory file name (e.g.
-trj prd.xtc
)
- trajectory file name (e.g.
-top, --topology
(str):- topology file path (e.g.
-m trial001/cycle000/replica001/input.gro
)
- topology file path (e.g.
-g, --cmd_gmx
(str):- gromacs command prefix (e.g.
-g gmx_mpi
)
- gromacs command prefix (e.g.
cpptraj
usage: pacs genrepresent cpptraj [-h] [-t] [-trj] [-top]
-t, --trial
(int):- trial number without 0-fill when pacsmd was conducted (e.g.
-t 1
)
- trial number without 0-fill when pacsmd was conducted (e.g.
-trj, --trajectory
(str):- trajectory file name (e.g.
-trj prd.nc
)
- trajectory file name (e.g.
-top, --topology
(str):- topology file path (e.g.
-m trial001/cycle000/replica001/input.parm7
)
- topology file path (e.g.
gencom
- This command is employed after executing PaCS-MD.
- The COMs of a selected atom group from a trajectory (generated by "gencom traj") or multiple trajectories in one trial of PaCS-MD (generated by "gencom trial") can be computed and outputted in pdb format.
- The COMs are stored as a VIRTUAL SITE in the pdb file.
- This command visualizes the migration paths of atoms and supports intuitive interpretation.
gencom traj
- This command is used after executing PaCS-MD.
- The COMs of a selected atom group from a trajectory file can be calculated and outputted in pdb format.
- This command visualizes the migration paths of atoms and supports intuitive interpretation.
Example
- The following example generates a pathway file
pathway.pdb
that contains the COMs of ligand from theprd.xtc
file.
pacs gencom traj mdtraj -trj ./trial001/cycle001/replica001/prd.xtc -top ./inputs/input.gro -s "resname ligand"
Arguments
usage: pacs gencom traj mdtraj [-h] [-trj] [-top] [-s] [-r] [-o]
-trj, --trajectory
(str):- trajectory file path (e.g.
-trj best_complete_fit.xtc
)
- trajectory file path (e.g.
-top, --top_mdtraj
(str):- topology file path for mdtraj (e.g.
-s ./trial001/cycle000/replica001/rmmol_top.pdb
)
- topology file path for mdtraj (e.g.
-s, --selection
(str):- atom selection for visualization (e.g.
-s "resname ligand"
)
- atom selection for visualization (e.g.
-r, --resid
(int):- Index of the visualized center of mass on the output pdb (e.g.
-r 1
) - default:
1
- Index of the visualized center of mass on the output pdb (e.g.
-o, --output
(str):- output file path (e.g.
-o pathway.pdb
) - default:
pathway.pdb
- output file path (e.g.
gencom trial
- This command is used after executing PaCS-MD.
- The COMs of a selected atom group from all of the trajectories within one trial of PaCS-MD can be calculated and outputted in pdb format.
- This command visualizes the migration paths of atoms and supports intuitive interpretation.
Example
- The following example generates a pathway file
pathway.pdb
that saves the COMs of ligand fromprd_fit.xtc
file in the trial 1.
pacsmd gencom trial mdtraj -t 1 -trj prd_fit.xtc -top ./inputs/input.gro -s "resname ligand"
Arguments
usage: pacs gencom trial mdtraj [-h] [-t] [-trj] [-top] [-s] [-r] [-o] [-sf]
-t, --trial
(int):- trial number without 0-fill when pacsmd was conducted (e.g.
-t 1
)
- trial number without 0-fill when pacsmd was conducted (e.g.
-trj, --trajectory
(str):- trajectory file name (e.g.
-trj prd_rmmol_fit.xtc
)
- trajectory file name (e.g.
-top, --top_mdtraj
(str):- topology file path for mdtraj (e.g.
-s ./trial001/cycle000/replica001/rmmol_top.pdb
)
- topology file path for mdtraj (e.g.
-s, --selection
(str):- atom selection for ligand (e.g.
-s "resname ligand"
)
- atom selection for ligand (e.g.
-r, --resid
(int):- Index of the visualized center of mass on the output pdb (e.g.
-r 1
) - default:
1
- Index of the visualized center of mass on the output pdb (e.g.
-o, --output
(str):- output file path (e.g.
-o pathway.pdb
) - default:
pathway.pdb
- output file path (e.g.
-sf, --stride_frame
(int):- stride to skip when extracting the com for visualization (e.g.
-sf 10
) - default:
10
- stride to skip when extracting the com for visualization (e.g.
genfeature
- This command should be executed after running
pacs mdrun
. - It generates feature data in
.npy
format, which is cconvenient for MSM analysis in Python.- Feature data files (e.g.,
t001c002r010.npy
) are stored in the directory specified with the-od
option. - Each
.npy
file has thenp.arry
in the shape as described in the table below.
- Feature data files (e.g.,
- This command supports parallel processing.
Currently implemented analysis tools and the shape of the output data in .npy
files.
feature | mdtraj | gmx | cpptraj | shape of .npy |
---|---|---|---|---|
comdist | o | x | x | (n_frames,) |
comvec | o | x | x | (n_frames, 3) |
rmsd | o | x | x | (n_frames,) |
xyz | o | x | x | (n_frames, n_atoms, 3) |
comdist
- Center of mass distance
- Calculate the distance between the centers of mass of
s1
ands2
.
Example
- The following example generates features about COM distance for MSM analysis
mdtraj
pacs genfeature comdist mdtraj -t 1 -tf prd.xtc -top inputs/example_gromacs/input.gro -s1 "residue 1" -s2 "residue 9"
Arguments
mdtraj
usage: pacs genfeature comdist mdtraj [-h] [-t] [-tf] [-top] [-od] [-p] [-s1] [-s2]
-t, --trial
(int):- trial number without 0-fill
-tf, --trj-file
(str):- trajectory file name (e.g. prd.xtc prd_rmmol.trr)
-top, --topology
(str):- topology file path (e.g. .inputs/input.gro)
-od, --outdir
(str):- output directory path
-p, --n_parallel
(int):- number of parallel
-s1, --selection1
(str):- mdtraj selection1 for calculating distance
-s2, --selection1
(str):- mdtraj selection2 for calculating distance
comvec
- Center of mass vector
- Calculate the vector between the centers of mass of
s1
ands2
- The vector is calculated as
s1
-s2
- The vector is calculated as
Example
- The following example generates features about COM vector for MSM analysis
mdtraj
pacs genfeature comvec mdtraj -t 1 -tf prd.xtc -top inputs/input.gro -ref inputs/input.gro -ft "name CA" -fr "name CA" -s1 "resid 1" -s2 "resid 9"
Arguments
mdtraj
usage: pacs genfeature comvec mdtraj [-h] [-t] [-tf] [-top] [-od] [-p] [-ref] [-ft] [-fr] [-s1] [-s2]
-t, --trial
(int):- trial number without 0-fill
-tf, --trj-file
(str):- trajectory file name (e.g. prd.xtc prd_rmmol.trr)
-top, --topology
(str):- topology file path (e.g. .inputs/input.gro)
-od, --outdir
(str):- output directory path
-p, --n_parallel
(int):- number of parallel
-ref, --reference
(str):- reference file path for fitting
-ft, --fit-trj
(str):- fitting selection for mdtraj in trajectory
-fr, --fit-ref
(str):- fitting selection for mdtraj in reference
-s1, --selection1
(str):- mdtraj selection1 for calculating distance
-s2, --selection1
(str):- mdtraj selection2 for calculating distance
RMSD
- Root Mean Square Deviation
- Calculate the RMSD relative to the structure specified in
ref
Example
- The following example generates features about RMSD for MSM analysis
mdtraj
pacs genfeature rmsd mdtraj -t 1 -tf prd.xtc -top ./inputs/input.gro -ref ./inputs/input.gro -ft "protein" -fr "protein" -ct "name CA" -cr "name CA"
Arguments
mdtraj
usage: pacs genfeature rmsd mdtraj [-h] [-tf] [-top] [-od] [-p] [-ref] [-ft] [-fr] [-ct] [-cr]
-t, --trial
(int):- trial number without 0-fill
-tf, --trj-file
(str):- trajectory file name (ex. prd.xtc prd_rmmol.trr)
-top, --topology
(str):- topology file path (ex. .inputs/input.gro)
-od, --outdir
(str):- output directory path
-p, --n_parallel
(int):- number of parallel
-ref, --reference
(str):- reference file path for fitting/RMSD calculation
-ft, --fit-trj
(str):- mdtraj selection for fitting in trajectory
-fr, --fit-ref
(str):- mdtraj selection for fitting in reference
-ct, --cal-trj
(str):- mdtraj selection for calculating RMSD in trajectory
-cr, --cal-ref
(str):- mdtraj selection for calculating RMSD in reference
XYZ
- (x, y, z), 3 dimension coordinates
Example
- The following example generates features about (x, y, z) for MSM analysis
mdtraj
pacs genfeature xyz mdtraj -t 1 -tf prd.xtc -top ./trial001/cycle000/replica001/input.gro -s "residue 1 to 5"
Arguments
mdtraj
usage: pacs genfeature xyz mdtraj [-h] [-t] [-tf] [-top] [-od OUTDIR] [-p N_PARALLEL] [-s]
-t, --trial
(int):- trial number without 0-fill
-tf, --trj-file
(str):- trajectory file name (ex. prd.xtc prd_rmmol.trr)
top, --topology
(str):- topology file path (ex. .inputs/input.gro)
-od, --outdir
(str):- output directory path
-p, --n_parallel
(int):- number of parallel
-s, --selection
(str):- mdtraj selection extracted in trajectory
FAQ
Content
- Q: How can I check pacsmd progress ?
- Q: What should I do if there are not evaluation types that I want to use.
- Q: dissociation PaCS-MD takes a lot of time. Some simulations could not completed.
- Q: Minor dissociation pathway was obtained. Is is OK ?
- Q: I got errors during running PaCS-MD
Q: How can I check pacsmd progress ?
A. You check it by seeing pacsmd standard output or error.
INFO 2023-12-16 23:22:04 [ parser.py - parse ] input_ims.toml was read successfully
INFO 2023-12-16 23:22:04 [ __main__.py - <module> ] pacsmd version 0.1.1
INFO 2023-12-16 23:22:04 [ __main__.py - main ] PaCS-MD starts
INFO 2023-12-16 23:22:04 [ __main__.py - prepare_md ] MDsettings(..)
INFO 2023-12-16 23:22:04 [ __main__.py - pacs_md ] cycle000 starts
INFO 2023-12-16 23:23:15 [superSimulator.py - record_finished ] replica001 done
INFO 2023-12-16 23:23:18 [ superAnalyzer.py - analyze ] The top ranking CV is replica 1 frame 32 cv 1.864649296784787
INFO 2023-12-16 23:23:19 [ rmmol.py - make_top_gmx ] topology file rmmol_top.pdb has been created in ./trial001/cycle000/replica001
INFO 2023-12-16 23:23:19 [ __main__.py - pacs_md ] cycle000 done
INFO 2023-12-16 23:23:44 [ Cycle.py - export ] export to cycle001 is completed
INFO 2023-12-16 23:23:46 [ rmmol.py - rmmol_gmx ] trajectory file in cycle000 have been reduced
...
- But sometimes you cannot check it because some supercomputers do not generate log files. If so, you can check it by seeing
trialXXX/cycleXXX/replicaXXX/summary/progress.log
. Here is a sample command for you to check quickly.
head -n 1 trial001/cycle*/summary/cv_ranked.log | grep frame | awk '{print NR "," $6}'
And then you can get the following results. First column denotes the cycle number and second column represents CV(for example, COM distance if type == "dissociation").
1,1.864649296784787
2,1.9111852343506632
3,1.9179890510636395
4,1.9223202126596912
5,1.915256901828055
6,1.9316632729334582
7,1.9657408272709807
8,1.9565950015268871
9,1.951968749749852
10,1.9565426138983022
11,1.9765695029520212
12,2.015255318811986
13,2.047009037596073
14,2.0903023704717936
15,2.126198720722031
16,2.1582448887927432
17,2.2197207031516375
18,2.232262081387398
19,2.2752254833312677
20,2.3406157309562796
21,2.372434614483611
22,2.4393667210979166
23,2.480949213506798
24,2.575250861566694
25,2.579432883406738
Q: What should I do if there are not evaluation types that I want to use.
A: Some evaluation types are still in progress. If you need you could also develop by yourself.
- While PaCS-MD offers many evaluation types, we could not incorporate all of them. You're welcome to utilize our template as a reference and develop your own set of evaluation types. template page.
Q: dissociation PaCS-MD takes a lot of time. Some simulations could not completed.
A: Please check COM of selected molecules.
-
Sometimes wrong selections might lead to wrong dissociation pathway and free energy.
-
But there are some examples where the selections are actually correct but not dissociated based on the initial structure.
-
You can check COM quickly by employing the following command.
$ pacs gencom traj mdtraj -s init.gro -t init.gro -ls "resid 168 || resid 253" -o com_of_r168_r253.pdb
$ pacs gencom traj mdtraj -s init.gro -t init.gro -ls "resname Lig" -o com_of_ligand.pdb
$ vmd -m init.gro com_of_r168_r253.pdb com_of_ligand.pdb
Q: Minor dissociation pathway was obtained. Is is OK ?
A: It really depends on cases but you need to check carefully by yourself.
- Minor pathway may actually emerge, but it could stem from incorrect selections or initial structures, etc. Therefore it's crucial to assess their validity before proceeding with your simulation.
Q: I got errors during running PaCS-MD
A: Please read the error messages carefully.
-
Since PaCS-MD relies on various other libraries. many issues may arise due to incorrect settings. Before reporting your concerns to Github, please ensure whether the problem genuinely lies with PaCS-MD or not.
-
Additionally, since PaCS-MD is often employed on supercomputers, certain issues might be attributed to the supercomputers themselves. For example, PaCS-MD may encounter difficulties running on supercomputers, or errors might arise that are unrelated to the associated libraries.