Usage¶
Installing HyperImmunISE and Dependencies¶
To use HyperImmunISE, first pull from git: https://github.com/Jiwon-Lee-Lab/HyperImmunISE.git
$ git clone https://github.com/Jiwon-Lee-Lab/HyperImmunISE.git
- HyperImmunISE requires Python 3 (https://www.python.org/) and the following dependencies:
_libgcc_mutex=0.1=conda_forge
_openmp_mutex=4.5=2_kmp_llvm
appdirs=1.4.4=pyhd3eb1b0_0
biopython=1.78=py310h7f8727e_0
blas=1.0=mkl
bzip2=1.0.8=h7b6447c_0
ca-certificates=2023.08.22=h06a4308_0
certifi=2023.11.17=py310h06a4308_0
cudatoolkit=11.8.0=h6a678d5_0
icu=73.1=h6a678d5_0
intel-openmp=2021.4.0=h06a4308_3561
ld_impl_linux-64=2.38=h1181459_1
libboost=1.82.0=h109eef0_2
libboost-python=1.82.0=py310hcb52e73_6
libffi=3.3=he6710b0_2
libgcc-ng=13.2.0=h807b86a_3
libstdcxx-ng=13.2.0=h7e041cc_3
libuuid=1.0.3=h7f8727e_2
llvm-openmp=14.0.6=h9e868ea_0
lz4-c=1.9.4=h6a678d5_0
mako=1.2.3=py310h06a4308_0
markupsafe=2.1.1=py310h7f8727e_0
mkl=2021.4.0=h06a4308_640
mkl-service=2.4.0=py310h7f8727e_0
mkl_fft=1.3.1=py310hd6ae3a3_0
mkl_random=1.2.2=py310h00e6091_0
ncurses=6.3=h7f8727e_2
numpy=1.24.3=py310hd5efca6_0
numpy-base=1.24.3=py310h8e6c178_0
openssl=1.1.1w=h7f8727e_0
pip=21.2.4=py310h06a4308_0
platformdirs=3.10.0=py310h06a4308_0
pycuda=2023.1=py310hd456308_0
pyrosetta=2023.33+release.9c16e13=py310_0
python=3.10.4=h12debd9_0
python_abi=3.10=2_cp310
pytools=2023.1.1=pyhd8ed1ab_0
readline=8.1.2=h7f8727e_1
setuptools=61.2.0=py310h06a4308_0
six=1.16.0=pyhd3eb1b0_1
sqlite=3.38.3=hc218d9a_0
tk=8.6.11=h1ccaba5_1
typing_extensions=4.7.1=py310h06a4308_0
tzdata=2022a=hda174b7_0
wheel=0.37.1=pyhd3eb1b0_0
xz=5.4.2=h5eee18b_0
zlib=1.2.13=h5eee18b_0
zstd=1.5.5=hc292b87_0
- pip:
freesasa==2.1.0
The provided “hyperimmunise.yml” can also be used to directly import the corresponding packages. It is recommended that these are installed into a conda environment.
Jwalk, a software used to calculate solvent accessible surface distances (Bullock, Joshua Matthew Allen, et al. Molecular & Cellular Proteomics, 2016) can be downloaded as part of XLM-Tools (https://github.com/Topf-Lab/XLM-Tools/).
NetNGlyc, a software to predict if a glycosylation sequon will contain a glycan (Gupta R, Brunak S. Pac Symp Biocomput., 2002) can be downloaded (https://services.healthtech.dtu.dk/services/NetNGlyc-1.0/. If the primary link is unavailable, use https://services.healthtech.dtu.dk/cgi-bin/sw_request?software=netNglyc&version=1.0_copy&packageversion=1.0d&platform=Linux). This software may need to be edited to work on modern operating systems, a guide to a hotfix to allow for this can be found here: https://squanderingti.me/blog/2020/10/28/extreme-debugging.html
Amber, a molecular dynamic simulation tool (R. Salomon-Ferrer, D.A. Case, R.C. Walker. WIREs Comput. Mol. Sci., 2013) can be downloaded from https://ambermd.org/. Amber simulation can be edited based on user needs, the default framework contained here assumes that Amber is installed with gpu compatibility and a gpu is available for simulations.
Necessary System Architecture¶
This script is intended to be run on a computing cluster and is currently written to accommodate a CentOS Linux architecture with both CPU and GPU nodes. Other architectures may be compatible but have not been tested with this program. It is highly recommended that this software be run on a high-performance computing cluster, as the process is memory intensive.
Designing Constructs with HyperImmunISE¶
Constructs are designed by running the ‘main.py’ script with the following flags:
-netnglyc_loc <file path to the NetNGlyc installation>
-jwalk_loc <file path to the XLM-Tools folder>
-chain <chain in PDB file to be used for analysis, contained in quotations>
-size <integer with the targeted number of glycans to add>
-number <integer with the number of designs to create>
-priority_sites <residue numbers of site which must contain a glycan>
-uncovered_sites <residues number of sites which should remain unblocked by glycans>”99-105,221-228”
-model_glycans <whether Rosetta should model glycans, “Y” or “N”>
-pdb_list <name of PDB file, no extension (.pdb) necessary>
-path <path to the PDB construct>
-destination <path to the output folder>
Residue numbers for the “priority_sites” and “uncovered_sites” can be entered as a range (e.g., “99-105”) and/or as a sequence (e.g., “99-105,221-228,330”). Numbering is based off of the PDB numbering.
Output designs will be written in the destination folder as .pdb files and scoring evaluations will be output in text in the console.
An example input is provided below:
$ python main.py -netnglyc_loc "/NetNGlyc/netNglyc-1.0" -jwalk_loc "/XLM-Tools-master" -chain "K" -size 10 -number 10 -priority_sites "" -uncovered_sites "99-105,221-228" -model_glycans "Y" -pdb_list "mono_head" -path "/mono_head.pdb" -destination "/"
Evaluating HyperImmunISE Constructs in Amber¶
With the installation of Amber, navigate to the installed directory:
$ source amber.sh
Made folder called Simulations:
$ mkdir Simulations/
$ cd Simulations/
Copy your desired .pdb structure to this folder.
Navigate to the AmberTools/src/ folder and run cpptraj:
$ cpptraj
$ cd Simulations/
$ parm <path_to_file>/<filename>.pdb
$ loadcrd <path_to_file>/<filename>.pdb MyCrd
$ prepareforleap crdset MyCrd name Final out <path_to_file>/leap.<filename>.in leapunitname mol pdbout <path_to_file>/<filename>.cpptraj.pdb nowat noh keepaltloc highestocc
$ quit
Edit the resulting .in file to have the following:
At the top:
source leaprc.protein.ff19SB
source leaprc.GLYCAM_06j-1
source leaprc.water.opc3
loadAmberParams frcmod.ionslm_126_opc
and at the bottom:
solvateOct mol OPC3BOX 8.0
addIons mol Cl- 0
addIons mol Na+ 0
saveAmberParm mol <filename>.prmtop <filename>.inpcrd
quit
Navigate to the Amber bin folder and then run tleap and parmed:
$ tleap -f <path_to_file>/leap.<filename>.in
$ parmed
$ parm <filename>.prmtop
$ HMassRepartition
$ outparm <filename>_hmr.prmtop
$ quit
Move the resulting .prmtop, inpcrd, and _hrm.prmtop files to the Simulations folder and navigate there.
Make the following files:
File - minimize.in
Minimize
&cntrl
imin=1,
ntx=1,
ntxo=1,
irest=0,
maxcyc=5000,
ncyc=2500,
ntpr=250,
ntwx=0,
cut=8.0,
/
File - heat.in
Heat
&cntrl
imin=0,
ntx=1,
ntxo=1,
irest=0,
nstlim=11000,
dt=0.004,
ntf=2,
ntc=2,
tempi=200.0,
temp0=310.0,
ntpr=100,
ntwx=100,
ntwr=1000,
cut=8.0,
ntb=1,
ntp=0,
ntt=2,
vrand=1000
ig=-1,
/
&wt type=’TEMP0’, istep1=0, istep2=10000, value1=200.0,
value2=310.0 /
&wt type=’TEMP0’, istep1=10001, istep2=11000, value1=310.0,
value2=310.0 /
&wt type=’END’
File - equilibrate.in
Equilibrate w/ HMR
&cntrl
imin=0,
ntx=1,
ntxo=1,
irest=0,
nstlim=25000000,
dt=0.004,
ntf=2,
ntc=2,
tempi=310.0,
temp0=310.0,
ntpr=100000,
ntwx=50000,
ntwr=1000000,
cut=8.0,
ntb=2,
ntp=1,
ntt=2,
vrand=10000,
ig=-1,
/
The following commands are best run on a computing cluster:
$ module load cuda
$ cd <path_to_amber>
$ source amber.sh
$ cd Simulations
$ ../bin/sander -O -i minimize.in -o minimize.out -p <filename>_hmr.prmtop -c <filename>.inpcrd -r min.rst7
$ ../bin/pmemd.cuda -O -i heat.in -o heat.out -p <filename>_hmr.prmtop -c min.rst7 -r heat.rst7 -x heat.nc
$ ../bin/pmemd.cuda -O -i equilibrate.in -o equilibrate.out -p <filename>_hmr.prmtop -c heat.rst7 -r equilibrate.rst7 -x equilibrate.nc
GPU-based MD simulation¶
MD simulations can be run on GPU using pmemd.cuda.
For HPC or server-based workflows, enter a GPU node first and verify that the GPU is properly available by running nvidia-smi. This check should be done before launching the simulation.
In some cases, pmemd.cuda is not included in the installed Amber version. This is common with AmberTools 25. When that happens, load a CUDA environment earlier than version 13, download pmemd24.tar.bz2 from the Amber organization website, build it locally, and use the compiled executable for GPU simulation.
Then load the conda environment which contains a mdtraj environment (install with pip or conda if you don’t have it) and run the following:
import mdtraj as mdtraj
import os
os.chdir('Simulations')
mtraj = mdtraj.load("equilibrate.nc", top="<filename>_hmr.prmtop")
mtraj=simul.remove_solvent()
mtraj.save_pdb("<filename>_nosolvent.PDB")
sr = mdtraj.shrake_rupley(mtraj, probe_radius=1, mode='residue')
avg_sr = np.mean(sr,0)
np.savetxt("avg_sr.csv",avg_sr,delimiter=",")
The corresponding degree of blocking will be shown in the “avg_sr.csv” file and the numbering will follow the PDB numbering on your input design.