Request account
GATK 4 can be installed through Conda with:
$ conda install -c bioconda gatk4
GATK 3 cannot be installed this easily due to licensing restrictions. Instead, you must download a licensed copy of GATK from https://software.broadinstitute.org/gatk/download/archive, for example:
[fe-open-01]$ conda activate myproject
(myproject) [fe-open-01]$ conda install -c bioconda gatk
(myproject) [fe-open-01]$ wget 'https://software.broadinstitute.org/gatk/download/auth?package=GATK-archive&version=3.8-1-0-gf15c1c3ef' -O GenomeAnalysisTK-3.8-1-0-gf15c1c3ef.tar.bz2
(myproject) [fe-open-01]$ gatk3-register GenomeAnalysisTK-3.8-1-0-gf15c1c3ef.tar.bz2
You can now call GATK with the gatk3
command:
(myproject) [fe-open-01]$ gatk3 --help
Different Python versions can be installed through the Conda package manager. For example, for a project that needs Python 2.7, create an environment like this:
$ conda activate
$ conda create -n myproject python=2.7
$ conda activate myproject
$ python --version
If you need Python 3.6 for another project, just create another environment for that project:
$ conda deactivate # exit the myproject environment
$ conda create -n mynewproject python=3.6
$ conda activate mynewproject
$ python --version
Install the jupyter
package in your environment.
One way to run a Jupyter Notebook on the cluster is to setup an SSH tunnel to the Jupyter instance.
Start an interactive job. Login to the cluster and start an interactive job where the notebook will run.
[local ~]$ ssh <user>@login.genome.au.dk
[me@genomedk ~]$ srun --pty bash
srun: job 3597082 queued and waiting for resources
srun: job 3597082 has been allocated resources
[me@node ~]$
Setup SSH tunnel. Back on your local computer open a second terminal to setup the port-forwarding from the computing node to your computer.
[local ~]$ ssh -L<UID>:<compute node>:<UID> <user>@login.genome.au.dk
You will need to replace <UID>
with your user ID on the cluster, <compute node>
with the name of the compute node you have your job on, and <user>
with your username on the cluster. You can easily get those values by running following commands on your compute node inside the interactive job you started in the previous step.
[me@node ~]$ echo $UID
1234
[me@node ~]$ hostname -s
node
[me@node ~]$ echo $USER
me
Resulting in a command that would look like this:
[local ~]$ ssh -L1234:node:1234 me@login.genome.au.dk
Start the notebook. Back on the computing node start a Jupyter notebook. For this you may have to first unset the environmental variable XDG_RUNTIME_DIR
(this could also be included in ~/.bashrc
).
[me@node ~]$ unset XDG_RUNTIME_DIR
[me@node ~]$ conda activate <jupyter-env>
[me@node ~]$ jupyter-notebook --no-browser --port=$UID --ip=0.0.0.0
Run the notebook. Back on your local computer start a web browser and paste the URL from above. But replace the part in parenthesis with localhost to get:
http://localhost:<UID>/?token=....
Cleanup. When finished, remember to log out from both sessions.
RStudio is available on the cluster as a graphical application, which can be run on both compute nodes and the frontend node. Bare in mind, the frontend node must not be used for computation or analysis. RStudio needs X-forwarding to be enabled.
When logged in, you must either activate the environment where RStudio is installed or install it into an environment yourself (see Installing and using software):
[fe-open-01]$ conda install -n my-project rstudio-desktop
[fe-open-01]$ conda activate my-project
(my-project) [fe-open-01]$ rstudio
To run an analysis or computations in RStudio you will need to run RStudio in an interactive job on a compute node.
[fe-open-01]$ srun --mem=4g -c 1 --time=10:0:0 --pty bash
srun: job 3597082 queued and waiting for resources
srun: job 3597082 has been allocated resources
[s03n11]$ conda activate my-project
(my-project) [s03n11]$ rstudio
RStudio is automatically terminated if it allocates more than the reserved 4GB, the 10 hours expires or the connection is lost. So remember to save your work!
TeXLive is available on GenomeDK in the form of TinyTeX, which is a stripped down version of TeXLive. See TinyTex for more details.
The Conda provided package is for CLI or script usage, the R integration has not been tested and should probably be done using the guide described on the TinyTeX home page.
To install TinyTeX with Conda in a new environment:
[fe-open-01]$ conda create <name of project> -c genomedk tinytex
or if you have an existing environment where you want TinyTeX installed:
[fe-open-01]$ conda activate <existing project>
[fe-open-01]$ conda install -c genomedk tinytex
Compiling documents is done using the normal TexLive commands, i.e.:
[fe-open-01]$ pdflatex test.tex
To install LaTeX packages from CTAN:
[fe-open-01]$ tlmgr install <package>
Search for packages using tlmgr:
[fe-open-01]$ tlmgr search <package>