An Overview of the Tools
One of the strengths of iVirus (thanks to its underlying CyVerse cyberinfrastructure) is a focus on bringing bioinformatic tools to the viral ecology community. Here are a few examples of using iVirus/CyVerse available Apps to process data:
A few quick notes:
- Guides are not intended to assist users in understanding the biology behind the tools nor how the tools function.
- Where possible, Apps have links to their documentation on CyVerse as well as their citations (or original home pages).
- In some cases, many Apps are available to solve a particular problem. Guides will choose to highlight one or two.
- These guides assume you’ve created an CyVerse account and can access your account. Check out the getting started guide for assistance.
Several “use cases” are available at protocols.io. For nearly all these use cases, we’ll use (as a basis) actual reads from the Ocean Sampling Day (2014) and process them using Cyverse. In some cases we’ll take the user from using raw read files to assembly to identifying viral sequences and preliminary analysis. Other use cases will tackle ways of analyzing a viral metagenome, either reads or contigs, using traditional and non-traditional approaches. As a reminder, all these protocols are on protcols.io and should be considered the most up-to-date versions.
All example files can be found within the Cyverse datastore. To find these files, login to the Discovery Environment. Under “Data”, go to Community Data –> iVirus –> ExampleData. Alternatively, you can copy-and-paste the following into the “Viewing” bar under the data browser: /iplant/home/shared/iVirus/ExampleData/
All tools have “Input” and “Output” directories, so not only does the user have valid input data, but also the expected output data as well.
Processing a Viral Metagenome
Description: A long-standing challenge in viral metagenomics is actually processing a viral metagenome (we’re not talking about the science side!). For many reasons enumerated elsewhere, processing these datasets requires skilled bioinformaticians and computational resources not available to many researchers/labs. iVirus seeks to tackle this head-on.
Protocol “Collection”: protocols.io (collections are just that – collections of protocols)
- Cleaning up sequencing reads using Trimmomatic
- Assembling QC’d reads using SPAdes
- Identifying putative viral sequences using VirSorter
- Preparing data for vContact
- Running vContact and Visualization in Cytoscape
Mapping Metagenomic Reads to References
Description: One of the most commonly used procedures for analyzing viral metagenomic data is to map their reads (or reads from another dataset) against a set of references, often those from the read assembly. For example, if one wanted to know how well-represented viruses in NCBI’s Viral Reference Sequences (ViralRefSeq) were in ocean viromes, they could map reads from lots of ocean viral metagenomes against ViralRefSeq. This is generally done using Bowtie2 or BWA, by selecting a reference set of sequences, and then providing paired or unpaired reads to Bowtie2/BWA. Then the results must be processed/filtered to generate coverage tables. Dealing with setting up multiple reads files (10 paired metagenomes = 10 alignment runs) and the processing those read files can be challenging (not to mention computational resources).
- Mapping reads from multiple metagenomes to a set of references
- Filtering mapped reads and generate coverage tables
Before processing any data, users will need to upload their data to CyVerse’s data store. The data store is built on iRODS, an open source data management system. Data can be uploaded directly through the Discovery Environment’s (DE) upload menu (this is limited to 2 GB per file) or through one of iRODS clients (click here for a list of available offerings). The easiest way to upload files securely and quickly is by using Cyberduck. Here we’ll assume you’ve installed Cyberduck and are connecting to the Data Store (a complete guide is available here):
user: your CyVerse username
password: your CyVerse password
Once you’ve logged in, you should be at your home folder. Drag n’ drop your read files to your home directory.