public class QiimeClassifier extends ClassifierModuleImpl
Modifier and Type | Field and Description |
---|---|
protected static String |
ALPHA_DIV_NULL_VALUE
Value output by "alpha_diversity.py" for null values: "N/A"
|
protected static String |
ALPHA_DIVERSITY_TABLE
File produced by QIIME "alpha_diversity.py" script: "alphaDiversity.txt"
|
protected static String |
COMBINED_FNA
Multiplexed fasta file produced by QIIME "add_qiime_labels.py" script: "combined_seqs.fna"
|
protected static String |
EXE_VSEARCH
Config property for vsearch exectuable used for chimera detection: "exe.vsearch" |
protected static String |
EXE_VSEARCH_PARAMS
Config property for "exe.vsearch" parameters (such as alternate reference database
path): "exe.vsearchParams" |
protected static String |
OTU_SUMMARY_FILE
File produced by QIIME "biom summarize-table" script: "otuSummary.txt"
|
protected static String |
OTU_TABLE
File produced by OTU picking scripts holding read taxonomy assignments: "otu_table.biom"
|
protected static String |
QIIME_PARAMS
Config List property used to obtain the QIIME executable params |
protected static String |
QIIME_PYNAST_ALIGN_DB
Config File property to define ~/.qiime_config pynast_template_alignment_fp:
"qiime.pynastAlignDB" |
protected static String |
QIIME_REF_SEQ_DB
Config File property to define ~/.qiime_config pick_otus_reference_seqs_fp and
assign_taxonomy_reference_seqs_fp: "qiime.refSeqDB" |
protected static String |
QIIME_REMOVE_CHIMERAS
Config boolean property to indicate if "exe.vsearch" is needed for chimera removal:
"qiime.removeChimeras" |
protected static String |
QIIME_TAXA_DB
Config File property to define ~/.qiime_config assign_taxonomy_id_to_taxonomy_fp:
"qiime.taxaDB" |
protected static String |
REP_SET
Directory created by "pick_de_novo_otus.py" and
"pick_open_reference_otus.py": "rep_set"
|
protected static String |
SCRIPT_ADD_ALPHA_DIVERSITY
QIIME script to add "alphaDiversity.txt" to the metadata file: "add_alpha_to_mapping_file.py"
|
protected static String |
SCRIPT_ADD_LABELS
QIIME script that produces "combined_seqs.fna", the multiplexed fasta file: "add_qiime_labels.py"
|
protected static String |
SCRIPT_CALC_ALPHA_DIVERSITY
QIIME script that creates alpha diversity metrics file in output/"alphaDiversity.txt":
"alpha_diversity.py"
|
protected static String |
SCRIPT_FILTER_OTUS
QIIME script used to remove chimeras detected by "exe.vsearch": "filter_otus_from_otu_table.py"
|
protected static String |
SCRIPT_PRINT_CONFIG
QIIME script to print environment configuration to qsub output file: "print_qiime_config.py"
|
protected static String |
SCRIPT_SUMMARIZE_BIOM
Produces output/"otuSummary.txt" summarizing dataset: "biom summarize-table"
|
protected static String |
SCRIPT_SUMMARIZE_TAXA
QIIME script used to produce taxonomy-level reports in the module output directory:
"summarize_taxa.py"
|
protected static String |
SUMMARIZE_TAXA_SUPPRESS_BIOM
QIIME script "summarize_taxa.py" parameter used to suppress the output of biom files.
|
GZIP_EXT, LOG_EXT, PDF_EXT, RETURN, SH_EXT, TAB_DELIM, TSV_EXT, TXT_EXT
SCRIPT_BATCH_SIZE, SCRIPT_DEFAULT_HEADER, SCRIPT_NUM_THREADS, SCRIPT_PERMISSIONS, SCRIPT_TIMEOUT
MAIN_SCRIPT_PREFIX, OUTPUT_DIR, TEMP_DIR
Constructor and Description |
---|
QiimeClassifier() |
Modifier and Type | Method and Description |
---|---|
protected List<String> |
buildQiimeDockerConfigLines()
Build ~/.qiime_config to define the alternate Docker qiime_classifier DB with local container path
references.
|
List<List<String>> |
buildScript(List<File> files)
Generate bash script lines to summarize QIIME results, build taxonomy reports, and add alpha diversity metrics.
|
List<List<String>> |
buildScriptForPairedReads(List<File> files)
QIIME does not support paired reads
|
void |
checkDependencies()
Validate module dependencies:
Call
ClassifierModuleImpl.getClassifierExe() to verify the executable
Call ClassifierModuleImpl.getClassifierParams() to verify the runtime parameters are valid
Call ClassifierModuleImpl.validateModuleOrder() to validate module configuration order. |
void |
cleanUp()
The cleanUp operation builds a new metadata file if alpha diversity metrics were generated by this module.
|
String |
getClassifierExe()
QIIME calls python scripts, so no special command is required
|
List<String> |
getClassifierParams()
Obtain the QIIME runtime params
|
File |
getDB()
Check DB parameters for the comment parent directory path, there are 3 parameters:
"qiime.pynastAlignDB"
"qiime.refSeqDB"
"qiime.taxaDB"
|
protected File |
getDockerDB(String prop)
Return the Docker container database directory (starting with /db/...)
|
protected File |
getInputFileDir()
Module input directories are set to the previous module output directory.
To ensure we use the correct path, get path from getInputFiles() |
List<File> |
getInputFiles()
Return
SeqModuleImpl.getSeqFiles(Collection) to filter standard module input files. |
protected String |
getParams()
Subclasses call this method to check dependencies before picking OTUs to validate
Config ."qiime.params" |
protected List<String> |
getPickOtuLines(String otuPickingScript,
File fastaDir,
String mapping,
File outputDir)
Subclasses call this method to add OTU picking lines by calling "add_qiime_labels.py" via OTU picking
script.
|
List<String> |
getPostRequisiteModules()
Subclasses of QiimeClassifier add post-requisite module:
QiimeClassifier . |
List<String> |
getPreRequisiteModules()
If paired reads found, add prerequisite module:
PearMergeReads . |
String |
getSummary()
This method extends the classifier summary by adding the Qiime OTU summary metrics.
|
protected String |
getVsearchParams()
Return runtime parameters for "exe.vsearchParams"
|
List<String> |
getWorkerScriptFunctions()
Method returns bash script lines used to build the functions called by the worker scripts.
|
boolean |
isValidInputModule(BioModule module)
If superclass is fed by another QiimeClassifier, it must be a subclass with biom output.
|
protected void |
validateFileNameUnique(Set<String> fileNames,
File file)
Typically we verify no duplicate file names are used, but for QIIME we may be combining multiple files with the
same name ("otu_table.biom"), so QiimeClassifier skips this validation.
|
getClassifierType, validateModuleOrder
getSeqFiles
executeTask, getJobParams, getMainScript, getRuntimeParams, getScriptDir, getScriptErrors, getTimeout, hasScripts
cacheInputFiles, compareTo, equals, findModuleInputFiles, getFileCache, getID, getModuleDir, getOutputDir, getTempDir, init, toString
clone, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
getSeqFiles
getJobParams, getMainScript, getScriptDir, getScriptErrors, getTimeout
executeTask, getID, getModuleDir, getOutputDir, getTempDir, init
protected static final String ALPHA_DIV_NULL_VALUE
protected static final String ALPHA_DIVERSITY_TABLE
protected static final String COMBINED_FNA
protected static final String EXE_VSEARCH
Config
property for vsearch exectuable used for chimera detection: "exe.vsearch"protected static final String EXE_VSEARCH_PARAMS
Config
property for "exe.vsearch" parameters (such as alternate reference database
path): "exe.vsearchParams"protected static final String OTU_SUMMARY_FILE
protected static final String OTU_TABLE
protected static final String QIIME_PARAMS
Config
List property used to obtain the QIIME executable paramsprotected static final String QIIME_PYNAST_ALIGN_DB
Config
File property to define ~/.qiime_config pynast_template_alignment_fp:
"qiime.pynastAlignDB"protected static final String QIIME_REF_SEQ_DB
Config
File property to define ~/.qiime_config pick_otus_reference_seqs_fp and
assign_taxonomy_reference_seqs_fp: "qiime.refSeqDB"protected static final String QIIME_REMOVE_CHIMERAS
Config
boolean property to indicate if "exe.vsearch" is needed for chimera removal:
"qiime.removeChimeras"protected static final String QIIME_TAXA_DB
Config
File property to define ~/.qiime_config assign_taxonomy_id_to_taxonomy_fp:
"qiime.taxaDB"protected static final String REP_SET
protected static final String SCRIPT_ADD_ALPHA_DIVERSITY
protected static final String SCRIPT_ADD_LABELS
protected static final String SCRIPT_CALC_ALPHA_DIVERSITY
protected static final String SCRIPT_FILTER_OTUS
protected static final String SCRIPT_PRINT_CONFIG
protected static final String SCRIPT_SUMMARIZE_BIOM
protected static final String SCRIPT_SUMMARIZE_TAXA
protected static final String SUMMARIZE_TAXA_SUPPRESS_BIOM
public List<List<String>> buildScript(List<File> files) throws Exception
The QiimeClassifier script begins with the following QIIME scripts:
If Config
. are defined, add lines to run
additional scripts:
Config
."metadata.filePath"
For complete list of skbio.diversity.alpha options, see http://scikit-bio.org/docs/latest/generated/skbio.diversity.alpha.html
buildScript
in interface ScriptModule
buildScript
in class ScriptModuleImpl
files
- Files in the input directory that contain only forward readsException
- if unable to generate script linespublic List<List<String>> buildScriptForPairedReads(List<File> files) throws Exception
buildScriptForPairedReads
in interface ScriptModule
buildScriptForPairedReads
in class ScriptModuleImpl
files
- Files in the input directory that contain only paired readsException
- if unable to generate the script linespublic void checkDependencies() throws Exception
ClassifierModuleImpl
ClassifierModuleImpl.getClassifierExe()
to verify the executable
ClassifierModuleImpl.getClassifierParams()
to verify the runtime parameters are valid
ClassifierModuleImpl.validateModuleOrder()
to validate module configuration order.
BioModule.checkDependencies()
to validate script dependencies.
checkDependencies
in interface BioModule
checkDependencies
in class ClassifierModuleImpl
Exception
- thrown if missing or invalid dependencies are foundpublic void cleanUp() throws Exception
Config
."metadata.nullValue" if any are found.
This method also removes the redundant normalized alpha metric column and reorganizes the metadata so that alpha metric columns are move to the first columns after the 1st ID column.
cleanUp
in interface BioModule
cleanUp
in class BioModuleImpl
Exception
- thrown if any runtime error occurspublic String getClassifierExe() throws Exception
getClassifierExe
in interface ClassifierModule
getClassifierExe
in class ClassifierModuleImpl
Exception
- if the classifier program undefined or invalidpublic List<String> getClassifierParams() throws Exception
getClassifierParams
in interface ClassifierModule
getClassifierParams
in class ClassifierModuleImpl
Exception
- thrown if parameters defined are invalidpublic File getDB() throws Exception
getDB
in interface DatabaseModule
getDB
in class ClassifierModuleImpl
Exception
- if errors occurpublic List<File> getInputFiles() throws Exception
SeqModuleImpl
SeqModuleImpl.getSeqFiles(Collection)
to filter standard module input files.getInputFiles
in interface BioModule
getInputFiles
in class SeqModuleImpl
Exception
- thrown if any runtime error occurspublic List<String> getPostRequisiteModules() throws Exception
QiimeClassifier
.
Only the QiimeClassifier itself adds the QiimeParser as a post-requisite module.getPostRequisiteModules
in interface BioModule
getPostRequisiteModules
in class ClassifierModuleImpl
Exception
- if invalid Class names are returned as post-requisitespublic List<String> getPreRequisiteModules() throws Exception
PearMergeReads
. If sequences are not
fasta format, add prerequisite module: AwkFastaConverter
, or similar module specified
by "pipeline.defaultFastaConverter". Subclasses of QiimeClassifier add prerequisite module:
BuildQiimeMapping
.getPreRequisiteModules
in interface BioModule
getPreRequisiteModules
in class BioModuleImpl
Exception
- if invalid Class names are returned as prerequisitespublic String getSummary() throws Exception
getSummary
in interface BioModule
getSummary
in class ClassifierModuleImpl
Exception
- if any error occurspublic List<String> getWorkerScriptFunctions() throws Exception
ScriptModule
getWorkerScriptFunctions
in interface ScriptModule
getWorkerScriptFunctions
in class ScriptModuleImpl
Exception
- if errors occurpublic boolean isValidInputModule(BioModule module)
isValidInputModule
in interface BioModule
isValidInputModule
in class SeqModuleImpl
module
- BioModule that ran before the current BioModuleprotected List<String> buildQiimeDockerConfigLines() throws Exception
Exception
- if errors occur build fileprotected File getDockerDB(String prop) throws Exception
prop
- QIIME database dirException
- if errors occurprotected File getInputFileDir() throws Exception
getInputFiles()
Exception
- if propagated by getInputFiles()
protected String getParams() throws Exception
Config
."qiime.params"Exception
- if "qiime.params" contains invalid parametersprotected List<String> getPickOtuLines(String otuPickingScript, File fastaDir, String mapping, File outputDir) throws Exception
otuPickingScript
- QIIME scriptfastaDir
- Fasta File directorymapping
- File-path of mapping fileoutputDir
- Directory to output "combined_seqs.fna"Exception
- if errors occurprotected String getVsearchParams() throws Exception
Exception
- if errors occurprotected void validateFileNameUnique(Set<String> fileNames, File file)
validateFileNameUnique
in class BioModuleImpl
fileNames
- A registry of module input file names added so farfile
- Next file to validate