public class RarefySeqs extends JavaModuleImpl implements SeqModule
| Modifier and Type | Field and Description |
|---|---|
protected static String |
INPUT_RAREFYING_MAX
Config property "rarefySeqs.max" defines the maximum number of reads per file |
protected static String |
INPUT_RAREFYING_MIN
Config property "rarefySeqs.min" defines the minimum number of reads per file |
static String |
NUM_RAREFIED_READS
Metadata column name for column that holds number of rarefied reads per sample: "Num_Rarefied_Reads"
|
BLJ_OPTIONSGZIP_EXT, LOG_EXT, PDF_EXT, RETURN, SH_EXT, TAB_DELIM, TSV_EXT, TXT_EXTSCRIPT_BATCH_SIZE, SCRIPT_DEFAULT_HEADER, SCRIPT_NUM_THREADS, SCRIPT_PERMISSIONS, SCRIPT_TIMEOUTMAIN_SCRIPT_PREFIX, OUTPUT_DIR, TEMP_DIR| Constructor and Description |
|---|
RarefySeqs() |
| Modifier and Type | Method and Description |
|---|---|
protected void |
buildRarefiedFile(File input,
List<Long> indexes)
Build the rarefied file for the input file, keeping only the given indexes
|
void |
checkDependencies()
Validate module dependencies
Validate
Config.INPUT_RAREFYING_MIN is a non-negative integer
Validate Config.INPUT_RAREFYING_MAX is a positive integer that is greater than or
equal to Config.INPUT_RAREFYING_MIN (if defined)
|
void |
cleanUp()
Set "Num_Rarefied_Reads" as the number of reads field.
|
List<String> |
getPreRequisiteModules()
This method always requires a prerequisite module with a "number of reads" count such as:
RegisterNumReads. |
List<File> |
getSeqFiles(Collection<File> files)
Return only sequence files for sample IDs found in the metadata file.
If Config."metadata.required" = "Y", an
error is thrown to list the files that cannot be matched to a metadata row. |
String |
getSummary()
Produce summary message with min, max, mean, and median number of reads.
|
protected void |
rarefy(File seqFile)
Builds the rarefied file if too many seqs found, or adds files with too few samples to the list of bad samples.
|
void |
runModule()
For each file with number reads outside of
Config.INPUT_RAREFYING_MIN and
Config.INPUT_RAREFYING_MAX values, generate a new sequence file from a shuffled list of
its sequences. |
buildScript, executeTask, getSource, getWorkerScriptFunctions, isValidInputModule, markStatus, moduleComplete, moduleFailedbuildScriptForPairedReads, getJobParams, getMainScript, getRuntimeParams, getScriptDir, getScriptErrors, getTimeout, hasScriptscacheInputFiles, compareTo, equals, findModuleInputFiles, getFileCache, getID, getInputFiles, getModuleDir, getOutputDir, getPostRequisiteModules, getTempDir, init, toString, validateFileNameUniqueclone, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitbuildScript, buildScriptForPairedReads, getJobParams, getMainScript, getScriptDir, getScriptErrors, getTimeout, getWorkerScriptFunctionsexecuteTask, getID, getInputFiles, getModuleDir, getOutputDir, getPostRequisiteModules, getTempDir, init, isValidInputModulepublic static final String NUM_RAREFIED_READS
protected static final String INPUT_RAREFYING_MAX
Config property "rarefySeqs.max" defines the maximum number of reads per fileprotected static final String INPUT_RAREFYING_MIN
Config property "rarefySeqs.min" defines the minimum number of reads per filepublic void checkDependencies()
throws Exception
Config.INPUT_RAREFYING_MIN is a non-negative integer
Config.INPUT_RAREFYING_MAX is a positive integer that is greater than or
equal to Config.INPUT_RAREFYING_MIN (if defined)
checkDependencies in interface BioModulecheckDependencies in class ScriptModuleImplException - thrown if missing or invalid dependencies are foundpublic void cleanUp()
throws Exception
cleanUp in interface BioModulecleanUp in class BioModuleImplException - thrown if any runtime error occurspublic List<String> getPreRequisiteModules() throws Exception
RegisterNumReads. If paired reads found, also return a 2nd module:
PearMergeReads.getPreRequisiteModules in interface BioModulegetPreRequisiteModules in class BioModuleImplException - if invalid Class names are returned as prerequisitespublic List<File> getSeqFiles(Collection<File> files) throws Exception
SeqModuleConfig."metadata.required" = "Y", an
error is thrown to list the files that cannot be matched to a metadata row.getSeqFiles in interface SeqModulefiles - Module input filesException - if no input files are foundpublic String getSummary() throws Exception
getSummary in interface BioModulegetSummary in class ScriptModuleImplException - if any error occurspublic void runModule()
throws Exception
Config.INPUT_RAREFYING_MIN and
Config.INPUT_RAREFYING_MAX values, generate a new sequence file from a shuffled list of
its sequences.runModule in interface JavaModulerunModule in class JavaModuleImplException - thrown if any runtime error occursprotected void buildRarefiedFile(File input, List<Long> indexes) throws Exception
input - Sequence fileindexes - List of indexes to keepException - if unable to build rarefied file