public abstract class BioModuleImpl extends Object implements BioModule, Comparable<BioModule>
Modifier and Type | Field and Description |
---|---|
static String |
GZIP_EXT
BioLockJ gzip file extension constant: ".gz"
|
static String |
LOG_EXT
BioLockJ log file extension constant: ".log"
|
static String |
PDF_EXT
BioLockJ PDF file extension constant: ".pdf"
|
static String |
RETURN
Return character constant *backslash-n*
|
static String |
SH_EXT
BioLockJ shell script file extension constant: ".sh"
|
static String |
TAB_DELIM
BioLockJ tab character constant: "\t"
|
static String |
TSV_EXT
BioLockJ tab delimited text file extension constant: ".tsv"
|
static String |
TXT_EXT
BioLockJ tab delimited text file extension constant: ".txt"
|
MAIN_SCRIPT_PREFIX, OUTPUT_DIR, TEMP_DIR
Constructor and Description |
---|
BioModuleImpl() |
Modifier and Type | Method and Description |
---|---|
protected void |
cacheInputFiles(Collection<File> files)
Cache the input files for quick access on subsequent calls to
getInputFiles() |
abstract void |
checkDependencies()
If restarting or running a direct pipeline execute the cleanup for completed modules.
|
void |
cleanUp()
By default, no cleanUp code is required.
|
int |
compareTo(BioModule module) |
boolean |
equals(Object o)
Compared based on ID
|
abstract void |
executeTask()
This is the main method called when it is time for the BioModule to complete its task.
|
protected List<File> |
findModuleInputFiles()
Called upon first access of input files to return sorted list of files from all
Config ."input.dirPaths"Hidden files (starting with ".") are ignored Call isValidInputModule(BioModule) on each previous module until acceptable input files are found |
protected List<File> |
getFileCache()
Get cached input files
|
Integer |
getID()
Some BioModules may be added to a pipeline multiple times so must be identified by an ID.
This is the same value as the directory folder prefix when run. The 1st module ID is 0 (or 00 if there are more than 10 modules. |
List<File> |
getInputFiles()
BioModule
getInputFiles() is called to initialize upon first call and cached. |
File |
getModuleDir()
All BioModule work must be contained within the scope of its root directory.
|
File |
getOutputDir()
Returns moduleDir/output which will be used as the next module's input.
|
List<String> |
getPostRequisiteModules()
By default, no post-requisites are required.
|
List<String> |
getPreRequisiteModules()
By default, no prerequisites are required.
|
String |
getSummary()
Returns summary message to be displayed by Email module so must not contain confidential info.
|
File |
getTempDir()
Returns moduleDir/temp for intermediate files.
|
void |
init()
This method must be called immediately upon instantiation.
|
boolean |
isValidInputModule(BioModule module)
In the early stages of the pipeline, starting with the very 1st module
ImportMetadata , most modules expect sequence files as input. |
String |
toString() |
protected void |
validateFileNameUnique(Set<String> fileNames,
File file)
Validate files in
Constants.INPUT_DIRS have unique names. |
public static final String GZIP_EXT
public static final String LOG_EXT
public static final String PDF_EXT
public static final String RETURN
public static final String SH_EXT
public static final String TAB_DELIM
public static final String TSV_EXT
public static final String TXT_EXT
public abstract void checkDependencies() throws Exception
checkDependencies
in interface BioModule
Exception
- thrown if missing or invalid dependencies are foundpublic void cleanUp() throws Exception
public int compareTo(BioModule module)
compareTo
in interface Comparable<BioModule>
public abstract void executeTask() throws Exception
BioModule
executeTask
in interface BioModule
Exception
- thrown if the module is unable to complete is taskpublic Integer getID()
BioModule
public List<File> getInputFiles() throws Exception
getInputFiles()
is called to initialize upon first call and cached.getInputFiles
in interface BioModule
Exception
- if unable to obtain input filespublic File getModuleDir()
getModuleDir
in interface BioModule
public File getOutputDir()
getOutputDir
in interface BioModule
public List<String> getPostRequisiteModules() throws Exception
getPostRequisiteModules
in interface BioModule
Exception
- if invalid Class names are returned as post-requisitespublic List<String> getPreRequisiteModules() throws Exception
getPreRequisiteModules
in interface BioModule
Exception
- if invalid Class names are returned as prerequisitespublic String getSummary() throws Exception
getSummary
in interface BioModule
Exception
- if any error occurspublic File getTempDir()
Constants.PIPELINE_DELETE_TEMP_FILES
=
"Y", this directory is deleted after pipeline completes successfully.getTempDir
in interface BioModule
public void init() throws Exception
public boolean isValidInputModule(BioModule module)
ImportMetadata
, most modules expect sequence files as input. This method returns
false if the previousModule only produced a new metadata file, such as
ImportMetadata
or RegisterNumReads
.
When getInputFiles()
is called, this method determines if the previousModule output is valid input for
the current BioModule. The default implementation of this method returns FALSE if the previousModule only
generates a new metadata file.isValidInputModule
in interface BioModule
module
- BioModule that ran before the current BioModuleprotected void cacheInputFiles(Collection<File> files) throws Exception
getInputFiles()
files
- Input filesException
- if errors occurprotected List<File> findModuleInputFiles() throws Exception
Config
."input.dirPaths"isValidInputModule(BioModule)
on each previous module until acceptable input files are foundException
- if errors occurprotected List<File> getFileCache()
protected void validateFileNameUnique(Set<String> fileNames, File file) throws Exception
Constants.INPUT_DIRS
have unique names. BioModules that expect duplicates must
override this method.fileNames
- A registry of module input file names added so farfile
- Next file to validateException
- if a duplicate file name found