Class PDAX2MDAG
- java.lang.Object
-
- edu.isi.pegasus.planner.parser.pdax.PDAX2MDAG
-
- All Implemented Interfaces:
Callback
public class PDAX2MDAG extends java.lang.Object implements Callback
This callback ends up creating the megadag that contains the smaller dags each corresponding to the one level as identified in the pdax file generated by the partitioner.- Version:
- $Revision$
- Author:
- Karan Vahi
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description private classPDAX2MDAG.GrepCallbackAn inner class, that implements the StreamGobblerCallback to count the occurences of a word in a document.
-
Field Summary
Fields Modifier and Type Field Description static java.lang.StringCODE_GENERATOR_CLASSThe SubmitWriter that has to be loaded for now.static java.lang.StringCONDOR_DAGMAN_LOGICAL_NAMEThe logical name with which to query the transformation catalog for the condor_dagman executable, that ends up running the mini dag as one job.static java.lang.StringCONDOR_DAGMAN_NAMESPACEThe namespace to use for condor dagman.static java.lang.StringCPLANNER_LOGICAL_NAMEThe logical name with which to query the transformation catalog for cPlanner executable.static java.lang.String[][]DAGMAN_KNOBSThe dagman knobs controlled through property.static intHEAD_INDEXThe index of the head job.private PegasusProperties.CLEANUP_SCOPEmCleanupScopeThe cleanup scope for the workflows.private longmCondorVersionThe long value of condor version.private java.lang.StringmDAGManKnobsAny extra arguments that need to be passed to dagman, as determined from the properties file.private StreamGobblerCallbackmDefaultCallbackAn instance of the default stream gobbler callback implementation that is used for creating symbolic links.private booleanmDoneA flag to store whether the parsing is complete or not.private org.griphyn.vdl.euryale.FileFactorymFactoryThe handle to the file factory, that is used to create the top level directories for each of the partitions.private java.util.MapmJobMapThe internal map that maps the partition id to the job responsible for executing the partition..private LogManagermLoggerThe handle to the logging object.private java.lang.StringmMDAGPropertiesFileThe path to the properties file that is written out and shared by all partitions in the mega DAG.private ADagmMegaDAGThe abstract dag object that ends up holding the megadag.private java.text.NumberFormatmNumFormatterThe number formatter to format the run submit dir entries.private java.lang.StringmPDAXDirectoryThe directory in which the daxes corresponding to the partitions are kept.private PlannerOptionsmPOptionsThe object containing the options that were given to the concrete planner at runtime.private PegasusPropertiesmPropsThe handle to the properties file.protected static charmSeparatorThe file Separator to be used on the submit host.private java.lang.StringmSubmitDirectoryThe root of the submit directory where all the submit directories for the various partitions reside.private TransformationCatalogmTCHandleThe handle to the transformation catalog.private java.lang.StringmUserThe user name of the user running Pegasus.static java.lang.StringNAMESPACEThe namespace to which the job in the MEGA DAG being created refer to.static intNUM_OF_EXPANDED_JOBSThe number of jobs into which each job in the partition graph is expanded to.static java.lang.StringRETRY_LOGICAL_NAMEThe planner utility that needs to be called as a prescript.static java.lang.StringSUBMIT_DIRECTORY_PREFIXThe prefix for the submit directory.static intTAIL_INDEXThe index of the tail job.
-
Constructor Summary
Constructors Constructor Description PDAX2MDAG(java.lang.String directory, PegasusProperties properties, PlannerOptions options)The overloaded constructor.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidcbDocument(java.util.Map attributes)Callback when the opening tag was parsed.voidcbDone()Callback when the parsing of the document is done.voidcbParents(java.lang.String child, java.util.List parents)Callback for child and parent relationships from section 3.voidcbPartition(Partition partition)Callback for the partition .protected JobconstructDAGJob(Partition partition, java.io.File directory, java.lang.String dax)Constructs a job that plans and submits the partitioned workflow, referred to by a Partition.static java.lang.StringconstructDAGManKnobs(PegasusProperties properties)Constructs Any extra arguments that need to be passed to dagman, as determined from the properties file.private TransformationCatalogEntryconstructTCEntryFromEnvironment()Returns a tranformation catalog entry object constructed from the environment An entry is constructed if either of the following environment variables are defined 1) CONDOR_HOME 2) CONDOR_LOCATION CONDOR_HOME takes precedence over CONDOR_LOCATIONprivate TransformationCatalogEntryconstructTCEntryFromEnvProfiles(ENV env)Returns a tranformation catalog entry object constructed from the environment An entry is constructed if either of the following environment variables are defined 1) CONDOR_HOME 2) CONDOR_LOCATION CONDOR_HOME takes precedence over CONDOR_LOCATIONprotected java.lang.StringcreateSubmitDirectory(java.lang.String label, java.lang.String dir, java.lang.String user, java.lang.String vogroup, boolean timestampBased)Creates the submit directory for the workflow.protected booleancreateSymlink(java.lang.String source, java.io.File destDir)Returns the number of partitions referred to in the PDAX file.private TransformationCatalogEntrydefaultTCEntry(java.lang.String site)Returns a default TC entry to be used in case entry is not found in the transformation catalog.protected java.lang.StringgetAbsolutePath(Partition partition, java.lang.String directory, java.lang.String suffix)Returns the absolute path to a dagman (usually) related file for a particular partition in the submit directory that is passed as an input parameter.protected java.lang.StringgetBasename(Partition partition, java.lang.String suffix)Returns the basename of a dagman (usually) related file for a particular partition.protected java.lang.StringgetBaseName(Partition partition)Returns the base name of the submit directory in which the submit files for a particular partition reside.protected java.lang.StringgetBasenamePrefix(Job job)Returns the basename prefix of a dagman (usually) related file for a a job that submits nested dagman.protected java.lang.StringgetCacheFilePath(Job job)Returns the full path to a cache file that corresponds for one partition.private java.lang.StringgetCondorFileName(java.lang.String name, int index, java.lang.String suffix)A small utility method that constructs the name of the Condor files that are generated when a dag is submitted.private java.lang.StringgetCondorFileName(java.lang.String name, int index, java.lang.String suffix, java.lang.String separator)A small utility method that constructs the name of the Condor files that are generated when a dag is submitted.java.lang.ObjectgetConstructedObject()Returns the MEGADAG that is generatedprotected JobgetJob(java.lang.String id)Returns the job that has been constructed for a particular partition.protected intgetPartitionCount(java.lang.String pdax)Returns the number of partitions referred to in the PDAX file.protected static intparseInt(java.lang.String s)Parses a string into an integer.protected static voidsanityCheck(java.io.File dir)Checks the destination location for existence, if it can be created, if it is writable etc.protected voidsetPrescript(Job job, java.lang.String daxURL, java.lang.String log)Sets the prescript that ends up calling to the default wrapper that introduces retry into Pegasus for a particular job.protected voidsetPrescript(Job job, java.lang.String daxURL, java.lang.String log, java.lang.String namespace, java.lang.String name, java.lang.String version)Sets the prescript that ends up calling to the default wrapper that introduces retry into Pegasus for a particular job.protected java.lang.StringwriteOutBraindump(java.io.File directory, Partition partition, java.lang.String dax, java.lang.String dag)Writes out the braindump.txt file for a partition in the partition submit directory.protected java.lang.StringwriteOutProperties(java.lang.String directory)Writes out the properties to a temporary file in the directory passed.
-
-
-
Field Detail
-
CODE_GENERATOR_CLASS
public static final java.lang.String CODE_GENERATOR_CLASS
The SubmitWriter that has to be loaded for now.- See Also:
- Constant Field Values
-
SUBMIT_DIRECTORY_PREFIX
public static final java.lang.String SUBMIT_DIRECTORY_PREFIX
The prefix for the submit directory.- See Also:
- Constant Field Values
-
NUM_OF_EXPANDED_JOBS
public static final int NUM_OF_EXPANDED_JOBS
The number of jobs into which each job in the partition graph is expanded to.- See Also:
- Constant Field Values
-
HEAD_INDEX
public static final int HEAD_INDEX
The index of the head job.- See Also:
- Constant Field Values
-
TAIL_INDEX
public static final int TAIL_INDEX
The index of the tail job.- See Also:
- Constant Field Values
-
CPLANNER_LOGICAL_NAME
public static final java.lang.String CPLANNER_LOGICAL_NAME
The logical name with which to query the transformation catalog for cPlanner executable.- See Also:
- Constant Field Values
-
CONDOR_DAGMAN_NAMESPACE
public static final java.lang.String CONDOR_DAGMAN_NAMESPACE
The namespace to use for condor dagman.- See Also:
- Constant Field Values
-
CONDOR_DAGMAN_LOGICAL_NAME
public static final java.lang.String CONDOR_DAGMAN_LOGICAL_NAME
The logical name with which to query the transformation catalog for the condor_dagman executable, that ends up running the mini dag as one job.- See Also:
- Constant Field Values
-
NAMESPACE
public static final java.lang.String NAMESPACE
The namespace to which the job in the MEGA DAG being created refer to.- See Also:
- Constant Field Values
-
RETRY_LOGICAL_NAME
public static final java.lang.String RETRY_LOGICAL_NAME
The planner utility that needs to be called as a prescript.- See Also:
- Constant Field Values
-
DAGMAN_KNOBS
public static final java.lang.String[][] DAGMAN_KNOBS
The dagman knobs controlled through property. They map the property name to the corresponding dagman option.
-
mSeparator
protected static char mSeparator
The file Separator to be used on the submit host.
-
mPDAXDirectory
private java.lang.String mPDAXDirectory
The directory in which the daxes corresponding to the partitions are kept. This should be the same directory where the pdax containing the partition graph resides.
-
mSubmitDirectory
private java.lang.String mSubmitDirectory
The root of the submit directory where all the submit directories for the various partitions reside.
-
mMegaDAG
private ADag mMegaDAG
The abstract dag object that ends up holding the megadag.
-
mJobMap
private java.util.Map mJobMap
The internal map that maps the partition id to the job responsible for executing the partition..
-
mProps
private PegasusProperties mProps
The handle to the properties file.
-
mTCHandle
private TransformationCatalog mTCHandle
The handle to the transformation catalog.
-
mLogger
private LogManager mLogger
The handle to the logging object.
-
mPOptions
private PlannerOptions mPOptions
The object containing the options that were given to the concrete planner at runtime.
-
mMDAGPropertiesFile
private java.lang.String mMDAGPropertiesFile
The path to the properties file that is written out and shared by all partitions in the mega DAG.
-
mFactory
private org.griphyn.vdl.euryale.FileFactory mFactory
The handle to the file factory, that is used to create the top level directories for each of the partitions.
-
mDefaultCallback
private StreamGobblerCallback mDefaultCallback
An instance of the default stream gobbler callback implementation that is used for creating symbolic links.
-
mNumFormatter
private java.text.NumberFormat mNumFormatter
The number formatter to format the run submit dir entries.
-
mUser
private java.lang.String mUser
The user name of the user running Pegasus.
-
mDone
private boolean mDone
A flag to store whether the parsing is complete or not.
-
mDAGManKnobs
private java.lang.String mDAGManKnobs
Any extra arguments that need to be passed to dagman, as determined from the properties file.
-
mCondorVersion
private long mCondorVersion
The long value of condor version.
-
mCleanupScope
private PegasusProperties.CLEANUP_SCOPE mCleanupScope
The cleanup scope for the workflows.
-
-
Constructor Detail
-
PDAX2MDAG
public PDAX2MDAG(java.lang.String directory, PegasusProperties properties, PlannerOptions options)The overloaded constructor.- Parameters:
directory- the directory where the pdax and all the daxes corresponding to the partitions reside.properties- thePegasusPropertiesto be used.options- the options passed to the planner.
-
-
Method Detail
-
sanityCheck
protected static void sanityCheck(java.io.File dir) throws java.io.IOExceptionChecks the destination location for existence, if it can be created, if it is writable etc.- Parameters:
dir- is the new base directory to optionally create.- Throws:
java.io.IOException- in case of error while writing out files.
-
cbDocument
public void cbDocument(java.util.Map attributes)
Callback when the opening tag was parsed. This contains all attributes and their raw values within a map. This callback can also be used to initialize callback-specific resources.- Specified by:
cbDocumentin interfaceCallback- Parameters:
attributes- is a map of attribute key to attribute value
-
cbPartition
public void cbPartition(Partition partition)
Callback for the partition . These partitions are completely assembled, but each is passed separately.- Specified by:
cbPartitionin interfaceCallback- Parameters:
partition- is the PDAX-style partition.
-
cbParents
public void cbParents(java.lang.String child, java.util.List parents)Callback for child and parent relationships from section 3. This ties in the relations between the partitions to the relations between the jobs that are responsible for partitions. In addition, appropriate cache file arguments are generated.
-
cbDone
public void cbDone()
Callback when the parsing of the document is done. This ends up triggering the writing of the condor submit files corresponding to the mega dag.
-
getConstructedObject
public java.lang.Object getConstructedObject()
Returns the MEGADAG that is generated- Specified by:
getConstructedObjectin interfaceCallback- Returns:
- ADag object containing the mega daga
-
constructDAGJob
protected Job constructDAGJob(Partition partition, java.io.File directory, java.lang.String dax)
Constructs a job that plans and submits the partitioned workflow, referred to by a Partition. The main job itself is a condor dagman job that submits the concrete workflow. The concrete workflow is generated by running the planner in the prescript for the job.- Parameters:
partition- the partition corresponding to which the job has to be constructed.directory- the submit directory where the submit files for the partition should reside.dax- the absolute path to the partitioned dax file that corresponds to this partition.- Returns:
- the constructed DAG job.
-
defaultTCEntry
private TransformationCatalogEntry defaultTCEntry(java.lang.String site)
Returns a default TC entry to be used in case entry is not found in the transformation catalog.- Parameters:
site- the site for which the default entry is required.- Returns:
- the default entry.
-
constructTCEntryFromEnvironment
private TransformationCatalogEntry constructTCEntryFromEnvironment()
Returns a tranformation catalog entry object constructed from the environment An entry is constructed if either of the following environment variables are defined 1) CONDOR_HOME 2) CONDOR_LOCATION CONDOR_HOME takes precedence over CONDOR_LOCATION- Returns:
- the constructed entry else null.
-
constructTCEntryFromEnvProfiles
private TransformationCatalogEntry constructTCEntryFromEnvProfiles(ENV env)
Returns a tranformation catalog entry object constructed from the environment An entry is constructed if either of the following environment variables are defined 1) CONDOR_HOME 2) CONDOR_LOCATION CONDOR_HOME takes precedence over CONDOR_LOCATION- Parameters:
env- the environment profiles.- Returns:
- the entry constructed else null if environment variables not defined.
-
writeOutBraindump
protected java.lang.String writeOutBraindump(java.io.File directory, Partition partition, java.lang.String dax, java.lang.String dag) throws java.io.IOExceptionWrites out the braindump.txt file for a partition in the partition submit directory. The braindump.txt file is used for passing to the tailstatd daemon that monitors the state of execution of the workflow.- Parameters:
directory- the directory in which the braindump file needs to be written to.partition- the partition for which the braindump is to be written out.dax- the dax filedag- the dag file- Returns:
- the absolute path to the braindump file.txt written in the directory.
- Throws:
java.io.IOException- in case of error while writing out file.
-
writeOutProperties
protected java.lang.String writeOutProperties(java.lang.String directory) throws java.io.IOExceptionWrites out the properties to a temporary file in the directory passed.- Parameters:
directory- the directory in which the properties file needs to be written to.- Returns:
- the absolute path to the properties file written in the directory.
- Throws:
java.io.IOException- in case of error while writing out file.
-
setPrescript
protected void setPrescript(Job job, java.lang.String daxURL, java.lang.String log)
Sets the prescript that ends up calling to the default wrapper that introduces retry into Pegasus for a particular job.- Parameters:
job- the job whose prescript needs to be set.daxURL- the path to the dax file on the filesystem.log- the file where the output of the prescript needs to be redirected to.- See Also:
RETRY_LOGICAL_NAME
-
setPrescript
protected void setPrescript(Job job, java.lang.String daxURL, java.lang.String log, java.lang.String namespace, java.lang.String name, java.lang.String version)
Sets the prescript that ends up calling to the default wrapper that introduces retry into Pegasus for a particular job.- Parameters:
job- the job whose prescript needs to be set.daxURL- the path to the dax file on the filesystem.log- the file where the output of the prescript needs to be redirected to.namespace- the namespace of the replanner utility.name- the logical name of the replanner.version- the version of the replanner to be picked up.
-
getBaseName
protected java.lang.String getBaseName(Partition partition)
Returns the base name of the submit directory in which the submit files for a particular partition reside.- Parameters:
partition- the partition for which the base directory is to be constructed.- Returns:
- the base name of the partition.
-
getAbsolutePath
protected java.lang.String getAbsolutePath(Partition partition, java.lang.String directory, java.lang.String suffix)
Returns the absolute path to a dagman (usually) related file for a particular partition in the submit directory that is passed as an input parameter. This does not create the file, just returns an absolute path to it. Useful for constructing argument string for condor_dagman.- Parameters:
partition- the partition for which the dagman is responsible for execution.directory- the directory where the file should reside.suffix- the suffix for the file basename.- Returns:
- the absolute path to a file in the submit directory.
-
getBasename
protected java.lang.String getBasename(Partition partition, java.lang.String suffix)
Returns the basename of a dagman (usually) related file for a particular partition.- Parameters:
partition- the partition for which the dagman is responsible for execution.suffix- the suffix for the file basename.- Returns:
- the basename.
-
getBasenamePrefix
protected java.lang.String getBasenamePrefix(Job job)
Returns the basename prefix of a dagman (usually) related file for a a job that submits nested dagman.- Parameters:
job- the job that submits a nested dagman.- Returns:
- the basename.
-
getCacheFilePath
protected java.lang.String getCacheFilePath(Job job)
Returns the full path to a cache file that corresponds for one partition. The cache file resides in the submit directory for the partition for which the job is responsible for.- Parameters:
job- the job running on the submit host that submits the partition.- Returns:
- the full path to the file.
-
createSymlink
protected boolean createSymlink(java.lang.String source, java.io.File destDir)Returns the number of partitions referred to in the PDAX file.- Parameters:
source- the source file that has to be symlinked.destDir- the destination directory where the symlink has to be placed.- Returns:
- the number of partitions in the pdax file.
-
getPartitionCount
protected int getPartitionCount(java.lang.String pdax)
Returns the number of partitions referred to in the PDAX file.- Parameters:
pdax- the path to the pdax file.- Returns:
- the number of partitions in the pdax file.
-
getJob
protected Job getJob(java.lang.String id)
Returns the job that has been constructed for a particular partition.- Parameters:
id- the partition id.- Returns:
- the corresponding job, else null if not found.
-
createSubmitDirectory
protected java.lang.String createSubmitDirectory(java.lang.String label, java.lang.String dir, java.lang.String user, java.lang.String vogroup, boolean timestampBased) throws java.io.IOExceptionCreates the submit directory for the workflow. This is not thread safe.- Parameters:
label- the label of the workflow being worked upon.dir- the base directory specified by the user.user- the username of the user.vogroup- the vogroup to which the user belongs to.timestampBased- boolean indicating whether to have a timestamp based dir or not- Returns:
- the directory name created relative to the base directory passed as input.
- Throws:
java.io.IOException- in case of unable to create submit directory.
-
constructDAGManKnobs
public static java.lang.String constructDAGManKnobs(PegasusProperties properties)
Constructs Any extra arguments that need to be passed to dagman, as determined from the properties file.- Parameters:
properties- thePegasusProperties- Returns:
- any arguments to be added, else empty string
-
parseInt
protected static int parseInt(java.lang.String s)
Parses a string into an integer. Non valid values returned as -1- Parameters:
s- the String to be parsed as integer- Returns:
- the int value if valid, else -1
-
getCondorFileName
private java.lang.String getCondorFileName(java.lang.String name, int index, java.lang.String suffix)A small utility method that constructs the name of the Condor files that are generated when a dag is submitted. The default separator _ is used.- Parameters:
name- the name attribute in the partition element of the pdax.index- the partition number of the partition.suffix- the suffix that needs to be added to the filename.- Returns:
- the name of the condor file.
-
getCondorFileName
private java.lang.String getCondorFileName(java.lang.String name, int index, java.lang.String suffix, java.lang.String separator)A small utility method that constructs the name of the Condor files that are generated when a dag is submitted.- Parameters:
name- the name attribute in the partition element of the pdax.index- the partition number of the partition.suffix- the suffix that needs to be added to the filenameseparator- the separator that is to be used while constructing the filename.- Returns:
- the name of the condor file
-
-