Class InPlace
- java.lang.Object
-
- edu.isi.pegasus.planner.refiner.cleanup.InPlace
-
- All Implemented Interfaces:
CleanupStrategy
public class InPlace extends java.lang.Object implements CleanupStrategy
This generates cleanup jobs in the workflow itself.- Version:
- $Revision$
- Author:
- Arun ramakrishnan, Karan Vahi
-
-
Field Summary
Fields Modifier and Type Field Description static java.lang.StringCLEANUP_JOB_PREFIXThe prefix for CLEANUP_JOB ID i.e prefix+the parent compute_job ID becomes ID of the cleanup job.static intDEFAULT_CLUSTERED_CLEANUP_JOBS_PER_LEVELThe default value for the number of clustered cleanup jobs created per level.static java.lang.StringDEFAULT_MAX_JOBS_FOR_CLEANUP_CATEGORYThe default value for the maxjobs variable for the category of cleanup jobs.private intmCleanupJobsPerLevelThe number of cleanup jobs per level to be createdprivate intmCleanupJobsSizethe number of cleanup jobs clustered into a clustered cleanup jobprivate java.util.HashSetmDoNotCleanHashSet of Files that should not be cleaned upprivate CleanupImplementationmImplThe handle to the CleanupImplementation instance that creates the jobs for us.private LogManagermLoggerThe handle to the logging object used for logging.private intmMaxDepthThe max depth of any job in the workflow useful for a priorityQueue implementation in an arrayprivate PegasusPropertiesmPropsThe handle to the properties passed to Pegasus.private java.util.HashMapmResMapThe mapping to siteHandle to all the jobs that are mapped to it mapping to siteHandle(String) to Setprivate java.util.HashMapmResMapLeavesThe mapping of siteHandle to all subset of the jobs mapped to it that are leaves in the workflow mapping to siteHandle(String) to Set. private java.util.HashMapmResMapRootsThe mapping of siteHandle to all subset of the jobs mapped to it that are roots in the workflow mapping to siteHandle(String) to Set. private booleanmUseSizeFactorA boolean indicating whether we prefer use the size factor or the num factor-
Fields inherited from interface edu.isi.pegasus.planner.refiner.cleanup.CleanupStrategy
VERSION
-
-
Constructor Summary
Constructors Constructor Description InPlace()The default constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description GraphaddCleanupJobs(Graph workflow)Adds cleanup jobs to the workflow.private voidaddCleanUpJobs(java.lang.String site, java.util.Set leaves, Graph workflow)Adds cleanup jobs for the workflow scheduled to a particular site a breadth first search strategy is implemented based on the depth of the job in the workflowprotected voidapplyJobPriorities(Graph workflow)Adds job priorities to the jobs in the workflow on the basis of the levels in the traversal order given by the iterator.private java.util.List<GraphNode>clusterCleanupGraphNodes(java.util.List<GraphNode> cleanupNodes, java.util.HashMap cleanedBy, java.lang.String site, int level)Takes in a list of cleanup nodes ,one per cleanupNode(compute/stageout job) whose files need to be deleted) and clusters them into a smaller set of cleanup nodes.private GraphNodecreateClusteredCleanupGraphNode(java.util.List<GraphNode> nodes, java.util.HashMap cleanedBy, java.lang.String site, int level, int index)Creates a clustered cleanup graph node that aggregates multiple cleanup nodes into one nodeprotected java.lang.StringgenerateCleanupID(Job job)Returns the identifier that is to be assigned to cleanup job.java.lang.StringgenerateClusteredJobID(java.lang.String site, int level, int index)Generated an ID for a clustered cleanup jobprivate intgetClusterSize(int size)Returns the number of cleanup jobs clustered into one job per level.java.lang.StringgetDefaultCleanupMaxJobsPropertyKey()Returns the property key that can be used to set the max jobs for the default category associated with the registration jobs.protected java.lang.StringgetSiteForCleanup(Job job)Returns site to be used for the cleanup algorithm.voidinitialize(PegasusBag bag, CleanupImplementation impl)Intializes the class.protected voidreduceDependency(GraphNode node)Reduces the number of edges between the nodes and it's parents.protected voidreset()Resets the internal data structures.private voidsetDepth_ResMap(java.util.List roots)A BFS implementation to set depth value (roots have depth 1) and also to populate mResMap ,mResMapLeaves,mResMapRoots which contains all the jobs that are assigned to a particular resourceprotected booleantypeNeedsCleanUp(int type)Checks to see which job types are required to be looked at for cleanup.protected booleantypeNeedsCleanUp(GraphNode node)Checks to see which job types are required to be looked at for cleanup.protected booleantypeStageOut(int type)Checks to see if job type is a stageout job type.
-
-
-
Field Detail
-
CLEANUP_JOB_PREFIX
public static final java.lang.String CLEANUP_JOB_PREFIX
The prefix for CLEANUP_JOB ID i.e prefix+the parent compute_job ID becomes ID of the cleanup job.- See Also:
- Constant Field Values
-
DEFAULT_MAX_JOBS_FOR_CLEANUP_CATEGORY
public static final java.lang.String DEFAULT_MAX_JOBS_FOR_CLEANUP_CATEGORY
The default value for the maxjobs variable for the category of cleanup jobs.- See Also:
- Constant Field Values
-
DEFAULT_CLUSTERED_CLEANUP_JOBS_PER_LEVEL
public static final int DEFAULT_CLUSTERED_CLEANUP_JOBS_PER_LEVEL
The default value for the number of clustered cleanup jobs created per level.- See Also:
- Constant Field Values
-
mResMap
private java.util.HashMap mResMap
The mapping to siteHandle to all the jobs that are mapped to it mapping to siteHandle(String) to Set
-
mResMapLeaves
private java.util.HashMap mResMapLeaves
The mapping of siteHandle to all subset of the jobs mapped to it that are leaves in the workflow mapping to siteHandle(String) to Set.
-
mResMapRoots
private java.util.HashMap mResMapRoots
The mapping of siteHandle to all subset of the jobs mapped to it that are roots in the workflow mapping to siteHandle(String) to Set.
-
mMaxDepth
private int mMaxDepth
The max depth of any job in the workflow useful for a priorityQueue implementation in an array
-
mDoNotClean
private java.util.HashSet mDoNotClean
HashSet of Files that should not be cleaned up
-
mImpl
private CleanupImplementation mImpl
The handle to the CleanupImplementation instance that creates the jobs for us.
-
mProps
private PegasusProperties mProps
The handle to the properties passed to Pegasus.
-
mLogger
private LogManager mLogger
The handle to the logging object used for logging.
-
mCleanupJobsPerLevel
private int mCleanupJobsPerLevel
The number of cleanup jobs per level to be created
-
mCleanupJobsSize
private int mCleanupJobsSize
the number of cleanup jobs clustered into a clustered cleanup job
-
mUseSizeFactor
private boolean mUseSizeFactor
A boolean indicating whether we prefer use the size factor or the num factor
-
-
Method Detail
-
initialize
public void initialize(PegasusBag bag, CleanupImplementation impl)
Intializes the class.- Specified by:
initializein interfaceCleanupStrategy- Parameters:
bag- bag of initialization objectsimpl- the implementation instance that creates cleanup job
-
addCleanupJobs
public Graph addCleanupJobs(Graph workflow)
Adds cleanup jobs to the workflow.- Specified by:
addCleanupJobsin interfaceCleanupStrategy- Parameters:
workflow- the workflow to add cleanup jobs to.- Returns:
- the workflow with cleanup jobs added to it.
-
reset
protected void reset()
Resets the internal data structures.
-
setDepth_ResMap
private void setDepth_ResMap(java.util.List roots)
A BFS implementation to set depth value (roots have depth 1) and also to populate mResMap ,mResMapLeaves,mResMapRoots which contains all the jobs that are assigned to a particular resource- Parameters:
roots- List of GraphNode objects that are roots
-
addCleanUpJobs
private void addCleanUpJobs(java.lang.String site, java.util.Set leaves, Graph workflow)Adds cleanup jobs for the workflow scheduled to a particular site a breadth first search strategy is implemented based on the depth of the job in the workflow- Parameters:
site- the site IDleaves- the leaf jobs that are scheduled to siteworkflow- the Graph into which new cleanup jobs can be added
-
reduceDependency
protected void reduceDependency(GraphNode node)
Reduces the number of edges between the nodes and it's parents.For the node look at the parents of the Node. For each parent Y see if there is a path to any other parent Z of X. If a path exists, then the edge from Z to node can be removed.
- Parameters:
node- the nodes whose parent edges need to be reduced.
-
applyJobPriorities
protected void applyJobPriorities(Graph workflow)
Adds job priorities to the jobs in the workflow on the basis of the levels in the traversal order given by the iterator. Later on this would be a separate refiner.- Parameters:
workflow- the workflow on which to apply job priorities.
-
generateCleanupID
protected java.lang.String generateCleanupID(Job job)
Returns the identifier that is to be assigned to cleanup job.- Parameters:
job- the job with which the cleanup job is primarily associated.- Returns:
- the identifier for a cleanup job.
-
generateClusteredJobID
public java.lang.String generateClusteredJobID(java.lang.String site, int level, int index)Generated an ID for a clustered cleanup job- Parameters:
site- the site associated with the cleanup jobslevel- the level of the workflowindex- the index of the job on that level- Returns:
-
typeNeedsCleanUp
protected boolean typeNeedsCleanUp(GraphNode node)
Checks to see which job types are required to be looked at for cleanup. COMPUTE_JOB , STAGE_OUT_JOB , INTER_POOL_JOB are the ones that need cleanup- Parameters:
node- the graph node- Returns:
- boolean
-
typeNeedsCleanUp
protected boolean typeNeedsCleanUp(int type)
Checks to see which job types are required to be looked at for cleanup. COMPUTE_JOB , STAGE_OUT_JOB , INTER_POOL_JOB are the ones that need cleanup- Parameters:
type- the type of the job.- Returns:
- boolean
-
typeStageOut
protected boolean typeStageOut(int type)
Checks to see if job type is a stageout job type.- Parameters:
type- the type of the job.- Returns:
- boolean
-
getSiteForCleanup
protected java.lang.String getSiteForCleanup(Job job)
Returns site to be used for the cleanup algorithm. For compute jobs the staging site is used, while for stageout jobs is used. For all other jobs the execution site is used.- Parameters:
job- the job- Returns:
- the site to be used
-
getDefaultCleanupMaxJobsPropertyKey
public java.lang.String getDefaultCleanupMaxJobsPropertyKey()
Returns the property key that can be used to set the max jobs for the default category associated with the registration jobs.- Returns:
- the property key
-
clusterCleanupGraphNodes
private java.util.List<GraphNode> clusterCleanupGraphNodes(java.util.List<GraphNode> cleanupNodes, java.util.HashMap cleanedBy, java.lang.String site, int level)
Takes in a list of cleanup nodes ,one per cleanupNode(compute/stageout job) whose files need to be deleted) and clusters them into a smaller set of cleanup nodes.- Parameters:
cleanupNodes- List of stub cleanup nodes created corresponding to a job in the workflow that needs cleanup. the cleanup jobs have content as a CleanupJobContentcleanedBy- a map that tracks which file was deleted by which cleanup jobsite- the site associated with the cleanup jobslevel- the level of the workflow- Returns:
- a set of clustered cleanup nodes
-
createClusteredCleanupGraphNode
private GraphNode createClusteredCleanupGraphNode(java.util.List<GraphNode> nodes, java.util.HashMap cleanedBy, java.lang.String site, int level, int index)
Creates a clustered cleanup graph node that aggregates multiple cleanup nodes into one node- Parameters:
nodes- list of cleanup nodes that are to be aggregatedcleanedBy- a map that tracks which file was deleted by which cleanup jobsite- the site associated with the cleanup jobslevel- the level of the workflowindex- the index of the cleanup job for that level- Returns:
- a clustered cleanup node with the appropriate linkages added to the workflow else, null if the clustered cleanup node has no files to delete
-
getClusterSize
private int getClusterSize(int size)
Returns the number of cleanup jobs clustered into one job per level.- Parameters:
size- the number of cleanup jobs created by the algorithm before clustering for the level.- Returns:
- the number of clustered cleanup jobs to be created for the level
-
-