AutoSys Introduction Beginners
What is AutoSys?
- It is an automated, distributed job management system
- it enables the centralization and automated scheduling/management of jobs in distributed UNIX and Windows NT environments. Jobs run through AutoSys can run on any AutoSys configured machine that is attached to the network.
- It provides users with functionality to define, report & monitor job streams
- It is based on a client/server architecture with three components:
- Event Processor: Main Processor of the AutoSys, The EP scans the Event Server for events which are ready to be processed and, when one is found, checks to see if the starting conditions have been met. If they have, the EP opens a connection to the appropriate Remote Agent, providing it with the necessary information to perform the action or job.
- Database: Stores all events, job information, job status etc. Could be any RDBMS like Sybase, Oracle
- Remote Agent: when notified by the EP, performs its tasks and sends the resulting job status back to the database. On a UNIX machine, the Remote Agent is a temporary process started by the Event Processor to perform a specific task on a remote (client) machine. On a Windows NT machine, the Remote Agent is a Windows NT service running on a remote (client) machine that is directed by the Event Processor to perform specific tasks.
Events:
AutoSys is a event driven system. For any action to happen there must be some kind of event
Events can come from a number of sources, including the following:
- Jobs status change such as starting, running, finishing successfully, etc.
- Errors detected by Internal AutoSys verification agents related to Autosys Components
- Events sent with the sendevent command from any of the Autosys user interfaces
Types of Jobs:
Autosys jobs are of 3 Kinds:
- Box Jobs : Box job contains a group of Box jobs, Command Jobs and/or File Watcher Jobs. Box jobs allow us to manipulate the jobs as a group/batch.
- Command Jobs : Command jobs are the ones which execute any command / script / executable / batch files with the any parameters
- File Watcher Jobs : File Watchers are the ones which watch for a particular file or a pattern of files. File pattern to be watched can contain global variables also. These jobs would complete their run once the file(s) being watched have arrived completely.
Define Job in Autosys using JIL attributes:
JIL stands for “Job Information Language”
- insert_job: Creates a new job of one of the following types: command job, box job, or file watcher job
- job_type: Describes the job type:
- b – box
- c – command
- f – file watcher
- command: The command attribute can be the name of any command, executable, script, or batch file, and its arguments. Input and output redirection is provided by other job attributes and cannot be part of command.
- machine: Specifies the application server name where the job will be run. The machine can be a specific real machine, a set of real machines, or a virtual machine.
- owner: Specifies the owner of the job. The owner, by default, is specified in the user@host_or_domain format. This account is used to run the command on the machine. Recommended is to use @ for Windows servers and for UNIX servers.
- std_out_file: Specifies the file to which the standard output should be redirected. Any file for which the job owner has write permission on the client machine can be specified as the standard out file.
- std_err_file: Specifies the file to which the standard error should be redirected. Any file for which the job owner has write permission on the client machine can be specified as the standard out file.
- profile: Specifies an AutoSys job profile that defines environment variables to be set before the specified command is executed.
- date_conditions: This attribute specifies whether or not there are date or time conditions for starting this job. If set to “yes,” the day can be specified using the days_of_week attribute, or the specific dates can be specified by associating this job with a custom calendar. Starting times can also be specified using the start_times attribute to request specific times per day, or using the start_mins attribute to request specific times per hour. If it is set to “no,” the remainder of the date/time related attributes will be ignored.
- days_of_week: Indicates the days of the week when the job will be run. One or more days can be selected, or all days can be selected. This attribute and the run_calendar attribute are mutually exclusive. AutoSys will schedule the job to run on every day of the week specified by this attribute, at the times specified in the start_times or start_mins attribute, one of which must be specified if this attribute is used.
- start_times: Indicates the times of day, in 24-hour format, on the specified days or dates, when the job will be started. The days or dates must be specified using one of the following attributes: days_of_week or run_calendar. This attribute overrides any times set in a run calendar.
- start_mins: Indicates the number of minutes past the hour, every hour, on the specified days or dates, when the job will be started. The days or dates must be specified using one of the following attributes: days_of_week or run_calendar. This attribute overrides any times set in a run calendar.
- run_calendar: Indicates the name of the custom calendar to be used when determining the days of the week on which a job will run. This attribute is useful for complex date specification, such as running a job on the last business day of the month. The custom calendar will list the dates and the times when the job is to be run. This attribute and the days_of_week attribute are mutually exclusive.
- exclude_calendar: Indicates the name of the custom calendar to be used for determining the days of the week on which this job will not run. This attribute should be used in conjunction with days_of_week and start_times / start_mins attributes. If an exclude_calendar is specified and the job has other implicit or explicit start conditions, the Event processor will inspect the calendar before starting the job. If the current date is on the calendar, the job will not be started. This attribute and the run_calendar attribute are mutually exclusive.
- run_window: Indicates the time span during which the job will be allowed to start. If this attribute is specified, then when the job is eligible to run (based on its starting conditions) AutoSys will check if the current time falls within the specified run window.
- condition: When using the condition attribute, any number of job dependencies can be specified. All dependencies must evaluate to “true” before the dependent job will be run. Starting conditions can be one or more of the following types of dependencies:
- Status of a job; success, failure, terminated, notrunning or done
- Exit code of a job; ex: exitcode (aut_c_job)=4
- Global variables; ex: VALUE(TODAY)=Friday
- box_name: Indicates the name of the box job in which this job is to be placed.
max_run_alarm: Specifies the maximum runtime (in minutes) that a job should require to finish normally. If the job runs longer than this time, an alarm is generated and the job will continue running - min_run_alarm: Specifies the minimum runtime (in minutes) that a job should require to finish normally.
- max_exit_success: Specifies the maximum exit code with which the job can exit and still be considered a success by AutoSys. An exit code equal to or less than this value will be considered a success.
- n_retrys: Specifies how many times, if any, the job should be restarted after exiting with a FAILURE status. If a job is TERMINATED, it will not restart.
- box_success: Specifies the conditions to be interpreted as a box success. The default condition for a box to be considered successful is that every job in the box completed with a success condition.
- box_failure: Specifies the conditions to be interpreted as a box failure
- box_terminator: This attribute specifies whether the box containing this job should be terminated if the job fails or terminates.
- Job_terminator: This attribute specifies whether the job should be terminated if the box it is in fails or terminates.
- term_run_time: Specifies the maximum runtime (in minutes) that a job should require to finish normally. If the job runs longer than this time, it will be automatically terminated by AutoSys.
- timezone: Allows to schedule a job based on a chosen time zone.
- watch_file: Specifies the full path name of the file for which this file watcher job should watch.
- watch_file_min_size: Specifies the watch file minimum size (in bytes), which determines when enough data has been written to the file to consider it complete. AutoSys does not consider a file complete until both the minimum file size is reached, and the watch interval has detected a “steady state” for example: the file size has not changed between checked intervals.
- watch_interval: Specifies the interval (in seconds) at which the file watcher job will check for the existence and size of the watched-for file. A “steady state” is said to have been reached when the file hasn’t grown during the specified time interval.
AutoSys Job states:
- Starting – calling the job on the specified machine (instantiating the remote agent)
- Running – start has succeeded and job is not running.
- Inactive – job has not been process; the job has never been run, or its stsatus was intentionally altered to “turn off” its previous completion status.
- Success – job exited with an exit code equal or less than the “maximum exit code for success.”
- Failure – job exited with an exit code greater than the “maximum exit code for success.“
- Terminated – job terminated while in running state. Typically, this indicates that a user has intervened with the job and has killed it.
- Restart – job unable to start (or failed) due to hardware or software problems
- Activated – top level box that this job is in is now in running state but the job itself has not yet started; job is now ready to run once it’s conditions are met.
- Que_Wait – job can logically run, but there are not enough machine resources available at the time
- On_Ice – job is removed from all conditions and logic but it is still defined to AutoSys. (Deactivated) any jobs with conditions on this particular job will now be met.
- On_Hold – job is on hold and stays that way until it receives Job_Off_Hold. While it is on hold, jobs downstream will NOT be run. On_Hold delays anything waiting on it.
Job Statuses:
- INACTIVE: The job has not yet been processed. Either the job has never been run, or its status was intentionally altered to turn off its previous completion status.
- ACTIVATED: The top-level box that this job is in is now in the RUNNING state, but the job itself has not started yet.
- STARTING: The event processor has initiated the start job procedure with the remote agent.
- RUNNING: The job is running. If the job is a box job, this value simply means that the jobs within the box can be started (other conditions permitting). If it is a command or file watcher job, the value means that the process is actually running on the remote machine.
- SUCCESS: The job exited with an exit code equal to or less than the maximum exit code for success. By default, only the exit code 0 is interpreted as success. However, a range of values up to the maximum exit code for success can be reserved for each job to be interpreted as success. If the job is a box job, this value means that all the jobs within the box have finished with the status SUCCESS (the default), or the Exit Condition for Box Success evaluated to true. (These exit conditions are discussed further in later sections.)
- FAILURE: The job exited with an exit code greater than the maximum exit code for success. By default, any number greater than zero is interpreted as failure. If the job is a box job, a FAILURE
status means either that at least one job within the box exited with the status FAILURE (the default), or that the Exit Condition for Box Failure evaluated to true. Unicenter AutoSys JM issues an alarm if a job fails. - TERMINATED: The job terminated while in the RUNNING state. A job can be terminated if a user sends a KILLJOB event or if it was defined to terminate if the box it is in failed. If the job itself fails, it has a FAILURE status, not a TERMINATED status. A job may also be terminated if it has exceeded the maximum runtime (term_run_time attribute, if one was specified for the job), or if it was killed from the command line through a UNIX kill command. Unicenter AutoSys JM issues an alarm if a job is terminated.
- RESTART: The job was unable to start due to hardware or application problems, and has been scheduled to restart.
- QUE_WAIT: The job can logically run (that is, all the starting conditions have been met), but there are not enough machine resources available.
- ON_HOLD: This job is on hold and will not be run until it receives the JOB_OFF_HOLD event.
- OFF_HOLD: Take job off hold. (opposite of ON_HOLD)
- ON_ICE: This job is removed from all conditions and logic, but is still defined. Operationally, this condition is like deactivating the job. It will remain on ice until it receives the JOB_OFF_ICE
event. - OFF_ICE: Place job back in run queue. (opposite of ON_ICE)
- REFRESH_DEPENDENCIES: Refreshing its dependencies
Status Codes:
- AC: Activated
- FA: Failure
- IN: Inactive
- OH: On hold
- OI: on ice
- QU: Queue wait
- RE: Restart
- RU: Running
- ST: Starting
- SU: Success
- TE: Terminated
Reference: Unicenter AutoSys Job Management
Thanks, good tutorial… Have to learn this stuff. Good old perl plus crontab not good enough for my present employer!