Working with Jobs¶
The Jobs workspace provides a window onto the different tasks that the Aunsight platform runs on the compute resources in a specific context. Because compute operations are an important part of dataflows, workflows, and processes, users need to be able to monitor the status and results of these tasks. This article teaches how to manage tasks with the jobs workspace. Readers can learn how to view, search, and understand the job records that the Aunsight platform uses to track tasks. In addition, this article also discusses how to stop jobs using the Aunsight Toolbelt command line interface, should that be necessary.
The Aunsight web interface provides a Jobs workspace to users who have the
AU-TRACKER:view-job permission available to them through a role or group they have in a given context.
To access this workspace, log in to the web interface and select the context you wish to work in through the context selector. From that context's dashboard, click the "Jobs" icon () in the palette on the right.
The jobs workspace is a standard list-based view of jobs in the present context. You can search () the list to find a job by clicking the appropriate icon at the top of the list. If a job is not appearing, or its status seems out of date, click the refresh () icon to update the data in the list.
Searching for Jobs¶
Clicking the search icon () will bring up a search box with three option buttons.
Simply typing text into the search box will dynamically search the list based on the name or ID of the task. For example, searching with a process name will show jobs related to the process.
Users can also filter the job by clicking one of the filter buttons:
- Filter by type
Displays jobs matching a specific job type.
- Filter by state
Displays jobs in a specific state, such as:
- Pending (submitted, but not yet run)
- Killed (stopped)
- Submitted by me only
Displays jobs submitted by the user.
Viewing Details of a Job¶
Clicking a job on the list panel will bring up that record in the main view of the user interface. Each job record displays data in two tabs, details and logs.
The details tab of each job record displays metadata related to that job record. All job records will contain the following fields:
- ID - The job ID number which can be used to stop the job
- Name - A descriptive name for the job
- Type - The type
- State - The current status of the job (e.g.
- Created At - Timestamp of the job's submission
- State Updated At - Timestamp of the last change to the job's state
- Updated At - Timestamp of the last update about the job from the tracker service
- Duration - Length of time from start to final completion (or the present, if the job is still active)
In addition to these basic settings that are common to all jobs, a number of other fields may be present based on the type of job selected. For file-related jobs, these fields usually refer to locks placed on files, but for computationally-focused jobs, the fields can contain details related to the execution environment (RAM and CPU core allocations, etc.).
The logs tab displays the contents of the Aunsight Loggerstream object associated with this job. Loggerstreams are specific Aunsight platform objects that allow tasks to log information about their performance. As the following example shows, loggerstreams are JSON objects containing a series of messages regarding the status of the job.
If you cannot see loggerstream data for a job; make sure you have the
The loggerstream view can be controlled in a variety of ways by the action buttons provided via the interface:
Sort by Timestamp
Log data can be sorted by timestamp using the newest/oldest first buttons
Log data can be viewed as a date-sorted JSON, date-sorted message, and raw JSON.
Copy the Loggerstream ID to the clipboard (useful for access from the Toolbelt and SDK interfaces)
Occasionally, users may find that a job is taking much longer than it should due to unexpected issues. This is often common with dataset ingestion, because network difficulties between the client interface and the platform services can interrupt data transfer operations. Another common example are processes, since user-supplied code can be subject to unpredictable errors.
In these cases, it is possible to stop a job, but this ability should not be overused, so there is no support for doing so from within the web app. Users who wish to kill jobs can do so from the toolbelt command line interface if they have the permissions
AU-DISPATCHER:kill-any-project-job depending on the desired context.
Killing jobs with toolbelt can be easily performed with the ID of the object being submitted for compute (e.g. a dataflow, process, or workflow), or from the job ID itself.
To kill a job by referring to the object that is being run, type
au2 dataflow job kill <object ID> (substitute the
workflow commands for
dataflow to kill those types of objects by their IDs).
To kill by referring to the job ID itself, type
au2 dataflow job kill --job <job ID> (substitute the
workflow commands for
dataflow to kill those types of objects).
Killing objects will instruct Aunsight to stop any containers running those jobs and collect and clean up the environment for the job. The job record itself will remain, but should have its state updated to
killed within a few moments.
Understanding Job Types¶
Both user and system tasks are tracked by means of "Job" records. The Aunsight Web Interface provides a jobs workspace that allows users to view a record of information about compute tasks run in an organization context. Aunsight records contain different information based on the type of task.
Tokamak dataflow jobs are specialized Docker containers that perform ETL functions using Pig Latin operations on a Hadoop filesystem.
Workflow jobs are containers that manage the execution of a workflow. As such, workflow containers will frequently start up other jobs (dataflows, queries, processes, etc.) or even start other workflows.
Process jobs are docker images running custom code that is then uploaded by users to perform some specific function. Processes usually interact with the Aunsight platform through one the SDKs.
Aunsight queries are containers that execute queries on an Apache Drill query resource and return data. These queries and the containers that run them are created and managed by the Aunsight Query service.
The Metrodispatcher (Dispatcher) service is an Aunsight platform service that facilitates large data transfer within the Aunsight platform. Because many data transfers can take hours or even days and involve terabytes of data moving slowly over encrypted connections through the public Internet, the dispatcher service listens for incoming requests and immediately assigns a worker node to serve as a liaison for the remainder of the data transfer. The service itself merely delegates and monitors these tasks so that it can remain responsive to further incoming requests.
Dispatcher handles a variety of tasks involving the transfer of big data. For this reason, Metrodispatcher tasks fall into one of the following categories:
Sightglass Source Publisher¶
Sightglass data sources are pushed to a public cloud so that data can be distributed efficiently across the public internet and mobile data networks. Sightglass source publisher jobs are containers responsible for performing the necessary data transformation and uploads into the public cloud to push a new version of Sightglass data to the cloud.
Peeper Reports are statistical analyses of Aunsight datasets. Generating these reports requires intensive computational work on the entire dataset. Peeper report jobs are containers responsible for reading through an entire dataset to generate the statistics included in these reports.