July Release Notes

Aunalytics is excited to announce the July 2021 release to our clients. This release will provide clients with model and site enhancement information along with any fixes to existing functionality we have included.

Daybreak

Data Builder Improvements

This month we're releasing significant improvements to the Daybreak Data Builder user interface, including grouped conditions and a column selector that now allows you to display fields from foreign tables in a query result. Learn more by reading on, or watch our feature release video.

Condition Groups

Previously, Daybreak Data Builder allowed users to build queries using the Query Wizard that had one or more filter expressions (conditions) that were combined so that all conditions had to match (i.e. using the SQL AND keyword) for a result to be included. This month, conditions can now be grouped so and those groups will be evaluated to restrict (AND the condition) or expand (OR the condition) the results that will be included when the query is run.

For example, previously you could not create a single query that would include customers whose ages were either between 18-35 or 65 or older. Now, that query can be created with two condition groups joined inclusively: results will be retrieved for any record that matches either condition.

The inclusion of condition groups has changed the look and feel of the Query Wizard, so take some time to familiarize yourself with the new look and feel.

Column Selector

Daybreak data marts employ a relational entity relationship model (ERM), sometimes called a relational database. For example, Daybreak for Financial Services includes different tables for customers, transactions, accounts, and branches. Each transaction is linked to a single account, and each customer can be linked to one or more accounts---and by extension, to the transactions for those accounts.

The Data Builder has always been able to understand these relationships in order to build complex queries, such as "Give me a list of customers whose credit card monthly balance was more than $5,000." However, when this query is run, the Data Builder's column selector could only include fields from the Customer table in the query results.

This month, the column selector has been updated to allow fields from related records in another table to be included. This feature, which required extensive logic to be applied in order to create the appropriate SQL JOIN statements, enables users to create more powerful datasets that combine data from different tables to support the generation of more comprehensive datasets drawn from multiple tables in the data mart.

Query results in the Daybreak Data Builder will now display URLs in data field results as clickable hyperlinks. This will allow Daybreak to support a new Daybreak application being developed by Qumulus Solutions for document storage and retrieval. With this change, Daybreak data marts will be able to link to data or documents stored outside of the data mart via URL hyperlinks.

Natural Language Answers Model Retrain

This month, the Innovation Lab has retrained the Natural Language Answers™ model responsible for translating natural language questions into SQL queries. This retrain has enhanced support for fields that had previously not been well integrated into the model's training:

  • ClosestBranchDistance: "Give me a list of customers that live less than five miles from a branch."
  • DistanceToHQ: "Give me a list of customers who live fifty miles or less from headquarters."
  • IsEmployee: "Show me customers with investment accounts who are employees."
  • IsDeceased: "List customers with mortgages who are deceased."
  • ClosedCode: "Show me accounts with the closed code of 'BLOC'"

This new addition to the training templates will provide new ways to ask questions that had previously not been well supported by the language model.

This month's model also addresses questions that have previously been flagged by users for followup.

  • Customers with wealth acct - This question has been resolved and is now understood by the model.
  • Active employee customers with a checking account - This question is problematic since "active" could modify either employee or customers, and the model cannot easily disambiguate this. A Suggested work-around would be Active customers who are employees with a checking account
  • Accounts that are overdrawn in the past week. Suggested work-around: Accounts that are overdrawn less than 7 days
  • Customers with transactions last 30 days. Suggested workaround: Customer with more than 0 transactions last 30 days

Aunsight

New Dataflow Builder Operations Panel UI

This month, we're revealing a new Dataflow Builder interface focusing on a new operators and expressions panel. The Dataflow builder used to edit dataflow objects previously featured a sidebar menu the left of the Builder screen. This month, a redesigned panel modeled on the component sidebar menu in the Workflow Builder will replace the old interface. The new interface features a responsive design that allows for a better flow of elements---especially important for complex dataflow operators or expressions that were difficult to display in the previous panel.

This new interface is being released as a "beta" feature side-by-side with the existing version to allow users a few months to acclimate to the new interface. Users can open existing dataflows using either Builder interface, but are encouraged to take some time to get to know the new tool as soon as possible so that they are familiar with it by the time the old interface is removed in a few months.

New Capabilities for Aunsight Formations

Aunsight Formations are an infrastructure-as-code solution management tool for the Aunsight platform. Formations enable solutions engineers to create structured templates of a solution, manage versions of those templates as software code, and deploy them in different contexts. As such, Formations are a critical feature of our solution implementation toolkit.

This month, a number of new improvements are being released after just over a year since Formations were first introduced.

Support for New Object Types

Two types of Aunsight objects that did not exist when Formations was created over a year ago are now supported: Datamarts and Daybreak Webapp Configurations. Users can now add these objects with the au2 formation object add command by specifying their type and ID with the --type and --id options.

Object Discovery

Adding objects to a formation manually can be a tedious and time consuming task for all but the smallest solutions. Aunsight object discovery provides new commands to add all objects or “crawl” from some initial object or job and add everything related to it recursively. These commands provide a faster and more powerful way to populate formations with objects selected intelligently.

Object Artifacts

Formations capture object metadata for a solution, but some types of objects merely track actual data that is stored outside of Aunsight in some specialized resource. Objects whose artifactual data is different from the Aunsight objects themselves include:

  • Dataset File Data (i.e. CSV data)
  • Process Docker Images and/or Source Code Packages
  • Serialized Machine Learning Model Data
  • Memento Series Data

For example, a dataset object's metadata could previously be copied via Formations, but since the contents of the dataset itself were stored outside of Aunsight on some dedicated resource (e.g. an NFS volume or Hadoop cluster), capturing that dataset's contents was not possible. Formation artifacts now add support for capturing the associated data in these “artifacts” so that the objects can be populated with it during deployment.

Local Deployments

Although formations are designed for mass-deployment of solutions, users most frequently create Formations for testing minor changes to a single client's production solution as is required by quality control/assurance processes. In these instances, users create formations in order to replicate a production solution to a separate development environment, but do not ever intend for changes made there to be applied outside of a single, local context. Local deployments are a new type of deployment that streamlines many of the formations functions to better accommodate these simple, single-context use cases.

Deployment Cleanup

Deployment cleanup provides a simple solution for "backing out" changes made during a deployment. The cleanup command makes it easy to remove the objects created during a deployment, without altering the formation or impacting other deployments. In this way, deployment cleanup provides a much more effective option than simply deleting a project context and leaving behind a large number of orphaned objects in the parent organization.

New Connector: Load to Datamart from DSV

Aunsight Datamarts are schemas that define the finished structure of a published dataset consumed by clients via different end-user tools (Daybreak, Tableau, etc.). The Datamart service enables engineers to "publish" (load or migrate) data from Aunsight into the data mart storage engine (Exasol) where end users can access it. Previously, Datamarts only supported the loading of data from Parquet formatted Atlas records. Now, engineers can use a Datamart DSV connector to read data from a DSV (delimiter separated values) file on an accessible filesystem. With this connector, users can now load data into Exasol directly from a DSV file from the Datamart component of the Aunsight web interface.

New Dataflow Operator: WeekdaysBetween

This month, a new dataflow operator WeekdaysBetween allows engineers to easily calculate the number of weekdays between two arbitrary dates. Because of the complexity of calculating calendrical weekdays, this new operator simplifies use cases where an engineer needs to create a derived field involving the number of business days between two dates (for example, the number of business days between a loan application and its closing). This operator will greatly speed up our ability to deliver new data mart fields and machine learning datasets that can help deliver insight and data intelligence to our clients.

Peeper Reports Now Available for HiC Datasets

Peeper reports are statistical analyses of the data in Atlas records which are useful for data quality control and data exploration. Previously, these records were created by specialized workloads that ran in our dedicated Hadoop cluster, but as most workloads have moved over to a new infrastructure based on Hadoop in Containers (HiC), this service has been updated to be able to execute Peeper jobs on the HiC infrastructure as well. This new capability will enable us to continue to offer quality control and assurance and assist data scientists in understanding the structure and contents of training datasets.

Workflows Can Now Kill All Child Jobs

This month, workflow managers can now kill (stop) a workflow job and also all jobs started by that workflow (child jobs). Previously, killing of child jobs had to be done manually, or was simply not done since it was sometimes too difficult to trace all child jobs manually. In these cases, platform compute resources could be wasted as the jobs would run even though their output would ultimately be lost. With this feature, data engineers can now successfully terminate all jobs associated with a given workflow in order to more easily manage resource limits and fix failing workflows.

Currently, workflow child job killing is only supported in the Toolbelt CLI and Python and Javascript SDKs. However, support in the Aunsight Web Interface will come next month!

Aunsight™ Golden Record

This month, Aunsight Golden Record has a number of new enhancements to the user experience:

  • Users can now review all changes to a domain and also revert back if the changes they see are not what they intended. Previously, users had to manually change back each change.
  • Transactional workflows can now load into Exasol, Aunsight's data mart storage engine.
  • Data Profiling is now possible with transactional workflows, as it has been for Golden Records for some time.
  • Delete detection can now be turned off to enable specific use cases such as custom queries with incremental reads. When turned off, users are notified by a tooltip.

Release Contents

Issue ID Description
AUN-14655 Add kill button for AuQL evaluate-script jobs
AUN-14862 Move workflow mail service from Google to Microsoft
AUN-14870 Create models database migration for NFS artifact storage
AUN-14871 Create process package database migration for NFS artifact storage
AUN-14891 DF STORE Parquet cannot handle datetime types
AUN-15008 Prevent infinite workflow recursion
AUN-15011 Resources details should show full filesystem path
AUN-15021 Move Walleye Tasks to K8s & Alluxio Internal Storage
AUN-15035 Add error message when secret is failing to load
AUN-15139 Job Tracing UI improvements
AUN-15148 Browse tool for loading datasets into datamarts should show both DSV and Parquet records
AUN-9389 Make all job type pages use 'updated_at' for updated date.
CLOUD-1652 Create global credentials for datamarts
CLOUD-1653 Script for datamart resource provisioning
DATAINT-464 Merging MDM Into Nucleus
DATAINT-585 SBTI Custom Branding - Update to Aunsight
DEVOPS-114 Certificate Manager
ILT-319 Return insights when aggregation is predicted
ILT-339 Adding Snynoyms for Account_Status in Trios
PZ-1018 Filter to remove closed branches from Distance to Closest Branch Calculation
PZ-1020 Customer_Status issue in Customer Dataflow
PZ-1021 Join change on Lending Template Dataflow

Bug Fixes

Issue ID Description
AUN-13899 Viewing a user that has no global perms in the system dashboard returns a 500 error
AUN-14994 Daybreak - expired token leaves modal backdrop when redirected to login
AUN-15042 DF Cached Mode not confirming output is unchanged
AUN-15069 Dispatcher tasks too sensitive to slow logging
AUN-15072 hotfix - load datamart and datamart migration jobs are failing in production
AUN-15082 Daybreak Config Datamart Tables Don't Populate
AUN-15104 Unsanitized Credentials within errors in Exasol Connector
AUN-15127 WF asks for confirmation before leaving even in view mode
AUN-15135 Daybreak - Conditions groups are not rendering correct text
AUN-15153 Negative Duration in Workflow Job
AUN-15158 Daybreak - condition group option "does not include any of" not properly "OR" 'd in the wizard UI
AUN-15159 Daybreak: Data results column selector should be disabled when the query needs to be refreshed
AUN-15174 Fixes for pre-release DF Ops panel
DATAINT-486 Monitor Graphs - Inconsistent window timeframe
DATAINT-495 Auto-provisioning should set the UI host suffix specific to the cluster.
DATAINT-576 Negative pending record count
DATAINT-605 Range - PW Reset issues
DATAINT-609 Nucleus Dataflow history timestamp display is humanizing timestamps wrong
DATAINT-610 Text disappearing on hover
DATAINT-614 Agent Button hides behind agent tiles
DATAINT-615 AzureAD SSO Config issue
DATAINT-616 June 2021 Retro
DATAINT-618 Repro attempt - Nulls in a datatype string field not converting to decimal 0.00
ILT-334 Matching build fail if table not exist dict
ILT-335 Build Failure in one context impact all other context because of update_when_change
ILT-352 Return Insights not working if no insights config in nlp_config
ILT-355 Failure to collect data in Query-Collection
ILT-376 Matching build fail for empty vocab
WAT-37 T20210701.0624 - Callon: DQ Exception Download
WAT-38 DQ Exception - Loading/Export issues
WAT-39 Cannot sort DQ check table
WAT-6 Update Basic Authentication in Proxy to use Identity Service