July Release Notes¶
Aunalytics is excited to announce the July 2021 release to our clients. This release will provide clients with model and site enhancement information along with any fixes to existing functionality we have included.
Data Builder Improvements¶
This month we're releasing significant improvements to the Daybreak Data Builder user interface, including grouped conditions and a column selector that now allows you to display fields from foreign tables in a query result. Learn more by reading on, or watch our feature release video.
Previously, Daybreak Data Builder allowed users to build queries using the Query Wizard that had one or more filter expressions (conditions) that were combined so that all conditions had to match (i.e. using the SQL
AND keyword) for a result to be included. This month, conditions can now be grouped so and those groups will be evaluated to restrict (
AND the condition) or expand (
OR the condition) the results that will be included when the query is run.
For example, previously you could not create a single query that would include customers whose ages were either between 18-35 or 65 or older. Now, that query can be created with two condition groups joined inclusively: results will be retrieved for any record that matches either condition.
The inclusion of condition groups has changed the look and feel of the Query Wizard, so take some time to familiarize yourself with the new look and feel.
Daybreak data marts employ a relational entity relationship model (ERM), sometimes called a relational database. For example, Daybreak for Financial Services includes different tables for customers, transactions, accounts, and branches. Each transaction is linked to a single account, and each customer can be linked to one or more accounts---and by extension, to the transactions for those accounts.
The Data Builder has always been able to understand these relationships in order to build complex queries, such as "Give me a list of customers whose credit card monthly balance was more than $5,000." However, when this query is run, the Data Builder's column selector could only include fields from the
Customer table in the query results.
This month, the column selector has been updated to allow fields from related records in another table to be included. This feature, which required extensive logic to be applied in order to create the appropriate SQL
JOIN statements, enables users to create more powerful datasets that combine data from different tables to support the generation of more comprehensive datasets drawn from multiple tables in the data mart.
Clickable Hyperlinks in Daybreak Data Builder¶
Query results in the Daybreak Data Builder will now display URLs in data field results as clickable hyperlinks. This will allow Daybreak to support a new Daybreak application being developed by Qumulus Solutions for document storage and retrieval. With this change, Daybreak data marts will be able to link to data or documents stored outside of the data mart via URL hyperlinks.
Natural Language Answers Model Retrain¶
This month, the Innovation Lab has retrained the Natural Language Answers™ model responsible for translating natural language questions into SQL queries. This retrain has enhanced support for fields that had previously not been well integrated into the model's training:
ClosestBranchDistance: "Give me a list of customers that live less than five miles from a branch."
DistanceToHQ: "Give me a list of customers who live fifty miles or less from headquarters."
IsEmployee: "Show me customers with investment accounts who are employees."
IsDeceased: "List customers with mortgages who are deceased."
ClosedCode: "Show me accounts with the closed code of 'BLOC'"
This new addition to the training templates will provide new ways to ask questions that had previously not been well supported by the language model.
This month's model also addresses questions that have previously been flagged by users for followup.
Customers with wealth acct- This question has been resolved and is now understood by the model.
Active employee customers with a checking account- This question is problematic since "active" could modify either employee or customers, and the model cannot easily disambiguate this. A Suggested work-around would be
Active customers who are employees with a checking account
Accounts that are overdrawn in the past week. Suggested work-around:
Accounts that are overdrawn less than 7 days
Customers with transactions last 30 days. Suggested workaround:
Customer with more than 0 transactions last 30 days
New Dataflow Builder Operations Panel UI¶
This month, we're revealing a new Dataflow Builder interface focusing on a new operators and expressions panel. The Dataflow builder used to edit dataflow objects previously featured a sidebar menu the left of the Builder screen. This month, a redesigned panel modeled on the component sidebar menu in the Workflow Builder will replace the old interface. The new interface features a responsive design that allows for a better flow of elements---especially important for complex dataflow operators or expressions that were difficult to display in the previous panel.
This new interface is being released as a "beta" feature side-by-side with the existing version to allow users a few months to acclimate to the new interface. Users can open existing dataflows using either Builder interface, but are encouraged to take some time to get to know the new tool as soon as possible so that they are familiar with it by the time the old interface is removed in a few months.
New Capabilities for Aunsight Formations¶
Aunsight Formations are an infrastructure-as-code solution management tool for the Aunsight platform. Formations enable solutions engineers to create structured templates of a solution, manage versions of those templates as software code, and deploy them in different contexts. As such, Formations are a critical feature of our solution implementation toolkit.
This month, a number of new improvements are being released after just over a year since Formations were first introduced.
Support for New Object Types¶
Two types of Aunsight objects that did not exist when Formations was created over a year ago are now supported: Datamarts and Daybreak Webapp Configurations. Users can now add these objects with the
au2 formation object add command by specifying their type and ID with the
Adding objects to a formation manually can be a tedious and time consuming task for all but the smallest solutions. Aunsight object discovery provides new commands to add all objects or “crawl” from some initial object or job and add everything related to it recursively. These commands provide a faster and more powerful way to populate formations with objects selected intelligently.
Formations capture object metadata for a solution, but some types of objects merely track actual data that is stored outside of Aunsight in some specialized resource. Objects whose artifactual data is different from the Aunsight objects themselves include:
- Dataset File Data (i.e. CSV data)
- Process Docker Images and/or Source Code Packages
- Serialized Machine Learning Model Data
- Memento Series Data
For example, a dataset object's metadata could previously be copied via Formations, but since the contents of the dataset itself were stored outside of Aunsight on some dedicated resource (e.g. an NFS volume or Hadoop cluster), capturing that dataset's contents was not possible. Formation artifacts now add support for capturing the associated data in these “artifacts” so that the objects can be populated with it during deployment.
Although formations are designed for mass-deployment of solutions, users most frequently create Formations for testing minor changes to a single client's production solution as is required by quality control/assurance processes. In these instances, users create formations in order to replicate a production solution to a separate development environment, but do not ever intend for changes made there to be applied outside of a single, local context. Local deployments are a new type of deployment that streamlines many of the formations functions to better accommodate these simple, single-context use cases.
Deployment cleanup provides a simple solution for "backing out" changes made during a deployment. The cleanup command makes it easy to remove the objects created during a deployment, without altering the formation or impacting other deployments. In this way, deployment cleanup provides a much more effective option than simply deleting a project context and leaving behind a large number of orphaned objects in the parent organization.
New Connector: Load to Datamart from DSV¶
Aunsight Datamarts are schemas that define the finished structure of a published dataset consumed by clients via different end-user tools (Daybreak, Tableau, etc.). The Datamart service enables engineers to "publish" (load or migrate) data from Aunsight into the data mart storage engine (Exasol) where end users can access it. Previously, Datamarts only supported the loading of data from Parquet formatted Atlas records. Now, engineers can use a Datamart DSV connector to read data from a DSV (delimiter separated values) file on an accessible filesystem. With this connector, users can now load data into Exasol directly from a DSV file from the Datamart component of the Aunsight web interface.
New Dataflow Operator:
This month, a new dataflow operator
WeekdaysBetween allows engineers to easily calculate the number of weekdays between two arbitrary dates. Because of the complexity of calculating calendrical weekdays, this new operator simplifies use cases where an engineer needs to create a derived field involving the number of business days between two dates (for example, the number of business days between a loan application and its closing). This operator will greatly speed up our ability to deliver new data mart fields and machine learning datasets that can help deliver insight and data intelligence to our clients.
Peeper Reports Now Available for HiC Datasets¶
Peeper reports are statistical analyses of the data in Atlas records which are useful for data quality control and data exploration. Previously, these records were created by specialized workloads that ran in our dedicated Hadoop cluster, but as most workloads have moved over to a new infrastructure based on Hadoop in Containers (HiC), this service has been updated to be able to execute Peeper jobs on the HiC infrastructure as well. This new capability will enable us to continue to offer quality control and assurance and assist data scientists in understanding the structure and contents of training datasets.
Workflows Can Now Kill All Child Jobs¶
This month, workflow managers can now kill (stop) a workflow job and also all jobs started by that workflow (child jobs). Previously, killing of child jobs had to be done manually, or was simply not done since it was sometimes too difficult to trace all child jobs manually. In these cases, platform compute resources could be wasted as the jobs would run even though their output would ultimately be lost. With this feature, data engineers can now successfully terminate all jobs associated with a given workflow in order to more easily manage resource limits and fix failing workflows.
Aunsight™ Golden Record¶
This month, Aunsight Golden Record has a number of new enhancements to the user experience:
- Users can now review all changes to a domain and also revert back if the changes they see are not what they intended. Previously, users had to manually change back each change.
- Transactional workflows can now load into Exasol, Aunsight's data mart storage engine.
- Data Profiling is now possible with transactional workflows, as it has been for Golden Records for some time.
- Delete detection can now be turned off to enable specific use cases such as custom queries with incremental reads. When turned off, users are notified by a tooltip.
|AUN-14655||Add kill button for AuQL evaluate-script jobs|
|AUN-14862||Move workflow mail service from Google to Microsoft|
|AUN-14870||Create models database migration for NFS artifact storage|
|AUN-14871||Create process package database migration for NFS artifact storage|
|AUN-14891||DF STORE Parquet cannot handle datetime types|
|AUN-15008||Prevent infinite workflow recursion|
|AUN-15011||Resources details should show full filesystem path|
|AUN-15021||Move Walleye Tasks to K8s & Alluxio Internal Storage|
|AUN-15035||Add error message when secret is failing to load|
|AUN-15139||Job Tracing UI improvements|
|AUN-15148||Browse tool for loading datasets into datamarts should show both DSV and Parquet records|
|AUN-9389||Make all job type pages use 'updated_at' for updated date.|
|CLOUD-1652||Create global credentials for datamarts|
|CLOUD-1653||Script for datamart resource provisioning|
|DATAINT-464||Merging MDM Into Nucleus|
|DATAINT-585||SBTI Custom Branding - Update to Aunsight|
|ILT-319||Return insights when aggregation is predicted|
|ILT-339||Adding Snynoyms for Account_Status in Trios|
|PZ-1018||Filter to remove closed branches from Distance to Closest Branch Calculation|
|PZ-1020||Customer_Status issue in Customer Dataflow|
|PZ-1021||Join change on Lending Template Dataflow|
|AUN-13899||Viewing a user that has no global perms in the system dashboard returns a 500 error|
|AUN-14994||Daybreak - expired token leaves modal backdrop when redirected to login|
|AUN-15042||DF Cached Mode not confirming output is unchanged|
|AUN-15069||Dispatcher tasks too sensitive to slow logging|
|AUN-15072||hotfix - load datamart and datamart migration jobs are failing in production|
|AUN-15082||Daybreak Config Datamart Tables Don't Populate|
|AUN-15104||Unsanitized Credentials within errors in Exasol Connector|
|AUN-15127||WF asks for confirmation before leaving even in view mode|
|AUN-15135||Daybreak - Conditions groups are not rendering correct text|
|AUN-15153||Negative Duration in Workflow Job|
|AUN-15158||Daybreak - condition group option "does not include any of" not properly "OR" 'd in the wizard UI|
|AUN-15159||Daybreak: Data results column selector should be disabled when the query needs to be refreshed|
|AUN-15174||Fixes for pre-release DF Ops panel|
|DATAINT-486||Monitor Graphs - Inconsistent window timeframe|
|DATAINT-495||Auto-provisioning should set the UI host suffix specific to the cluster.|
|DATAINT-576||Negative pending record count|
|DATAINT-605||Range - PW Reset issues|
|DATAINT-609||Nucleus Dataflow history timestamp display is humanizing timestamps wrong|
|DATAINT-610||Text disappearing on hover|
|DATAINT-614||Agent Button hides behind agent tiles|
|DATAINT-615||AzureAD SSO Config issue|
|DATAINT-616||June 2021 Retro|
|DATAINT-618||Repro attempt - Nulls in a datatype string field not converting to decimal 0.00|
|ILT-334||Matching build fail if table not exist dict|
|ILT-335||Build Failure in one context impact all other context because of update_when_change|
|ILT-352||Return Insights not working if no insights config in nlp_config|
|ILT-355||Failure to collect data in Query-Collection|
|ILT-376||Matching build fail for empty vocab|
|WAT-37||T20210701.0624 - Callon: DQ Exception Download|
|WAT-38||DQ Exception - Loading/Export issues|
|WAT-39||Cannot sort DQ check table|
|WAT-6||Update Basic Authentication in Proxy to use Identity Service|