August Release Notes

Aunalytics is excited to announce the August 2021 release to our clients. This release will provide clients with model and site enhancement information along with any fixes to existing functionality we have included.

Daybreak

Aggregated filters

This month, we're excited to announce a new capability for the Daybreak Data Builder tool: Aggregated filters. Previously, Daybreak Data Builder conditions (filters) could only be evaluated against the value of a specific field and record. Now, aggregated filters provide a way to build queries that evaluate a condition against an aggregation of one or more records.

For example, if a user wants a list of customers with a total balance across all accounts over a certain amount, they can now build an aggregated filter off of the sum of the CurrentBalance field for all records in the Account table linked to that customer.

Aggregated filters support six different methods for generated an aggregation from two or more records in the data mart:

  • Min - The lowest value in the set
  • Max - The highest value in the set
  • Sum - The sum of all values in the set
  • Average - The average (mean) of all values in the set
  • Count - The number of records in the set
  • Count Distinct - The number of records with a unique value in the set

The last method is useful when working with enumerated fields. For example, a user could search for customers whose Account ProductType records have at least three or more distinct values, e.g. Checking, Savings, and Mortgage. This type of aggregation is different than the Count method since a query looking for customers with a Count aggregation of three or more on the ProductType field would also yield records for customers who had three different savings accounts, or two mortgages and a checking account; Count Distinct, by contrast, would yield the value of 1 for the first customer and 2 for the second, because that is the number of distinct product type values for those customers.

To learn more about how to use aggregated filters, watch this month's five-minute feature premiere video.

New Data Builder Operators

Daybreak's Data Builder now features a set of new operators for string fields: begins with, contains, and ends with. Conditions can now be built for string fields using these operators to match values based on the user supplied input and the operator's logic rules. For example, if a user supplies "son" as the input, results will vary based on the matching pattern selected:

  • begins with: Matches "Sondra Jones", "Sontag Industries LLC", etc.
  • ends with: Matches "Sarah Johnson", "Jake Davison", etc.
  • contains: Matches the previous two examples as well as others like "James Sonderheim", "Arthur Smith and Sons Construction LLC", etc.

Insights From Foreign Table Fields

Last month we released a new update to the column selector that enables users to include fields from related foreign tables in the data results. This month, those additional foreign table fields are now available in the Insights dashboard so that users can make visualizations of those fields.

Natural Language Answers™ improvements

Natural Language Answers™ Insights

Earlier this year, we released Natural Language Answers to provide the ability to get answers by posing questions about data in natural language, and Insights, which allow users to create charts and graphs to display data results visually. This month, we've combined both techniques so that Natural Language Answers will now feature a dashboard populated with Insights suggested by our machine learning model.

Now, users can ask a question about the two most frequently used tables (Customer and Account) and immediately receive a dashboard composed of elements that the model deems most relevant to understanding that question visually. For example, if a user asks for a list of accounts opened in 2020, the model will populate the dashboard with the following suggested Insights:

  • A Summary metric showing the number of accounts in the result set.
  • A line chart showing the number of new accounts broken down by month
  • A bar chart showing new accounts by region
  • A donut chart showing new accounts by current status as of today (e.g. how many are active, inactive, closed, etc.)
  • A bar chart showing new accounts by type (e.g. Mortgage, HELOC, checking, etc.)

As before, users can edit the dashboard by editing the suggested Insights, removing items, or manually adding new ones to customize the generated dashboard further. And as always, these dashboards will be saved when the query is saved, and can be shared and exported as JPEG images for download.

Model Retrain

This month's model retrain for Natural Language Answers features two new fields that the model had previously performed poorly on when presented with natural language questions about these fields: CurrentDTI from the Customer table, and AcquisitionType from Account. With more robust support for these fields, users will experience better results when asking questions like:

  • "Customers with debt to income less than 0.60"
  • "Customers who are older than 18 years with auto loan balance greater than 50000 and debt to income less than 0.30"
  • "Accounts acquired with indirect campaign"

Data Header Row Freezes

This month, we've made a minor update to the data results table in the Data Builder to freeze the header row when scrolling through query results. Previously, as a user scrolled through the result rows, the header row would disappear, sometimes making it difficult to interpret data results because the field name was not visible. Now, the header row will remain fixed at the top of the results table regardless of how far into the dataset a user has scrolled.

Aunsight

Automated Generation of Daybreak Enum Values

This month, enumerations for Daybreak web app configuration objects will now be generated automatically by examining the values present in fields flagged as enumerations. Previously, the process for configuring an enum field was done manually and implementations engineers were required to examine dataset reports to generate the list of unique values that should be present in that field. This change will streamline the process of updating enumerated fields (for example, when a new product type is added to the ProductType field in the Account table).

Terminate All Child Jobs - Web Interface Support

Last month, we released a new platform capability for terminating (killing) all child jobs created by a workflow. Initially, this option was only accessible via the Toolbelt and SDKs. This month, we have added a UI in the Aunsight Web Interface for killing all child jobs from the parent workflow's job status page.

Expression Builder chararray Inputs Can Select a Field as a Value

We've updated the Dataflow Builder's expression builder interface to allow users to select a field from a loaded table as the input for a chararray expression. This new capability enables Dataflow creators to build more dynamic expression logic by evaluating an expression based on variable input from a dataset field.

New dispatcher-compact Component

This month, we've created a new workflow component: dispatcher-compact. This component provides a utility for automating the cleanup of melted log data stored on our NFS storage resources. This cleanup is necessitated by the introduction of log-based source data files, which are created each time a transactional workflow (TXWF) copies data from a client source into Aunsight. Because these workflows are run repeatedly (in some cases, even every few minutes) a large number of small files containing timestamped records of updates to data tables are created. The dispatcher-compact component helps to organize these files by combining the data within multiple files into one file. And because each record row in a melted data log contains a timestamp, these compacted files do not impact our ability to provide data warehousing and archiving functions to our Daybreak™ for Financial Services customers.

Dataset Versioning

Many Aunsight objects have supported versioning in order to better track changes to pipeline configurations and revert back if changes need to be stepped out to support a client. Aunsight Atlas Record datasets have always been a marginal case, since they represent merely the structure (schema) for storing data, whereas the data itself exists in large filesystems in the Aunsight platform infrastructure. This month, the platform now supports versioning for Atlas Record schemas, enabling users to better understand the evolution of data pipeline schemas.

It is important to note that while versioning of Atlas Records applies to the schemas, it does not involve versioning of the data contained in them. In most cases, Atlas datasets are too large (gigabytes of magnitude) to version as a whole. For clients in need of data retention/dataset snapshots, we can support these in different ways but not directly through versioning Atlas records. For example, Daybreak for Financial Services clients on the 3.0 model have dataset warehousing through data logs that maintain a history of all changes to records. Those logs can be "unmelted" into a warehoused dataset using the most recent recorded changes, or a different time frame, to populate the value of each field.

Aunsight™ Golden Record

Transactional Workflow Enhancements

Transactional workflows (TXWF) are a lightweight "lift and shift" tool for the movement of bulk data across cloud systems. TXWFs move data from various types of databases into cloud-native data storage engines without engaging the matching and merging features used to clean data using Aunsight™ Golden Records, thereby greatly enhancing speed and resource footprint to perform these bulk migrations of data. This month, we're excited to announce two new improvements to this tool.

Amazon Aurora Option for TXWF

TXWFs now support a new output option: Amazon Aurora. This new storage option enables our clients to connect data from any of our supported data sources to an Amazon Aurora database in the AWS cloud using the lightweight, TXWF tool.

Profiling for TXWF

Aunsight Golden Record users creating Golden Records have long been able to use the data profiling tool to see source schema information and statistics on data populating a table using the Golden Record profiling tool. This month, the profiling tool has been extended to TXWFs to enable users to connect to a source, examine its data profile, and edit the query code accordingly as part of a data QA/QC process.

Data Dictionary Download

Aunsight Golden Record users have been able to view an interactive data dictionary showing the transformations used to derive the fields in a Golden Record. Previously, the only way to view this was to use the Aunsight Golden Record dashboard, which meant this data dictionary could not be widely shared outside of active users of the application. Now, Golden Record Data Dictionaries can be downloaded and shared like any other document which will provide additional value to our clients who need a data dictionary for compliance or quality assurance purposes.

Updates to Custom Query Editor

This month, several small updates to the Custom Query SQL editor have rolled out. Notable among them is the ability to save a query without validating it. Previously, users would have to ensure that their query code was valid SQL prior to saving, which in many cases prevented users from saving their work before it was completely finished or debugged.

Release Contents

Issue ID Description
WAT-30 Converting DQ API to .NET Core API
DATAINT-693 TXWF Partition Sync Refactor
DATAINT-668 Aunalytics Idea Portal - AuGR - Add persistent storage for cloud agents
DATAINT-655 Aunalytics Idea Portal - AuGR - Ability to see transactional workflow/golden record query even when the agent is down
DATAINT-632 Job control API - Trigger Resource + status 'completed'
DATAINT-622 SSO Error Handling - App Permissions
DATAINT-607 Allow saving of non-validated custom schema queries
DATAINT-582 Allow delete detection to be toggled off
DATAINT-565 Profiling for TXWF schemas
DATAINT-521 Query Validation - Schema Key Uniqueness check
DATAINT-517 Amazon Aurora - Bulk Load Destination option
DATAINT-499 Improve User Level Errors When Configuring A Connection
AUN-15177 dslab-image name format update
AUN-15105 Workflow: Cascade kill option in Workflow kill jobs
AUN-15098 Make 'Disambiguate only conflicting fields' the default option for joins
AUN-15067 Datamart migrate actions for table and view metadata
AUN-14977 Transactions for Migrate Tasks for Datamarts
AUN-14927 WF: Change "Retry attempts" to "Attempts" in Job Description Graph
AUN-14871 Create process package database migration for NFS artifact storage
AUN-14870 Create models database migration for NFS artifact storage
AUN-14862 Move workflow mail service from Google to Microsoft
AUN-14848 Insights will be able to use fields from all tables
AUN-14828 Show Parameter Values in Dataflow Job Logs
AUN-14754 Formations, Small Enhancements
AUN-13917 Atlas Versioning UI
AUN-11478 Account for renamed fields
DATAINT-674 Transactional Workflow - Date/Time Formatting Issue - Missing Milliseconds

Bug Fixes

Issue ID Description
WAT-45 DQ Package Export Naming Issue
WAT-40 Connection mappings not importing correctly for DQ Rule imports
ILT-411 verify data type in the structured query output
ILT-406 Fix spelling in Insights card
ILT-399 Model prediction bug in Interra in July release
ILT-398 Content get_column_names order mismatch with table_dict
DATAINT-709 T20210816.0183 - Aunsight - configuration clarification
DATAINT-675 Leading Zero in literal CRON breaks schedule
DATAINT-660 Profiling for TXWF - Count/Frequency Issue for Null and Empty Strings
DATAINT-659 Formatting Issue with Cell Content Space
DATAINT-650 Clear nonexist property mapping from draft when publish domain.
DATAINT-624 Improve data formatting failure handling
DATAINT-606 Profiling an empty schema results in confusing error
DATAINT-537 Make append-only resilient against job stopping
DATAINT-504 Improve warning logging for data point issues with matching
DATAINT-503 Do not auto-discover on Input edit
DATAINT-183 [BUG UAT] type of a property in a discovered schema changes
DATAINT-20 [PROD-ERROR] MDM Activities getting cancelled and throwing exception across tenants
AUN-15268 V2 daybreak configs don't load if there are no enumerable properties
AUN-15260 Daybreak, NLP result view not showing up in rc-aug21
AUN-15235 Dataset Versioning is Inaccessible in the Project Context
AUN-15233 Fix daybreak link format specification
AUN-15232 [QA] Op Panel doesn't handle if no relation is set for the Expression Builder
AUN-15156 T20210707.0321 - update_schema flag not working when called from workflow
AUN-15152 Aunsight Send Mail to Member component doesn't respect project permissions
AUN-15151 T20210702.0259 - Clone Job keeps Failing unless Ingest empty data first
AUN-15138 Encompass Connector - Not handling Null for GUID field
AUN-15071 T20210608.0411 - issue in toolbelt while trying to use 'query job' stem
AUN-15037 Dataflow did not respect Resource ReadOnly flag
AUN-14833 Daybreak-Download Insights-sometimes the image has the bottom cut-off
AUN-10509 Cross Org shared dataset not showing up in import selection list in dataflow builder.