August Release Notes¶
Aunalytics is excited to announce the August 2021 release to our clients. This release will provide clients with model and site enhancement information along with any fixes to existing functionality we have included.
Daybreak¶
Aggregated filters¶
This month, we're excited to announce a new capability for the Daybreak Data Builder tool: Aggregated filters. Previously, Daybreak Data Builder conditions (filters) could only be evaluated against the value of a specific field and record. Now, aggregated filters provide a way to build queries that evaluate a condition against an aggregation of one or more records.
For example, if a user wants a list of customers with a total balance across all accounts over a certain amount, they can now build an aggregated filter off of the sum
of the CurrentBalance
field for all records in the Account
table linked to that customer.
Aggregated filters support six different methods for generated an aggregation from two or more records in the data mart:
Min
- The lowest value in the setMax
- The highest value in the setSum
- The sum of all values in the setAverage
- The average (mean) of all values in the setCount
- The number of records in the setCount Distinct
- The number of records with a unique value in the set
The last method is useful when working with enumerated fields. For example, a user could search for customers whose Account
ProductType
records have at least three or more distinct values, e.g. Checking
, Savings
, and Mortgage
. This type of aggregation is different than the Count
method since a query looking for customers with a Count
aggregation of three or more on the ProductType
field would also yield records for customers who had three different savings accounts, or two mortgages and a checking account; Count Distinct
, by contrast, would yield the value of 1
for the first customer and 2
for the second, because that is the number of distinct product type values for those customers.
To learn more about how to use aggregated filters, watch this month's five-minute feature premiere video.
New Data Builder Operators¶
Daybreak's Data Builder now features a set of new operators for string fields: begins with
, contains
, and ends with
. Conditions can now be built for string fields using these operators to match values based on the user supplied input and the operator's logic rules. For example, if a user supplies "son"
as the input, results will vary based on the matching pattern selected:
begins with
: Matches "Sondra Jones", "Sontag Industries LLC", etc.ends with
: Matches "Sarah Johnson", "Jake Davison", etc.contains
: Matches the previous two examples as well as others like "James Sonderheim", "Arthur Smith and Sons Construction LLC", etc.
Insights From Foreign Table Fields¶
Last month we released a new update to the column selector that enables users to include fields from related foreign tables in the data results. This month, those additional foreign table fields are now available in the Insights dashboard so that users can make visualizations of those fields.
Natural Language Answers™ improvements¶
Natural Language Answers™ Insights¶
Earlier this year, we released Natural Language Answers to provide the ability to get answers by posing questions about data in natural language, and Insights, which allow users to create charts and graphs to display data results visually. This month, we've combined both techniques so that Natural Language Answers will now feature a dashboard populated with Insights suggested by our machine learning model.
Now, users can ask a question about the two most frequently used tables (Customer
and Account
) and immediately receive a dashboard composed of elements that the model deems most relevant to understanding that question visually. For example, if a user asks for a list of accounts opened in 2020, the model will populate the dashboard with the following suggested Insights:
- A Summary metric showing the number of accounts in the result set.
- A line chart showing the number of new accounts broken down by month
- A bar chart showing new accounts by region
- A donut chart showing new accounts by current status as of today (e.g. how many are active, inactive, closed, etc.)
- A bar chart showing new accounts by type (e.g. Mortgage, HELOC, checking, etc.)
As before, users can edit the dashboard by editing the suggested Insights, removing items, or manually adding new ones to customize the generated dashboard further. And as always, these dashboards will be saved when the query is saved, and can be shared and exported as JPEG images for download.
Model Retrain¶
This month's model retrain for Natural Language Answers features two new fields that the model had previously performed poorly on when presented with natural language questions about these fields: CurrentDTI
from the Customer
table, and AcquisitionType
from Account
. With more robust support for these fields, users will experience better results when asking questions like:
- "Customers with debt to income less than 0.60"
- "Customers who are older than 18 years with auto loan balance greater than 50000 and debt to income less than 0.30"
- "Accounts acquired with indirect campaign"
Data Header Row Freezes¶
This month, we've made a minor update to the data results table in the Data Builder to freeze the header row when scrolling through query results. Previously, as a user scrolled through the result rows, the header row would disappear, sometimes making it difficult to interpret data results because the field name was not visible. Now, the header row will remain fixed at the top of the results table regardless of how far into the dataset a user has scrolled.
Aunsight¶
Automated Generation of Daybreak Enum Values¶
This month, enumerations for Daybreak web app configuration objects will now be generated automatically by examining the values present in fields flagged as enumerations. Previously, the process for configuring an enum field was done manually and implementations engineers were required to examine dataset reports to generate the list of unique values that should be present in that field. This change will streamline the process of updating enumerated fields (for example, when a new product type is added to the ProductType
field in the Account
table).
Terminate All Child Jobs - Web Interface Support¶
Last month, we released a new platform capability for terminating (killing) all child jobs created by a workflow. Initially, this option was only accessible via the Toolbelt and SDKs. This month, we have added a UI in the Aunsight Web Interface for killing all child jobs from the parent workflow's job status page.
Expression Builder chararray
Inputs Can Select a Field as a Value¶
We've updated the Dataflow Builder's expression builder interface to allow users to select a field from a loaded table as the input for a chararray
expression. This new capability enables Dataflow creators to build more dynamic expression logic by evaluating an expression based on variable input from a dataset field.
New dispatcher-compact
Component¶
This month, we've created a new workflow component: dispatcher-compact
. This component provides a utility for automating the cleanup of melted log data stored on our NFS storage resources. This cleanup is necessitated by the introduction of log-based source data files, which are created each time a transactional workflow (TXWF) copies data from a client source into Aunsight. Because these workflows are run repeatedly (in some cases, even every few minutes) a large number of small files containing timestamped records of updates to data tables are created. The dispatcher-compact
component helps to organize these files by combining the data within multiple files into one file. And because each record row in a melted data log contains a timestamp, these compacted files do not impact our ability to provide data warehousing and archiving functions to our Daybreak™ for Financial Services customers.
Dataset Versioning¶
Many Aunsight objects have supported versioning in order to better track changes to pipeline configurations and revert back if changes need to be stepped out to support a client. Aunsight Atlas Record datasets have always been a marginal case, since they represent merely the structure (schema) for storing data, whereas the data itself exists in large filesystems in the Aunsight platform infrastructure. This month, the platform now supports versioning for Atlas Record schemas, enabling users to better understand the evolution of data pipeline schemas.
It is important to note that while versioning of Atlas Records applies to the schemas, it does not involve versioning of the data contained in them. In most cases, Atlas datasets are too large (gigabytes of magnitude) to version as a whole. For clients in need of data retention/dataset snapshots, we can support these in different ways but not directly through versioning Atlas records. For example, Daybreak for Financial Services clients on the 3.0 model have dataset warehousing through data logs that maintain a history of all changes to records. Those logs can be "unmelted" into a warehoused dataset using the most recent recorded changes, or a different time frame, to populate the value of each field.
Aunsight™ Golden Record¶
Transactional Workflow Enhancements¶
Transactional workflows (TXWF) are a lightweight "lift and shift" tool for the movement of bulk data across cloud systems. TXWFs move data from various types of databases into cloud-native data storage engines without engaging the matching and merging features used to clean data using Aunsight™ Golden Records, thereby greatly enhancing speed and resource footprint to perform these bulk migrations of data. This month, we're excited to announce two new improvements to this tool.
Amazon Aurora Option for TXWF¶
TXWFs now support a new output option: Amazon Aurora. This new storage option enables our clients to connect data from any of our supported data sources to an Amazon Aurora database in the AWS cloud using the lightweight, TXWF tool.
Profiling for TXWF¶
Aunsight Golden Record users creating Golden Records have long been able to use the data profiling tool to see source schema information and statistics on data populating a table using the Golden Record profiling tool. This month, the profiling tool has been extended to TXWFs to enable users to connect to a source, examine its data profile, and edit the query code accordingly as part of a data QA/QC process.
Data Dictionary Download¶
Aunsight Golden Record users have been able to view an interactive data dictionary showing the transformations used to derive the fields in a Golden Record. Previously, the only way to view this was to use the Aunsight Golden Record dashboard, which meant this data dictionary could not be widely shared outside of active users of the application. Now, Golden Record Data Dictionaries can be downloaded and shared like any other document which will provide additional value to our clients who need a data dictionary for compliance or quality assurance purposes.
Updates to Custom Query Editor¶
This month, several small updates to the Custom Query SQL editor have rolled out. Notable among them is the ability to save a query without validating it. Previously, users would have to ensure that their query code was valid SQL prior to saving, which in many cases prevented users from saving their work before it was completely finished or debugged.
Release Contents¶
Issue ID | Description |
---|---|
WAT-30 | Converting DQ API to .NET Core API |
DATAINT-693 | TXWF Partition Sync Refactor |
DATAINT-668 | Aunalytics Idea Portal - AuGR - Add persistent storage for cloud agents |
DATAINT-655 | Aunalytics Idea Portal - AuGR - Ability to see transactional workflow/golden record query even when the agent is down |
DATAINT-632 | Job control API - Trigger Resource + status 'completed' |
DATAINT-622 | SSO Error Handling - App Permissions |
DATAINT-607 | Allow saving of non-validated custom schema queries |
DATAINT-582 | Allow delete detection to be toggled off |
DATAINT-565 | Profiling for TXWF schemas |
DATAINT-521 | Query Validation - Schema Key Uniqueness check |
DATAINT-517 | Amazon Aurora - Bulk Load Destination option |
DATAINT-499 | Improve User Level Errors When Configuring A Connection |
AUN-15177 | dslab-image name format update |
AUN-15105 | Workflow: Cascade kill option in Workflow kill jobs |
AUN-15098 | Make 'Disambiguate only conflicting fields' the default option for joins |
AUN-15067 | Datamart migrate actions for table and view metadata |
AUN-14977 | Transactions for Migrate Tasks for Datamarts |
AUN-14927 | WF: Change "Retry attempts" to "Attempts" in Job Description Graph |
AUN-14871 | Create process package database migration for NFS artifact storage |
AUN-14870 | Create models database migration for NFS artifact storage |
AUN-14862 | Move workflow mail service from Google to Microsoft |
AUN-14848 | Insights will be able to use fields from all tables |
AUN-14828 | Show Parameter Values in Dataflow Job Logs |
AUN-14754 | Formations, Small Enhancements |
AUN-13917 | Atlas Versioning UI |
AUN-11478 | Account for renamed fields |
DATAINT-674 | Transactional Workflow - Date/Time Formatting Issue - Missing Milliseconds |
Bug Fixes¶
Issue ID | Description |
---|---|
WAT-45 | DQ Package Export Naming Issue |
WAT-40 | Connection mappings not importing correctly for DQ Rule imports |
ILT-411 | verify data type in the structured query output |
ILT-406 | Fix spelling in Insights card |
ILT-399 | Model prediction bug in Interra in July release |
ILT-398 | Content get_column_names order mismatch with table_dict |
DATAINT-709 | T20210816.0183 - Aunsight - configuration clarification |
DATAINT-675 | Leading Zero in literal CRON breaks schedule |
DATAINT-660 | Profiling for TXWF - Count/Frequency Issue for Null and Empty Strings |
DATAINT-659 | Formatting Issue with Cell Content Space |
DATAINT-650 | Clear nonexist property mapping from draft when publish domain. |
DATAINT-624 | Improve data formatting failure handling |
DATAINT-606 | Profiling an empty schema results in confusing error |
DATAINT-537 | Make append-only resilient against job stopping |
DATAINT-504 | Improve warning logging for data point issues with matching |
DATAINT-503 | Do not auto-discover on Input edit |
DATAINT-183 | [BUG UAT] type of a property in a discovered schema changes |
DATAINT-20 | [PROD-ERROR] MDM Activities getting cancelled and throwing exception across tenants |
AUN-15268 | V2 daybreak configs don't load if there are no enumerable properties |
AUN-15260 | Daybreak, NLP result view not showing up in rc-aug21 |
AUN-15235 | Dataset Versioning is Inaccessible in the Project Context |
AUN-15233 | Fix daybreak link format specification |
AUN-15232 | [QA] Op Panel doesn't handle if no relation is set for the Expression Builder |
AUN-15156 | T20210707.0321 - update_schema flag not working when called from workflow |
AUN-15152 | Aunsight Send Mail to Member component doesn't respect project permissions |
AUN-15151 | T20210702.0259 - Clone Job keeps Failing unless Ingest empty data first |
AUN-15138 | Encompass Connector - Not handling Null for GUID field |
AUN-15071 | T20210608.0411 - issue in toolbelt while trying to use 'query job' stem |
AUN-15037 | Dataflow did not respect Resource ReadOnly flag |
AUN-14833 | Daybreak-Download Insights-sometimes the image has the bottom cut-off |
AUN-10509 | Cross Org shared dataset not showing up in import selection list in dataflow builder. |