Enhancement: Cached Metadata

Ajilius enables browsing of source system metadata, to identify and load data into the warehouse. Here is a typical display:

metadata01

A click on a table, on the left side, shows the columns for that table on the right side.

Until now, each update of this screen represented a round trip to the database. That was fine for a data source located close by the Ajilius server, but users in hybrid environments reported slow screen updates when pulling cloud metadata to on-premise Ajilius. This was particularly apparent with large Oracle systems, which add thousands of system table entries to a metadata set.

We’ve now added a feature to cache metadata within Ajilius. This means the delay of updating metadata happens only once, and subsequent interactions are as fast as an on-premise solution.

The context menu for a data source now contains an option to Refresh Metadata:

metadata02

When you select this option, you will be prompted with a screen warning that a metadata update might be slow. In this case, “slow” means a few seconds.

metadata03

On running this refresh, an internal metadata cache is added to your data warehouse metadata repository, and subsequent calls to browse metadata or load metadata into the warehouse will be drawn from this cache.

You may refresh the cache at any time.

An added bonus is that once your data source metadata is cached, you no longer need to be connected to the source whilst working with source metadata. That’s a great feature for companies with high risk data, and for people who like to take their work home with them 🙂

Ajilius. More flexible metadata.

 

Multi-generational upgrades

Lately we’ve found that our pace of delivery has outstripped the ability of some users to keep up with upgrades.

We have been expecting users to apply each upgrade as it is issued. In practice, that hasn’t always been possible. Emails might have been missed, user priorities might have been elsewhere, and it has sometimes led to a situation where support was needed to work through the issues of upgrading multiple generations at a time.

We also had a problem this week with a customer who restored a metadata repository from a backup that was several months old. This meant that their metadata was out of step with their current application and repositories.

We’re fixing that problem this week. Release 2.2.14 will bring a new method of versioning upgrades. The version of the repository will be recorded, and the upgrade patches will include multiple generations of upgrade history.

On applying a patch to Ajilius, any repository that is not at the current level will be upgraded, even if every repository is at a different release.

Only one patch will be required to bring your metadata up to the latest version, even if it is several generations old. Further, any metadata that has been restored from a backup will also be brought up to date, simply by rerunning the upgrade process.

Ajilius. Flexible Upgrades.

Enhancement: MongoDB Data Source

Release 2.2.12 includes support for MongoDB as a data source.

MongoDB is a popular NoSQL DBMS, and Ajilius is the first data warehouse automation platform to support MongoDB as a first-class citizen.

Want proof? Here’s a shot of Ajilius displaying data from the MongoDB hosted TCPH database:

mongodb2

 

More? Here’s Ajilius showing data from the Restaurants sample database:

mongodb3

Ajilius makes it easy to work with MongoDB, as simple as working with any other data source we support.

Ajilius. MongoDB to your data warehouse.

Enhancement: Log to Data Warehouse

This week’s Release 2.2.12 includes a feature to write ELT log entries to the data warehouse.

Previous releases logged ELT performance to a log file. This file could be named through the Warehouse Edit screen, and defaulted to ajilius.log.

Now, in addition to the log file, a table AJILIUS_LOG is created in the AJILIUS schema of the data warehouse. On job completion, as long as a data warehouse connection is available, the results of the job will be inserted to this table.

The logged columns are:

  • log_stamp / Timestamp
  • log_script / Script name
  • log_elapsed / Elapsed time in seconds
  • log_status / Job status, 0=success
  • log_files / Number of files processed during job
  • log_inserts / Number of rows inserted during job
  • log_updates / Number of rows updated during job
  • log_deletes / Number of rows deleted during job
  • log_message / Descriptive error message if log_status<>0

Ajilius. Keeping track of performance.

 

Enhancement: User / Tech Documentation

This week’s download for Release 2.2.9 brings separation of User and Technical documentation.

Previously, Sources, Tables and Columns had one documentation panel. For tables and columns, that documentation was carried forward to subsequent tables in the ELT process. While editable, it gave rise to misleading documentation if a technical note from a previous table was overlooked.

We’ve now provided separate tabs for User and Tech(nical) documentation.

UserTechDoc

User documentation is carried from table-table, and column-column. Technical documentation is specific to the source, table or column for which it is defined.

Also, we’ve provided a feature to tune the size of the documentation panel. User Preferences now contains a field for Documentation Lines. Modify this value to increase or decrease the number of lines displayed by the documentation panel, tuning the fit for screen size and zoom level.

DocLines

Keep the requests coming, your ideas make Ajilius better for every user.

Ajilius. Listening to our users.

Enhancement: Load Metadata Columns

Release 2.2.8, out today, includes the ability to add metadata columns to load tables.

The columns which may be added are:

  • DateTime
  • GUID
  • Hash
  • Server
  • Database
  • Table

The availability of these columns is dependent on the capabilities of the source DBMS. Where a given column is not supported by the data source, an empty string will be supplied in its place.

Metadata columns may be added from the Column List screen for a given table. When adding a column, all you need to do is select the appropriate Column Role. Column names, data types, etc. will be added automatically.

Ajilius. Enriching source data.

Query Based Load Dependencies

The new Ajilius scheduler uses metadata to figure out the dependencies and run sequence of your ELT jobs.

These dependencies can currently break through custom code.

Here’s an example, contrived from a customer query earlier today:

 

cblDependencies

 

In this query there is a dependency between the tables load.load_from_table and load.load_earlier_table. Unless load.load_earlier_table has been populated before processing load.load_from_table, then duplicate rows may be selected.

In the long term we will use a SQL parser to extract these dependencies from custom queries, but we’re still working on the evaluation of parsers.

Meanwhile, these dependencies can be resolved through multiple steps in the scheduler. Here is an example of scheduler parameters that would ensure the correct sequence of loads in a full data warehouse refresh:

-w <dwname> -b reset
-w <dwname> -l load_earlier_table
-w <dwname> -f all

This could also be written as:

-w <dwname> -b reset -l load_earlier_table -f all

These commands will ensure that load_earlier_table is the first table processed by the scheduler, and it will be available when needed by load_from_table.

Remember, any time you’re not sure how to do something using Ajilius, you are most welcome to contact us and discuss.

Enhancement: Table and Column Comments

Ajilius now generates database comments for tables and columns, improving the self-documentation of your data warehouse.

We use the description field from table and column metadata to add the comments to the create script for each table.

Example: PostgreSQL

// Add table and column comments.

section = 'Comments'

dw.execute ("""comment on table load.load_album is 'Album';""");

dw.execute ("""comment on column load.load_album.album_id is 'Album ID';""");
dw.execute ("""comment on column load.load_album.title is 'Album Title';""");
dw.execute ("""comment on column load.load_album.artist_id is 'Artist ID';""");

Example: SQL Server

// Add table and column comments.

section = 'Comments'

dw.call ("""
exec sys.sp_addextendedproperty 
@name       = N'MS_DescriptionExample', 
@value      = N'Album', 
@level0type = N'SCHEMA', @level0name = 'load', 
@level1type = N'TABLE',  @level1name = 'load_album'; 
""");

dw.call ("""
exec sys.sp_addextendedproperty 
@name       = N'MS_DescriptionExample', 
@value      = N'Album ID', 
@level0type = N'SCHEMA', @level0name = 'load', 
@level1type = N'TABLE',  @level1name = 'load_album', 
@level2type = N'COLUMN', @level2name = 'album_id'; 
""");

A side comment. Look at the elegant syntax of the PostgreSQL version, then look at the SQL Server version. Shake your head, and mutter “WHAT were they thinking???”.

Ajilius. Continuous value.

Enhancement: Schemas

Rounding out the enhancements in Release 2.2.7, Ajilius now automatically organises tables into schemas.

We have implemented four schemas:

ajilius
Ajilius-specific control tables

load
Data loaded from external sources

stage
Data undergoing transformation

dw
Data warehouse dimensions and facts for public access

The main benefit of this feature is to simplify end-user access to fact and dimension tables. Previously, when a user accessed the database using any visualisation or query tool, all tables were visible to the user, including temporary load and staging tables. This was confusing to the user, and required work from a database administrator to selectively hide tables from the user.

Now, end-users need only be granted access to the dw schema by default, and they will only see dimensions and facts that have been created specifically for analytics and visualisation. Internal working tables will no longer be visible, unless access is specifically granted.

Ajilius. Simplifying end user analytics.