What’s In Ajilius 3

We’ve been working on the next major release for Ajilius for several months now, and it is time to let you know what we’ll deliver starting next month.

Adaptive ELT

Q: “What is the fastest algorithm to load a dimension?”

A: “It depends.”

That’s right, the best algorithm for a load process depends on your platform and your data. MPP platforms prefer CTAS algorithms, SMP may do better with UPDATE algorithms for low volatility tables.

What we often see, though, is that design decisions made during development are no longer valid when the DW has been in production for several months. Assuming good design choices in the first place, the most common reason is that the volume and patterns of data in production are not matching those that drove the initial design decision.

Ajilius already supports a choice of CTAS or UPDATE algorithms depending on the platform, but in Ajilius 3 we’re taking an industry-leading leap forward and introducing a dynamic choice of algorithm at run time. We’re calling it Adaptive ELT.

Based on the combination of platform and data, we’ll make an intelligent choice about the best algorithm to use for each and every batch run. Does the data suggest you’ll be updating the majority of rows in a billion row table? We’ll generate CTAS for that batch. Are you only updating a few  rows on the next run? We’ll use an UPDATE statement.

No other ETL or ELT platform gives you this flexibility. It is like having your developers rewrite and tune their code, each night, with the data for that night’s batch driving their decisions. You’ll get an expert ELT system that dynamically monitors and adjusts your processing for maximum performance, on every single run.

SQL+Script

Until now, Ajilius has generated scripts for ELT processing. Ajilius 1 generated Python, Ajilius 2 generates Groovy, Ajilius 3 is throwing away the single script constraints.

In Ajilius 3 we’ve separated generation of SQL from generation of the wrapping script. We’re calling this SQL+Script, because we’ll be generating SQL for every transform, supported by a wrapping Script of a type requested by the user.

A new template option will enable the generated SQL code that effects transformations to be wrapped in multiple scripting languages. We’ll continue to support Groovy as our primary scripting option, but here are just some of the options that will be available in Ajilius 3:

  • Plain SQL. Every script will be able to be generated as just SQL statements in the target platform dialect. This will delight customers who want to use Ajilius in a hybrid environment with existing data warehouses.
  • Stored Procedures. SQL Scripts will be wrapped in stored procedure creation scripts for customers who prefer to use tools like SQL Agent and SSIS for job execution.
  • PowerShell. The rise of PowerShell as an automation language, and its growing popularity in CI/CD environments, makes it a great option for an Ajilius target. We’ll generate wrappers that emulate our current Groovy scripts, but using PowerShell as the scripting language.

BEAM Support

Being great fans of Lawrence Corr’s Agile Data Warehouse Design methodology, we’re adding BEAM support to Ajilius.

Our new “Model” module will implement the BEAM methodology. Here’s what you’ll see in the first release:

  1. Capture and presentation of 7-Ws
  2. Generation of template fact/s and dimensions from Ws
  3. Mapping of staging tables to fact/s and dimensions
  4. BEAM-compatible short codes for all object types.
  5. Incorporation of BEAM artefacts into Ajilius-generated documentation

Ajilius currently works “left-to-right”, in that we start from data sources and eventually transform to facts and dimension. We’ll keep that capability in Ajilius 3, but we’ll also provide “right-to-left” model driven implementations, where you’ll have the option to create facts and dimensions first, then link them back to transformations. In other words, we’ll support model-driven, data-driven and hybrid approaches to building a dimensional data warehouse.

All of this is an extension to the current Ajilius. You’ll be able to retrofit BEAM to an existing Ajilius DW, as well as integrate new BEAM-designed stars in to a conformed set of existing dimensions.

Once again, Ajilius is demonstrating innovation and leadership in data warehouse automation, being the first to directly integrate BEAM into the automation process.

The Fine Print

We’ll be rolling out the first release of Ajilius 3 some time before June 30.

The Ajilius 3 upgrade will preserve all your existing metadata, but we’ll be making some adjustments to the metadata structure that will break the incremental update flow. In other words, we’ll be expecting all users to upgrade to 3, then restarting incremental upgrades from that point.

We will continue to support Ajilius 2 for six months from the date of release of Ajilius 3, in order to give users time to evaluate and test the migration process.

There is, of course, no additional charge to users for Ajilius 3, all upgrades are covered for the life of your license agreement.

We hope you’re as excited as we are by these changes!

Ajilius. Committed to innovation in data warehouse automation.

 

Ajilius 2.4.10

This week we’ve got a mix of incremental enhancements and error corrections.
Here’s what you’ll find in Ajilius 2.4.10:
  • Enhancements:
    • Salesforce Sandbox. Ajilius now supports sandbox accounts for Salesforce.
    • Tableau Connections. Tableau connections have now been validated against SQL Server, PostgreSQL, Snowflake, Redshift, Azure SQL DW, Exasol and MariaDB ColumnStore.
    • Dedup Calculations. We now support the use of Transforms and Calculations in Dedup transformations.
    • Redshift Sort Keys. If you’re a Redshift user, you now have interleaved sortkey support to improve performance of queries against facts and dimensions.
    • Column Sorting. The Warehouse List and Extract List forms now support column sorting, but clicking the column name. We’ll roll out more sortable lists in future releases.
    • BI Lineage. We now show the lineage of a column on the tooltips in Tableau and Yellowfin.
    • Table Page Numbering. Table page numbers now show at the bottom of lists, to show your position in the list.
  • Error Corrections
    • Better Locale Support. We’ve made some adjustments that give better support for locale-specific number and date formatting.
    • Coalesce Empty. The Coalesce Empty transform, which replaces null values with an empty string, was not working correctly on Snowflake.

Registered users should already have received notification of the download location, please let us know if you’ve not received your email.

 

Ajilius 2.4.7

This week’s release 2.4.7 brings the following new features:

  • Salesforce Adapter. Connect to Salesforce, browse and profile your data, then quickly load it to your choice of data warehouse for transformation to facts and dimensions.
  • Double-Bar delimiter. Delimited files now accept a double bar (||) as a column delimiter.

The following errors have been corrected:

  • Historic Persistent Staging. CTAS versions were not correctly updating the current row flag for this variant of persistent staging.
  • Blank lines in log. Occasional blank lines were being appended to the console, and have now been suppressed.

Registered users will receive an email over the next 1-2 days with download instructions.

Go build some data warehouses 🙂

 

Release 2.4.2

This week’s release makes the following changes:

  • New data type “char” for single character columns. Used internally for SCD_FROM column in type-2 slowly changing dimensions.
  • New TRUNCATE option for the command line scheduler, cleans up transient data after loads to minimise storage costs on cloud platforms.
  • PostgreSQL interface on Windows was not always being fed UTF-8 encoded data. Added encoding parameter to byte conversion.
  • Deleting the active data warehouse repository on Windows would sometimes fail. Now triggering call to garbage collector to ensure closed connections are purged from memory.

Registered users will receive an email in the next 1-2 days with download instructions.

Interestingly, the last two items only showed up when we shifted our development platforms from OSX to Windows. Really liking the change – like moving back home after living on the road for a while – but it does show up the quirks of the Windows internals compared to OSX and Linux.

The list is lighter than usual because we’re working hard on new features for Version 3 … stay tuned for updates!

 

Release 2.4.1

We’re back in the normal release cycle, with our first update to 2.4.

Here’s what you’ll find in this release:

  • New batch scheduler command line parameter “-x” to extract one or more tables to CSV.
  • Printing of script messages in batch output.
  • Licensing revisions making it easier and faster for us to create evaluation license keys.
  • Support for delimited files with spaces in file name.
  • Error corrections.

Internally, we added six new automated test cases this week, around delimited file names, and SCD2 dimension processing. Our automated testing process is really starting to pay off, none of the errors we fixed this week were reported by users, all were found through automated testing.

Registered users will receive an email in the next 24 hours with the download details.

 

Ajilius 2.4.0

We’re a month late, and we’re sorry. But Ajilius 2.4.0 has finally been delivered.

We’ll blog more about specific features, but here’s what you have to look forward to in this release:

  • BI Accelerators for Yellowfin, Tableau and Qlik. Full business-friendly metadata generation at the click (Qlik?) of a mouse. We’re the only data warehouse automation platform to support all three platforms.
  • BI Views for a unified query experience across any BI tool, including PowerBI and Excel. Now your PowerPivot users can share the same business-friendly table and column names as their Yellowfin and Tableau colleagues.
  • A revised, high-performance CTAS engine for MPP platforms, including Microsoft APS/PDW, Azure SQL Data Warehouse, Redshift and Snowflake. And we’re still the only data warehouse automation platform to support three click migration between supported platforms.
  • Full support for Microsoft APS/PDW. That makes us the only data warehouse automation platform that supports EVERY Microsoft RDBMS, SMP and MPP, on-premise and cloud.
  • A new Inference engine for managing early arriving facts. You have the one-click choice of automatically inferring dimension rows when new values are found in fact processing, or simply assigning those rows to the “Unknown” value.
  • Integrated Authentication for all Microsoft sources and targets.

Now the work begins on the Version 3.0 series of releases!

 

Announcement: Ajilius 2.3

I’m proud to announce that, today, we achieved our goals for Ajilius 2.3.

The hero feature of this release is data profiling. We now profile data sources faster than Trifacta, with more valuable information than Pentaho, and with more variety than SSIS. We give you the real data you need to make quality decisions about the content of your data sources.

To deliver this feature, we first added persistent metadata caching to Ajilius, as discussed in this earlier post.

Now, we’ve completed the feature by implementing the profiling and presentation features.

You profile a source table from the Extract Source Tables screen. The following screen shot shows that we are about to profile the Chinook Customer table.

profile1

We see any previous profile that is cached for this table, and we can refresh the profile at any time by pressing the Profile button.

profile2

We profile any number of rows, at a rate of around 4million rows per minute.

Every column is profiled in every row.

For columns of less than 64 characters in length, we profile up to 1,000,000 discrete values per column.

For columns of up to 256 characters in length, we profile up to 1,000,000 discrete patterns per column.

For columns of up to 4,000 characters in length, we profile the minimum and maximum values in a column.

Not only do we profile values and patterns, we examine your data for characters that might cause problems in your data warehouse. Null values? Got it. Control characters? Check. Extended ASCII characters? That too. Unicode characters? Again, check.

This is real, valuable profiling data for data warehouse professionals. And it is now included in your Ajilius licence.

So, once again, Ajilius provides real value through the addition of the features you need.

Ajilius. The real innovators in data warehouse automation.

 

Enhancement: Cached Metadata

Ajilius enables browsing of source system metadata, to identify and load data into the warehouse. Here is a typical display:

metadata01

A click on a table, on the left side, shows the columns for that table on the right side.

Until now, each update of this screen represented a round trip to the database. That was fine for a data source located close by the Ajilius server, but users in hybrid environments reported slow screen updates when pulling cloud metadata to on-premise Ajilius. This was particularly apparent with large Oracle systems, which add thousands of system table entries to a metadata set.

We’ve now added a feature to cache metadata within Ajilius. This means the delay of updating metadata happens only once, and subsequent interactions are as fast as an on-premise solution.

The context menu for a data source now contains an option to Refresh Metadata:

metadata02

When you select this option, you will be prompted with a screen warning that a metadata update might be slow. In this case, “slow” means a few seconds.

metadata03

On running this refresh, an internal metadata cache is added to your data warehouse metadata repository, and subsequent calls to browse metadata or load metadata into the warehouse will be drawn from this cache.

You may refresh the cache at any time.

An added bonus is that once your data source metadata is cached, you no longer need to be connected to the source whilst working with source metadata. That’s a great feature for companies with high risk data, and for people who like to take their work home with them 🙂

Ajilius. More flexible metadata.

 

Multi-generational upgrades

Lately we’ve found that our pace of delivery has outstripped the ability of some users to keep up with upgrades.

We have been expecting users to apply each upgrade as it is issued. In practice, that hasn’t always been possible. Emails might have been missed, user priorities might have been elsewhere, and it has sometimes led to a situation where support was needed to work through the issues of upgrading multiple generations at a time.

We also had a problem this week with a customer who restored a metadata repository from a backup that was several months old. This meant that their metadata was out of step with their current application and repositories.

We’re fixing that problem this week. Release 2.2.14 will bring a new method of versioning upgrades. The version of the repository will be recorded, and the upgrade patches will include multiple generations of upgrade history.

On applying a patch to Ajilius, any repository that is not at the current level will be upgraded, even if every repository is at a different release.

Only one patch will be required to bring your metadata up to the latest version, even if it is several generations old. Further, any metadata that has been restored from a backup will also be brought up to date, simply by rerunning the upgrade process.

Ajilius. Flexible Upgrades.

Enhancement: MongoDB Data Source

Release 2.2.12 includes support for MongoDB as a data source.

MongoDB is a popular NoSQL DBMS, and Ajilius is the first data warehouse automation platform to support MongoDB as a first-class citizen.

Want proof? Here’s a shot of Ajilius displaying data from the MongoDB hosted TCPH database:

mongodb2

 

More? Here’s Ajilius showing data from the Restaurants sample database:

mongodb3

Ajilius makes it easy to work with MongoDB, as simple as working with any other data source we support.

Ajilius. MongoDB to your data warehouse.