Documentation belongs …

… IN THE CODE.

I recently had to review some old ETL code where the developer was a firm believer in documentation. Unfortunately, his belief was that documentation belonged in Confluence.

Not that I’ve got anything against Confluence, it is a good product, but I got very tired and very grumpy about having to look up page after wiki page to figure out what was in his twisted little brain when the code was written.

Ajilius saves developers from having to search in unreasonable places for documentation. Not only do we capture both user and technical documentation for all artifacts, but we put technical document right were it belongs … IN THE CODE.

Here’s a screen shot of the technical documentation panel for a table:

… and here’s how it surfaces in code, neatly translated from HTML to text:

Ajilius. Better data warehouses … and better documentation.

Ajilius 3: Adaptive ELT Performance

The new Adaptive ELT feature in Ajilius 3 uses intelligent algorithm selection at runtime to optimise ELT performance.

Combined with the new SQL+Scripts architecture for code generation and execution, we’re seeing orders of magnitude improvement in some use cases.

Here’s a sample of batch output from the standard Ajilius demonstration:

The first two jobs – stage_location and dim_location – used to take around 0.06 and 0.08 seconds to run. Now they’re talking less than .001 seconds.

They’re trivially small tables, but we’re using them as an example because you can replicate the experience in an Ajilius demonstration.

Not every job will benefit to this extent, it depends on factors like the algorithm, and the number of rows in the source and target tables, but every job that processes different data on each run will be faster.

Ajilius. Faster.

 

Ajilius 3: SQL + Scripts

In Ajilius 1, we used Python as a scripting language. Ajilius 2 switched to Java and Groovy. We generated scripts that mixed the scripting language and SQL to emulate stored procedures.

We got pushback on this approach from a number of prospects, probably because we didn’t always do a good job of explaining the relationship between the scripting language and the SQL. Prospects saw the Groovy syntax, for example, and thought that they would have to learn a new language to build their data warehouse.

Looking back, we can see why.

Here is an Ajilius 2 script:

Here is the Ajilius 3 script for the same table:

As you can see, the SQL script in our new version is immediately familiar to a data warehouse developer.

This isn’t to say that we’re abandoning scripting, because Ajilius 3 introduces this feature as SQL + Scripts.

You’ll still have the option to export and package the SQL for execution as Groovy scripts, Stored Procedures (where supported), PowerShell scripts and more. The difference to earlier versions is that this is an option, not the standard.

An added bonus from this approach is that scripts run through the Ajilius scheduler no longer have to go through the overhead of instantiating the scripting engine and loading/compiling the script before execution. As a result, you get much faster execution of your transformation processes.

Ajilius. Listening to customer feedback to make us better.

 

Take the SAS language challenge

Can your data warehouse automation solution correctly handle international character sets?

I was reading an interesting blog post from SAS at the weekend. It shows how they visualise data containing character sets from many languages around the world.

http://blogs.sas.com/content/sastraining/2017/05/12/map-of-idioms-from-around-the-world/

The blog discuss loading and visualising data that looks like this:

Given that the data is contained in a spreadsheet, I thought it would be fun to see how that data was handled by Ajilius.

Create a data source that points to the spreadsheet:

Ingest the metadata from the spreadsheet:

Run the load script, and here is the result:

There it is. Fast, full-fidelity representation of the SAS data. If you’re building a data warehouse in Korea, or Armenia, or Israel, or any other country with a non-ASCII language, then Ajilius is your best solution for data warehouse automation.

Ajilius. Data warehouse automation for the global community.

 

What’s In Ajilius 3

We’ve been working on the next major release for Ajilius for several months now, and it is time to let you know what we’ll deliver starting next month.

Adaptive ELT

Q: “What is the fastest algorithm to load a dimension?”

A: “It depends.”

That’s right, the best algorithm for a load process depends on your platform and your data. MPP platforms prefer CTAS algorithms, SMP may do better with UPDATE algorithms for low volatility tables.

What we often see, though, is that design decisions made during development are no longer valid when the DW has been in production for several months. Assuming good design choices in the first place, the most common reason is that the volume and patterns of data in production are not matching those that drove the initial design decision.

Ajilius already supports a choice of CTAS or UPDATE algorithms depending on the platform, but in Ajilius 3 we’re taking an industry-leading leap forward and introducing a dynamic choice of algorithm at run time. We’re calling it Adaptive ELT.

Based on the combination of platform and data, we’ll make an intelligent choice about the best algorithm to use for each and every batch run. Does the data suggest you’ll be updating the majority of rows in a billion row table? We’ll generate CTAS for that batch. Are you only updating a few¬† rows on the next run? We’ll use an UPDATE statement.

No other ETL or ELT platform gives you this flexibility. It is like having your developers rewrite and tune their code, each night, with the data for that night’s batch driving their decisions. You’ll get an expert ELT system that dynamically monitors and adjusts your processing for maximum performance, on every single run.

SQL+Script

Until now, Ajilius has generated scripts for ELT processing. Ajilius 1 generated Python, Ajilius 2 generates Groovy, Ajilius 3 is throwing away the single script constraints.

In Ajilius 3 we’ve separated generation of SQL from generation of the wrapping script. We’re calling this SQL+Script, because we’ll be generating SQL for every transform, supported by a wrapping Script of a type requested by the user.

A new template option will enable the generated SQL code that effects transformations to be wrapped in multiple scripting languages. We’ll continue to support Groovy as our primary scripting option, but here are just some of the options that will be available in Ajilius 3:

  • Plain SQL. Every script will be able to be generated as just SQL statements in the target platform dialect. This will delight customers who want to use Ajilius in a hybrid environment with existing data warehouses.
  • Stored Procedures. SQL Scripts will be wrapped in stored procedure creation scripts for customers who prefer to use tools like SQL Agent and SSIS for job execution.
  • PowerShell. The rise of PowerShell as an automation language, and its growing popularity in CI/CD environments, makes it a great option for an Ajilius target. We’ll generate wrappers that emulate our current Groovy scripts, but using PowerShell as the scripting language.

BEAM Support

Being great fans of Lawrence Corr’s Agile Data Warehouse Design methodology, we’re adding BEAM support to Ajilius.

Our new “Model” module will implement the BEAM methodology. Here’s what you’ll see in the first release:

  1. Capture and presentation of 7-Ws
  2. Generation of template fact/s and dimensions from Ws
  3. Mapping of staging tables to fact/s and dimensions
  4. BEAM-compatible short codes for all object types.
  5. Incorporation of BEAM artefacts into Ajilius-generated documentation

Ajilius currently works “left-to-right”, in that we start from data sources and eventually transform to facts and dimension. We’ll keep that capability in Ajilius 3, but we’ll also provide “right-to-left” model driven implementations, where you’ll have the option to create facts and dimensions first, then link them back to transformations. In other words, we’ll support model-driven, data-driven and hybrid approaches to building a dimensional data warehouse.

All of this is an extension to the current Ajilius. You’ll be able to retrofit BEAM to an existing Ajilius DW, as well as integrate new BEAM-designed stars in to a conformed set of existing dimensions.

Once again, Ajilius is demonstrating innovation and leadership in data warehouse automation, being the first to directly integrate BEAM into the automation process.

The Fine Print

We’ll be rolling out the first release of Ajilius 3 some time before June 30.

The Ajilius 3 upgrade will preserve all your existing metadata, but we’ll be making some adjustments to the metadata structure that will break the incremental update flow. In other words, we’ll be expecting all users to upgrade to 3, then restarting incremental upgrades from that point.

We will continue to support Ajilius 2 for six months from the date of release of Ajilius 3, in order to give users time to evaluate and test the migration process.

There is, of course, no additional charge to users for Ajilius 3, all upgrades are covered for the life of your license agreement.

We hope you’re as excited as we are by these changes!

Ajilius. Committed to innovation in data warehouse automation.