Continuous Deployment with Ajilius

Many data warehouse automation products are good at single database development, but struggle when asked to cope with multiple environments such as Dev, Test, SIT, UAT and Production. When you’re striving to be Agile, you need the ability to handle CI / CD in any environment, in any state.

Release 3.2 of Ajilius brings full CI / CD to data warehouse automation.

Within the browser UI, we now support immediate generation of alignment scripts that preserve data across multiple generations of deployment. The following picture shows the new UI menu options for deployments, as well as an alignment script in the left-hand pane.

We’ve also extended the Ajilius command line to support scripted deployments across multiple environments. The following screen shot shows the initial deployment of a data warehouse to a new server.

We’ve then modified the structure of one table, and added a new table, through the browser UI. The second command line invocation alters the data warehouse to align with the new table structure.

But what happens when you have multiple environments, in multiple stages of deployment? Ajilius persists the current deployment state for each environment, and has IMPORT and EXPORT functions to reload that state before alignment. This means, for example, that your DEV, UAT and PROD environments can be progressively deployed in different states.

Because these features are implemented on the command line, as well as in the UI, Ajilius supports full automation of the CI / CD process. For example, our own internal CI / CD test server uses CRON to implement an hourly deployment, based on any changes to the metadata in the last hour. Of course, we fully integrate with products like GoCD, Control-M, etc. for enterprise-level automation.

The metadata changes that can trigger an alignment include:

  • Add table or column
  • Rename table or column
  • Change column data type
  • Change column role (ie, SCD2 to SCD0)
  • Change table distribution method
  • Change column sequence

And remember, every change preserves your existing data. We even fully re-key fact tables if an SCD2 dimension changes to any other type.

Ajilius. Enterprise maturity in data warehouse CI / CD.

DWA: Data Warehouse Abandonware?

Frequency of updates is a good indication of the on-going viability of a data warehouse automation platform.

From what we see and hear, too many competitors are good at getting their products released, but not so good at error correction and enhancement.

This came to mind while I was packaging our latest update – huge changes in incremental deployment – and looked at the following:

That’s the releases we’ve made during just the past four months.

  • Every update is cumulative, you don’t have to apply them all
  • Every update contains one or more new features, usually driven directly by customer requests
  • Most updates contain performance enhancements
  • Many updates contain code enhancements and corrections to errors reported by users

Here’s what a couple of weeks looks like in Mercurial:

All our release planning is done through a Trello board. At present it has a backlog of around 18 months for the current development team. New target platforms. New data sources. New transformations. New BI interfaces. New methodology support. New features we’re not yet ready to discuss. We’re going to continue, or even accelerate the rate of new feature deployment over the next  year.

The one thing that won’t change is our licence policy. We’ll continue to be a site-licensed product, with no lock-ins, including the guarantee that your data warehouse will continue to function even if you don’t renew your Ajilius licence, and we’ll never increase your annual renewal for as long as you use Ajilius.

Ajilius. USD 15k / site. The best value in data warehouse automation.

 

Documentation belongs …

… IN THE CODE.

I recently had to review some old ETL code where the developer was a firm believer in documentation. Unfortunately, his belief was that documentation belonged in Confluence.

Not that I’ve got anything against Confluence, it is a good product, but I got very tired and very grumpy about having to look up page after wiki page to figure out what was in his twisted little brain when the code was written.

Ajilius saves developers from having to search in unreasonable places for documentation. Not only do we capture both user and technical documentation for all artifacts, but we put technical document right were it belongs … IN THE CODE.

Here’s a screen shot of the technical documentation panel for a table:

… and here’s how it surfaces in code, neatly translated from HTML to text:

Ajilius. Better data warehouses … and better documentation.

Ajilius 3: Adaptive ELT Performance

The new Adaptive ELT feature in Ajilius 3 uses intelligent algorithm selection at runtime to optimise ELT performance.

Combined with the new SQL+Scripts architecture for code generation and execution, we’re seeing orders of magnitude improvement in some use cases.

Here’s a sample of batch output from the standard Ajilius demonstration:

The first two jobs – stage_location and dim_location – used to take around 0.06 and 0.08 seconds to run. Now they’re talking less than .001 seconds.

They’re trivially small tables, but we’re using them as an example because you can replicate the experience in an Ajilius demonstration.

Not every job will benefit to this extent, it depends on factors like the algorithm, and the number of rows in the source and target tables, but every job that processes different data on each run will be faster.

Ajilius. Faster.

 

Ajilius 3: SQL + Scripts

In Ajilius 1, we used Python as a scripting language. Ajilius 2 switched to Java and Groovy. We generated scripts that mixed the scripting language and SQL to emulate stored procedures.

We got pushback on this approach from a number of prospects, probably because we didn’t always do a good job of explaining the relationship between the scripting language and the SQL. Prospects saw the Groovy syntax, for example, and thought that they would have to learn a new language to build their data warehouse.

Looking back, we can see why.

Here is an Ajilius 2 script:

Here is the Ajilius 3 script for the same table:

As you can see, the SQL script in our new version is immediately familiar to a data warehouse developer.

This isn’t to say that we’re abandoning scripting, because Ajilius 3 introduces this feature as SQL + Scripts.

You’ll still have the option to export and package the SQL for execution as Groovy scripts, Stored Procedures (where supported), PowerShell scripts and more. The difference to earlier versions is that this is an option, not the standard.

An added bonus from this approach is that scripts run through the Ajilius scheduler no longer have to go through the overhead of instantiating the scripting engine and loading/compiling the script before execution. As a result, you get much faster execution of your transformation processes.

Ajilius. Listening to customer feedback to make us better.

 

Take the SAS language challenge

Can your data warehouse automation solution correctly handle international character sets?

I was reading an interesting blog post from SAS at the weekend. It shows how they visualise data containing character sets from many languages around the world.

http://blogs.sas.com/content/sastraining/2017/05/12/map-of-idioms-from-around-the-world/

The blog discuss loading and visualising data that looks like this:

Given that the data is contained in a spreadsheet, I thought it would be fun to see how that data was handled by Ajilius.

Create a data source that points to the spreadsheet:

Ingest the metadata from the spreadsheet:

Run the load script, and here is the result:

There it is. Fast, full-fidelity representation of the SAS data. If you’re building a data warehouse in Korea, or Armenia, or Israel, or any other country with a non-ASCII language, then Ajilius is your best solution for data warehouse automation.

Ajilius. Data warehouse automation for the global community.

 

What’s In Ajilius 3

We’ve been working on the next major release for Ajilius for several months now, and it is time to let you know what we’ll deliver starting next month.

Adaptive ELT

Q: “What is the fastest algorithm to load a dimension?”

A: “It depends.”

That’s right, the best algorithm for a load process depends on your platform and your data. MPP platforms prefer CTAS algorithms, SMP may do better with UPDATE algorithms for low volatility tables.

What we often see, though, is that design decisions made during development are no longer valid when the DW has been in production for several months. Assuming good design choices in the first place, the most common reason is that the volume and patterns of data in production are not matching those that drove the initial design decision.

Ajilius already supports a choice of CTAS or UPDATE algorithms depending on the platform, but in Ajilius 3 we’re taking an industry-leading leap forward and introducing a dynamic choice of algorithm at run time. We’re calling it Adaptive ELT.

Based on the combination of platform and data, we’ll make an intelligent choice about the best algorithm to use for each and every batch run. Does the data suggest you’ll be updating the majority of rows in a billion row table? We’ll generate CTAS for that batch. Are you only updating a few  rows on the next run? We’ll use an UPDATE statement.

No other ETL or ELT platform gives you this flexibility. It is like having your developers rewrite and tune their code, each night, with the data for that night’s batch driving their decisions. You’ll get an expert ELT system that dynamically monitors and adjusts your processing for maximum performance, on every single run.

SQL+Script

Until now, Ajilius has generated scripts for ELT processing. Ajilius 1 generated Python, Ajilius 2 generates Groovy, Ajilius 3 is throwing away the single script constraints.

In Ajilius 3 we’ve separated generation of SQL from generation of the wrapping script. We’re calling this SQL+Script, because we’ll be generating SQL for every transform, supported by a wrapping Script of a type requested by the user.

A new template option will enable the generated SQL code that effects transformations to be wrapped in multiple scripting languages. We’ll continue to support Groovy as our primary scripting option, but here are just some of the options that will be available in Ajilius 3:

  • Plain SQL. Every script will be able to be generated as just SQL statements in the target platform dialect. This will delight customers who want to use Ajilius in a hybrid environment with existing data warehouses.
  • Stored Procedures. SQL Scripts will be wrapped in stored procedure creation scripts for customers who prefer to use tools like SQL Agent and SSIS for job execution.
  • PowerShell. The rise of PowerShell as an automation language, and its growing popularity in CI/CD environments, makes it a great option for an Ajilius target. We’ll generate wrappers that emulate our current Groovy scripts, but using PowerShell as the scripting language.

BEAM Support

Being great fans of Lawrence Corr’s Agile Data Warehouse Design methodology, we’re adding BEAM support to Ajilius.

Our new “Model” module will implement the BEAM methodology. Here’s what you’ll see in the first release:

  1. Capture and presentation of 7-Ws
  2. Generation of template fact/s and dimensions from Ws
  3. Mapping of staging tables to fact/s and dimensions
  4. BEAM-compatible short codes for all object types.
  5. Incorporation of BEAM artefacts into Ajilius-generated documentation

Ajilius currently works “left-to-right”, in that we start from data sources and eventually transform to facts and dimension. We’ll keep that capability in Ajilius 3, but we’ll also provide “right-to-left” model driven implementations, where you’ll have the option to create facts and dimensions first, then link them back to transformations. In other words, we’ll support model-driven, data-driven and hybrid approaches to building a dimensional data warehouse.

All of this is an extension to the current Ajilius. You’ll be able to retrofit BEAM to an existing Ajilius DW, as well as integrate new BEAM-designed stars in to a conformed set of existing dimensions.

Once again, Ajilius is demonstrating innovation and leadership in data warehouse automation, being the first to directly integrate BEAM into the automation process.

The Fine Print

We’ll be rolling out the first release of Ajilius 3 some time before June 30.

The Ajilius 3 upgrade will preserve all your existing metadata, but we’ll be making some adjustments to the metadata structure that will break the incremental update flow. In other words, we’ll be expecting all users to upgrade to 3, then restarting incremental upgrades from that point.

We will continue to support Ajilius 2 for six months from the date of release of Ajilius 3, in order to give users time to evaluate and test the migration process.

There is, of course, no additional charge to users for Ajilius 3, all upgrades are covered for the life of your license agreement.

We hope you’re as excited as we are by these changes!

Ajilius. Committed to innovation in data warehouse automation.

 

Three-Click Migration

A technical sales manager at Microsoft recently argued that our claim of three-click migration between supported platforms was impossible. “It can’t be done,” he argued, “they’re completely different syntax and architectures.”

So we demonstrated it.

We took a data warehouse we’d built on AWS Redshift.

We ported it to Azure SQL Data Warehouse in three clicks.

We demonstrated the differences in the code that was generated for each platform, and the best practices it entailed.

We then ported the Azure SQL Data Warehouse version to SQL Server on Linux, in three more clicks, as a low-cost development and test platform where the results could be automatically deployed back to Azure SQL Data Warehouse.

Again, we demonstrated how it was done in code.

Finally, we added tables from Dynamics365 and SharePoint Online to the data warehouse on SQL Server, and deployed it back to Azure SQL Data Warehouse.

“OK,” he said, “I’m convinced.”

Ajilius. We walk what we talk.

Faster CRM Reporting

One thing we’ve noticed from our CRM integrations with Salesforce, Dynamics and NetSuite is that their APIs are glacially slow. If your reporting solution gets its data from an application interface rather than a database, your reports could take minutes to complete rather than seconds.

That isn’t surprising, an API gets in the way between your report and the database, but just how slow has been a real eye opener. We’ve seen data volumes that would move in less than a second on a direct database read take 5-6 minutes to retrieve from an application interface.

You might think that one report taking 6 minutes instead of a second or two isn’t too bad, but what if there are 100 reports to be run? Or you need to surface that report on an interactive dashboard?

Not only are there performance issues, but some APIs also have concurrency and volume limits. You might not be able to run more than two reports at once, and if you have a large group of users, their report queries may exceed the extract limits imposed by your CRM provider.

The solution is a data warehouse, where you extract the data from your application API just once, then all your reports interact with a faster database. You can integrate your CRM data with other systems in your organisation; and optimise the data structure into facts and dimensions, the best solution for reporting and analytics.

Here’s what it might look like from a time perspective:

Keep in mind, you also get the benefit of fast, interactive analytics and visualisation, for the entire day. Interactive dashboards are a reality, not a dream.

With our pre-built interfaces to Salesforce, Dynamics365 and NetSuite (and more coming soon), Ajilius makes it fast and simple to integrate your CRM into a data warehouse.

Ajilius profiles your data sources, screens your data to ensure its quality, ingests it into your data warehouse, transforms it into facts and dimensions, and creates direct integrations in business terminology with great products like Yellowfin, Tableau, Qlik, PowerBI and Excel. We’re the best value solution in data warehouse automation.

Ajilius. Faster CRM reporting.

Terminal Emulators

When your software lives in the cloud, you get used to the world of terminal emulators. We run over 50 cloud servers on 5 different cloud platforms, and the right terminal emulator is a real boost to productivity.

When I was using OSX as my development platform, I used iterm2 as my terminal. Far better than the standard Mac terminal program, it offers features like paned windows, search, autocomplete, hotkeys, and more.

On switching back to Windows I was stumped. Putty was the most often recommended terminal, but it just felt ancient after iterm2, and I started looking for alternatives. After much searching, I found the wonder that is ZOC Terminal.

Tabbed windows! 3270! 5250! Windows and OSX! This product isn’t just a Putty replacement, but Attachmate as well.

The file transfer feature means that I never have to scratch my head trying to remember SCP syntax. There’s a full scripting language built-in, that we’re using to automate some of our automated deployment processes. Think of something you want to do in a terminal session, and it is probably built into the extensive feature list.

ZOC Terminal is a paid, commercial product. Its productivity and usability gains are worth every cent.

Strongly recommended, if you’re deploying Ajilius on a cloud / Linux environment.