Azure SQL or Azure SQL Data Warehouse?

Customers committed to the Microsoft stack often ask which product they should target for their cloud data warehouse – Azure SQL or Azure SQL Data Warehouse.

Today I heard a Senior Program Manager from the ASDW team give the following answer:

  • If your data is measured in gigabytes, or your fact tables have less than 100 million rows, we’re probably not the right solution

I also read an interesting article the other day, that suggested 70% of analytics data sets – the data normally found in a data warehouse – are less than 1Tb. Yes, the article is a little old, but that 70% figure was at the 500Gb mark, which meant it could double and still hold true today.

https://www.kdnuggets.com/2015/11/big-ram-big-data-size-datasets.html

In other words, the typical SME-scale data warehouse is probably better off on Azure SQL. If you don’t have tens of millions of customers or products, and you don’t transact hundreds of millions of line items per year, you should probably start your data warehouse on Azure SQL.

Ajilius supports both Azure SQL and Azure SQL Data Warehouse, with three-click migration between them. That means you’re not locked in by your choice of database, and you can easily change your mind if your data grows beyond initial expectations.

You can’t go wrong building your Microsoft data warehouse with Ajilius.

 

New Script Form

The Ajilius script forms used to look like this:

Now, in Ajilius 3.5.0, we’ve moved to a tabbed presentation of scripts:

There are three main benefits from this new interface:

  • Scripts are more readable
  • You can choose between a CREATE and ALTER script for a table
  • We’ve got room to show¬†all the scripts for a table, including some in up-coming releases

Please note, if the ALTER tab is empty, it means that the table metadata matches its deployment, and no alteration is required.

Ajilius. Simpler scripting.

Role-based Permissions

Release 3.4.0 of Ajilius brings support for role-based permissions in the data warehouse.

Permissions are granted to schemas by role.

A given role may be granted read, write or read-write access to each schema.

Normal end-users of the data warehouse will only require read access on the DW and REPORT schemas.

A simple list shows the permissions granted by role.

Permissions are maintained through form-based selections.

When scripts are generated for a table, the appropriate permissions are automatically maintained.

Ajilius. Simplifying role-based permissions.

 

Ajilius in 2018

As 2017 draws to a successful close, let’s take a few moments to look at some up-coming changes to Ajilius.

Licensing

You’ll notice from our Pricing page that Ajilius is splitting into two Editions – Standard and Enterprise. Standard is for smaller, SMP data warehouses. Enterprise is for large data organisations wanting the ultimate in scale and performance.

This will provide a lower-cost entry point for small-to-medium organisations, for whom the full benefits of Enterprise Edition may not be realised.

All existing users will be automatically converted to Enterprise licences. As usual, you will pay no more than your initial licence price.

All existing opportunities will reflect our current pricing and licensing. Only opportunities commencing after January 1, 2018, will be affected by the changes.

Features

We’ve also got some major changes coming in V4, to be delivered by phase during 2018. Here’s a brief summary of the highlights:

  • Data Vault 2.0 support, for both Standard and Enterprise Edition.
  • Data Lake support, for Enterprise Edition.
  • Star Schemas built from source, vault or lake.
  • Data Vaults built from source or lake.
  • Data pseudonymisation on ingestion
  • Streaming ingestion
  • Forward chaining to drive near real time fact tables
  • Major enhancements to metadata store, including versioning and copying/cloning.
  • Containerisation, with full support for Docker deployments
  • Support for OpenJDK/JRE as well as Oracle JDK/JRE

Please contact us if you’d like further details of any of these changes.

Ajilius. Striving to be the best in data warehouse automation.

Ajilius Release Numbers

We recently finished an RFP submission with a question that asked:

Describe your version numbering system, and how well your product keeps to it

I thought the answer might be useful to all customers.

Our version numbering system

The Ajilius release number is comprised of three elements – major version, minor version, and point version. For example, in version 3.2.1, the major version is 3, the minor version is 2, and the point version is 1. But what do the various types of version mean?

A major version represents an architectural change. For example, Python to Java was a major version. the CTAS engine was a major version.

A minor version represents a metadata change, or a module-level change. Metadata changes are where we alter the structure of the metadata repositories used by Ajilius. Module changes are where we introduce major blocks of functionality. Incremental deployments were a minor version, SSAS integration was a major version.

Point versions are where we make relatively minor changes to the application code. These may be error corrections, or enhancements that build on existing functions or metadata.

How well do we keep it?

When I look back over the Mercurial logs for Ajilius, I’d have to say that we’ve only been 60-70% compliant in our numbering scheme. We’re always correct with Major versions, always correct with minor versions, but we’ve often let small metadata changes slip through in point versions.

From now on, though, we’re going to be 100% compliant with this structure.

Why bother? Simpler upgrades.

Better adherence to our release structure will help us to simplify our upgrades. We’ll make it faster and easier for you to download and apply new code.

  • We’ll always offer a full installer. This will create a completely new installation.
  • We’ll always offer an upgrade installer. This will upgrade any earlier release to the current release, including metadata structure changes and library dependency updates.
  • Now, we’re going to simplify point releases, as just the Ajilius executable program. Point releases will comprise only ajilius.jar program, currently around 400k in size. This will be faster to download, and more simple to copy to the Ajilius folder.

So when you get release 3.3.2 later this week, you’ll know that it is a code-only change to Release 3.3, and makes no changes to metadata or libraries.

Ajilius. The best development cycle in data warehouse automation.

Ajilius 3.3.0

Our blog has been a little quiet, lately, as we’ve been concentrating our efforts on new development.

That work has culminated in the release of Ajilius 3.3.0.

Besides enhancements, here’s a list of all the new features we’ve delivered since our last blog post:

3.3.0

  • Dimension Hierarchies
  • Measure Aggregates
  • SSAS Tabular Model Generation
  • Cross-database Custom Query Loads

3.2.12

  • DB2 data source driver
  • Per-table lineage analytics

3.2.10

  • Fast, royalty-free Excel data source driver

3.2.8

  • Fast, royalty-free XML data source driver

Ajilius. No other Data Warehouse Automation platform delivers as much new value every month.

New Transform: Persist Static

Ajilius 3.2.7 includes a new type of transform, labelled Persist Static.

The intent of this transform is to persist a table that will not be automatically reloaded.

Ajilius has a dependency-based scheduler. When we process a table, we recursively check and process its dependencies if required. Until now, that meant that ALL tables could potentially be reloaded.

We found this wasteful in cases where the table was relatively static. A business calendar, for example, might be updated once per year by the Finance department. A translation table aligning common data from two systems could be updated only when new products are added. In both examples, under the old process, these table would be reloaded every time a batch that referenced them was run.

Now, however, tables can with the type Persist Static will trim the dependency tree when included in a batch. The latest version of the table will be used, and its dependencies ignored.

A Persist Static table can still be reloaded from the command line. Assume that we have the following chain of tables, and that stage_source is Persist Static.

source.csv -> load_source -> stage_source

Any job that references stage_source as a dependency will use the latest contents of the table.

To reload this table, the following command might be used:

java -jar ajilius.jar -w MyWarehouse -t stage_source

That command will cause Ajilius to start processing with stage_source. It will be found to have a dependency of load_source. A load job will then be triggered to import data from source.csv. On completion of this job, the staging of data from load_source to stage_source will be performed. The table is then completely updated, and the job will terminate.

Ajilius. More effective loads and transforms.

 

Snowflake-Snowflake Data Loads

When we built the Snowflake adapter for Ajilius, around two years ago, we saw Snowflake as a Data Warehouse target.

We’re now seeing many Snowflake customers using its capabilities as a Data Lake as well as a Data Warehouse. Customers are landing their data in one Snowflake database, then migrating and structuring it for analysis and reporting.

We have responded to this shift by implementing cross-database queries for data ingestion by Ajilius.

Previously, customers could extract data from one Snowflake database, then reload it to another. If your Ajilius instance was running on-premise, this carried a significant overhead as the data was extracted, transferred to your Ajilius instance, transferred back to the Snowflake host, and reloaded.

Now, a new load type of “Cross-Database” can be selected for a table. When the table load job is executed, we create a three-part name query instead of an extract/load query.

Here’s how it works.

Create a data source in the normal way. Let’s use the SNOWFLAKE_SAMPLE_DATA that comes with a new account:

After refreshing metadata, we can see the tables and columns from the demonstration database. Now we can load the CALL_CENTER table:

Set the characteristics of the table. The Load Method will default to Bulk Import, which uses the Snowflake PUT/COPY method. Change this to Cross-Database Query:

Now, when you generate scripts for this table, you’ll see that a cross database query is used instead of a PUT/COPY. Note that some columns have been dropped in order to show the full query.

Of course, incremental loads are also supported for cross-database loads.

Ajilius. Faster loads for Snowflake-Snowflake.

 

SQL Server Integrated Authentication

From time to time we get a customer reporting an error like this:

The key to this error is the text “Faled to load the sqljdbc_auth.dll”

This file is required by the Microsoft SQL Server JDBC driver, in order to handle integrated authentication.

The file must be placed in the Java bin folder. Upgrades to Java can change the location of the Java in folder, by changing the version number higher up the path. This means that the file can no longer be found.

Here are some instructions to resolve this error if it occurs on your server.

Instructions

Download the SQL Server JDBC driver from https://www.microsoft.com/en-au/download/details.aspx?id=11774. Make sure you get the 64-bit version.

Install the driver to any location. It doesn’t matter where, it basically just unzips a file structure and you’ll only be taking one file from its directory.

Go to the folder to which you installed the driver, and find the following path:

\sqljdbc_6.0\enu\auth\x64

Copy the file sqljdbc_auth.dll from this location

Find the location of your Java Runtime Environment using the WHERE command:

where java

It should be something like this:

C:\Program Files\Java\jdk1.8.0_121\bin\java.exe

Copy the sqljdbc_auth.dll into the java folder. Given the result I’ve just shown, on my machine I would copy it to:

C:\Program Files\Java\jdk1.8.0_121\bin\sqljdbc_auth.dll

Now, when you restart the Ajilius application, integrated authentication should work on your machine.

Custom Code Injection

Ajilius has always supported a rich set of transformations and custom calculations to support complex ELT. Occasionally, however, we’ve had requests for custom code injection before or after ELT transformations. That request is now satisfied.

All Load and Transform jobs now support blocks of user-defined code at the beginning and end of the job.

Code injected into Load jobs is in the Groovy scripting language, while code injected into Transform jobs is pure SQL.

To add custom code to a job, select the Change menu item for the table being processed:

The table form now contains two new fields, named Custom Code – Pre-SQL and Custom Code – Post-SQL. These fields contain any legitimate SQL statements.

When you run the transform script for this table, the custom code will be injected into the appropriate place in the script:

Please note that custom code may not be supported across target data warehouse platforms, and may need to be reviewed and edited after migration.

Ajilius. Round-trip customisation.