Ajilius & Java

We’ve been a little quiet, lately.

The reason is that, for the last few months, we’ve had a team of people working across three countries to complete something exciting, a new version of Ajilius that shifts us from a Python platform to Java. For the last month in particular, we’ve been on an incredible sprint to deliver the Java version by the end of May.

This wasn’t an easy decision to make, as Python was critical to our first year of operation. We couldn’t have built Ajilius as quickly as we did on another product, and we grew to like Python as a development language.

Sadly, though, we found two problems with Python – distribution and database drivers.

Distribution management for Python is handled by a utility called PIP. Our installation scripts used PIP to acquire (and update) our library dependencies. Unfortunately, PIP struggled behind firewalls, and a number of corporate and government customers were unwilling to make the changes necessary for it to work.

That meant we had to bundle dependencies ourselves, which became a nightmare of dependencies-on-dependencies, and a maintenance and distribution headache for our team. If you’ve heard of DLL-hell in the Windows world, Python’s site-packages hell is at least three layers closer to Lucifer himself.

The second problem was database drivers. While Snowflake released a great quality Python driver for their DBMS, most other databases are supported by the Python community, which means that features can lag well behind current vendor releases. A great example is SQL Server, where Python’s interfaces are completely variable in their support. One doesn’t support Unicode. One doesn’t support Integrated Authentication. None support AlwaysOn intents. None support the Bulk Copy API. It wasn’t just a SQL Server problem, we found that as we supported more and more databases, the issues of finding current drivers became a support nightmare.

So both of these problems meant we needed an alternative. After much searching and testing, we settled on the combination of Java and JDBC, and after several man years of effort we finally have a Java version ready for beta testing.

Java solves the packaging and distribution problem because we can distribute Ajilius with the library JAR files already set up. Once you have installed Java, our installation process is as simple as unzipping a folder into a known location. Maintaining current versions of library dependencies is as simple as copying a file. Further, we no longer have to go through a platform-dependent compilation and packaging step as we did with our Python version, the code and packaging is identical across our supported platforms of Windows, Linux and OSX.

Java and JDBC have also enabled us to solve our database driver problem. Every mainstream DBMS has a vendor-supported JDBC driver available, we no longer have to test multiple community drivers then make compromises on the availability of features vs. the driver quality. Even better, we now have drivers available for popular application platforms and web services including CRM, Marketing, Accounting, Collaboration, Social Networking and more. We’ll be talking a lot more about this in coming posts.

So what is the benefit for you, the customer and partner?

Firstly, you get a much faster installation experience, that is more familiar to your operations team. Install Java, copy the Ajilius folder, and you’re done. It can even be fully automated for internal software packaging.

Secondly, updates are now much easier to deploy. You’ll get more features, faster.

Finally, the incredible breadth of JDBC drivers means that we’ll have connectors for more data sources than you can possibly imagine, both cloud and on-premise.

Our demonstration servers have now been switched to Java, and we’ll be ready for on-premise evaluations by the end of the week. Migration packages for existing customers will follow by the end of the month.

Now, we move on with the development of the huge list of features that will be coming to Ajilius in the next twelve months. Hang on, it is going to be one heck of a ride!


Version 2.1c – Snowflake TIME Data Type

On Wednesday night we got notification of the latest changes in Snowflake Elastic Data Warehouse. Of particular interest was the implementation of a TIME data type. Here we are on Friday night, and we’re releasing support for this latest feature.

We were one of the customers to request this data type early in the Snowflake development process, so we’re delighted to see it implemented.

Having a TIME data type makes it easy to implement dimensions that incorporate concepts like business/after hours, and morning/lunch/afternoon periods. Want to know the blocks of time in which most of your transactions occur across the week? You need a time dimension.

Without a TIME data type, Ajilius represents time values using a character string in the format HH:MM:SS. This works for building the star schema, but it doesn’t support analytic queries involving time arithmetic. Now we can store and calculate time values in Snowflake data warehouses.

The installer package for V2.1c also updates the Snowflake connector to the latest version.

Ajilius. Keeping up with Snowflake.


Version 2.1 – Data Quality Automation

The hero feature of Ajilius 2.1 is Data Quality Automation.

This is yet another unique feature brought to data warehouse automation by Ajilius.

In the 2.1 release, Ajilius adds three types of data quality screens to the extract process:

  • Data type validation, where values are tested for conformance to the column data type.
  • Range validation, where values are tested for set and range boundaries.
  • Regex validation, where values are tested against regex regular expressions.

In Version 2.3 (due September 2016) we will be adding Lookup validation to data quality rules, to check the existence of values in data warehouse tables.

Rows breaking validation are logged to an error file, along with the reason/s for row rejection.

A new return code from the extract job signals that validation errors have occurred, enabling the scheduler to choose to continue or suspend the batch pending user remediation of the errors.

And once again, we’re adding this as a standard feature of the Ajilius platform. If you’re licensed for Ajilius, upgrade to the latest version and you can immediately identify and screen data quality problems before they hit your data warehouse.

Ajilius. Committed to innovation in data warehouse automation.

Version 2.1 – Pivotal Greenplum

greenplumAjilius is pleased to announce full support for Pivotal Greenplum in Version 2.1, available now.

Greenplum is an open source MPP data warehouse, available on-premise and cloud. Based on an earlier version of PostgreSQL, Greenplum will shortly be upgraded to the latest PostgreSQL code base for even faster loads and transformations.

The advantage of Greenplum is that, being open source, it gives anyone the opportunity to use a real MPP data warehouse platform. All you need is hardware, or cloud, with realisable savings of hundreds of thousands of dollars over commercial offerings.

You now have a great, free, scalability path when your workload or data grows to exceed PostgreSQL capabilities. With the unique “3 click migration” feature of Ajilius, you can move your entire data warehouse at any time with just a few clicks of the mouse.

Ajilius. Keeping the value in data warehouse automation.

Release 1.3.11

This release fixes a couple of minor errors, and makes some productivity improvements to existing functions.

Stage Column Documentation
If a column was manually added to a stage table after initial creation, the documentation entered at this point was not updated. This has been fixed.

First Column Calculation / Literal
If a calculated column, or literal, was moved to the first column position in a table, Ajilius would generate invalid transform SQL. This has been fixed.

We’ve made the following changes to existing functions.

Enable Transformations on New Columns
We made an early design decision that transformations were applied by changing a column. When adding a column, we did not know if it was capable of being transformed until after the addition. This means that if you wanted to re-add a previously deleted column, and upper-case it at the same time, that you had to go through two steps – add the column, then change it to add the transformation.
We have now modified the Add Column function to enable you to enter your choice of source-column, role, transform and calculation. Roles and transforms can only be applied to selected source-columns, not to calculations. If both a source-column and a calculation have been entered, the source-column selection will take precedence.

Enable Column Selection on Join Tables
You might recall a post, many months ago, titled Our Join Editor Sucks. Well, Join Editor V2 was a big improvement, but still had some workflow issues. When you joined a table, all columns from that table were copied into the join set. You could delete columns from the column management functions, but users (self included) did not like this interruption to the thought process that typically happens when joining tables.
The new Join Editor enables column selection at the point of join definition. A toggle supports select-all functionality on the list, or you may select individual columns. This is a small improvement when you want all columns from a joined table, but a huge leap forward in productivity when you only want one or two columns from a larger table.
You can still used the column management functions to add, change and delete individual columns from the join set, as well as sort the columns into your preferred sequence in the table.

Registered users will be notified by email of the download location. If you’re reading this and have not received your email, please contact support@ajilius.com.

Release 1.3.10

This release is a quick maintenance update, with a minor enhancement to built-in transforms.

WHERE Test Error
The Test button on the WHERE clause editor may have triggered a fatal error screen. The clause was correctly written, but could not be tested. ELT jobs worked correctly. This error has been corrected.

We’ve made the following changes to existing functions.

New Transforms
We’ve added the following transforms to the Change Column screen:
– Cast to Integer
– Cast to Numeric

Registered users will be notified by email of the download location. If you’re reading this and have not received your email, please contact support@ajilius.com.

Release 1.3.09

The devs have been busy!

This release brings a bunch of new features to make it easier to re-run whole sections of the data warehouse during development and testing, as well as some functions to keep your development environment neat and tidy.

Load | Create All
Create, or drop and re-create, all Load tables.

Load | Process All
Execute the extract, push (for cloud DW), and load processes for all Load tables.

Stage | Create All
Create, or drop and re-create, all Stage tables.

Stage | Process All
Execute the transform processes for all Stage tables.

Dimension | Create All
Create, or drop and re-create, all Dimension tables.

Dimension | Process All
Execute the transform processes for all Dimension tables.

Fact | Create All
Create, or drop and re-create, all Fact tables.

Fact | Process All
Execute the transform processes for all Fact tables.

Warehouse | Clear Scripts
Erases the contents of the SCRIPTS directory.

Warehouse | Clear Extracts
Erases the contents of the EXTRACTS directory.

Warehouse | Drop Tables
This function lists all the tables in your data warehouse. You may check one or more tables, then press the DROP button to remove those tables from your DW database. Use this function with care, once dropped, tables can only be restored from backup.

We’ve also made one small change to the Load Tables list, to bring it into line with the other table displays.

Load | List Load Tables
The column role has been added to the table columns list at the right side of the screen.

There are no metadata changes in this release, and no known errors to be fixed.

Registered users will be notified by email of the download location. If you’re reading this and have not received your email, please contact support@ajilius.com.

Release 1.3.06

We’re now growing fast enough that we need to formalise the release update process.

Each time we publish an update, we’ll post the details of the update in this topic.

This release brings the following user changes:

  • Amazon RDS PostgreSQL as DW.
    We always supported RDS as a data source, now we support it as a first-class target platform as well. This means that you can use PostgreSQL on premise, in the cloud on IAAS, or in the cloud as PAAS. Three-click migration means that you quickly and easily migrate between these platforms, as well as Snowflake, Redshift and Azure SQL Data Warehouse.
  • New Oracle driver offering more simple setup.
    The previous Oracle driver was extremely complex to install, and caused more support issues than any other aspect of installation and configuration. We have now switched to a driver that does not depend on the Oracle client software, with much better results.
  • Optional command line credentials.
    ELT scripts use environment variables to pass credentials and paths. Some users reported difficulty setting environment variables in complex server environments, so we have added the option of command-line credentials to all jobs.
  • Deleting a table or column no longer deletes descendants.
    If a table or column is deleted, we now set the descendant columns to NULL. This leaves downstream tables intact, while enabling you to restructure earlier tables in the ELT flow.
  • A new Fast Map feature to quickly adjust table and column sourcing.
    If you have restructured precedent tables or columns, or changed your mind about the sourcing of a table, you can now quickly update the column sources through this function.
  • A new Fast Delete feature to rapidly delete large numbers of columns.
    You might only require a handful of columns from a set of joined tables. This feature makes it quick and easy to delete all the unwanted columns in one pass, rather than separately deleting each column.
  • Documentation enhancements.
    We have added more obvious references to column roles, transformations and calculations in the data warehouse documentation generated by Ajilius.

All users will receive an email with the new code and instructions. There are no database changes in this release, just copy the executable to your Ajilius directory

Roadmap update

We’ve just updated the Ajilius Roadmap to reflect the imminent release of Version 1.3.

The major features of this release are:

  • Cloud to Cloud Data Warehouse, with interoperable support for Snowflake Elastic DW, Amazon Redshift and Microsoft Azure SQL DW.
  • Upgraded documentation generation, with interactive lineage navigation on all elements.
  • Enhanced support for JSON files and REST services

You might note that the Roadmap shows Azure SQL DW coming in Version 1.4, due at the end of Q4. The code is working, but Microsoft currently has the platform in preview only. We’ll update our level of support once it is released to production.

We’ve also added two new datasheets:

You’ll find all this content on our Bookshelf page.