Azure SQL or Azure SQL Data Warehouse?

Customers committed to the Microsoft stack often ask which product they should target for their cloud data warehouse – Azure SQL or Azure SQL Data Warehouse.

Today I heard a Senior Program Manager from the ASDW team give the following answer:

  • If your data is measured in gigabytes, or your fact tables have less than 100 million rows, we’re probably not the right solution

I also read an interesting article the other day, that suggested 70% of analytics data sets – the data normally found in a data warehouse – are less than 1Tb. Yes, the article is a little old, but that 70% figure was at the 500Gb mark, which meant it could double and still hold true today.

https://www.kdnuggets.com/2015/11/big-ram-big-data-size-datasets.html

In other words, the typical SME-scale data warehouse is probably better off on Azure SQL. If you don’t have tens of millions of customers or products, and you don’t transact hundreds of millions of line items per year, you should probably start your data warehouse on Azure SQL.

Ajilius supports both Azure SQL and Azure SQL Data Warehouse, with three-click migration between them. That means you’re not locked in by your choice of database, and you can easily change your mind if your data grows beyond initial expectations.

You can’t go wrong building your Microsoft data warehouse with Ajilius.

 

Documentation belongs …

… IN THE CODE.

I recently had to review some old ETL code where the developer was a firm believer in documentation. Unfortunately, his belief was that documentation belonged in Confluence.

Not that I’ve got anything against Confluence, it is a good product, but I got very tired and very grumpy about having to look up page after wiki page to figure out what was in his twisted little brain when the code was written.

Ajilius saves developers from having to search in unreasonable places for documentation. Not only do we capture both user and technical documentation for all artifacts, but we put technical document right were it belongs … IN THE CODE.

Here’s a screen shot of the technical documentation panel for a table:

… and here’s how it surfaces in code, neatly translated from HTML to text:

Ajilius. Better data warehouses … and better documentation.

Faster CRM Reporting

One thing we’ve noticed from our CRM integrations with Salesforce, Dynamics and NetSuite is that their APIs are glacially slow. If your reporting solution gets its data from an application interface rather than a database, your reports could take minutes to complete rather than seconds.

That isn’t surprising, an API gets in the way between your report and the database, but just how slow has been a real eye opener. We’ve seen data volumes that would move in less than a second on a direct database read take 5-6 minutes to retrieve from an application interface.

You might think that one report taking 6 minutes instead of a second or two isn’t too bad, but what if there are 100 reports to be run? Or you need to surface that report on an interactive dashboard?

Not only are there performance issues, but some APIs also have concurrency and volume limits. You might not be able to run more than two reports at once, and if you have a large group of users, their report queries may exceed the extract limits imposed by your CRM provider.

The solution is a data warehouse, where you extract the data from your application API just once, then all your reports interact with a faster database. You can integrate your CRM data with other systems in your organisation; and optimise the data structure into facts and dimensions, the best solution for reporting and analytics.

Here’s what it might look like from a time perspective:

Keep in mind, you also get the benefit of fast, interactive analytics and visualisation, for the entire day. Interactive dashboards are a reality, not a dream.

With our pre-built interfaces to Salesforce, Dynamics365 and NetSuite (and more coming soon), Ajilius makes it fast and simple to integrate your CRM into a data warehouse.

Ajilius profiles your data sources, screens your data to ensure its quality, ingests it into your data warehouse, transforms it into facts and dimensions, and creates direct integrations in business terminology with great products like Yellowfin, Tableau, Qlik, PowerBI and Excel. We’re the best value solution in data warehouse automation.

Ajilius. Faster CRM reporting.

Better BI Lineage

Ajilius generates metadata-based models for Yellowfin, Tableau, Qlik and PowerBI. We use the descriptions and notes that you maintain in Ajilius to generate business-friendly models and documentation.

The quality of the model depends on the capabilities of the BI product. Yellowfin and Tableau have a  richer metadata capability than Qlik and PowerBI.

We’ve taken advantage of that metadata capability to add a great new feature to the models for Yellowfin and Tableau – data lineage. Note the last two lines of the highlighted tooltips in the following screenshots:

Not only do we present the documentation you provided for the column, but did you see the last two lines? We track right back through the data warehouse lineage to give you the data sources, tables and columns from which each dimension or measure was sourced.

Now there can be no arguments about the source of data in reports, it is right in the model so that users know exactly where each number and value in their report originated.

Ajilius. Better BI Governance.

Dynamics 365 Data Warehouse

Ajilius 2.4.8 brings full Dynamics 365 integration to your data warehouse.

Connect to Dynamics:

Browse metadata to select your tables and columns:

Profile your data for a better understanding:

Load and blend your data:

Ajilius. Integrating CRM for faster, better reporting and analytics.

Why Data Warehouse Automation kicks the crap out of SSIS

Here’s a great explanation of how to do incremental loads in SSIS. It uses third-party components from Pragmatic Works, and a combination of hand coding and SSIS tasks.

http://blog.pragmaticworks.com/incrementally-loading-data-from-salesforce.com

See how complex it is? How long it takes?

Here’s how to do the same thing in Ajilius:

Simply check the box on the column/s that govern incremental change.

Under the covers we do almost exactly the same tasks as described in the blog post, but the difference is that we do them, not you. You make a decision in the UI, and we instantly generate all the code and logic that implements that decision.

That’s it … a perfect example of why Data Warehouse Automation kicks the crap out of  great ETL tools like SSIS, and why we deliver data warehouses in a fraction of the time spent in those products.

Ajilius. Data warehouses. Faster.

Ajilius 2.4.0

We’re a month late, and we’re sorry. But Ajilius 2.4.0 has finally been delivered.

We’ll blog more about specific features, but here’s what you have to look forward to in this release:

  • BI Accelerators for Yellowfin, Tableau and Qlik. Full business-friendly metadata generation at the click (Qlik?) of a mouse. We’re the only data warehouse automation platform to support all three platforms.
  • BI Views for a unified query experience across any BI tool, including PowerBI and Excel. Now your PowerPivot users can share the same business-friendly table and column names as their Yellowfin and Tableau colleagues.
  • A revised, high-performance CTAS engine for MPP platforms, including Microsoft APS/PDW, Azure SQL Data Warehouse, Redshift and Snowflake. And we’re still the only data warehouse automation platform to support three click migration between supported platforms.
  • Full support for Microsoft APS/PDW. That makes us the only data warehouse automation platform that supports EVERY Microsoft RDBMS, SMP and MPP, on-premise and cloud.
  • A new Inference engine for managing early arriving facts. You have the one-click choice of automatically inferring dimension rows when new values are found in fact processing, or simply assigning those rows to the “Unknown” value.
  • Integrated Authentication for all Microsoft sources and targets.

Now the work begins on the Version 3.0 series of releases!

 

Yellowfin Integration

One of the challenges in self-service BI is that you can quickly end up in your BI tool’s version of Excel Hell. That is, every user has their own definition of data, sources and aggregations, leading to the pre-DW situation where no two reports ever agreed.

Ajilius now brings a new level of governance to self-service, by directly integrating the data warehouse metadata with the BI tool. Over the 2.4 development cycle we’re bringing integration to a number of popular BI platforms, beginning today with Yellowfin.

yellowfin1

Select a fact table, select a destination, and a full Yellowfin model for your star schema will be written to that destination.

Use the Import function in Yellowfin, and your model is imported as a Yellowfin connection and view.

Not only do we generate the metadata, but Yellowfin makes round-trip integration a breeze. In our demonstrations we show how models can be enriched in line with changes to the data warehouse, preserving your existing reports and dashboards.

Now your data warehouse and your BI solution can be closer than ever before.

Ajilius. Self-service and governance.

DW DBMS Changes

DW DatabasesIn line with DW market developments, we have made some changes to the databases supported by Ajilius as target data warehouse platforms.

The revised list of targets is:

  • SQL Server (On-Premise & IAAS)
  • PostgreSQL / EnterpriseDB
  • MariaDB Column Store
  • Exasol
  • Azure SQL
  • Azure SQL Data Warehouse
  • Snowflake Elastic Data Warehouse
  • AWS Redshift

With the release of SQL Server 2016 SP1, and the CTP of SQL Server on Linux, we’re excited by the potential for major cost savings in mid-market customers. Features like database compression and column store make it feasible to run a small data warehouse / data mart on even SQL Server Express Edition; stepping up through Standard and Enterprise Editions as storage and performance expectations grow.

We have been working with Microsoft on SQL Server on Linux for some time now, and see a bright future for this platform. Installation is easy, performance is great (although very slightly behind Windows at this time), and it has been rock-solid for over 2,000 cycles of DW deployments using Ajilius.

We’ve also been impressed by MariaDB’s new Column Store engine. We’re seeing great performance and scale from this platform, making it a natural target for any business running MariaDB / MySQL as its standard DBMS.

Exasol recently announced V6 of its in memory DBMS, and we’re busy upgrading our Exasol adapter to take advantage of its new features.

With all these new and upgraded platforms comes a harder decision, to drop one of our supported databases. While we introduced Greenplum support earlier this year, we have had just one enquiry during that time, and no customers. Time to retire it, and put the support time to better use.

We’ve still got one platform that we want to add as a target, but I’ll save that announcement until development is further advanced. A hint, it will be a BIG announcement.

Ajilius. Supporting tomorrow’s DW platforms, today.

SharePoint Data Warehouse

Reference data – the common lookup tables that drive many data warehouses – often has no clear home in an organisation. It usually gets dumped in spreadsheets, XML files or text files, and ends up in a mess. Recently we’ve found that SharePoint Online makes a great repository for DW reference data. It can be managed and maintained in a clean, multi-user environment, then neatly integrated into the data warehouse.

This was supposed to be a post about the relative merits of SharePoint and Google Sheets for the maintenance of reference data. Unfortunately, Google’s approach to browser-based OAUTH authentication doesn’t play well on servers, so we’re temporarily shelving Google Sheets as an Ajilius data source until we figure out a workable solution.

On with SharePoint!

Adding a SharePoint Online data source to Ajilius is a breeze. Simply enter your site, user name and password:

sharepoint01

Refresh your SharePoint metadata, and you’ll see the libraries, lists and other data sources that can be loaded into Ajilius. Here we’ve created a simple list in which we can maintain details of public holidays.

sharepoint02

You’ll notice that there are many metadata columns added by SharePoint. In fact, we’re only interested in three columns, and we’ll clean that up once we’ve imported the metadata.

Import your metadata just like any other Ajilius data source, by selecting the Load Metadata option from the context menu for the SharePoint.PublicHolidays table. Here is the screen to complete the metadata:

sharepoint03

When added, you’ll see a screen like this, showing the columns for which metadata has been imported:

sharepoint04

Let’s clean up those extra columns we don’t need. Click the Change link at the top of the right-hand panel, and you’ll be taken to the column list.

sharepoint05

We’re going to use the Delete link, highlighted in the previous picture, to quickly delete a large number of columns from the table. When you click it, you will see a simplified form of the column mapper. You can choose to delete a column by changing the drop-down to Delete, either by selection, or most browsers support simply tabbing into the drop-down and pressing the ‘D’ key to set the value.

sharepoint06

When you’ve deleted the columns you don’t need, you should be left with a Date, Store and Title column. Clean up their metadata with some better descriptions, set the Date and Store as business keys to the table, and it should look like this:

sharepoint07

Next, use the Scripts option from the load_public_holiday context menu, and Create and Load your new table. Here is how your screen should look after loading the data.

sharepoint08

And when we view the data, we can see that it has been successfully loaded from our source in SharePoint Online.

sharepoint09

SharePoint Online makes a great multi-user tool for managing the small reference data sets that often get overlooked in the data warehouse governance process. Ajilius now includes a SharePoint connector as a standard feature, at no additional cost.

Ajilius. Now with SharePoint.