Delimited file formats

Delimited files, often referred to as “CSV” or text files, represent tables of data in a simple text format.

Rows are separated by newline characters, or by newline and carriage-return pairs.

Columns within rows are separated by delimiters. The delimiters supported by Ajilius are comma, tab, semi-colon and vertical bar (‘|’). All rows must contain all columns, although empty columns may be represented as consecutive delimiters.

Text fields containing delimiters must be enquoted using single (‘) or double (“) quote characters.

Where a text field contains a character used as a delimiter, that character must be preceded by an escape character, typically backslash (\). If the text field contains the escape character, it must be preceded by another escape character (ie, ‘\\’).

Delimited files must have column names contained in the first row of the file.

The file may be encoded in any format supported by Ajilius (link).

Detection of encoding, delimiters, quotation and escape characters is automatically performed on file access. You don’t need to specify anything other than the file name.

Ajilius. Flexible delimited file formats.

File Transcoding

Because Ajilius is a multi-lingual and cross-platform product, we sometimes get asked about our ability to handle specific file encodings.

XML, JSON and Delimited files may be any of the following:

  • ASCII
  • windows-1252
  • UTF-8 (with or without a BOM)
  • UTF-16 BE or LE (with a BOM)
  • UTF-32 BE, LE, 3412-ordered, or 2143-ordered (with a BOM)
  • Big5, GB2312/GB18030, EUC-TW, HZ-GB-2312, and ISO-2022-CN (Traditional and Simplified Chinese)
  • EUC-JP, SHIFT_JIS, and ISO-2022-JP (Japanese)
  • EUC-KR and ISO-2022-KR (Korean)
  • KOI8-R, MacCyrillic, IBM855, IBM866, ISO-8859-5, and windows-1251 (Russian)
  • ISO-8859-2 and windows-1250 (Hungarian)
  • ISO-8859-5 and windows-1251 (Bulgarian)
  • ISO-8859-7 and windows-1253 (Greek)
  • ISO-8859-8 and windows-1255 (Visual and Logical Hebrew)
  • TIS-620 (Thai)

Ajilius will transcode these file types to the encoding of the target DBMS, assuming that the target encoding is capable of supporting the characters in the input file.

Ajilius. We can read your writing.

New feature: Separate extract / load

We’ve now separated Extract and Load processes in Ajilius.

Previous versions coupled the extract and load processes for a table. To process the LOAD_PRODUCT table, for example, we would connect to the source database, extract the required rows, and load those rows to the warehouse, all within the same process.

We faced a situation recently where the largest extract for a warehouse came from a system that finished its end-of-day processing four hours ahead of the window allocated for the data warehouse load. The extract from this system was the longest-running task in the ELT process.

By separating the extract and load processes, we are now able to schedule the extract from this system (and others) to complete as early as possible, with the load to the warehouse occurring at a later time.

The data warehouse load window is made significantly smaller, giving the operations team more headroom if any errors occur during end-of-day processing in upstream systems.

You can, of course, continue to run extract and load processes at the same time if you prefer.

Ajilius. Making ELT fast and flexible.

New feature: Warehouse Script All

We’ve just released a new feature for Ajilius, a one-click generation of all scripts for a warehouse.

In previous releases you ran each script manually. This was a little tedious if you wanted to run an entire load process from within the Ajilius application.

Now, you can select the Script All option from the Warehouse menu, and a full set of ELT scripts will be generated from the session warehouse metadata.

You will be shown a script screen with the full warehouse DDL in the left pane, and a script to run all ELT processes in the right pane. Use the Create and Run buttons to perform an end-to-end build of your data warehouse.

Ajilius. Eliminating repetition in Data Warehouse Automation.

New competitor: BI Builders

A new week, and a new competitor. Norwegian company BI builders has popped into view with a very pretty looking dimensional warehouse solution for SQL Server users.

It is a desktop solution, generating SSIS, which puts it in the same category as Dimodelo, TimeXtender and (perhaps) WhereScape. LeapfrogBI is a little different, being a web-based solution.

This is becoming a very competitive section of the market!