How would you like your data stored: Hot, Warm or Cold?
Matt Potts
By Matt Potts
Blog

Following on from the blog around operating BW on HANA and what it means from an operational perspective, I wanted to discuss some of the functionality that BW powered by HANA brings in further detail.

The aim of this blog is to discuss data management within the system focusing on active data management (aka temperature data management) and the role of SAP IQ (previously known as Sybase IQ) as a Near Line Storage (NLS) solution for BW, along with how it can be used for extended storage.

Active Data Management

Firstly, let’s talk about the profile of data. We can classify data warehouse data into three different profiles - Hot, Warm and Cold, according to its access frequency. The characteristics of these profiles are summarised below:

Data Profiles

Warm data typically represents 40-60% of data volume within a data warehouse, this data does not need to occupy the main memory in HANA all of the time. In BW running on HANA, the active/non-active data concept was introduced to allow a displacement of tables’ data from main memory in case a memory shortage occurs, as well as generally managing the memory consumed at any given point. Active Data Management is a standard feature of BW on HANA which was introduced with SP5 of HANA (and BW 7.3 SP8).

HANA of course doesn't have infinite amounts of memory (based mainly on cost of both hardware and licences), neither would you want to store all of your BW data (across all of the tables in the database) in memory all of the time. HANA will automatically swap tables in and out of memory, based on the last access time of the data, if it reaches its memory capacity. Active Data Management allows a functional level of control on the BW tables being placed in HANA memory.

InfoProviders and PSA (define) tables in BW on HANA can be marked with an 'early unload' flag. Once an object has been marked with the flag all of its tables and partitions are displaced from memory when a memory shortage occurs ahead of other objects. If data is displaced from memory, it is loaded back into memory and processed as normal when next required/requested. BW will automatically mark all PSA tables and write-optimised DSOs as 'early unload' by default - this can be overwritten, just as you may decide to have the 'early unload' turned on for standard DSOs and InfoCubes (though not recommended).

Standard HANA memory displacement is performed on a table’s columns, based on the ‘last used’ date concept. The 'early unload' flag multiples the last used time statistic with a displacement factor (a default of 27) to effectively age the last used date of tables even further (making them older, and therefore at the top of the list to be displaced). The cockpit to manage the 'early unload' flag also allows the manual load and unload of data in and out of the HANA memory, so for example, after system restart you may push all reporting based InfoProviders into memory for first query requests (if you don’t do this, it will get loaded at the point of first request).

IQ for NLS

SAP IQ is a column store database (like HANA, but with disk based persistence) which is built for analytics. Its primary purpose is to be used as a database in its own right, for analytics, but we'll be focusing here on discussing it as NLS solution for BW.

It's also worth noting that you can use IQ as a smart data access source for HANA if you are using HANA as a platform or a database independent of BW. The concepts and reasons of using IQ as a NLS option for BW are extremely similar to using IQ as a smart data access option for non BW HANA.

NLS has been available in BW for a while, and supports a number of databases other than IQ (such as PBS, Informatica and Outboard) but as IQ is a SAP product and contains many technical similarities with HANA it makes it one of the front runners for a NLS option. NLS allows cold data to be stored in an optimal format, outside of the BW database, but still makes it accessible when required, without the need to make any functional adjustments. For BW, whilst cold reporting data may be stored in NLS, if a user requests it via a query then it is read without any need to adjust the query design (so therefore still appears to be online and a totally transparent process to the user).

IQ for BW on HANA is an SAP owned NLS solution which delivers an ABAP-based implementation of the BW NLS interface (see figure 2) and like HANA, the column store in IQ means large data volumes can be significantly compressed for storage (i.e Petabytes). The expected data compression with IQ is around 90%.

IQ for BW NLS

IQ can only be used for the NLS solution of BW and not as the standard operating database (which HANA of course can be).

IQ for Extended Storage

With HANA SPS7 a new feature was introduced, called extended storage. Extended storage are tables that logically sit in HANA and can be used as if they were normal HANA database tables. However, they physically sit in IQ which is tied to the HANA system. This is completely transparent to the user. In essence for the DBA or BW developer they will simply manage one table. There will be no need to move data into IQ from HANA and then delete it from HANA, they manage this themselves.

Extended Storage

BW Objects (DSOs and Write-Optimised DSOs) can have a property set of ‘extended storage’ and no change to applications or processes are required. It is also a simple conversion to move existing data to extended storage. Write and read operations are re-directed to the extended storage, all BW standard operations will be supported and HANA memory will only be required temporarily at runtime.

Figure 3 shows an overview of extended storage.

Fundamentally extended storage will mean BW running on HANA will be able to handle PB-scale data volumes at an attractive price point for customers.

The extended table feature is available with the following product versions:
• SAP NetWeaver BW 7.4 SPS05 and higher
• SAP HANA SPS 7 and higher
• SAP IQ 16.0 SP03 and higher

So which solution is recommended for each data profile?

Below is a summary table which shows which technologies should be used to store the data profiles that were discussed in the opening section.

Active data management in BW on HANA is further complemented by IQ and offers customers a cost effective option to managing their ever growing data volumes in an in-memory environment. Together they offer a scalable and long term solution to manage the separation of frequently and infrequently accessed data to reduce overall TCO of data management.

To talk about BW, HANA and IQ and how this technology fits into your organisations technology roadmap please get in touch.

Matt Potts
By Matt Potts

Matt Potts is Solution Principal for Business Insights, specialising in Business Intelligence and Analytics. Matt has worked in the BI domain for over 8 years, helping organisations turn their data into actionable business insights through the use of agile BI platforms and delivery models. He is involved with multiple clients across varying lines of business and industries, working with them to ensure they realise the value from their investment in BI technologies. Outside of work Matt enjoys running, cycling and watching a range of sports and is kept busy by DIY projects at home.

Follow us on Social

And get access to even more digital insights

 

Just want to talk, call us on:
+44 (0)844 5610930

Join us

Life doesn't wait and neither should you. If you want to join a bunch of people intent on changing the world, you've come to the right place.

 

Latest opportunities