November 28, 2015

Implementing Rolling Window concept of IBM Informix TimeSeries technology


Have you been looking for a database configuration that enables you to only see the data for a given period, no matter, what's your data retention period is? 

In the process, have you been working on making a simple delete / purge query, as complex as possible, trying to identify the markers to delete based on the time stamp, calculating the difference in hours / days / months, and ended up deleting more than necessary? leading to non-compliance or other legal terms?

IBM Informix TimeSeries has the Rolling Window feature, a concept that involves Active and Dormant windows (Partitions on DBSpaces) as its key components, enables an organization to manage time series datasets based on defined time intervals, with just a minor configuration setting. 

A user can configure Rolling Window by defining the number of Active and Dormant windows, facilitating the applications to only view the data in Active window(s), while the data in the Dormant window(s) are still within the database, they are inaccessible to the queries. Thus Rolling Window feature effectively allows you to query the data belonging to a certain window ( time period ) with ease and automatically manages the data ( delete / purge), that's dormant in nature, based on the configuration settings, as set by the user.

Consider the scenario, where an organization working on Sensor Data Management and is required to:
  • Consolidating the data from various sensors, whose data insertion frequency varies from milliseconds to minutes, into the database. 
  • BI analytics is performed against these datasets, that are not older than 2 days. 
  • Any data older than 2 days, needs to be archived / deleted / purged and has to be managed on database side, reducing application efforts.

With Informix TimeSeries Rolling Window, you can plan to define 2 Active windows, to hold the data of last 2 days, then 2 Dormant windows, to hold the data for next 2 days ( as a buffer for any ad-hoc queries) and then define to delete any data older than 4 days. 

The configuration for Rolling Window needs to be defined, while creating the TimeSeries Containers:

execute procedure tscontainercreate('tscont1','dbspace1','row1',1024,1024, '2015-07-01 00:00:00.00000'::datetime year to fraction(5), 'day', 2, 2, 'dbspace1, dbspace2', 1, 16, 16);

Here:
'day'                      - Is the Rolling Window Time Interval. You can define HOUR / DAY / MONTH / YEAR as the time period for data retention
2                               - Is the number of Active Windows, whose data shall be accessible by the queries
2                               - Is the number of Dormant Windows, whose data shall be in database, but inaccessible to queries
'dbspace1,dbspace2' - The list of DBSpace(s), where the Active and Dormant window partitions reside.

The illustration is as follows:

1. Data for Day 1 gets inserted into the database and is available for BI queries


    2. Data for Day 2 gets inserted. BI queries access data from Day 1 & 2

    3. Data for Day 3 gets inserted. Data of Day 1 gets moved into the Dormant window and is inaccessible. BI queries continue to access the data that's available in last 48 hours

    4. Data for Day 4 gets inserted. Data from Day 2 gets moved into the Dormant window, as we had space for 2 Dormant windows and that Active window can accommodate data for last 48 hours.

    5. Data for Day 5 gets inserted. Data from Day 3 gets moved to Dormant window and now, the data from Day 1 gets automatically deleted from the database.


    Now, we have put together a simple shell script, that shall take you through the implementation of TimeSeries Rolling Window feature, in under 10 minutes. The shell script has been designed to help understand the basic objects of TimeSeries viz. Calendar, Container, TimeSeries, Virtual Table Interface (VTI) and SQL API and understand Informix TimeSeries syntax.

    1.      The shell script is well tested on Linux and might need minor changes on Unix platform. It won’t work on Windows platform. However, you can copy paste SQL statements high lighted in blue in following section and execute on windows manually.
    2.      Ensure that Informix engine is online. You can use “onstat –“  command
    3.      Download the shell script available in the zip file ( Or Copy - Paste the following link on to the browser:
    https://drive.google.com/open?id=0B4mxiMcgUauaMENkNVdxZXJRQlU )
    4.      Unzip the file on linux/Unix machine under some test directory
    5.      You might need to execute dos2unix command for each file before execution
    6.      Ensure that 'main.sh' and 'func.sh' have Execute Privileges
    7.      Execute main.sh and just follow the instructions.

    Output of shell script should look like this :


    ################################################################
    Welcome to TimeSeries demo.
    This program has sections that demonstrate capabilities of TimeSeries.
    Flow in each section is Setup, Data Load & Data Retrieval
    please enter your choice of TimeSeries demo
     1      : TimeSeries Demo
     2      : JSON Demo
     3      : TimeSeries with JSON Demo
     4      : TimeSeries with Rolling Window Concept
     5      : TimeSeries with Hertz Frequency Concept
     6      : TimeSeries with Swinging Door Compression Algorithm

    4

    ################################################################
     You have chosen a Demo on Informix TimeSeries with Rolling Window Concept
    ################################################################

    Please read the options carefully :-
    If you are executing the TimeSeries Demo for the first time, then
            Press '1'       : To Perform the Setup and create the Demo Environment

    If you have already performed the setup for TimeSeries Demo, then
            Press ANY NUMBER: To Execute Queries against the prepared Setup

    1

    Cleaning up old files...
    Clean up is over. Please presss Enter to continue

    ################################################################
    Creating dbspace dbspace1, dbspace2, dbspace3

    execute function admin('create dbspace','dbspace1',
    '/home/amprasan/IDS1210/storage/dbspace1','100 MB','0');
    execute function admin('create dbspace','dbspace2',
    '/home/amprasan/IDS1210/storage/dbspace2','30 MB','0');
    execute function admin('create dbspace','dbspace3',
    '/home/amprasan/IDS1210/storage/dbspace3','30 MB','0');
    45107280         6        0x60001    6        1        2048     N  BA    informix dbspace1
    479f9ce0         7        0x60001    7        1        2048     N  BA    informix dbspace2
    47de7cc8         8        0x60001    8        1        2048     N  BA    informix dbspace3
    471ba028         6      6      0          10000      4851                  PO-B-D /home/amprasan/IDS1210/storage/dbspace1
    4800f028         7      7      0          15360      14283                 PO-B-D /home/amprasan/IDS1210/storage/dbspace2
    47603028         8      8      0          15360      14795                 PO-B-D /home/amprasan/IDS1210/storage/dbspace3

    dbspace creation .. passed

    Please press Enter to continue

    ################################################################
    Creating database demo_db

    drop database if exists demo_db;
    create database demo_db in dbspace1 with buffered log

    database creation .. passed

    Please press Enter to continue

    ################################################################
    Creating row type row1

    create row type row1
    (
     time_stamp datetime year to fraction(5),
     value float
    );

    Row creation .. passed

    Please press Enter to continue

    ################################################################
    Creating calendars calsec, cal1min and cal15min

    execute procedure ifx_allow_newline('t');
    insert into calendarpatterns values
    (
     'sec','{1 on},second'
    );
    insert into calendarpatterns values
    (
     'min','{1 on},minute'
    );
    insert into calendarpatterns values
    (
     'min15','{1 on ,14 off},minute'
    );
    insert into CalendarTable
    (
     c_name,c_calendar
    )
    values ('calmin','startdate(2015-07-01 00:00:00.00000),
    pattstart(2015-07-01 00:00:00.0000),pattname(min)');

    insert into CalendarTable
    (
     c_name,c_calendar
    )
    values ('calsec','startdate(2015-07-01 00:00:00.00000),
    pattstart(2015-07-01 00:00:00.0000),pattname(sec)');

    insert into CalendarTable
    (
     c_name,c_calendar
    )
    values ('cal15min','startdate(2015-07-01 00:00:00.00000),
    pattstart(2015-07-01 00:00:00.0000),pattname(min15)');

    Calendar creation .. passed

    Please press Enter to continue

    ################################################################
    Creating table test_ts

    create table test_ts
     (
       tagid char(20),
       series timeseries(row1),
       PRIMARY KEY (tagid)
    ) in dbspace1;
    alter table test_ts
    lock mode(row);

    Table creation .. passed

    Please press Enter to continue

    ################################################################
    Enable Rolling Window, while creating the Container 'tscont1', by extending the 'TSCONTAINERCREATE' definition to include
    INTERVAL - Which is 'DAY' in the following command
    Active Window - Which is defined as numeric '4' indicating 4 Active Windows
    Dormat Window - Which is defined as numeric '2' indicating 2 Dormant Windows
    DBSpaces spread across - Which is defined by 'dbspace1, dbspace2, dbspace3'

    Thus enabling the TimeSeries Container to hold 4 Days of Active Data at any point of time and hold 2 Days of Dormat Data that will be inaccessible to queries
    ################################################################

    Creating container tscont1

    execute procedure tscontainercreate('tscont1','dbspace1','row1',1024,1024, '2015-07-01 00:00:00.00000'::datetime year to fraction(5), 'day', 4, 2, 'dbspace1, dbspace2, dbspace3', 1, 16, 16);

    Container creation .. passed

    Please press Enter to continue

    ################################################################
    Creating virtual table test_v

    execute procedure TSCreateVirtualTab('test_v','test_ts');

    Virtual table creation .. passed

    Please press Enter to continue

    ################################################################
    Initializing all tags tag1 to tag5

    BEGIN;
    insert into test_ts values('tag1',TSCreateIrr('calsec', '2015-07-01 00:00:00.00000',0,0,0,'tscont1'));
    insert into test_ts values('tag2',TSCreateIrr('calsec', '2015-07-01 00:00:00.00000',0,0,0,'tscont1'));
    insert into test_ts values('tag3',TSCreateIrr('calsec', '2015-07-01 00:00:00.00000',0,0,0,'tscont1'));
    insert into test_ts values('tag4',TSCreateIrr('calsec', '2015-07-01 00:00:00.00000',0,0,0,'tscont1'));
    insert into test_ts values('tag5',TSCreateIrr('calsec', '2015-07-01 00:00:00.00000',0,0,0,'tscont1'));
    COMMIT;

    Tags initialization .. passed

    Please press Enter to continue

    ################################################################
    Load data for Tags 1 to 5 through VTI for a duration of 4 Days

    load from rw_load.unl insert into test_v;

    Loading data through VTI .. passed

    Please press Enter to continue

    Congratulations!!! Your setup is successful. Please press Enter to refresh the screen and be ready to see certain data retrieval queries

    ################################################################
    Simple select on VTI table to see range of data

    select * from test_v
    where tagid='tag1'
    and time_stamp between '2015-07-02 00:00:00.00000' and
    '2015-07-02 00:00:00.20000';

    Press enter to see output

    Database selected.

    tagid                time_stamp                         value

    tag1                 2015-07-02 00:00:00.00000 100.0000000000
    tag1                 2015-07-02 00:00:00.02000 101.0000000000
    tag1                 2015-07-02 00:00:00.04000 102.0000000000
    tag1                 2015-07-02 00:00:00.06000 103.0000000000
    tag1                 2015-07-02 00:00:00.08000 104.0000000000
    tag1                 2015-07-02 00:00:00.10000 105.0000000000
    tag1                 2015-07-02 00:00:00.12000 106.0000000000
    tag1                 2015-07-02 00:00:00.14000 107.0000000000
    tag1                 2015-07-02 00:00:00.16000 108.0000000000
    tag1                 2015-07-02 00:00:00.18000 109.0000000000
    tag1                 2015-07-02 00:00:00.20000 100.0000000000

    11 row(s) retrieved.

    Database closed.

    Press enter to continue

    ################################################################
    Simple select on TimeSeries table to see range of data

    select 'tag1',
    clip(series,'2015-07-02 00:00:01.00000'::datetime year to fraction(5) ,
    '2015-07-02 00:00:01.30000'::datetime year to fraction(5))
    from test_ts where tagid='tag1';

    Press enter to see output

    Database selected.

    (constant)    tag1
    (expression)  origin(2015-07-02 00:00:01.00000), calendar(calsec), container(tscont1), threshold(0), irregular, [(100.0000000000)@2015-07-02 00:00:01.00000, (101.0000000000)@2015-07-02 00:00:01.02000, (102.0000000000)@2015-07-02 00:00:01.04000, (103.0000000000)@2015-07-02 00:00:01.06000, (104.0000000000)@2015-07-02 00:00:01.08000, (105.0000000000)@2015-07-02 00:00:01.10000, (106.0000000000)@2015-07-0200:00:01.12000, (107.0000000000)@2015-07-02 00:00:01.14000, (108.0000000000)@2015-07-02 00:00:01.16000, (109.0000000000)@2015-07-02 00:00:01.18000, (100.0000000000)@2015-07-02 00:00:01.20000, (101.0000000000)@2015-07-02 00:00:01.22000, (102.0000000000)@2015-07-02 00:00:01.24000, (103.0000000000)@2015-07-02 00:00:01.26000,(104.0000000000)@2015-07-02 00:00:01.28000, (105.0000000000)@2015-07-02 00:00:01.30000]

    1 row(s) retrieved.

    Database closed.

    Press enter to continue

    ################################################################
    Select on TimeSeries table and output in traditional relational form

    select 'tag3',time_stamp,value from
    table (( select tssettolist(clip(series,
    '2015-07-02 00:02:00.00000'::datetime year to fraction(5),
    '2015-07-02 00:03:00.30000'::datetime year to fraction(5)))
    ::list(row1 not null)
    from test_ts
    where tagid='tag1'));

    Press enter to see output

    Database selected.

    (constant) time_stamp                         value

    tag3       2015-07-02 00:02:00.00000 109.0000000000

    1 row(s) retrieved.

    Database closed.

    Press enter to continue

    ################################################################
    Roll up data to 15min cycle using aggregateby function

    SELECT *
    FROM TABLE (( SELECT TSSetToList(AggregateBy('avg($value)', 'cal15min',
    series, 0, '2015-07-02 00:00'::datetime year to minute,
    '2015-07-03 2:30'::datetime year to minute))::list(row1 not null)
    FROM test_ts
    WHERE tagid='tag2'));

    Press enter to see output

    Database selected.

    time_stamp                         value

    2015-07-02 00:15:00.00000 104.4751381215
    2015-07-02 00:30:00.00000 109.0000000000
    2015-07-02 00:45:00.00000 109.0000000000
    2015-07-02 01:00:00.00000 109.0000000000
    2015-07-02 01:15:00.00000 109.0000000000
    2015-07-02 01:30:00.00000 109.0000000000
    ........
    ........
    2015-07-02 23:15:00.00000 109.0000000000
    2015-07-02 23:30:00.00000 109.0000000000
    2015-07-02 23:45:00.00000 109.0000000000
    2015-07-03 00:00:00.00000 109.0000000000
    2015-07-03 00:15:00.00000 104.4751381215

    97 row(s) retrieved.

    Database closed.

    Press enter to continue

    ################################################################
    Load Data for all Tags on Day 5 through VTI

    load from rw2_load.unl insert into test_v;

    Loading data through VTI .. passed
    Please press Enter to continue

    ################################################################
    Simple select on VTI table to see range of data from Day 1
     Since the example maintained only 4 days of Active window, the following query doesn't yeild data for Day 1, once the data for Day 5 is loaded

    select * from test_v
    where tagid='tag1'
    and time_stamp between '2015-07-01 01:00:00.00000' and
    '2015-07-01 01:10:00.00000';

    Press enter to see output

    Database selected.

    tagid                time_stamp                         value

    No rows found.

    Database closed.

    Press enter to continue


    Congratulations!!! Your Demo is completed Successfully.
    We hope you had a pleasant learning experience


    No comments: