Ever heard the phrases Data Driven, Data Centric, or Data Fabric? Maybe you've heard British mathematician Clive Humby's phrase, “Data is the new oil”? Yes, we live in a world where data is the most important commodity and on IBM i we have plenty. Our issue is data isn't staying on IBM i, rather it's going elsewhere - off platform for reporting or analytics or machine learning or some or purpose. I call all this copying, duplicating, extracting, and thrashing of data simply Data Motioning.
Does any of this Data Motioning sound familiar?
- move, copy, duplicate, extract, subset, filter, etc. your data from your IBM i,
- optionally clean it by changing data types, substringing or concatenating fields, performing math calculations, etc.,
- optionally join with other tables, or perhaps with data on other IBM i LPARs or non-IBM i servers,
- optionally process journal entries for change data capture operations possibly from remote journals,
- optionally enhance with data from external services,
- optionally transform to another data format such as JSON or CSV, and
- load to another library or table, or to another destination such as the IFS, or to Microsoft SQL Server or Oracle, or to a cloud database?
If Only All Data Motioning Were Easy!?
You've probably written code to perform these tasks. Writing code may be okay for a couple of tables but it quickly gets out of hand when there are hundreds or thousands of tables. The development and maintenance investments can be larger than expected. And complexity can increase quickly consuming valuable development time and resources. Let's take a closer look at some Data Motioning scenarios.
Motioning data from one table to another on a different IBM i system while transforming data
Motioning data from multiple tables on one IBM i to multiple destinations
Motioning journaled data from one or more tables on one IBM i to multiple destinations
One advantage to journal processing is near real-time data capture (change data capture) and the business may need this type of processing to feed Business Intelligence systems. If you have data on multiple systems, remote journals provide a way to capture that remote data without having to copy data to a centralized location.Motioning data from multiple systems where some or all are non-IBM i's to multiple destinations some or all of which may be non-IBM i systems
Motioning to Open-source Applications
Many IT environments are beginning to use Apache's open-source technologies like Kafka and Camel to address heterogeneous data motioning. Without knowledge of these open-source technologies, this scenario can be complicated, leading to a long development cycle.
Motioning to a Data Warehouse
Dude, Where's My Data Going?
It's getting more common for data to be spread everywhere, anywhere and even out of your control. Often data is extracted without internal knowledge of the IBM i database specifics (char dates, multi member files, etc.) resulting in complaints about the quality of the database. Or data is extracted by an SQL developer who does not understand IBM i best practices for processing data resulting in unfounded accusations of a slow and archaic IBM i.
The business needs better stewards of data to ensure the right users get the right data at the right time. I suggest that it's IBM i experts who are the stewards. So how can we be successful and keep control of our data - and maybe our sanity?
Quite simply, we need a cost friendly, effective, and reliable option for
- An automated and repeatable process
- Optionally cleaning the data
- Optionally transforming the data
- Motioning data from somewhere to somewhere
Allow Me to Introduce DataMigrator!
DataMigrator is an ETL (Extract,
Transformation and Load) application which is part of the IBM Web Query family
of products. DataMigrator offers capabilities to schedule processes that extract,
consolidate, aggregate, clean, transform, load, and motion data across multiple
platforms not just IBM i. Near real-time capture of data is possible with
DataMigrator’s built in journaling and remote journaling support. Building Dimensional Models is also
easy with DataMigrator built in support for these types of databases.
DataMigrator is a useful but little-known product for IBM i environments. I think of DataMigrator as an unsung hero of IBM Db2 for i. DataMigrator could save your day, budget, and possibly your well-being.
Still Not Convinced?
- Create data marts and operational data stores
- Built in dimensional modeling for data warehouses
- Processing journals including remote journals and DataMigrator natively decomposes the journal entry into the tables format so no more manual work
- Can be your ETL Pipeline or participate in your existing ones
- GUI
- Drag and Drop
- PC Design Client
Built in functions and formula
- Formulas for math operations, statistical functions, concatenation, subsetting, type conversion, data formatting, and more.
- SQL Pass-Through for direct access to SQL
- DataMigrator jobs run natively on IBM i (7.3+)
SQL based – set vs row at a time
- All the performance and benefits of SQL without writing any SQL code
Security
- Honors all IBM i security including object security and RCAC.
Low Cost
- Compared to other ETL solutions, very cost effective
- Can pay for itself within months
And much much more!
Interested in a demo or more information? Please reach out to me, JWestcott@ibm.com or QU2@us.ibm.com.
No comments:
Post a Comment