How can repeating tasks be simplified? The question of automating processes in a data warehouse (DWH) keeps project teams busy.

The need for a comprehensive automation product then usually arises very quickly. Meanwhile, there are many products on the market: established ones, newcomers and in-house developments. Project teams are spoiled for choice. And not infrequently experience disappointment after the product selection or product launch.

So what is it all about? Many data warehouse project teams have dealt in the past or are currently dealing with the question of how the development and execution of processes in a data warehouse (referred to as data solution in the following) can be automated: How can recurring tasks be simplified and the development of processes be carried out with as little effort as possible?

Especially in a Data Solution with many repetitive patterns - for example, with Data Vault - there is a desire for an automation tool that supports teams in development and promises an advantage in overall performance.

I describe what the expectations are for an automation solution, what vendors promise, and what the reality ends up looking like in this multi-part blog post.

Expectations on automation

The project teams' expectation of an automation solution is to support the new development or a necessary restructuring and reorganization of a data solution. At the latest when the acquisition or development of an automation solution is being considered, a number of buzzwords pop up in the minds of team members.

 

Automated ETL

Automated creation of all ELT or ETL data logistics processes - When data logistics processes are developed manually, they are prone to implementation errors or, if the loading pattern changes, are difficult to update. This is often the starting point for considering an automation solution. After all, these are mostly recurring patterns (tables and processes) that should be easily automated.

Simplification

Simplifying the development process - How can the project team simplify the work to be done while delivering artifacts faster? Is this possible with a suitable tool? It is expected that the automation solution will simplify the daily work.

 

Orchestration

Orchestration of data logistics processes - Many small processes (often several 1000) exist in the Data Solution and need to be executed in parallel and/or in a specific order. If these all have to be manually inserted into an orchestration tool, this is very time-consuming and the automation solution is expected to take over this task.

Metadata

Metadata, the DNA of the Data Solution - At the latest, teams are becoming aware of the need for metadata. It is expected that they will be the foundation to create, simplify, orchestrate and finally automate the processes as much as possible.

 

Speed

Faster delivery of artifacts - Previous thoughts and buzzwords about an automation solution lead to the expectation that automating, simplifying, and orchestrating with metadata will make the delivery of artifacts faster.

Data Vault Standard

Use existing Data Vault standards - Of course, it is expected that all Data Vault standards are supported. Regardless of which Data Vault standard is planned or already in use in the Data Solution. And the automation solution should be adaptable to the needs in the project, such as reusing standards from a book or another automation tool.

 

Insensitive

Insensitive to (external) changes - If something changes in the operational system or new data needs to be integrated, it is expected that the resulting changes or refactoring can be done very easily in the Data Solution.

Integration

Operational Systems Integration - EIt is expected that the integration of different operational systems will be easy to manage. This also applies to support for the integration of different, overlapping or competing business keys.

 

 
Top 10

High ranking - The automation solution is rated as high as possible in a ranking. The criteria or the evaluated features do not (or must not) play a role.

 

 

These are just some of the expectations that I have encountered in recent years and which are demanded for the automation solution. But they are also the reasons to invest in an automation solution.

Automation vendor pledges

Project teams looking for an automation solution are spoiled for choice. There are now many products on the market: established solutions and newcomers as well as countless in-house developments.

It is not uncommon for teams to experience disappointment after product selection or product launch. One of the main causes for this are the pledges made by the salespeople of an automation solution during the sales process.

When selecting vendors for an automation solution, there always comes a point when project teams invite vendors to a demo or product presentation.

Presentations show the ideal world, demos are precisely tailored to the functionalities of an automation solution on display. Regardless of whether it is an established solution, a start-up or a consulting firm's own development. This is how it should be! After all, all automation solutions pursue (more or less) the same goal.

A good salesperson asks what the project team expects from the automation solution. This is exactly where the vendor's pledges come into play as a response to the existing expectations:

  • "Of course we support Data Vault modeling!" - But how? And is that what the project team expects?
  • "We can automate (everything)!" - But what exactly? A search on the Internet shows how many tools there are that have the keyword automation in their description, but are completely different than expected.
  • "Our solution works right out of the box!" - Quick install and that's it?
  • "We can do continuous delivery! You don't have to worry about anything!"
  • "All code generation 'out of the box'! No customizations needed!"

 

The answers to other expectations of the project team could be as follows:

"When it comes to orchestration, continuous delivery of artifacts, and development speed - our solution maps all of that and almost runs itself. There are just a few small things to do and that's it."

"Generate Data Vault tables? Sure, in all variations!"

If the project team asks more specifically about the capabilities of domain-oriented data modeling, the answer might be:

"Our solution supports domain-oriented data modeling!"

Everything is promised and usually much, much more. Some promises may be meant in a different context, but that doesn't matter, does it?

We all know these pledges, we all know the need to evaluate solutions objectively, and at the same time we are easily convinced that the solution being presented is the best one.

I don't want to create a wrong impression. Every solution has its raison d'être, all statements and promises made can be correct, accurate and complete in a given context. In this sense, there is no bad or good automation solution - only one that fits the situation better or less well.

This is why it is so important as a project team to be clear about what is expected from an automation solution and according to which criteria it can be objectively evaluated.

These are just some of the promises I have encountered over the years. And yet, they are also the reasons to invest in an automation solution.

Case studies

Below I present some real cases that I have experienced in recent years. First, not the best ones. But then also a very good example of a successful selection of an automation solution.

First case

In the first case, a good set of instruments was created from the beginning. The project team, let's call it Team A, conducted a comprehensive selection process, starting with a long list of automation vendors and whittling it down to four vendors or four automation solutions.

Team A was quite proud, and rightly so, of their accomplishment in creating a defined set of really good selection criteria, weighting them, and adding any information they received from the vendors (in writing or in interviews) as additional information to the selection criteria. Team A's goal was to find an automation solution that met the company's needs as quickly as possible.

DWH Automatisierung & Erwartungen - Fallstudien - Fall 1

Actually, a good starting situation, right? However, and this is the other side of the coin, Team A secretly had a preferred automation solution. This was neither transparent nor obvious throughout the entire process.

How could this have happened? The promises made by the vendors were crucial. I have great respect for the performance and persuasiveness of some salespeople, I honestly admit. In the case described here, the presentation of an automation solution was so good that the participating members of Team A defined this solution as their desired solution and did not deviate from it in the further course.

Back to case 1: The results of the selection process - which was carried out really well - did not, of course, correspond to the preferred automation solution, but they did correspond to the original requirements criteria.

What happened next? The criteria from the previous selection questionnaire were adapted and watered down until all criteria corresponded to the desired automation solution. And that's how it ended up in first place!

In addition, technology-oriented thinking again gained the upper hand. That is, Team A focused more on the technological features of the automation solutions than on the actual requirements defined by use cases, data architecture, and other criteria.

And what was the result in case 1? Today, after several years, Team A or the company still has not chosen an automation solution. The selected automation solution did not pass the subsequent proof of concept (PoC) or did not work as expected. Time passed, Team A decided on a different automation solution, the PoC failed, and so on.

From my point of view, the mistake here in the first case was that Team A no longer focused on the criteria and functions in the selection process, but instead made the decision for an automation solution driven by emotions and some fancy features.

Second case

In the second case, the project team, let's call it Team B, made a pre-selection of automation solutions. This looked like the following: Search Google for the keyword 'automation'. It is amazing what search results you get there that do not fit at all to a DWH, called data solution in the following.

The result was a huge list of over 30 vendors offering an 'automation solution' found by searching the Internet.

Team B's list included everything that was somehow related to automation. At this point, Team B enlisted external help, significantly reduced the number of vendors, and finally had a shortlist of vendors.

The really good thing about team B's initial startup and expectation phase was that they took the opportunity to exchange ideas with other companies that had already successfully implemented automation tools.

At this point, it seemed that team B was really well on their way to identifying a perfect tool for their requirements and goals. But: Team B expressed the requirements on a very high and abstract level, with a vague definition and a very strong focus on technology. And on top of that, from that point on, they did it without external support.

Team B's real goal was to find an automation solution that would minimize and support all of the developer's manual tasks. Thus, the focus was to be on domain-oriented modeling - conceptual and logical. The physical modeling and thus the creation of tables, loading processes and orchestration were also to be supported by the automation solution.

DWH Automation & Expectations - Case Studies - Case 2

In practice, team B had great difficulty in identifying, defining and weighting the criteria for the selection process. The reason for this was the vague and abstract requirements formulated in advance.

Before this criteria identification process was completed, team B already invited the first manufacturers to presentations and demonstrations. And again, in case 2, the vendors were very strong in presenting their tools. They were always promising team B: Yes, our tool can do this.

And what happened? In principle, a really sad ending, because team B failed with this project after one year and completely stopped working on the automation solution. The automation solution did not meet the requirements, the expectations. A wrong decision was made due to the vague and incomplete criteria. A lot of money was wasted or burned.

Third case

In the third case, the project team, let's call it team C, has a strong desire for an automation tool.

From the beginning, team C did not create a complete list for selecting an automation solution. There was only a very short list because the pre-selection was based on the personal views and sensitivities of a single person from team C.

On the other hand, team C did a good job by creating a proof of concept (PoC) to gain experience with the automation solutions. On the other hand, there were no specific evaluation criteria for the PoC, only general buzzwords.

DWH Automation & Expectations - Case Studies - Case 3

In summary, team C wanted to conduct an 'independent' selection process for an automation solution without any real pre-selection of vendors. And this combined with a required PoC, which was to be evaluated according to keyword criteria.

The (unclear) criteria for the PoC as well as (unclear) requirements and goals for the selection process were largely determined by personal opinions and preferences of a single team member. Possible external help for the selection process of independent vendors and the preparation of a PoC were rejected on the grounds that team C has more than enough expertise to do so.

To date, team C has not made a decision on an automation solution. One of the reasons, in addition to the uncertainties mentioned above, was that business problems should be solved with a new technology. This usually does not work, because business problems have to be solved on a business level and not with a new great technology.

Thoughts

A few thoughts (almost) to conclude this series of posts. The most important thought I would like to share is that you always have to consider what challenges you want to solve with an automation solution.

DWH Automation & Expectations - Case Studies - Thoughts

Requirements and objectives

What are the requirements and what are the objectives? Where is the journey to go? When requirements and objectives are clearly defined, they provide some objective orientation.

Far-reaching decision

The choice of an automation solution is a far-reaching decision that will last for several years. Possible changes must not lead to the automation solution having to be replaced.

Proof of Concept

Define a PoC to test the defined requirements and objectives in the smallest possible timebox.

Technology is not a solution

The technology itself is not the solution to the problems. The requirements and goals are decisive for the right selection.

 

Circle of Competence

Know the actual circle of competence compared to the perceived circle for selecting an automation solution. Do not overestimate the level of competence.

Acting responsibly

Acting responsibly and sustainably for oneself, the project and the company.

Independent expert

Knowing when you need the help of an independent expert. You can't know everything as a project team, especially because software selection doesn't happen every day.

Being critical

Do not (blindly) trust the 'baseless' promises of the salesperson. The salesperson's goal is to sell his product, not to solve your problems.

 

Selection process

The requirements and objectives as well as the existing circle of competence form the basis for the actual selection process.

Structured

Start structured, use requirements and objectives to find good criteria for the selection process.

Criteria

Finding tangible and trackable criteria to evaluate the automation solution against your requirements and objectives.

Confidence

Confidence in the changes in case adjustments are needed in the future.

 

Case study - ideal-typical

The first three cases for selecting an automation solution were not successful. In the fourth case, the project team - let's call it team D - made a successful decision at the end of the selection process. The following figure shows the result of the selection process.

DWH Automation & Expectations - Case Studies - Case 4

What did they do differently? Team D accurately defined and described the requirements and goals, criteria and features they needed to make a decision on an automation solution.

Categories such as metadata, security, data quality were defined and described by team D. All detailed criteria in these categories were weighted according to the requirements.

The result was that team D was able to select the best automation solution for them through the simple visualization. It did not perform best in every category, but overall it was the best result. This allowed team D to objectively select an automation solution. In terms of the requirements and the objectives that team D had, they made a good decision. One of the key success factors was that team D carefully defined all the categories and nearly 200 criteria in advance of the selection process and stuck to them. The subsequent PoC led to success.

I would be very interested in your experiences on this topic. Write me here in the comments or directly.

Be sure to check back.

Until then
Your Dirk

 

No comments

Leave your comment

In reply to Some User

This form is protected by Aimy Captcha-Less Form Guard