The promises made in relation to data-driven work are huge, and at the intersection of big data and AI, all kinds of opportunities for business optimization are currently emerging. If you are able to make good use of these opportunities as an organization, it can give you a huge strategic advantage over the competition.
At the same time, many companies still find it difficult to actually capitalize on all these opportunities. Although they do have access to large amounts of data, they lack a central and versatile platform for collecting and analyzing this data – as the basis for creating more insight and making better decisions.
A modern data platform solves this problem, with the platform making it possible to truly put data at the center of everything and derive concrete value from it. The self-service platform serves all audiences, with tools for both data specialists – such as data engineers, data scientists and data analysts – and business users.
The MDP includes a total of seven building blocks/features as a solid foundation for an insight-driven business. In this article, we will explain which characteristics these are and why they are an asset for ambitious organizations that want to get more out of their data.
#1 A Key Enabler for Data-Driven Work
Data-driven work requires a genuine culture change, in which teams ‘learn’ to examine and validate their assumptions based on data. Data-driven work is therefore no longer the responsibility of a separate data department alone. In fact, to achieve maximum added value, it is important that data is freely available.
Democratization of Data
To make data truly work for your organization, it is important that everyone in the business can access it and work with it in an easy way. This democratization of data is a fundamental condition for data-driven success. The MDP therefore offers numerous user-friendly tools that make it easier to access data sources and run analyses on them.
But the MDP also offers advantages for more specialized work. The platform makes data more accessible and ensures that new data sources can be hooked up quickly. As a result, data engineers can get more done in a short time. At the same time, much of the tooling within the MDP consists of managed cloud services, which ensures a lower operational burden. For data scientists, the platform also offers the advantage that it scales along with the current need; as you run heavier models as a data scientist, the platform grows with you.
Combining Data Sources
In addition, the platform offers the possibility of combining different data sources. Within larger companies in particular, data is usually fragmented across many different systems within the organization. Take a retail chain, for example: this has data from both the cash register systems in the stores and from online purchases in the webshop (and possibly also from sales via third-party online marketplaces). Ideally, you want to combine all this data so that you can build advanced models and create more insights.
For a long time, it was technically quite a challenge to combine all this, but within the MDP it becomes relatively easy. At the end of the day, the benefits are enormous. It makes it possible to better analyze and even predict what customers want, so that you can, for example, make them targeted offers.
Moreover, as a business user – thanks to the built-in self-service capabilities – you no longer have to wait for the data team for this kind of analysis, says Diederik Greveling, CTO of GDD Solutions.
“You can get started yourself right away. This also creates more room to run experiments on data quickly. Ultimately, this leads to a better workflow which allows you to deliver value faster throughout the organization – from Marketing and Sales to Finance and HR.”
#2 The MDP as ‘Single Source of Truth’
As organizations grow, data often becomes increasingly fragmented across different systems and databases. There is often no single central point for storing data and information, which makes it extremely difficult to run accurate analyses.
Because the MDP – as we have seen above – does collect all data in one place, a single source of truth (SSOT) is created: a central source, with data accessible to everyone, as a solid basis for calculating KPIs and for making well-founded decisions. In this way, the MDP prevents teams from working on the basis of different data and structurally working at cross-purposes with each other.
Greater Confidence in Data
An SSOT promotes trust in data, outlines CTO Niels Zeilemaker of GoDataDriven.
“If data leads to different interpretations of the same KPI, it can undermine trust in this underlying data – and therefore trust in the whole data-driven way of working. You can prevent this by ensuring that everyone is using the same source through the MDP.”
#3 Platform for Both Structured and Unstructured Data
The MDP works a little differently, explains Niels Zeilemaker. “The MDP does not work along the lines of a traditional ETL process, but according to ELT: after the data has been retrieved, it is first loaded into the platform and only then – at the moment it is needed – is it transformed into a format that can be used for that specific project. In other words, the emphasis is on easy access to and use of the data, not on simply putting it into a certain format so that it fits in your database. At the end of the day, this saves a lot of time.”
Information from applications, systems, sensors and a host of other sources: the average organization produces enormous amounts of data. In part, this will be structured data, which conforms to a predefined data model and is stored in a fixed table format. This makes it relatively easy to analyze and search, and includes data, for example, from Excel, an SQL database or from ERP and CRM systems.
In addition, organizations often have large amounts of unstructured data. This data does not fit into a spreadsheet or database or is hidden in a text field (and is therefore often much more difficult to analyze, customize and search), but it can provide valuable insights. This might include data from social media (how do your customers ‘talk’ about you?) or internal company data, such as Word documents and emails.
From ETL to ELT
The MDP can handle both structured and unstructured data. The order in which this data is processed is traditionally Extract – Transform – Load (ETL): data is first extracted from a specific source, then transformed into a file format that fits into a table or database, and finally loaded into a data warehouse.
#4 Cleaning Up, Aggregating, and Combining Data
To make the most of all the data present within an organization, it is important that it can be easily cleaned up, aggregated and combined.
Whereas structured data is stored in a data warehouse, large amounts of raw (unstructured) data are collected in a data lake. From this ‘reservoir’ of data, all raw data is then converted into one file format, and according to uniform definitions. Subsequently, all related data is converted into one table.
Change Data Capture
From a few hundred to as many as ten thousand tables: the larger the organization, the more tables are usually in circulation. Depending on the type of table, data is loaded into the platform. For example, customer addresses will be loaded a maximum of once a day, while your online sales data (an important control indicator on your dashboard) will be continuously updated.
#5 The MDP Is Scalable, Secure, Compliant and Cost-Effective
When it comes to scalability, security, compliancy, and cost-effectiveness, a modern data platform truly provides a best-of-breed solution.
SaaS: Maximum Scalability
Managed cloud services and the associated pay-per-use model have rapidly become commonplace over the past decade. The MDP is largely built around the same, scalable SaaS applications: no matter how much data you process as an organization, the platform scales up or down automatically.
Secure and Compliant
In addition, the MDP is secure and compliant by design; the platform is designed so that data access is logged and tracked and exposure to outside access is limited, in such a way that it does hinder accessibility. In addition, the platform is compliant with best practices and follows all cloud security recommendations.
Finally, the MDP provides cost savings. The SaaS nature of the platform, with extensive cloud-native tooling, allows for an unprecedented acceleration in implementation; where building and configuring a data platform from scratch used to take up to about four to six months, the roll-out of the MDP is typically limited to a few hours – resulting in considerable cost savings. Only after roll-out, is a choice made for each use case: what data do you need to arrive at a particular dashboard or data science model? And which data sources are and are not loaded onto the platform? At the end of the day, this approach also provides a cost-effective way of working.
#6 Suitable for Reporting as Well as Data Products
Many organizations that want to move toward a data-driven way of working start by creating reports and translating data into insightful dashboards. The business uses these dashboards, which are updated in real time, to gain more insight into the current state of affairs in the operation and to make targeted adjustments based on these insights.
Data science: predictive analytics
The MDP supports this type of straightforward data analysis, but can also be used for more complex, data science-like applications and models. For example, the MDP enables predictive analysis, in which e.g. customer behavior can be predicted ever more accurately. This allows a webshop, for example, to use targeted ads and make recommendations to customers for products that may also be of interest.
But other data-driven applications will then also come within reach, outlines Diederik Greveling.
“Such as predictive maintenance, in which devices and machines are equipped with advanced sensor technology and can thus ‘self-report’ when a certain component needs to be replaced. Companies are also increasingly using data analytics to arrive at an optimal pricing strategy through continuous analysis of movements in the market.”
#7 Data Governance Enabled
Garbage in is garbage out. As data becomes more important to organizations, the importance of proper data management grows as well. After all, without an appropriate underlying technical infrastructure, the right tooling, and processes for data management, data quality quickly deteriorates. In addition, there are more and more laws and regulations regarding privacy that you, as an organization, will have to consider, such as the right to be forgotten (the right for EU citizens to have outdated or inaccurate privacy-sensitive information removed by processors of personal data).
The modern data platform supports organizations in shaping their policies around data governance, says Niels Zeilemaker. “A data governance layer on top of the platform allows users to better manage their data and ensures that, as an organization, you are in maximum control in terms of data. The platform also includes functionality for data observability: automated tools that allow you to quickly detect breaks in trends within your data sets – which may indicate an anomaly.”
The MDP: Suitable for Virtually any SME
GoDataDriven’s modern data platform is designed with the seven characteristics above in mind, Niels Zeilemaker emphasizes.
“In this platform, we have brought together a unique combination of toolings that we think organizations with serious data ambitions can manage. For example, SMEs+, scale-ups and corporates.”
The MDP has been designed based on GoDataDriven’s extensive experience within the consulting domain, adds Diederik Greveling.
“We know what works and what doesn’t work for organizations that are embarking on data-driven work. The MDP satisfies 90 percent of the use cases a customer has – and for the other 10 percent that don’t fall into this category, it’s always possible to have a custom solution developed.”