Author: Stephan M. Liozu, PhD
Garbage in garbage out! Before you start working on data analytics, there are a variety of preparation steps that are critical to make sure you are heading in the right direction. Of course, you have to make sure you locate all the data sources in your organization. You will be surprised how much data is already available before you even think about adding more. Then you have to prepare your work by mapping the data points and starting the extraction process. Finally, you can think about connecting and leveraging all of these data sets. Remember that you have to learn to walk before you can run. These preparatory steps are essential for your success in data analytics. The more you invest up front, the more you will benefit from your data.
The Pricing Advisor, February 2018
Why Look for Your Data & Where do You Find it?
We are surrounded by data. In fact, we have data everywhere in our personal and professional lives. We might not realize it but the more connected we are, the more data we generate. While many profess that “any press is good press,” the recent “big data” trend has, in a way, taken the data phenomena hostage.
Every other word on Twitter mentions “big” something. Big data has become a management fad in much the same way as Total Quality Management (TQM), Six Sigma and other jazzy concepts that propagate like wild fire. Various online sources define big data as “a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. The challenges include capture, curation, storage, search, sharing, transfer, analysis and visualization.” The key word here is big I suppose. But everything is relative.
Lately, almost everything related to data analytics has been linked to big data, whether relevant or not. For small and medium business, it really does not matter whether your data is small, medium or extra-large. What matters is that you find the data, connect and integrate it, analyze it, and use it to create value for your business. The latter part of the equation is the most important. Much needs to be done to prove the positive impact of big data on the profit performance of firms. Although the concept has taken the world of analytics by storm, there is a need for more evidence-based research to demonstrate its profit power.
The power of data analytics is very real. Whether your data sets are big or small, here are some recommendations to get started managing your organization’s data:
- Make a commitment to start on the journey of data management: The first step of this journey is to have a discussion within your management team and gain a commitment to explore the value of a data management project(s). Do not pay too much attention to the hype about big data. Instead, focus on improving productivity in one area of your business by leveraging the mountain of information and data coming into your organization. Define a value statement of why you want to start on that journey. In other words, define how data management will create value for you and your customers.
- Do not worry about “big” – focus on the data: The key to getting started is to find all sources of data in your organization and to try to create a data map of where data is produced, stored and what it’s used for. You will be surprised by how much data your small or medium-sized business can generate in any given year. Between invoice transactions and product data alone, you probably store millions or even billions of records already across many locations. That sounds big to me.
- Leverage the integrated power of all your data … big or small: Once you’ve identified and extracted all this data, you might be ready to run some analytics. The key here is to eventually analyze each data set. The power of big data, as it’s defined above, is to connect the data set and create predictive and explanatory models that can help you make better decisions. That is where the rubber meets the road, as the saying goes. This is also where it might get complicated. Integrated data analytics might, for example, link your marketing data with your sales transactions and supply chain data to derive a powerful forecasting model. The value is in connecting all of the data sets and running predictive models.
- Blend intuition and science in your decision-making process: The goal is to reduce the level of uncertainty in your decision-making process, and to refine the range within which you might make these decisions. Intuition and experience still play a role when final decisions are made. However, you want to reverse the balance of 80% intuition and 20% science to a ratio of 80% science and 20% intuition.
- Walk before you run: Depending on where you are today in the analytics spectrum (from nowhere to embedded analytics), I recommend that you start slowly. Experiment and walk before you run. Before you pick up the phone and call the big gun consulting companies, establish some basic internal capabilities and run an audit of your infrastructure. If you are using IT systems from the 80s or 90s, you might have issues extracting the data and linking it all today, but the effort required is almost certainly worthwhile. Then, start working on your individual data sets and extracting the nuggets of predictive power.
There are four Vs in big data: volume, veracity, variety and velocity. To those, I add the most important of the Vs: value. Whether you are a small business or a huge corporation, leveraging your own data intelligently will create value for your business. This is the fundamental start of your data journey.
Preparing and Extracting Your Data: Team Work and Realization of Issues
From a practitioner’s perspective, barriers to identifying and preparing your data start very early in the data preparation and extraction steps. The process poses both technical and behavioral challenges. It also requires thinking, team coordination and organizational commitment. There are critical early parameters of data preparation and extraction that must be conceptualized to ensure successful data analytics later in the process.
Some of the potential barriers to data preparation and extraction are presented in the figure below:
Barriers related to data quality and systems complexity are to be expected. For example, daily manipulation of order entry, manual modifications in master data files and manual accounting entries to address incorrect transactions will create issues with the overall reliability of data. The data may be incomplete, unstructured and inconsistent.
Issues with systems are equally problematic. The use of multiple ERP systems, large-scale upgrade projects and the dependence on outdated systems to extract historical data can lead to breakdowns in data integration and consolidation. These types of issues are to be expected in any pricing analytics and optimization project.
Other barriers to data preparation and extraction are more organizational and behavioral in nature. First, it is critical to obtain commitment to program goals from the internal department involved in the project. That requires a detailed explanation of the project scope as well as a clearly articulated business justification. Explaining the data management program within the context of the overall corporate vision is a critical step toward project buy-in. Finally, the required data must be well defined prior to the start of the project, both in terms of type and quantity, in order to eliminate the false starts and multiple extractions that frustrate busy technicians.
Other potential barriers to project success are issues related to data cleaning, preparation and extraction. Many organizations have never been faced with a single project requiring data normalization and extraction across multiple, disparate systems. Successful organizations have learned that an investment in proper documentation and training, including demonstrations, prior to a project’s start reduces stress and errors, and greatly increases the quality of the resulting analysis while reducing the time and expense required to achieve it.
We recommend the following simple steps:
- Create a multi-functional team for data preparation and extraction: Conduct a kick-off meeting and explain the vision, the purpose and a clear scope of analysis. Create a common vision for the project, and reassure the team from the very beginning that data will be secured and treated with a high level of confidentiality.
- Conduct a data audit to evaluate potential technical barriers and issues related to data quality and systems: Map out where data might come from, along with possible interface issues. Link the project purpose to the project outcome, and create a road map on how to get the best and cleanest data.
- Select the proper technical experts to address and treat all possible discrepancies through the integration process: Do not improvise on the manipulation and treatment of data as this might extend the project schedule, and, in the long run, create more problems.
- Involve the team by creating a taskforce to support the project: Create transparencies on issues and solutions without finger pointing or breaking the “data kingdom.” Remember that people still believe in the axiom that information is power. Having everyone on board with a top executive champion might be the most powerful combination to ensure project support.
- Get it right the first time: Garbage in, garbage out! Avoid multiple iterations of extraction and data file versions that might confuse the team. The point is to keep the project simple but highly structured throughout its lifetime. Make proper use of consultants for this step in the overall pricing analytics process. Spending more time preparing and extracting the right data file will make the back end of the process faster and more robust.
For most of you, the thoughts in this paper might seem obvious and too simplistic. From a practitioner’s perspective, it is not as easy as you might think. Organizational complexity with people and systems can, and often does, slow down the process. What matters is a good vision for what data is being prepared and a mindful process that takes into account these potential issues. The results from such a project are well worth taking your time and getting it right the first time.
Connecting, Mining, and Leveraging Your Data
Your data is identified, prepared and extracted. While these are critical and challenging steps of the process, you have not yet moved into true data management. Your organization and team are ready to start connecting, mining, and leveraging the data to create intelligence.
Data intelligence is defined by Techopedia as:
“Data intelligence is the analysis of various forms of data in such a way that it can be used by companies to expand their services or investments. Data intelligence can also refer to companies’ use of internal data to analyze their own operations to make better decisions into the future. Business performance, data mining, online analytics, and event processing are all types of data that companies gather and use for data intelligence purposes”.
Data by itself has very little intelligence. You have to start mining it by creating an environment and a place where data can be hosted and connected through relevant languages and relationships.
Here are eight dimensions to consider when you start making your data speak:
- Data warehousing: This is an essential consideration of your data management process. Where will you assemble all the relevant data and how will you create a dynamic environment to integrate new data points in the future?
- Data Mapping: Maps are essential to structure your underlying databases and to establish critical data hierarchies. These maps will reflect your overall data architecture.
- Data Relationships: Data sets or blocks will be connected based on specific relationships. As with relational databases, multidimensional datasets will evolve in many ways and at multiple points in time. Data relationships will make the whole data dynamically connected and up-to-date.
- Assumptions: As you start leveraging your data through mining, you will have to keep a log of your assumptions to make sure they are constantly documented in case you are asked to justify your findings or discoveries.
- Hypotheses: Equally important for your data mining exercises is to establish relevant hypotheses to be tested in the future. Both assumptions and hypotheses are critical elements of data intelligence language. Your data is going to start talking back to you and sending you messages that need to be contextualize based on assumptions and hypotheses.
The team experts in charge of your data management process have to work on all of these eight dimensions very carefully. All eight have to fit well in the overall project vision and have to be driven by the desire to create value for the business. Proper documentation and discussions are required to avoid some of the most common pitfalls of a holistic data management: generous connections, over-simplifications, inappropriate interpretations, hidden agendas, over-stated conclusions. You cannot cut corners and take any shortcuts. Business leaders have to remain involved in the process at this stage to encourage teams, to support the need for transparency and clarity, and to make sure that all that work is done to make better decisions and create value. Too often, these data management projects are “taken hostage” by technical people who loses focus on value. Leveraging your data means creating a process to make better business decisions and to generate greater performance. Never lose focus of that essential goal.
Conclusions
Data, whether small, medium or big, are available in your organization. Do not rush to acquire more of it. Build a strong analytical foundation based on what you have and start mining. Spend a lot of time mapping your current pockets of data, preparing it for analysis, and connecting data sets. It is worthwhile delaying the process of starting with your analytics in order to prepare, clean, organize and connect your data sets. Event better, you can design a data architecture that resides in one centralized data warehouse. There is not magic trick in the area of data preparation. There is a need to be strategic, to be disciplined, and to be patient until you reach the “sweet spot.”