We live in an age where systems and applications are evolving at a rapid pace where technologies transition from vogue to commonplace to obsolete in less than a decade. In such extremely dynamic and volatile world, businesses can maintain its competitive edge by ensuring that they stay relevant to market. The best yardstick to measure their competitiveness is the ability to analyze the data captured by them directly or indirectly. The sheer breadth of problems that analytics has already solved and promises to solve in the future has driven organizations to invest increasingly in data & analytics initiatives.
Most of the time data and analytics is discussed only at the surface level in the form of great visualizations that meets business needs and attributes such as self-service, easy to use/learn are often associated with it. While those are essential elements for any analytics initiative to sustain and gain traction within business organizations, what sometimes gets less weightage or ignored is the underlying data fabric that supports such visualizations. The data fabric refers to various activities such as data extraction, data discovery, data transformation, data definition, data modelling, and data management for both structured as well as unstructured data. Hence, it is important to ensure that the overall data systems architecture is well defined, nuanced for specific needs and follows best practices and guidelines for data management.
This piece of writing aims to guide data leaders and technical experts in laying down an effective, future-ready, and foolproof architecture in the form of the key questions and parameters that need to be considered before they start designing the architecture. From my experience of designing and implementing architectures, the most important consideration is business objective. A well-defined business objective lays the foundation for a strong and sustainable analytics architecture. However, there are several parameters that one needs to keep in mind when identifying the business objective.
Often, businesses miss outlining of foundational business objectives when thinking about a new analytics set-up. Some of the most important considerations are listed below:
- The kind of insights sought from the solution: descriptive, diagnostic, predictive, or prescriptive.
- Acceptable latency between source and insights system: Does the business need real-time analytics?
- Existing analytics landscape of the organization.
- Implementation timeline.
- Budget allocated for implementation.
Clear answers to the above helps the organization define an overall strategy. However, designing a good architecture requires careful deliberation on every component of the landscape namely data acquisitions and integration, data management, insights generation and consumption, and AI/ML adoption. The important factors to keep in mind for each of the above components are listed in subsequent sections.
Data Acquisition & Integration
- Identification of data sources and applications that would generate data i.e. ERP/CRP/legacy apps, IoT/sensors data, social data etc.
- Data formats from different sources.
- Data extraction and acquisition methods from different sources i.e. pipeline or APIs based extraction, FTP hosted external data etc.
- Data volume and number of federated data sources.
- Frequency of data refresh i.e. full, incremental, or streaming.
- Security requirements.
There are multiple tools and technologies available to enable data acquisition and data integration, however the technology should be selected basis overall business objectives.
- Need for data quality and master data management.
- Is there a data governance set-up in place?
- Data storage requirements in terms of data warehouse or data lakes or operational data store.
- Regulatory and compliance requirements.
- Data definition and cataloging requirements.
A robust data governance set-up is an important factor in successful implementation of data analytics platform. Assessing data quality in source systems and accordingly defining strategies to improve that and secure the data at the same time create a strong foundation for data analytics.
Insights Generation and Consumption
- Method of consumption.
- Need for self-service analytics.
- Level of data literacy amongst business users.
- Need for embedded analytics.
- Data security framework to protect data across channels.
The success or failure of any data analytics initiative depends heavily on the effectiveness of the last mile of the process i.e. data visualization. It is imperative that we assess the existing setup and how the business users are positioned against various factors listed above while designing the overall architecture. A careful examination of analytics maturity within the organization and key asks from business team will help establish multiple methods of data consumption for successful adoption of analytics.
This parameter of AI/ML adoption is more relevant for organizations who have attained some maturity in the analytics journey and are ready to embrace advanced analytics. Prior to embarking on AI/ML journey or even pilots, it is recommended that businesses consider all the listed aspects. Unless there is a clearly defined problem, outcomes of which can be measured quantitatively, investment into ML initiatives is not recommended. The success of any such initiative is highly dependent on the parameters listed below.
- Identified business problems.
- Availability of datasets to build and train ML models.
- Availability of talent.
Evaluation for custom model creation or leveraging product features. In the dialogue above, we talked about various elements and considerations one should make while defining their analytics architecture. For a ready reference, we have summarized the parameters that would derive definitions to establish the architecture.