The Guardian recently came down heavily on Big Data and the negative impact it may cause to democracy. Do you really wonder why? A lesser known company, Cambridge Analytica, suddenly became one of the most talked about companies – not only in the Big Data sphere but outside it as well. Though some observers would not agree, but Cambridge Analytica got credit, or rather blame for hyper-personalized campaigns that helped cause two shocking outcomes in the democratic processes – Brexit and the US elections.
The CEO of Cambridge Analytica, Alexander Nix, has been very vocal when discussing the work his company does. He told Skynews: “Today in the United States, we have somewhere close to four or five thousand data points on every individual.” This means that with a further powered algorithm, one can control the self-determination of an individual.
The Guardian went down saying –
“Our model of democracy is based on public campaigning followed by private voting. These developments threaten to turn this upside down, so that voting intentions are pretty much publicly known but the arguments that influence them are made in secret, concealed from the wider world where they might be contested.”
Is this the power of Big Data or has it became too powerful to imagine?
When we start looking at the other side, towards our business and IT world, we are still struggling to put the Big Data hypothesis or a POC in production. Failures of Big Data projects are too many.
So then, where is the gap? Why is it that only a few are able to utilize or rather maximize the benefits of Big Data to an unimaginable extent and the majority fail in a simple implementation?
Here is the answer – All of us, as analytics professionals, approach Big Data Analytics in the same way we used to approach BI projects in the past. In reality however, Big Data projects are starkly different in comparison to a BI project.
The two major gaps lie in:
- The expectation and
- The implementation
In plenty of Big Data projects, we expect to instantaneously hit a goldmine whereas the reality is that Big Data helps in finding those small pieces of gold which when collected, can become a goldmine. In the democratic process, voters act like those small pieces of gold, which when collected together, give us the goldmine (Brexit/Trump).
Implementation and design approach are completely different. Big Data is not only challenging the old rules of the traditional BI world, but also the most basic ones.
An few examples are as follows –
- Do we really need the dimensional data model?
- Should we take the path of the logical model to the physical model as in Traditional BI, or rather pull in raw data and build logical views on top of that?
- Do we really need the long cycle or curves of MDM or should we only implement a cross-reference graph database?
- No central use case – analytics may have a shorter life expectancy and hence, build only logical views for use case and dismantle whenever needed
- Use metadata tagging to automate and manage data pipelines. Manage metadata in a way that automatically identifies and tags
- Onboard a new data source in a week
- Use of DevOps for continuous integration and deployment
- Converging roles of data engineers, analysts and data scientists
- Do we need really sophisticated ETL tools or will only Spark suffice?
In this new digital world, the need of the hour is two-fold – Analytics on Demand and Analytics Anywhere.
It means Analytics-as-a-Service has to be nimble but deep, and the Analytics ecosystem has to be more adaptive than ever before.
One of my most successful Big Data implementations was for Telematics Data, in which numerous cars on the road were analysed for numerous telematics data points along with driver behaviour, geographic location, traffic conditions and more importantly, upcoming traffic conditions in order to predict or even prevent road mishaps. The most important factor for its success was the ability of the Design and Implementation team to “unlearn” traditional BI ways and adapt to the new Big Data ways.
Change is constant. But Big Data, oh boy, you are asking us (Analytics Professionals) to change a lot. Nevertheless, whatever is happening in the BI world is not uncommon and we see similar changes in other part of the IT world as well. In my opinion, the need of the hour is to have an off-the-shelf Big Data ecosystem which is away from nuances of traditional BI implementation and has a focused view of utilizing Big Data features and power. This, in my opinion, will result in high conversion ratio of Big Data projects.