Big Data for Distributed Marketing

January 9, 2014 Anjan Upadya

big data

It is practically a marketing proverb, “if you can’t measure it, you can’t manage it.”  Peter Drucker had his own spin on the adage, “what is measured, improves.”  It seems like if an article is not about social, then it’s about big data.  Usually talked about with shallow understanding and not much practical application, big data considered in the context of distributed marketing provokes ideation; imagine all the unstructured data that flows between a brand, local partners, and customers.  What relationships, dependencies, or insights could be discovered at the intersection of several types of data like partner sales transactions, local search volume for xyz keyword, tweets that include brand name, partner marketing participation for a particular program, and online reviews?      

This is a two part article in which we will discuss (1) the basics of big data in the context of distributed marketing and (2) practical applications of big data for a distributed organization.  I know not all of you have the bandwidth to take full advantage of the data that should inform marketing spend, but it can be addressed to varying degrees.  There is something here for everybody, let’s take big data – put it in context – and define what it is while laying the groundwork for the actionable insight brands that sell through local partners can utilize. 

Big Data vs. Business Intelligence

Before we get into BI vs. big data let’s not lose sight of the fact that the underlying purpose of investing in big data (for us here) is to inform marketing and better communicate with customers in a personal-relevant and timely manner over online and offline channels.  To stay consistent, let’s use two definitions from a credible source (Gartner) to distinguish big data from business intelligence:

“Business intelligence (BI) is an umbrella term that includes the applications, infrastructure and tools, and best practices that enable access to and analysis of information to improve and optimize decisions and performance.”


“Big data is high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making.”    

Although it is not mentioned in the definition, when talking about BI, more often than not an organization is analyzing their own data coming in over controllable-structured channels like google analytics whereas big data is an ambiguous multi-source mashup that must take into account the five Vs. 

The 5 Vs That Define Big Data

Volume – Simply the large volume of data in aggregate: every email open, page view, web search, like, tweet, partner transaction, marketing program opt-in, etc.

Velocity – The speed at which the data comes in, moves out, its shelf-life, and general accumulation.  You can imagine how much data your network generates and that some of its relevance is time sensitive.   

Variety – All the types of data derived from the mediums mentioned in Volume and many more, both structured and unstructured.  

Viability – Does it actually have anything to tell you?

Value – Is the information worth what it costs you to produce?  Luckily big data tools are becoming more business user friendly.

There is obviously some overlap in the realms of business intelligence and big data, but know that BI is defined by descriptive statistics that quantitatively show trends in a particular data set (the bounce rate on my product page) as opposed to big data that finds relationships and dependencies from large and varied  structured and unstructured data sets (finding a relationship between countrywide local search data, transaction data of my local partners, and traffic data from their mobile websites).  

Structured vs. Unstructured Data

Structured Data, regardless of size, always remains structured like dates, names, addresses, age, etc.  All structured data can be governed by set rules because there is uniformity to the data.  Unstructured data is often in the catch-all format of a string, text that may be comprised of dates, words, numbers, or hashtags, fashioned in anything from tweets to internal communications to yelp reviews.  Companies large and small (you don’t have to be a fortune 500 company to utilize big data) find ways to analyze structured and unstructured data sets from varied sources. 

Hadoop – The Open Source Big Data Backbone

Some 67% of marketers ranked “data-driven decisions” as a top three marketing challenge with respect to multi-channel optimization, according to Gleanster Research.  Marketers shy away from big data at times because we think of the data we input into a system as needing to have order before it goes in or else it won’t be useful.  According to IBM, eighty percent of the world’s data is unstructured, and whether internal or external, most businesses don’t have the means to use it. 

Hadoop is an open source software project that enables parallel processing over a distributed network of commodity servers, imagine the computing power of 1000+ servers networked together that can be scaled up or down as the workload dictates.  Any type of data can be absorbed from any number of sources and analyzed in innumerable ways – junk in not necessarily junk out.

Distributed Marketing Data Streams You Have

You can begin to imagine the structured and unstructured data streams distributed marketing organizations produce.  Too many for a comprehesive list,  but just to give you an idea, the data streams could be:

-Sales Data                                        -Corporate–Partner Communications

- List Growth                                     - Marketing Program Participation

-Tweets mentioning the Brand      -Partner-Corporate Content Syndication results

- Mobile Site Visits                            -On-Page Metrics

Hadoop permits working with the mounds of structured and unstructured data that could flow in from all corners of a distributed marketing organization wherever they fall in the B2B2C dynamic.  In 2013, purpose built marketing technologies have begun to provide user friendly interfaces to pull graphical data from Hadoop.  As with the lifecycle for all technologies valuable for business, Hadoop is progressing from a language for developers into a business tool for non-technical end users.   

In part two of this article we will discuss some examples of distributed organizations that have used big data to gain actionable insights about their organizations.

Related Content: 
List Management
Article Type: 
Exclude from Recent List: 
Are you doing anything with the mounds of data generated by your network? Discover what data you should be collecting and what to do with it.

Previous Article
Online Marketing Movers and Shakers – 2013 Review and 2014 Predictions
Online Marketing Movers and Shakers – 2013 Review and 2014 Predictions

Ah, the start of a New Year! While many of you have made your resolutions to eat healthier or exercise more...

Next Article
The Digital Frontier of Marketing Resource Management
The Digital Frontier of Marketing Resource Management

Marketing Resource Management has typically been used within the context of print, direct mail, and premium...