We are all about marketing, data, analysis, innovation and technology

Friday, March 16, 2012

Big Data, What Does it Mean? What Does it Mean for Me?

Big Data, a term that has been showing up in our industry’s papers for a while lacks a consistent definition, yet it certainly conjures up images: a mountain of information; an ocean of data; so much data that it can only be contained in the “clouds.”

These images don’t provide definitions, however, and definitions help us to understand what Big Data really is and where it is coming from. Big Data is unstructured data, which due to its nature takes up more storage space and requires new technology to house, and analyze.

Where is the big data coming from?
Big data is coming from everywhere. The social media phenomenon is one huge source with 1 billion Facebook users globally while twitter has hit the 200 million mark (Source: http://mashable.com/2012/03/06/facebook-growth-slows/). According to the YouTube site, they have over 537 million videos available for viewing, with new content being added at a rate of one hour of video every second.

Besides purely social media sites, another source of big data is typical business enterprises. With the cost of storage dropping due to an ever-increasing number of choices to store data, corporations now have the ability to store data which was previously lost, and the choices for storage have expanded to meet the demand. What types of unstructured data does business have?

General business information

  • Human resource files
  • Email archives
  • Project files and documentation
  • Customer service correspondence
  • Legal documents
  • Shipping manifests

Industry specific big data

Some big data is derived from industry specific needs. Examples of specific industries known to have big data issues are:

  • Healthcare Industry--Medical records including scans and images which can be accessed by medical professionals all over the country.
  • Insurance Industry—Insurance companies now routinely photograph for claims purposes and these images can even travel with the claims process allowing customers to access the status of their claims and the process of the repairs online.
  • Shipping industry—Probably everyone has tracked a package and looked up the scanned signature of the receiver.
  • Media industry—companies in the media industry leverage their digital assets online and thus must maintain the data in storage and in databases for easy access to recall when needed.
  • Travel and Hospitality industry—some hotel chains maintain that they can serve you better by maintaining your individual preferences on file. In addition, thanks to the world of providing ratings for everything under the sun, the hospitality industry can maintain customer feedback on their various properties with the goal of benchmarking and improving satisfaction.
  • The energy sector—oil and gas drilling equipment are now producing data in real time to broadcast various parameters of the drill. Devices read and transmit data to allow for early warning of problems in the field.

Where is the big data going?
Currently the solutions regarding the storage of big data are outpacing the data analytic tools. Some companies manage their data storage in house and have merely ramped up the number of servers utilized to house the data.

Other companies may find that outsourcing the task of storing big data is a better option. One thing is certain, however, and that is that the mere existence of Big Data is driving innovation, and creating a lot of opportunities in the Information Technology sector.

Cloud computing is one of the outflows of big data. Data storage in the cloud makes sense to some businesses because the storage of the data can be centralized and the access can be distributed to all end users.

What are the leading technologies to utilize big data?
Hadoop, is the name of the technology developed to manage big data, fundamentally, Hadoop allows software to be run in a distributed manner across very large datasets, so that thousands of nodes of computing power are leveraged to process the data much more quickly than if just a single or a small number of nodes were used ( http://radar.oreilly.com/2012/02/what-is-apache-hadoop.html ). The engine of Hadoop, MapReduce, efficiently leverages the power of a network of computers to push work to available nodes for a processing task. This engine, originated at Google to reduce the time required to create web search indexes.

Hadoop is being utilized at many of the large companies including Facebook, LinkedIn, The New York Times, American Airlines, AOL, Twitter among others.

The technologies developed thus far to leverage large datasets involve the ability to search and retrieve information.

How can organizations extract value from big data?
Before the value can be extracted from these enormous datasets, each organization who is capturing data must have good data management processes to scrub and store the data. Many companies lack the resources to complete this fundamental step.

A comprehensive benchmarking study from The Economist Intelligence Unit Sponsored by SASindicates that the experience and value derived from big data is variable based on the individual business, the particular business model, and the discipline they have adopted around their data processes.

In this study only 22% of the 586 senior executives interviewed would characterize their organization as putting nearly all of the data that is of real value to good use. While 53% said they leverage about half of their organization’s valuable data.

Also quoted in the research findings is Stan Lepeak, Research Direct in KPMG’s Shared Services and Outsourcing Advisory group who notes, ”The process of capturing is actually relatively easy, and these firms have gotten very good at it over the last 10 to 15 years.” He notes that the cost of the actual data, as well as the storage and data warehousing products needed to collect them, has dropped dramatically over the last decade. “But a number of them are struggling to extract value from the data. In particular, many are failing to organize them properly so that they can be analyzed and queried. And often they don’t have people with the skills to interpret the results.” You can find this complete study in pdf file format at: http://www.managementthinking.eiu.com/sites/default/files/downloads/SAS_BigData_final_0.pdf

Some data managers concede that while they believe that some of the data they capture has very high value, other information may be completely worthless. Their jobs are so fast-paced, however, it is not possible to make the case that some data lacks value while the transactions accumulate constantly. Showing the value (or lack thereof) is not the responsibility of the IT teams.

The most common problem among companies who fail to extract value from their data is that they have too much data and too few resources, 45% of the respondents to the Economist Intelligence Unit Study cite this issue as their biggest challenge.

There is an emergence in the discipline of data science which incorporates ideas from computer science, mathematics, statistical analysis, data visualization and social science, to meet the demand for data scientists which is anticipated due to the increasing prevalence of Big Data. Hum…maybe I can help NYU create such a program.

Does everyone believe in the value of big data?
There are data skeptics who believe that the talk of big data is a bunch of hype. That it is never really necessary to capture all of the data associated with some process precisely because of the volume. Since such volumes of data are unlikely to ever be mined to reveal their value, a sample of the data should be sufficient to determine the value of saving entire transactional histories. There will always be skeptics in the world, and their influence looms larger when a new concept is in the formative stages.

In addition, some IT leaders believe that for their organization, there just isn’t enough data for it to be classified as “Big Data.” For this group, it can be argued, that the Big Data movement is less about the absolute size of the data being managed, and more about the new tools and practices that are being deployed to maximize the efficiency of scrubbing and processing the data.

I hope this helped clarify for you what big data is, and how we will utilize it in the near future.

Rhonda Knehans Drake
President, Drake Direct

Wednesday, March 7, 2012

What is Driving Facebook Traffic Today?

Wondering what Facebook traffic patterns look like lately and if they are still tied to the unemployment rate as I proved in a prior blog post. Well, wait no more.

If you recall from my prior blog entry in 2009 titled “Has Facebook Been Lucky” I reported that unique visitors to Facebook were climbing exponentially throughout the recession we were facing at that time. I additionally noted that traffic to Facebook during that same time also had a 98% correlation with the unemployment rate. And to top it off, it all made perfect sense. During that time many people were losing their jobs and were seeking to reconnect with past friends and acquaintances. Facebook was the perfect vehicle. Individuals were reaching out to Facebook and creating accounts at lightning speed. See Figure 1 below from 2009 revealing this phenomenon (click on figure for larger image).

Figure 1: Facebook Unique Visitors Vs Unemployment Rate through September 2009

Now two years later, I felt it was time to revisit this story. In particular I wondered two things:

  • Is the relationship between Facebook traffic and the unemployment rate the same?

  • is traffic to Facebook still skyrocketing?

Let’s take a look.

The Relationship Between Facebook and the Unemployment Rate

Facebook visitors are continuing to climb as Figure 2 below shows (click on figure for larger image). However, at a much slower rate than we saw during the height of the recession in 2008 and 2009. In October ’09, at the point where the unemployment rate began to decline, we note the growth rate of Facebook unique visitors has definitely slowed down. And, as of April ’10, how this Facebook visitor metric relates to the unemployment rate has definitely changed. So the question is why and does this make sense.

Figure 2: Facebook Unique Visitors vs. Unemployment Rate through January 2012

Current Facebook Visitor Dynamic

The dynamic observed today makes perfect sense based on other circumstances that have transpired since 2009 and how we are now using Facebook today versus two years ago. Let’s take a look:

So in other words, we are now using Facebook to interact with brands we “like” in addition to stay in touch with friends and family. We’ve developed more intimate relationships with our beloved brands on Facebook.

Other developments over the past two years also driving us to Facebook include the following facts:

So, maybe there is another connection to the unemployment rate? Could the time we spend on Facebook now be correlated with the unemployment rate? The answer is a big fat yes.

The correlation with the unemployment rate has shifted from unique visitors to time spent as Figure 3 below clearly depicts (click on figure for larger image).

Figure 3: Facebook Time Spent vs. Unemployment Rate through 2011

By examining this graph we note, as the economy improves the time spent on Facebook is also decreasing. The correlation between the two metrics is significant. Even though unique visitors is still increasing as was shown in Figure 2. As the population shifts back to work, our time is now shifting away from leisure and as a result we spend less time on Facebook. Makes perfect sense. So cool. Love data!


In 2008 and 2009 we saw that Facebook was a means of people connecting with each other as the unemployment rate was high, and many individuals were seeking work, and reconnecting with their contacts. In 2010, we saw many businesses jump into Facebook to create brand pages and monetize this channel. We noted the increase in overall business pages nearly doubled during 2010. This increase in business pages, coupled with the improving employment situation led to a new dynamic in Facebook; people connecting with brands and a shift in time spent on Facebook. As more people enter the job market, the time spent on Facebook declines, and this relationship is evident even as overall Facebook users climb albeit at a slower rate than 2009.


Special thanks to Yuko Ichihara, my Research Analyst, for her valuable research work and insights provided.