34
2.2 – Big Data analytics for marketing: technologies and
applications
2.2.1 – Marketing analytics are not born with the Internet
Marketing research is a fundamental part of marketing management processes; it aims to
generate insights on consumer behaviour, social dynamics and competition, to guide
strategic decisions about market presence with regard to relations with stakeholders, and
the operational decisions of product, pricing, branding and distribution. The analytics have
long played an important role in marketing research, even if in the last decades more and
more in association (rather than in competition) with a qualitative approach to research.
Analytics is an “umbrella term for data analysis applications”
27
, that is, a general concept
applicable to all quantitative analysis techniques to support decisions.
Analytics, in general, can be of four types:
1. Descriptive. They allow to synthesize and/or better understand a phenomenon of
interest;
2. Diagnostics. Capture relationships between data and support hypothesis
development;
3. Predictive. Allow to anticipate events to give more control over their management;
4. Prescriptive. Calculates/suggests optimal solutions to general and contingent
problems, based on given parameters.
American Express, based on the history of its customers’ expenses, manages to predict the
possible bankruptcy before they themselves realize it. This knowledge supports the
decisions of interaction and service of the company; if such solutions are suggested (or even
implemented) automatically by the software, it is an example of prescriptive analytics.
Data and algorithmic tools available today promise to change not only the research
techniques but also the potential and the strategic impact of these practices. It is true that
this revolution actually began when digital technology had not turned into a network yet.
At the beginning of the eighties, with the advent of data scanners, consumer goods
27
(Watson H.J., 2014)
35
companies began to rely on data that were no longer aggregated, indirect, or asynchronous,
as in the case of surveys or exit data from goods warehouses. Data scanners already allowed
us to observe purchasing behaviour with the granularity associated with individual
consumer shopping trips and for different products, potentially in real time and allowing
these purchases to be associated with actions such as coupons and promotions.
The digital and the Internet have represented an important leap in the history of innovation
in the marketing research field, but already in the past there were considerable efforts to
structure and make reliable the market knowledge available to marketing managers. Prior
to 1995, research progressed through three steps:
1. The first phase saw research devoted to the simplified description of markets through
descriptive statistics;
2. The second phase was characterized by the development of models useful for
understanding consumers’ behaviours;
3. The third phase, instead, focused on the evaluation of marketing policy options, to
predict their effects with statistical, econometric techniques and the operational
research approach, all to support marketers’ decisions.
Since the 1990s (and even more since the early years of the new century), the Internet has
become increasingly important for communications, people and organizations’ lives. In the
new relational logic of doing marketing, digital was crucial right from the start because it
allowed us to collect data on individual consumers and customers, memorize them in a
customer database and use them to direct customized offers and communication decisions.
The Internet has facilitated the collection of personal information and individual
consumption preferences (both through the behavioural tracking of online behaviours and
through user registration practices for different online services). Starting from these data,
through the interactivity of the network, it has made the possibilities for customizing the
offer and communication/promotion enormously more powerful. In the last twenty years,
digital technology and the Internet have become much more complex and have offered
much more sophisticated functions to market players. Mobile Internet and social media
(first of all Facebook and YouTube, to which followed the other relevant platforms such as
Twitter, Instagram, Pinterest, among others) have not only replaced the traditional web for
36
many information search activities but have also changed radically the logic of network use.
They have made access to the network distributed and mobile. They have allowed and
stimulated sharing, interaction and collaboration both between consumers and between
consumers and companies. On these new platforms more and more data have been
generated, produced by the tracking and measurement of online activities, studied through
increasingly sophisticated analytics technologies.
2.2.2 – Big Data technologies applied for marketing
With Big Data, marketing researchers can obtain insights to better understand consumer
behaviour, the configuration of preferences, reactions to marketing actions, supply and
service expectations and relationships with brands. But Big Data is too big to be analysed in
a dull way. In order to generate marketing insights, new data require to be treated with
more powerful and flexible management technologies and more advanced software
28
(text
processing, audio processing, video processing, machine learning). Meanwhile, an
increasingly rich and complex market of tools is being developed for marketing.
Among the phenomena that have had the most impact on the development of big data, there
is certainly the strong acceleration of capacity and flexibility of data storage and CPU. In
addition to the new storage hardware technologies (in-memory and solid-state disk), of
great importance was the emergence of scale-out computing architectures, in which
hundreds or even thousands of servers can be put to work in parallel (massive parallel
processing). This increases the possible scale of the data to be processed but also the
flexibility of these processes, because new servers can be added as the processing volumes
increase as well. The large data warehouses typically exploit these architectures. Data is
used for online analytics, ad hoc queries, to create reports and maps on trends and relevant
phenomena for business, also with data visualization techniques.
On the front of data storage and management, great attention is paid based on the
Hadoop/MapReduce open source technology (also based on the logic of massive parallel
computing) which makes it possible to raise the bar of potential, both in the variable of
28
For instance, Hadoop open access software, inspired by the Google’s Map Reduce and Google File System.
37
volumes and in that of flexibility. The flexibility in this case also refers, indeed, to the
possibility of processing data of different formats (in particular both structured data and
unstructured data).
Another technological change refers to where analytics processing is performed. In the past,
data had to be moved to a server to be processed. Today the database management software
can also be used for analytics (in-database analytics), making this processing more efficient
but also increasing the scale of application (see Figure 12).
A further element of accelerating efficiency, but at the same time of flexibility in data
processing, is given by the development of cloud computing. Cloud computing is now
mainstream and describes the phenomenon whereby computing capacity (at all levels) can
be used as a service
29
that can be delivered via the Internet, rather than managed in-house.
Cloud computing services can be public (managed by third parties such as Amazon, Google,
Oracle and others) or private, centralized within a complex company or group. It is not only
a question of lowering costs and economic-organizational rigidity, but also of the possibility
of accessing data management capabilities, calculation power and degree of software
complexity that would be unthinkable for most companies or (in the case of private clouds)
of the individual local organizational units. For public cloud computing applied to big data,
29
Cloud services are available as software-as-a-service (SaS), platform-as-a-service (PaS), and infrastructure-as-a-service
(IaaS) depending on which resource is required to be used in service mode.
4
BIC DATA ANALYTICS PER IL MARKETINC: TECNoLOGIE E APPLICAZIONI
Figura 4.2 Architettura integrata di big data
67
'-Tl
['#J_]ll
t*:tvsr
re-.-,eqrg
- .
BI
BI
Cluster
frt.ra l-
Data
t
User
Fonle: Eckerson (2012)
('
--
-t
Or-
6i
=,< r-t
!u
Figure 12. The architecture of Big Data. Source: (Eckerson W., 2012)
38
an example may be Amazon Redshift, offered since 2013 within the Amazon Web Services
package and further details will be given in the next chapter.
Once the database system is organized, the queries can be made using SQL language
30
and
using analytics applications. To understand the impact in terms of organization and
business, it is useful to consider that these services have a cost that can even be of the order
of 1$ per hour of use. Large companies are now facing the challenge of integrating these
systems (internal or in the cloud) with other existing knowledge management systems.
Talking about enabling technologies, from the point of view of data analysis methods it is
unquestionable that many steps forward have been made thanks to the data mining
approach and its evolution enhanced by the artificial intelligence of machine learning. Even
though some traditional business intelligence service providers claim that their tools
applied to data warehouse technologies make data mining, oftentimes this is not the case.
Data extraction tools and simple visualizations, even when applied to vast amounts of data,
are not data mining. This technique requires the use of algorithms and data processing
designed with the precise scope to find non-hypothesized relationships between the data.
Machine learning is a method/analysis technique, based on advanced mathematical models
and artificial intelligence software, which allows the identification of patterns and
regularities in basins of apparently chaotic data. It is looking for relevant associations and
clusters, as in the case of Amazon, which uses these techniques to suggest products to
consumers, after considering what has already been seen and/or purchased. It is also used
in self-driving cars and in the most sophisticated fraud detection software in the financial
sector.
Deep learning (or deep machine learning) is a branch of machine learning that applies
multiple processing layers algorithms that are particularly effective in the representation of
abstract observations (facial recognition or facial expression recognition). In marketing
research, these systems are used in particular for facial recognition with customer
identification objectives but also for visual analytics, which allows the recognition of logos
30
SQL is Structured Query Language, which is a computer language for storing, manipulating and retrieving data stored
in a relational database, and it is the standard language for Relational Database System.
39
and products in the images available online and shared by consumers on social media
31
.
2.2.3 – Steps to follow to exploit marketing analytics
Organizations must follow three steps to fully exploit the potential of Marketing Analytics
and to take advantage of them
32
:
1. Identify the best analytical approaches
Teams have different tools and methods available to them for evaluating the pros and
cons to identify the best one for their strategic goals. The prevailing choices include the
following (see Table 2):
Marketing-mix modelling (MMM)
Description. This is an advanced analytics approach that uses Big Data to determine the
effectiveness of spending by different channels. This approach statistically links marketing
investments to sales drivers, including external variables such as seasonality and competitor
activities in order to discover effects, such as changes in individuals’ preferences over time or
differences between offline, online and social media activities.
Pros. MMM can be used for both long-range strategic purposes and near-term tactical planning.
Cons. Such an approach requires high-quality data on sales and marketing spending going back
over a period of years. Furthermore, it cannot measure activities that change little over time (for
example, out-of-house or outdoor media). Moreover, it cannot measure the long-term effects of
investing in any one touchpoint, such as a new mobile app or social-media feed. Lastly, MMM
also requires users with sufficiently deep econometric knowledge to understand the models and
a scenario-planning tool to model budget implications of spending decisions.
Reach, Cost, Quality (RCQ)
Description. RCQ disaggregates each touchpoint into its component parts—the number of target
consumers reached, cost per unique touch, the quality of the engagement—using both data and
structured judgment.
Pros. It is often used when MMM is not feasible, such as when there is limited data; when the rate
of spending is relatively constant throughout the year, as is the case with sponsorships; and with
persistent, always-on media where the marginal investment effects are harder to isolate. RCQ
31
As the case of Ditto and LogoGrab.
32
(Bhandari R., Singer M., van der Scheer H., 2014).