What is Big Data?

Question

What is Big Data?

Accepted Answer

Big Data Definition

According to Wikipedia, big data is “an all-encompassing term for any collection of data sets so large or complex that it becomes difficult to process them using traditional data-processing applications.” At Teradata big data is often described in terms of several “Vs” – volume, variety, velocity, variability, veracity – which speak collectively to the complexity and difficulty in collecting, storing, managing, analyzing and otherwise putting big data to work in creating the most important “V” of all, value. In today’s high-stakes business environment, leading companies — enterprises that differentiate, outperform, and adapt to customer needs faster than competitors — rely on big data analytics. They see how the purposeful, systematic exploitation of big data, coupled with analytics, reveals opportunities for better business outcomes.

For mature organizations, big data analytics coupled with Artificial Intelligence (AI) and/or machine learning is helping solve even more complex business challenges:

Customer Experience: Find a competitive edge by being customer centric and optimizing the customer journey

Financial Transformation: Deliver new enterprise value and strategic input thru financial and accounting processes

Product Innovation: Create and iterate products that are safer, in demand, and profitable

Risk Mitigation: Minimize exposure to financial fraud and cyber security risk

Asset Optimization: Optimize asset value leveraging IoT and sensor data

Operational Excellence: Achieve peak performance value leveraging personnel, equipment, and other resources

How to Make Big Data Work

Big data is often defined as data sets too large and complex to manipulate or query with standard tools. Even companies fully committed to big data, companies that have defined the business case and are ready to mature beyond the “science project” phase, must figure out how to make big data work.

The massive hype, and the perplexing range of big data technology options and vendors, makes finding the right answer harder than it needs to be. The goal must be to design and build an underlying big data environment that is low cost and low complexity. That is stable, highly integrated, and scalable enough to move the entire organization toward true data and analytics centricity. Data-and-analytics centricity is a state of being where the power of big data and big data analytics are available to all the parts of the organization that need them. With the underlying infrastructure, data streams and user toolsets required to discover valuable insights, make better decisions and solve actual business problems.

Big Data as an Engine

Getting started with big data requires thinking of it — big data — as an engine. To boost performance, it’s a matter of assembling the right components in a seamless, stable and sustainable way. Those components include:

Data Sources: operational and functional systems, machine logs and sensors, Web and social and many other sources.

Data Platforms, Warehouses and Discovery Platforms: that enable the capture and management of data, and then – critically – its conversion into customer insights and, ultimately, action.

Big Data Analytics Tools and Apps: the “front end” used by executives, analysts, managers and others to access customer insights, models scenarios, and otherwise do their jobs and manage the business.

At this level, it’s about harnessing and exploiting the full horsepower of big data assets to create business value. Making it all work together requires a strategic big data design and thoughtful big data architecture that not only examines current data streams and repositories, but also accounts for specific business objectives and longer-term market trends. In other words, there is no one single template to making big data work.

Given that big data will only become more important tomorrow, these infrastructures should be viewed as the foundation of future operations. So, yes, capital outlays may be significant. However, many forward-thinking organizations and early adopters of big data have reached a surprising – and somewhat counterintuitive – conclusion: that designing the right big data environment can lead to cost savings. Speaking of surprises: these cost savings can be pleasantly large and harvestable relatively soon.

It’s critical to note that with flexible frameworks in place, big data technologies and programs can support multiple parts of the enterprise and improve operations across the business. Otherwise, there is real risk that even advanced and ambitious big data projects will end up as stranded investments. Gartner estimates that 90% of big data projects be leveraged or replicated across the enterprise. Tomorrow’s big data winners are in that 10% today, and long ago stopped thinking small.

Attributes of Highly Effective Big Data Environments

Seamlessly Use Data Sets: Much of the payoff comes through the mixing, combining and contrasting of data sets – so there’s no analytics-enabled innovation without integration.

Flexible, Low Cost: The target here is low complexity and low cost, with enough flexibility to scale for future needs, which will be both larger-scale and more targeted at specific user groups.

Stable: Stability is critical because the data volumes are massive, and users need to easily access and interact with data. In this sense, infrastructure performance holds a key to boosting business performance through big data.

Big Data Integration: The Most Important Variable

Limited reusability is, to a large extent, a function of poor integration. In fact, integration may be the most important variable in the equation for big data success.

Forrester Research has written that 80% of the value in big data comes through integration. The big picture idea is that the highest value big data is readily accessible to the right users, and robust and clearly defined business rules and governance structures. Deeper data sets – legacy transactional data and long-tail customer histories – may only need reliable storage and robust data management, so data scientists and data explorers can review and model it when it makes sense to do so.

Big data integration is also about thinking big. In this instance, “big” means holistically, inclusively and multidimensionally. Dots must be connected, islands of data bridged, and functional silos plugged into each other (if not broken down entirely).

High degrees of integration. Well-designed ecosystems. Unified architectures. Data and analytics centricity. That short list doesn’t necessarily require every component or technical detail to make big data programs function. But certainly these are difference-making attributes that ensure big data programs work effectively.

What is Big Data?

Big Data Definition

How to Make Big Data Work

Big Data as an Engine

Attributes of Highly Effective Big Data Environments

Big Data Integration: The Most Important Variable

More on Big Data

Big Data Solutions from Teradata

The Future of Big Data

Understanding How Big Data Works