News | | Shopzilla Implements a Cloudera Enterprise Data Hub to Enhance its EDW and Capture Unparalleled Retail Insights

Shopzilla Implements a Cloudera Enterprise Data Hub to Enhance its EDW and Capture Unparalleled Retail Insights

Cloudera, the leader in enterprise analytic data management powered by Apache Hadoop™, today announced that Shopzilla, Inc., a leading source for connecting buyers and sellers online, has deployed a Cloudera enterprise data hub to complement its existing Oracle enterprise data warehouse (EDW). In this hybrid Big Data environment, Shopzilla now has unlimited capacity to process and deliver new insights on millions of pageviews and ten billion ad requests daily, reaching over 100 million unique visitors and gaining valuable insights in hours or minutes instead of days.

Shopzilla recently announced that it combined three marketing-centric business units that will operate under the Connexity brand name. Connexity will uniquely combine consumer insights and media buying within the same programmatic platform, helping marketers to learn more about their customers, discover valuable audiences, and activate new consumers at scale. Data from Shopzilla’s global portfolio of retail websites connects more than 40 million shoppers with over 100 million products from tens of thousands of retailers and that crucial data is now an offering via Connexity. It is important that data was processed as rapidly and efficiently as possible in order to keep up with growing customer engagement. With its 500-terabyte EDW growing by five terabytes a day, Shopzilla’s existing legacy data warehouse had outgrown its capacity, impacting the company’s ability to provide business analytics in  timely and effective manner.

“Our legacy system delivers great performance for analytics and reporting, but didn’t have the bandwidth for the intensive data transformations we needed–it would take hours to process 100 million products per day,” said Paramjit Singh, director of data for Shopzilla. “We needed enormous processing capabilities, scalability, full redundancy, and extensive storage–at a cost-effective price. Our Cloudera platform provides all that and more, while complementing our current data warehouse system. We were able to reduce latency from days to hours and soon minutes.”

“Cloudera provides an exploration environment for our data scientists that reveals tremendous insights, which would be virtually impossible to obtain otherwise,” explained Singh. “We’re able to answer complex questions on multi-structured data, such as how a user is behaving on a particular site and what ads would be most effective, as well as execute other sophisticated data mining queries. It improves Shopzilla’s ability to provide relevant results to users–a core tenet of our business. Many of the things we do as a business would not be possible without this platform running alongside our Oracle data warehouse.”

This improved processing performance also benefits Shopzilla’s search engine marketing (SEM) activities, allowing the company to score and bid on ten million keywords each day. Reaching over 100 million users, Shopzilla is able to collect billions of data points to create some of the most targeted and rich shopping-intent data available.

“By 2017, US online retail sales will total $434.2 billion. In a data-driven industry such as online retail, which is experiencing such explosive growth, providing profound and timely insights to both shoppers and retailers is key in boosting marketing ROI,” said Alan Saldich, vice president of marketing at Cloudera. “Connecting social and transactional data provides businesses with a 360-degree view of customer behaviors, interactions, interests, and activities in a way that was just not possible before.”

Shopzilla augmented its Oracle EDW with a multi-tenant Cloudera Enterprise system to create a hybrid environment. While Hadoop is the primary engine for data processing and analytics, aggregated data is stored in the EDW using Apache Sqoop for reporting on the back end. Users can access Cloudera Enterprise directly using Apache Pig and Apache Hive, and Shopzilla plans to upgrade to Cloudera Impala and Apache Spark in the near future.

A Cloudera-powered enterprise data hub delivers the most secure, managed, governed, and open data management platform to give customers a choice over legacy data management for storing, accessing, and analyzing any amount and any kind of data in one centralized repository. Cloudera has all of the key attributes necessary for customers to make data the true focal point of any business.

About Shopzilla

Shopzilla, Inc. is a leading source for connecting buyers and sellers online. Reaching a global audience of over 50 million shoppers each month through both its destination websites and affiliate network, Shopzilla connects shoppers with over 100 million products from tens of thousands of retailers a month. Shopzilla, Inc. manages a premier portfolio of online shopping brands in the US and Europe, consisting of Bizrate, Beso, Shopzilla, Retrevo, TaDa, PrixMoinsCher, and SparDeinGeld, as well as B2B businesses including Connexity, a consumer insights and audience activation platform, and the Shopzilla Publisher Program.