Hadoop in Practice, 2nd Edition

Hadoop in Practice, 2nd Edition

Publisher Manning
Author
Pages 512
Year 2014
Language English
ISBN 9781617292224
File size 9.9 MB
File format pdf
Download & Read more

Hadoop in Practice, Second Edition provides over 100 tested, instantly useful techniques that will help you conquer big data, using Hadoop. This revised new edition covers changes and new features in the Hadoop core architecture, including MapReduce 2. Brand new chapters cover YARN and integrating Kafka, Impala, and Spark SQL with Hadoop. You'll also get new and updated techniques for Flume, Sqoop...

Beginning Big Data with Power BI and Excel 2013

Beginning Big Data with Power BI and Excel 2013

Publisher Apress
Author
Pages 268
Year 2015
Language English
ISBN 9781484205303
File size 20.9 MB
File format pdf
Download & Read more

In Beginning Big Data with Power BI and Excel 2013, you will learn to solve business problems by tapping the power of Microsoft's Excel and Power BI to import data from NoSQL and SQL databases and other sources, create relational data models, and analyze business problems through sophisticated dashboards and data-driven maps. While Beginning Big Data with Power BI and Excel 2013 covers prom...

Enterprise Data Workflows with Cascading

Enterprise Data Workflows with Cascading

Publisher O'Reilly Media
Author
Pages 170
Year 2013
Language English
ISBN 9781449358723
File size 12.7 MB
File format pdf
Download & Read more

There is an easier way to build Hadoop applications. With this hands-on book, you'll learn how to use Cascading, the open source abstraction framework for Hadoop that lets you easily create and manage powerful enterprise-grade data processing applications - without having to learn the intricacies of MapReduce. Working with sample apps based on Java and other JVM languages, you'll quickly le...

Pro Hadoop

Pro Hadoop

Publisher Apress
Author
Pages 440
Year 2009
Language English
ISBN 9781430219422
File size 7.7 MB
File format pdf
Download & Read more

You've heard the hype about Hadoop: it runs petabyte - scale data mining tasks insanely fast, it runs gigantic tasks on clouds for absurdly cheap, it's been heavily committed to by tech giants like IBM, Yahoo!, and the Apache Project, and it's completely open-source. But what exactly is it, and more importantly, how do you even get a Hadoop cluster up and running? From Apress, the name you'...

Fast Data Processing with Spark

Fast Data Processing with Spark

Publisher Packt Publishing
Author
Pages 120
Year 2013
Language English
ISBN 9781782167068
File size 11.0 MB
File format pdf
Download & Read more

Spark is a framework for writing fast, distributed programs. Spark solves similar problems as Hadoop MapReduce does but with a fast in-memory approach and a clean functional style API. With its ability to integrate with Hadoop and inbuilt tools for interactive query analysis (Shark), large-scale graph processing and analysis (Bagel), and real-time analysis (Spark Streaming), it can be interactivel...

Programming Elastic MapReduce

Programming Elastic MapReduce

Publisher O'Reilly Media
Author
Pages 174
Year 2013
Language English
ISBN 9781449363628
File size 19.2 MB
File format pdf
Download & Read more

Although you don't need a large computing infrastructure to process massive amounts of data with Apache Hadoop, it can still be difficult to get started. This practical guide shows you how to quickly launch data analysis projects in the cloud by using Amazon Elastic MapReduce (EMR), the hosted Hadoop framework in Amazon Web Services (AWS). Authors Kevin Schmidt and Christopher Phillips demo...

Big Data Analytics with R and Hadoop

Big Data Analytics with R and Hadoop

Publisher Packt Publishing
Author
Pages 238
Year 2013
Language English
ISBN 9781782163282
File size 3.6 MB
File format pdf
Download & Read more

Big data analytics is the process of examining large amounts of data of a variety of types to uncover hidden patterns, unknown correlations, and other useful information. Such information can provide competitive advantages over rival organizations and result in business benefits, such as more effective marketing and increased revenue. New methods of working with big data, such as Hadoop and MapRed...

Microsoft SQL Server 2012 with Hadoop

Microsoft SQL Server 2012 with Hadoop

Publisher Packt Publishing
Author
Pages 96
Year 2013
Language English
ISBN 9781782177982
File size 2.8 MB
File format pdf
Download & Read more

With the explosion of data, the open source Apache Hadoop ecosystem is gaining traction, thanks to its huge ecosystem that has arisen around the core functionalities of its distributed file system (HDFS) and Map Reduce. As of today, being able to have SQL Server talking to Hadoop has become increasingly important because the two are indeed complementary. While petabytes of unstructured data can be...

Learning Apache Mahout

Learning Apache Mahout

Publisher Packt Publishing
Author
Pages 250
Year 2015
Language English
ISBN 9781783555215
File size 13.8 MB
File format pdf
Download & Read more

In the past few years the generation of data and our capability to store and process it has grown exponentially. There is a need for scalable analytics frameworks and people with the right skills to get the information needed from this Big Data. Apache Mahout is one of the first and most prominent Big Data machine learning platforms. It implements machine learning algorithms on top of distributed ...

Fast Data Processing with Spark, 2nd Edition

Fast Data Processing with Spark, 2nd Edition

Publisher Packt Publishing
Author
Pages 184
Year 2015
Language English
ISBN 9781784392574
File size 14.2 MB
File format pdf
Download & Read more

Spark is a framework used for writing fast, distributed programs. Spark solves similar problems as Hadoop MapReduce does, but with a fast in-memory approach and a clean functional style API. With its ability to integrate with Hadoop and built-in tools for interactive query analysis (Spark SQL), large-scale graph processing and analysis (GraphX), and real-time analysis (Spark Streaming), it can be ...