Hive编程(影印版)

Hive编程(影印版)
作 者: 卡普里奥罗 万普勒 卢森格林
出版社: 东南大学出版社
丛编项:
版权说明: 本书为公共版权或经版权方授权,请支持正版图书
标 签: 计算机/网络 数据库 数据库理论
ISBN 出版时间 包装 开本 页数 字数
未知 暂无 暂无 未知 0 暂无

作者简介

  Edward CaPriolo,是Media6degrees的系统管理员,也是Apache软件基金会的成员和Hadoop—Hive项目的委员之一。Dean Wampler,是Think Big Analytics公司的资深咨询顾问,他专长于大数据问题以及诸如HadOOP这样的工具和MachineLearning(机器学习)。Jason Rutherglen,是Think Big Analytics公司的软件架构师,他专长于大数据、Hadoop、搜索和安全。

内容简介

你是否需要把一个关系型数据库应用迁移到Hadoop上?卡普里奥罗等著的《Hive编程(影印版)》这本全面的指南将为你介绍Apache Hive,它是Hadoop的数据仓库平台。你将快速了解如何使用Hive的SQL方言——HiveQL——来汇总、查询和分析存储在Hadoop分布式文件系统中的大数据集。《Hive编程(影印版)》为你展数据库应用于示了如何在你的环境中搭建和配置Hive,它也提供了对Hadoop和Map Reduce的概括介绍,并且演示了Hive是如何在Hadoop的生态系统中工作的。你还将在其中找到现实世界的实例分析,它们展示了那些使用Hive的公司是如何解决PB容量数据层面上的独特问题。

图书目录

Preface

1. Introduction

An Overview of Hadoop and MapReduce

Hive in the Hadoop Ecosystem

Pig

HBase

Cascading, Crunch, and Others

Java Versus Hive: The Word Count Algorithm

What's Next

2. Getting Started

Installing a Preconfigured Virtual Machine

Detailed Installation

Installing Java

Installing Hadoop

Local Mode, Pseudodistributed Mode, and Distributed Mode

Testing Hadoop

Installing Hive

What Is Inside Hive?

Starting Hive

Configuring Your Hadoop Environment

Local Mode Configuration

Distributed and Pseudodistributed Mode Configuration

Metastore Using JDBC

The Hive Command

Command Options

The Command-Line Interface

CLI Options

Variables and Properties

Hive "One Shot" Commands

Executing Hive Queries from Files

The .hiverc File

More on Using the Hive CLI

Command History

Shell Execution

Hadoop dfs Commands from Inside Hive

Comments in Hive Scripts

Query Column Headers

3. Data Types and File Formats

Primitive Data Types

Collection Data Types

Text File Encoding of Data Values

Schema on Read

4. HiveQL: Data Definition

Databases in Hive

Alter Database

Creating Tables

Managed Tables

External Tables

Partitioned, Managed Tables

External Partitioned Tables

Customizing Table Storage Formats

Dropping Tables

Alter Table

Renaming a Table

Adding, Modifying, and Dropping a Table Partition

Changing Columns

Adding Columns

Deleting or Replacing Columns

Alter Table Properties

Alter Storage Properties

Miscellaneous Alter Table Statements

5. HiveQt: Data Manipulation

Loading Data into Managed Tables

Inserting Data into Tables from Queries

Dynamic Partition Inserts

Creating Tables and Loading Them in One Query

Exporting Data

……

6.HiveQL: Queries

7.HiveQL: Views

8.HiveQL: Indexes

9.Schema Design

10.Tuning

11.Other File Formats and Compression

12.Developing

13.Functions

14.Streaming

15.Customizing Hive File and Record Formats

16.Hive Thrift Service

17.Storage Handlers and NoSQL

18.Security

19.Locking

20.Hive Integration with Oozie

21.Hive and Amazon Web Services(AWS)

22.HCatalog

23.Case Studies

Glossary

Appendix:References

Index