Tagged: hive

0

Hadoop Hive UDTF Tutorial – Extending Apache Hive with Table Functions

Source: http://beekeeperdata.com/posts/hadoop/2015/07/26/Hive-UDTF-Tutorial.html Author: Matthew Rathbone Co-author: Elena Akhmatova   Article Hadoop Hive UDTF Tutorial – Extending Apache Hive with Table Functions While working with both Primitive types and Embedded Data Structures was discussed in part one, the UDF interfaces are limited to...

0

How to create a Hive UDF in Scala

Source: https://community.hortonworks.com/articles/42695/how-to-create-a-hive-udf-in-scala.html   This article will focus on creating a custom HIVE UDF in the Scala programming language. Intellij IDEA 2016 was used to create the project and artifacts. Creation and testing of the UDF was performed on the Hortonworks...

0

Hive on Tez Performance Tuning – Determining Reducer Counts

Source: https://community.hortonworks.com/articles/22419/hive-on-tez-performance-tuning-determining-reducer.html   Short Description: Some practical steps in Hive Tez tuning Article How Does Tez determine the number of reducers? How can I control this for performance? In this article, I will attempt to answer this while executing and tuning...

0

Hive query tips

Date operations Data operations Headers in Beeline Unlock hive tables Check partitions used in hive query   Debugging Hive Long (query length) queries submitted to Hive Occurrence of thread printing in hiveserver2 log file Capture classes used in hiveserver2 log...

0

Good looking .hiverc file

Following is the .hiverc from one of the hadoop environments I work on, — additional .jar includes like the one below — add jar hdfs://ualprod/tmp/json-serde-1.3.7-jar-with-dependencies.jar; set hive.exec.dynamic.partition.mode=nonstrict; set hive.auto.convert.join.noconditionaltask=true; set hive.optimize.sort.dynamic.partition=true; set hive.exec.max.dynamic.partitions=100000; set hive.exec.max.dynamic.partitions.pernode=10000; — large mem?? set hive.tez.container.size=10240;...

0

Hive ORC files – Pro Tips

Extract text from ORC files (source) Hive (0.11 and up) comes with ORC file dump utility. dump can be invoked by following command, $ hive –orcfiledump <location-of-orc-file>   Create hive table definition using ORC files on HDFS $ hive –orcfiledump...

0

Performance of Hive tables with Parquet & ORC

Source: http://stackoverflow.com/questions/32373460/parquet-vs-orc-vs-orc-with-snappy Datasets Table A – Text File Format- 2.5GB Table B – ORC – 652MB Table C – ORC with Snappy – 802MB Table D – Parquet – 1.9 GB Parquet was worst as far as compression for my table...

0

Can’t connect Excel to Hive using ODBC driver on MAC

So you done everything right and can’t connect Excel to Hive using ODBC driver on your macOS? Let’s see what is going on. Are you running El Capitan on Sierra? Well I was running Sierra and tried connecting before while...

0

Connecting SQuirrel SQL to Hive

Pre-requisites In order to connect SQuirrel SQL client we need the following prerequisites, Client – http://squirrel-sql.sourceforge.net/ Hive connection JARs (found in lib directories) Hive JDBC JAR – hive-jdbc-1.2.1-standalone.jar Hadoop common JAR (for ) – hadoop-common-2.7.2.jar Running HiveServer2 instance For connections use the following...

0

Creating Hive tables on compressed files

Stuck with creating Hive tables on compressed files? Well the documentation on apache.org suggests that Hive natively supports compressed file – https://cwiki.apache.org/confluence/display/Hive/CompressedStorage Lets try that out. Store a snappy compressed file on HDFS. … thinking, I do not have such file… Wait!...