Tagged: hadoop

0

Hadoop mapreduce not working on mac local setup

Yarn jobs hangs on local mac environment Missing node in Resource manager UI There are no nodes in the node list https://www.robin.eu.org:8088/cluster/nodes File: yarn-site.xml Location: /usr/local/Cellar/hadoop/2.7.3/libexec/etc/hadoop/yarn-site.xml <property> <name>yarn.resourcemanager.hostname</name> <value>localhost</value> </property>   Resource Manager UI displaying ipv6 for local address Hadoop does...

0

How to create a Hive UDF in Scala

Source: https://community.hortonworks.com/articles/42695/how-to-create-a-hive-udf-in-scala.html   This article will focus on creating a custom HIVE UDF in the Scala programming language. Intellij IDEA 2016 was used to create the project and artifacts. Creation and testing of the UDF was performed on the Hortonworks...

0

Permanently add jars to hadoop

Looking to add custom SerDe and custom or third party codecs to Hortonworks HDP? Only auxlib folder trick worked for me after having tried lot of alternatives. The places where we need to add that auxlib folder containing JARs is,...

0

Best practices for Namenode and Datanode restarts

Problems Following are some problems we might come across while working in a large setup of hadoop clusters, Namenode restarts taking long time (http://nn-host:50070/dfshealth.html#tab-startup-progress) Namenode startup goes to safemode for a long time after restart   Best practices for Namenode &...

0

brew packages and cask packages

brew packages Basic Apps $ brew install bash $ brew install bash-completion $ brew install maven $ brew install openssl $ brew install ssh-copy-id $ brew install wget $ brew install gawk Big data Apps $ brew install hadoop $...

0

Hive on Tez Performance Tuning – Determining Reducer Counts

Source: https://community.hortonworks.com/articles/22419/hive-on-tez-performance-tuning-determining-reducer.html   Short Description: Some practical steps in Hive Tez tuning Article How Does Tez determine the number of reducers? How can I control this for performance? In this article, I will attempt to answer this while executing and tuning...

0

Hive query tips

Date operations Data operations Headers in Beeline Unlock hive tables Check partitions used in hive query   Debugging Hive Long (query length) queries submitted to Hive Occurrence of thread printing in hiveserver2 log file Capture classes used in hiveserver2 log...

0

Adding compression codec to Hortonworks data platform

Lately I tried installing xz/lzma codec on my local VM setup. The compression ratios are pretty awesome. Won’t do a benchmark here, try it out yourself 😉   Steps Download codec JAR – https://github.com/yongtang/hadoop-xz or https://mvnrepository.com/artifact/io.sensesecure/hadoop-xz Copy downloaded JAR to HDPs’ libs...

0

Good looking .hiverc file

Following is the .hiverc from one of the hadoop environments I work on, — additional .jar includes like the one below — add jar hdfs://ualprod/tmp/json-serde-1.3.7-jar-with-dependencies.jar; set hive.exec.dynamic.partition.mode=nonstrict; set hive.auto.convert.join.noconditionaltask=true; set hive.optimize.sort.dynamic.partition=true; set hive.exec.max.dynamic.partitions=100000; set hive.exec.max.dynamic.partitions.pernode=10000; — large mem?? set hive.tez.container.size=10240;...

0

Apache drill – No current connection

After reading multiple posts, it seems that this is a problem of conflicting jars. My current setup has apache drill installed using $brew install apache-drill and upon executing $drill-embedded or $drill-localhost, I see below error (line 10) robin@MacBook-Pro:~$ drill-localhost Java HotSpot(TM)...