Tagged: hadoop

0

Use SSH Tunneling to access Ambari web UI, ResourceManager, JobHistory, NameNode, Oozie, and other web UI’s

Source: https://azure.microsoft.com/en-us/documentation/articles/hdinsight-linux-ambari-ssh-tunnel/ Original Author: Larry Franks Excerpts ssh tunnel command ssh -C2qTnNf -D 9876 user-name@machine-name This creates a connection that routes traffic to local port 9876 to the cluster over SSH. The options are: D 9876 – The local port that will route...

0

Understanding HDFS Quotas and Hadoop Fs and Fsck Tools

Source: http://www.michael-noll.com/blog/2011/10/20/understanding-hdfs-quotas-and-hadoop-fs-and-fsck-tools/ References: http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsQuotaAdminGuide.html   In my experience Hadoop users often confuse the file size numbers reported by commands such as hadoop fsck, hadoop fs -dus and hadoop fs -count -q when it comes to reasoning about HDFS space quotas. Here is...

0

How to identify what is consuming space in HDFS

Source: https://community.hortonworks.com/articles/16846/how-to-identify-what-is-consuming-space-in-hdfs.html Find the directories using the most space in HDFS For a UI showing the biggest consumers of space in HDFS install and configure Twitter’s HDFS-DU. For a quick visual representation of HDFS disk usage with no extra tools required,...

0

Setting up knox with Active Directory/ LDAP

Source: https://cwiki.apache.org/confluence/display/KNOX/Using+Apache+Knox+with+ActiveDirectory   This article covers using Apache Knox with ActiveDirectory. Currently Apache Knox comes “out of the box” setup with a demo LDAP server based on ApacheDS. This was a conscious decision made to simplify the initial user experience with Knox. Unfortunately,...

0

Setup local Hadoop dev environment on macOS

It is always so convenient to have a local environment for learning and quick testing of a scenario. If you are working in macOS environment looking to learn or setup Hadoop locally then you are in the right place. I...

0

LDAP – Apache Directory Studio: A Basic Tutorial

Source: http://krams915.blogspot.com/2011/01/ldap-apache-directory-studio-basic.html In this tutorial we will setup a basic LDAP structure containing users and roles. We will be using the excellent Apache Directory Studio IDE. This tutorial will be the basis for our other Spring LDAP integration tutorials. What is...

0

Hive statistics using beeline and expect script

Following expect script uses beeline interface to fetch statistics of tables within a database. Use username and queuename with your environment values. #!/usr/bin/expect -f # hive_statistics, v0.1, 2016-05, [email protected] # Usage: ./hive_statistics [database_name] set _database [lindex $argv 0] if {...

0

Hortonworks Data Platform Installation errata – Missing manual

Pre-requisites Creating service users and databases in MySQL JDBC connector error during ambari-server setup with MySQL MySql connection failing during ambari automated installation   Some useful bash scripts Create database and user on mysql for services like ambari, oozie, hue,...

0

Configure log files on HDP platform

  Kafka Storm Ranger HDFS Zookeeper Oozie Knox Hive & Hive metastore   1. Kafka Kafka currently uses org.apache.log4j.DailyRollingFileAppender, which doesn’t allow us to specify the max backup index, max file size. And by default, rolls every hour creating 24...

0

Insightful Hadoop administration commands

Tip #1 Quick list of operations Where? Example: /var/log/hadoop/hdfs cat hdfs-audit.log | awk ‘{cmds[$9]++}END{for (i in cmds)printf “%s %d\n”,i,cmds[i]}’ Results [user@server hdfs]$ cat hdfs-audit.log | awk ‘{cmds[$9]++}END{for (i in cmds)printf “%s %d\n”,i,cmds[i]}’ cmd=setTimes 52 cmd=listStatus 47422 cmd=create 36932 cmd=getfileinfo 7431182...