Author: robin


Cleanup hdfs directory having too many files and directories

At times some directories on hdfs has too many inodes (files and folders) and it is really hard to delete. Some instances also lead to out of memory (OOM) errors such as the following error, INFO retry.RetryInvocationHandler: java.lang.OutOfMemoryError:...


Querying Hive Metastore

Querying hive metastore tables can provide more in depth details on the tables sitting in Hive. This article is a collection of queries that probes Hive metastore configured with mysql to get details like list of transactional tables, etc. More...


Microsoft SQL Server query tips

Get column details of a table   Get column details of a table select name from sys.columns where object_name(object_id)=’table_name’   No related posts.


Cluster filesystem utilization alerts

This is a quick and raw method to setup alerts when the filesystem fill above threshold. Pre-requisites Monitored filesystems should be consistent, meaning available across all nodes passwordless ssh should be setup between the nodes. Node where the alert script...


Check version of installed python packages

Below bash command will let you find the version of packages for your python interpreter. Make sure you are running the correct version of python enterpreter. Update: 2020-05-14 for i in pandas numpy sqlalchemy logging logging.handlers datetime sys re os...


Download m3u8 URL video to local

Follow below steps to download video from a webpage Step 1: Finding m3u8 URL from webpage Open Chrome Developer tools and click the Network tab Navigate to the page with the video and get it to start playing Filter the...


Automate ports connectivity check using telnet and timeout

Check ports connectivity using automation with telnet and timeout commands. Timeout will help us not get blocked for a long time. Adjust the timeout value on case to case basis, # vi ~/ timeout 2 bash -c “echo ‘exit’ |...


Connecting to msaccess database from Python

Connect to msaccess file from Python and tweak it to emit desired format. import pyodbc conn = pyodbc.connect(r’Driver={Microsoft Access Driver (*.mdb, *.accdb)};DBQ=path where you stored the Access file\file name.accdb;’) cursor = conn.cursor() cursor.execute(‘select * from table name’) for row in...


Process JSON data in SQL Server 2012

Source: SQL Server 2016 and above support JSON_VALUE function to parse JSONs. To process JSONs in older versions add this function to database – Then we can write queries such as below, — Query sample 1 select *...