Category: BigData

Apache Kafka installation on Linux

In the following post, I will show how to install Apache Kafka on a Linux VM. Apache Kafka is an open-source stream processing platform developed by the Apache Software Foundation written in Scala and Java. First of all we need a Linux VM (e.g Centos 7) with at least 1 GB of RAM Connect as root and

Hadoop HDFS maximum number of files

Just a short blog about HDFS directory file and memory limits. I’ve recently faced an issue with Hive with the following error message The parameter which caused the error above is called dfs.namenode.fs-limits.max-directory-items dfs.namenode.fs-limits.max-directory-items 1048576 Defines the maximum number of items that a directory may contain. The parameter controls how much in a directory on