We all use Hadoop as the de-facto choice for a DataLake. Why so, because we can throw any data at it. But in reality we cannot throw any data at it. Most of our data is convertible to plaintext, and Hadoop accepts only plaintext. We convert all our data into plaintext and then throw it into Hadoop.

However does plain text storage in Hadoop address any problem? Yes, it does allow you to store all your organisational data in one place. But because your JSON & XML went in as plain text, analysing it within Hadoop now requires excess code.

Feature Comparison

Feature	Hadoop	BlobCity DB
Distributed	Yes	Yes
Plaintext data	Yes	Yes
JSON data	No	Yes
XML data	No	Yes
PDF, Word, Excel data	No	Yes
ACID Compliance	No	Yes
Stored Procedures	Yes (Map-Reduce Only)	Yes (Java + Scala)
In-memory Processing	No (Possible with Spark)	Yes (Built-in engine)

Future proof your DataLake

Using BlobCity as a DataLake futures proofs your DataLake infrastructure. With native support for 17 different formats of data and option of moving part data to in-memory for faster analytics, BlobCity strikes the right balance between features, performance, customer needs and cost effectiveness.

New systems may bring in data in newer formats that are currently not anticipated, and BlobCity will most likely and readily accept that format. This means new systems can report data to your DataLake with minimalistic integration effort.

If some queries are performing slower due to limitation of disk IO, the corresponding data can be instantly moved into memory to allow high speed and real-time analytics over such data.