Spark New Optimization Rule - ReplaceExceptWithNotFilter

Spark New Optimization Rule - ReplaceExceptWithNotFilter Spark community decided to replace Except logical operator using left anti-join in SPARK-12660. It facilitates to take advantage of all the benefits of the join operations such as managed memory, code generation and broadcast joins, cc. SPARK-12660. [Read More]
Tags: hadoop spark sql

Spark Adding Custom Optimization Rules

Spark Adding Custom Optimization Rules One of the main benefits of spark-sql as mentioned in their sigmod paper is its ability to easily define and plug in user defined adhoc rules for better optimization. Spark-sql provides api for adding set of adhoc rules that can be plugged into the query... [Read More]
Tags: hadoop spark sql

Distributed Hash Table (DHT) - Database Perspective

Distributed Hash Table (DHT) traditionally found enough applicabilities in decentralized distributed systems. In DHT data are distributed among several nodes in the system via hashing techniques. Currently, DHT shows increasing popularity in the modern storage systems aka NoSql Systems. This post reveals some of the uses and applications of DHT in today's modern... [Read More]

Cassandra CQL Client Limitations

Colum-Family data stores are a special class of NoSql system that facilitates to store wide range of data types keeping the design in between traditional Relational Database System and Modern Key-Value stores. Cassandra is a Column-Family data store on its underlying peer-to-peer architecture. One of the important features of any data... [Read More]
Tags: cassandra