Spark New Optimization Rule - ReplaceExceptWithNotFilter
Spark community decided to replace Except logical operator using left anti-join in SPARK-12660.
It facilitates to take advantage of all the benefits of the join operations such as managed memory,
code generation and broadcast joins, cc. SPARK-12660.
[Read More]
Spark Adding Custom Optimization Rules
Spark Adding Custom Optimization Rules One of the main benefits of spark-sql as mentioned in their sigmod paper is its ability to easily define and plug in user defined adhoc rules for better optimization. Spark-sql provides api for adding set of adhoc rules that can be plugged into the query...
[Read More]
Spark Catalyst Internals
Spark Catalyst Internals
Spark catalyst is one of the secret sauce of Spark’s Operations on the structured data. Let’s take
a deep look into its internals.
[Read More]
Distributed Hash Table (DHT) - Database Perspective
Distributed Hash Table (DHT) traditionally found enough applicabilities in decentralized distributed systems. In DHT data are distributed among several nodes in the system via hashing techniques. Currently, DHT shows increasing popularity in the modern storage systems aka NoSql Systems. This post reveals some of the uses and applications of DHT in today's modern...
[Read More]
Cassandra CQL Client Limitations
Colum-Family data stores are a special class of NoSql system that facilitates to store wide range of data types keeping the design in between traditional Relational Database System and Modern Key-Value stores. Cassandra is a Column-Family data store on its underlying peer-to-peer architecture. One of the important features of any data...
[Read More]