Learnitweb

Author: Editorial Team

  • Cumulative Spend per Customer per Month in Oracle

    1. Introduction In many analytical and financial systems, you often need to calculate how much a customer has spent over time — either within a month (running total) or across months (year-to-date cumulative spend). For example: This is a classic window function (analytic function) use case in Oracle. 2. Example Table Let’s start with a…

  • Detecting gaps in sequential data

    1. Introduction In many business applications, you may have tables containing sequential values, such as: However, due to missing records, data corruption, or load errors, some IDs or dates may be missing. Detecting these gaps is a very common data quality check. In Oracle, there are several ways to identify these gaps using SQL. 2.…

  • Transpose rows into columns using case and without case

    1. Introduction: What is Transposing Rows into Columns? When you transpose rows into columns, you’re converting row values into multiple columns. This is often needed in reporting, data aggregation, or visualization where you want each unique value in a column (like department or month) to become a separate column. Example Input Table Let’s take an…

  • CQRS (Command Query Responsibility Segregation)

    1. Introduction In traditional application architectures, especially those following a CRUD (Create, Read, Update, Delete) model, the same data model and service layer handle both read and write operations. While this approach works well for simple systems, it begins to show limitations as applications grow more complex — particularly in terms of scalability, performance, and…

  • MinHash

    Introduction to MinHash MinHash, short for “Min-wise Independent Permutations Hashing”, is a probabilistic algorithm used to estimate the similarity between two sets. Unlike exact similarity measures, which require comparing every element in both sets, MinHash provides a fast and memory-efficient way to approximate similarity, making it extremely useful for large-scale datasets. MinHash is primarily used…

  • Cuckoo Filter

    Introduction to Cuckoo Filter A Cuckoo Filter is a probabilistic data structure used to efficiently check whether an element exists in a set. It is similar to a Bloom Filter, but provides additional benefits like support for deletions and better space efficiency in many cases. Cuckoo Filters are part of the approximate membership query (AMQ)…

  • Count-Min Sketch

    Introduction to Count-Min Sketch Count-Min Sketch (CMS) is a probabilistic data structure used for frequency estimation of elements in a data stream. It allows you to determine how many times an element has appeared in a large dataset or streaming data without storing every element explicitly. Count-Min Sketch is part of the streaming algorithms family,…

  • HyperLogLog

    Introduction to HyperLogLog HyperLogLog (HLL) is a probabilistic algorithm used to efficiently estimate the number of unique elements (cardinality) in a very large dataset. Unlike traditional counting methods, which require storing every single element to ensure accurate counting, HyperLogLog achieves this using a fixed and very small amount of memory, making it extremely suitable for…

  • Raft and Paxos

    Introduction to Consensus Algorithms In a distributed system, multiple nodes (servers) often need to agree on a single value or system state, even when some nodes fail, messages are delayed, or nodes crash. This is called the consensus problem, and solving it correctly is critical for ensuring fault-tolerant and consistent systems. Consensus is fundamental in…

  • Merkle Tree

    Introduction to Merkle Tree A Merkle Tree, also known as a hash tree, is a fundamental data structure used for verifying data integrity and consistency in distributed and decentralized systems. It is named after Ralph Merkle, who introduced the concept in 1979. Unlike standard data structures that store complete information, a Merkle Tree stores only…