|700

Array

An array (or static array) organizes items sequentially, one after another in memory. Each position in the array has an index, starting at 0.

Pros
- Fast lookups: Retrieving the element at a given index takes O(1) time, regardless of the length of the array.
- Fast appends: Adding a new element at the end of the array takes O(1) time.
Cons
- Fixed-size: You need to specify how many elements you’re going to store in your array ahead of time. (Unless you’re using a fancy dynamic array.)
- Costly inserts and deletes: You have to “scoot over” the other elements to fill in or close gaps, which takes worst-case O(n) time.

Array Operations

Operation	Description	Cost
`lookup`	Direct lookup of an item in the array given its index	O(1)
`append`	Add an element to the end of the array	O(1)
`insert`	Insert an element in the middle or beginning of the array. First, we need to make space for the new element by moving over everything starting at the index i.	O(n)
`delete`	Delete an element anywhere in the array. We have to fill in the gap when we remove an element since array elements are stored adjacent to each other.	O(n)

Dynamic Array

A dynamic array (ArrayList in Java) is an array that resizes itself as needed while still providing O(1) access. A typical implementation is that when the array is full, the array doubles in size. Each doubling takes O(n) time, but happens so rarely that its amortized insertion time is still O(1), this technique is called table doubling.

Table Doubling algorithm: create a new array with a double length of the original array, and copy over the items from the original array to the new array (takes O(n)).
Amortized Analysis: “Amortize” is a fancy verb used in finance that refers to paying off the cost of something gradually. With dynamic arrays, every expensive append where we have to grow the array “buys” us many cheap appends in the future. Conceptually, we can spread the cost of the expensive append over all those cheap appends.
Pros:
- Fast lookups: Just like arrays, retrieving the element at a given index takes O(1) time.
- Variable size: You can add as many items as you want, and the dynamic array will expand to hold them.
- Cache-friendly: Just like arrays, dynamic arrays place items right next to each other in memory, making efficient use of caches.
Cons:
- Slow worst-case appends: Usually, adding a new element at the end of the dynamic array takes O(1) time. But if the dynamic array doesn’t have any room for the new item, it’ll need to expand, which takes O(n) time.
- Costly inserts and deletes: Just like arrays, elements are stored adjacent to each other. So adding or removing an item in the middle of the array requires “scooting over” other elements, which takes O(n) time.

Dynamic Array Operations

Operation	Description	Cost
`lookup`	Direct lookup of an item in the array given its index	O(1)
`append`	Add an element to the end of the array	O(1)
`insert`	Insert an element in the middle or beginning of the array. (Similar to a static array)	O(n)
`delete`	Delete an element anywhere in the array. (Similar to a static array).	O(n)

Hash Table

A data structure (ADT) that can map keys to values. A hash table uses a hash function to compute a hash code, that is used as an index of an array of buckets or slots, from which the desired value can be found.

Pros:
- Fast lookups: Lookups take O(1) time on average.
- Flexible keys: Most data types can be used for keys, as long as they’re hashable.
Cons:
- Slow worst-case lookups: Lookups take O(n) time in the worst case.
- Unordered: Keys aren’t stored in a special order. If you’re looking for the smallest key, the largest key, or all the keys in a range, you’ll need to look through every key to find it.
- Single-directional lookups: While you can look up the value for a given key in O(1) time, looking up the keys for a given value requires looping through the whole dataset—O(n) time.
- Not cache-friendly: Many hash table implementations use linked lists, which don’t put data next to each other in memory.
Use cases:
- English words spelling corrections
- Network routers: store IP addresses
- Network server: Port number ⇒ socket/app
Goal:
- Minimize collisions
- Uniform distribution of hash values
- Hash function: easy to calculate
- Resolve any conclusions that may happen

Hash Table Operations

Operation	Description	Cost
`insert`	Insert item anywhere in the hash table	O(1) Average O(n) Worst
`lookup`	Return item from hash table given its key	O(1) Average O(n) Worst
`delete`	Delete an item from the hash table given its key	O(1) Average O(n) Worst

Implementation

Array using LinkedList (or Dynamic Array)
1. Given a key (string, integer or any type), compute a hash code (integer) using a hash function to generate a hash code that maps to an index in the array:
  index = hash(key) % array_length
  
  Example hashing functions:
  1. For numeric keys: divide the key by the number of available addresses in the array and take the remainder.
  2. For alphanumeric keys: sum the ASCII codes in a key and take the remainder.
  3. Folding method: divide the key into equal parts and sum the parts. E.g: telephone number 01234 ⇒ 01 + 23 + 45
2. Store the associated value with the key at the index. Set the value to a single node LinkedList or append to existing LinkedList in case of a collision.
3. For retrieval, the same process is applied: Compute the hash code from the key, and then compute the index from the hash code. Then, search through the linked list for the value with this key.
Binary Search Tree This gives us an O(log(n)) lookup time. The advantage of this is potentially using less space since we no longer allocate a large array. We can also iterate through the keys in order, which can be useful sometimes.

Collision Resolution

Open addressing If an address is full, search for another open address, there’s different ways to select the next available spot:
- Linear probing: next available space, cycle around to the beginning if the end is reached.
- Linear probing applies to insertion and searching.
- Has problems: keys might bunch together while other spots are still free (primary clustering)
- Plus 3 rehash: look at every third slot around instead of the next spot, avoids clustering.
- Quadratic probing: if find the next spot by failed attempts2
- Double hashing: applies the hash function a second time to find the next available
Chaining Stores a Linked List in the address of an array and adds an item to the end of the list upon collisions

The more items in a hash table, the more likely that the collision might happen when inserting more data. One solution to avoid collisions is to make the hash table bigger than needed (e.g. 75% occupied), keep the load factor low:

Load factor = Total number of items stored / Size of the array

A hash table can be implemented such that it grows when the load factor reaches a certain threshold.

Linked List

A linked list is a data structure that represents a sequence of nodes. In a singly linked list. each node points to the next node in the linked list. A doubly linked list gives each node pointers to both the next node and the previous node. Unlike an array, consecutive items in a linked list are not necessarily next to each other in memory.

Pros:
- Fast operations on the ends: Adding elements at either end of a linked list is O(1)assuming we have a pointer to the tail. Removing the first element is also O(1).
- Flexible size: There’s no need to specify how many elements you’re going to store ahead of time. You can keep adding elements as long as there’s enough space on the machine.
Cons:
- Costly lookups: To access or edit an item in a linked list, you have to take O(i) time to walk from the head of the list to the ith item.

Linked List Operations

Operation	Description	Cost
`insert`	Insert an item in the middle after the given reference node.	O(n)
`prepend`	Insert an item at the head of the list	O(1)
`append`	Insert an item at the tail of the list assuming we have a reference to the tail	O(1)
`lookup`	Search for an item in the list	O(n)
`delete`	Delete an item from the list. Involves searching for the item first O(n) then deleting it once we have a reference to it O(1)	O(n)

Implementation

Singly Linked List

class Node {                                            
  constructor(data) {
    this.data = data;
    this.next = null;
  }
}

Doubly Linked List

class Node {                                            
  constructor(data) {
    this.data = data;
    this.next = null;
    this.prev = null;
  }
}

Techniques

Fast & slow pointers
Used in recursive problems

Stack

A stack stores items in a last-in, first-out (LIFO) order.

Pros
- Fast operations All stack operations take O(1) time.
Uses
- The call stack is a stack that tracks function calls in a program. When a function returns, which function do we “pop” back to? The last one that “pushed” a function call.
- Depth-first search uses a stack (sometimes the call stack) to keep track of which nodes to visit next.
- String parsing — stacks turn out to be useful for several types of string parsing.

Stack Operations

Operation	Description	Cost (Linked List)	Cost (Dynamic Array)
`pop`	Remove the top item from the stack.	O(1)	O(n)
`push`	Add an item to the top of the stack	O(1)	O(n)
`peek`	Return the item at top of the stack	O(1)	O(1)
`isEmpty`	Return true if the stack is empty	O(1)	O(1)

Implementation

Linked List
- push: insert at head
- pop: delete at head
Dynamic Array
- push: push at head
- pop: remove the last element

Techniques

Used to implement recursive algorithms iteratively (BFS, DFS…)

Queue

A queue stores items in a first-in, first-out (FIFO) order.

Pros
- Fast operations: All queue operations take O(1) time.
Uses
- Breadth-first search uses a queue to keep track of the nodes to visit next.
- Printers use queues to manage jobs—jobs get printed in the order they’re submitted.
- Web servers use queues to manage requests—page requests get fulfilled in the order they are received.
- Processes wait in the CPU scheduler’s queue for their turn to run.

Queue Operations

Operation	Description	Cost
`enqueue`	Add an item at the back of the queue.	O(1)
`dequeue`	Remove an item at the front of the queue and return it.	O(1)
`peek`	Return the item at the front of the queue	O(1)
`isEmpty`	Return true if the queue is empty	O(1)

Implementation

Linked List
- push: insert at the tail
- pop: delete at the head

Techniques

Used to implement BFS, DFS algorithms to keep track of the nodes.

Tree

A tree organizes values hierarchically where each entry in the tree is called a node, and every node links to zero or more child nodes.

Use cases
- Filesystems: files inside folders inside folders
- Comments: comments, replies to comments, replies to replies
- Family trees: parents, grandparents, children, and grandchildren
Leaves, Depth, and Height
- Leaf nodes: Bottom nodes of the tree.
- Depth of a node: the number of links from the root to the node, each node has a depth.
- Height of a tree is the number of links from its root to the furthest leaf. (That’s the same as the maximum node depth.)

Tree Traversals

500 - Breadth-First Search (BFS) (Recursive or Iterative using a Queue) In a BFS, you first explore all the nodes one step away, then all the nodes two steps away, etc.. - Depth First Search (DFS) (Recursive or Iterative using a Stack) In a DFS, you go as deep as possible down one path before backing up and trying a different one. - Pre Order Traversal (Root => Left => Right) Visits nodes in the same order as a DFS. - In Order Traversal (Left => Root => Right) Most common, visits the nodes in sorted, ascending order. - Post Order Traversal (Left => Right => Root) Rare … but it shows up in parsing algorithms or Math operations evaluations.

Binary Tree

A binary tree is a tree where every node has at most two children. The children are usually called left and right.

Full binary tree: A full binary tree is a binary tree where every node has exactly 0 or 2 children. That is, no nodes have only one child.

- **Complete binary tree** A complete binary tree is a binary tree in which every level of the tree is completely filled, except for perhaps the last level. To the extent that the last level is filled, it is filled left to right.

 Complete trees are the basis for heaps and priority queues 
 
 <center><img src="https://i.imgur.com/kzEdeRz.png"></center>

Perfect binary tree A perfect binary tree is a tree where every level is completely full of nodes.
- The number of nodes at depth d in a perfect binary tree is 2d
- Number of nodes in a perfect binary tree of height his 2^(h+1) - 1
Balanced binary tree
A balanced binary tree is a tree whose height is small relative to the number of nodes it has. By small, we usually mean the height is O(log(n)), where n is the number of nodes. This ensures O(log(n)) for insert and find operations

Two common types of balanced trees are Red-Black Trees and AVL trees.
Relationship between height and number of nodes $n = 2^{(h + 1)} - 1 \Leftrightarrow h = l o g_{2} (n + 1) - 1$

$n = Number of nodes$ $h = Height of the tree$

Implementation

class BinaryTreeNode {
  constructor(value) {
    this.value = value;
    this.left  = null;
    this.right = null;
  }
}

Binary Search Tree (BST)

A binary search tree is a binary tree where the nodes are ordered in a specific way. For every node:

All the nodes to the left are smaller than the current node.
All the nodes to the right are larger than the current node.
Pros
- Good performance across the board. Assuming they’re balanced, binary search trees are good at lots of operations, even if they’re not constant time for anything.
- Compared to a sorted array, lookups take the same amount of time (O(log(n))), but inserts and deletes are faster (O(log(n))) for BSTs, O(n) for arrays).
- Compared to objects, BSTs have better worst-case performance—O(log(n)) instead of O(n). But, on average objects perform better than BSTs (meaning O(1) time complexity).
- BSTs are sorted: Taking a binary search tree and pulling out all of the elements in sorted order can be done in O(n) using an in-order traversal. Finding the element closest to a value can be done in O(log(n)) (again, if the BST is balanced!).
Cons
- Poor performance if unbalanced: Some types of binary search trees balance automatically, but not all. If a BST is not balanced, then operations become O(n).
- No O(1) operations BSTs aren’t the fastest for anything. On average, an array or an object will be faster.

Binary Heap (Min)

A binary heap is a binary tree where the smallest value is always at the top.

Pros
- Quickly access the smallest item: Binary heaps allow you to grab the smallest item (the root) in O(1) time while keeping other operations relatively cheap (O(log(n)) time).
- Space efficient: Binary heaps are usually implemented with lists, saving the overhead cost of storing pointers to child nodes.
Cons
- Limited interface: Binary heaps only provide easy access to the smallest item. Finding other items in the heap takes O(n) time since we have to iterate through all the nodes.
Uses
- Priority queues are often implemented using heaps. Items are enqueued by adding them to the heap, and the highest-priority item can quickly be grabbed from the top.
- Heapsort: One way to efficiently sort an array of items is to make a heap out of them and then remove items one at a time—in sorted order.

Binary Min Heap Operations

Operation	Description	Cost
`insert`	Insert an item into a min-heap. We always start by inserting the element at the bottom. We insert at the rightmost spot to maintain the complete tree property. Then, we “fix” the tree by swapping the new element with its parent, until we find an appropriate spot for the element. We essentially bubble up the minimum element.	O(log(n))
`get_min`	Returns the min without removing it.	O(1)
`extract_min`	Return the min item of the heap: it’s always at the top. The trickier part is how to remove it: 1. First, we remove the minimum element and swap it with the last element in the heap (the bottommost, rightmost element). 2. Then, we bubble down this element, swapping it with one of its children until the min-heap property is restored. 3. Do we swap it with the left child or the right child? That depends on their values. There’s no inherent ordering between the left and right element, but you’ll need to take the smaller one to maintain the min-heap ordering.	O(log(n))
`heapify`	Transform an Array Into a Heap	O(n)

Implementation

Heaps are built using arrays
Navigation between nodes:
- Left Child: 2 * i + 1
- Right Child: 2 * i + 2
- Parent: floor(i / 2)

Priority Queue

A priority queue is a special queue where:

Every item in the queue has a priority, and
Higher-priority items are dequeued before lower-priority items. Picture a big list of bugs for an engineering team to tackle. You want to keep the highest-priority bugs at the top of the list.
Pros:
- Quickly access the highest-priority item: Priority queues allow you to peek at the top item in O(1) while keeping other operations relatively cheap (O(log(n)).
Cons:
- Slow enqueues and dequeues: Both operations take O(log(n)) time with priority queues. With normal first-in, first-out queues, these operations are O(1) time.
Uses
- Any time you want to handle things with different priority levels: triaging patients in a hospital, locating the closest available taxi, or just a to-do list.
- Operating system schedulers may use priority queues to select the next process to run, ensuring high-priority tasks run before low-priority ones.
- Certain foundational algorithms rely on priority queues:
  - Dijkstra’s shortest-path
  - A search* (a graph traversal algorithm like BFS)
    - Huffman codes (an encoding for data compression)

Implementation

Binary Heaps Priority queues are often implemented using binary heaps. Notice how the highest priority is right at the top of the heap, ready to be grabbed in O(1) time.
- To enqueue an item, add it to the heap using the priority as the key. (O(log(n)) time)
- To peek at the highest priority item, look at the item at the top. (O(1) time)
- To dequeue the highest priority item, remove the top item from the heap. (O(log(n)) time)
A Sorted List
A Sorted Linked List

Trie

Also known as radix tree, prefix tree or digital tree. A trie is a variant of an n-ary tree in which characters are stored at each node. Each path down the tree may represent a word.

Pros
- Sometimes Space-Efficient: If you’re storing lots of words that start with similar patterns, tries may reduce the overall storage cost by storing shared prefixes once.
- Efficient Prefix Queries: Tries can quickly answer queries about words with shared prefixes, like:
  - How many words start with “choco”?
  - What’s the most likely next letter in a word that starts with “strawber”?
Cons
- Usually Space-Inefficient Tries rarely save space when compared to storing strings in a set.
- ASCII characters in a string are one byte each: Each link between trie nodes is a pointer to an address—eight bytes on a 64-bit system. So, the overhead of linking nodes together often outweighs the savings from storing fewer characters.
- Not Standard: Most languages don’t come with a built-in trie implementation. You’ll need to implement one yourself.
Radix Trees: A radix tree is like a trie, but it saves space by combining nodes if they only have one child.
Marking word endings There might be cases where one word is a prefix of another, and to still be able to identify both words separately. The actual implementation of these nodes might be just a boolean flag such as isComplete within the “parent” node.
Compared to Hashtable A trie can check if a string is a valid prefix in O(K) time. The same runtime as a hash table will take for a string of length K.

Trie Operations

Operation	Description	Cost
`insert`	Insert a word of length k into the trie.	O(k)
`lookup`	Determine if a word of length k is in the trie.	O(k)

Implementation

class TrieNode {
  constructor(character) {
    this.character = character; // character stored in this node.
    this.isComplete = false; // is a complete word.
    this.children = new Map(); // hashmap <Character, TrideNode>
  }
}
 
class Trie {
  constructor() {
    this.root = new TrieNode();
  }
  insert(word) {}
  lookup(word) {}
}

Techniques

Tries are used for searching for words or prefixes in dictionaries.
Many problems involving lists of valid words leverage a trie as an optimization: we can keep a reference to the current node in the trie as we do lookups, e.g: autocomplete, spell checking…

Graph

A graph organizes items in an interconnected network where:

Each item is a node (or vertex).
Nodes are connected by edges
Pros
- Representing links: Graphs are ideal for cases where you’re working with things that connect to other things. Nodes and edges could, for example, respectively represent cities and highways, routers and ethernet cables, or Facebook users and their friendships.
Cons
- Scaling challenges: Most graph algorithms are O(n∗lg(n)) or even slower. Depending on the size of your graph, running algorithms across your nodes may not be feasible.
Terminology
- Directed or undirected In directed graphs, edges point from the node at one end to the node at the other end. In undirected graphs, the edges simply connect the nodes at each end.
  - Cyclic or acyclic graph is cyclic if it has a cycle—an unbroken series of nodes with no repeating nodes or edges that connect back to itself. Graphs without cycles are acyclic.
- Weighted or unweighted If a graph is weighted, each edge has a “weight.” The weight could, for example, represent the distance between two locations, or the cost or time it takes to travel between the locations.

Graph representation

Adjacency list A list where the index represents the node and the value at that index is a list of the node’s neighbors:
```
const graph = {
  0: [1],
  1: [0, 2, 3],
  2: [1, 3],
  3: [1, 2],
};
```

Adjacency matrix
A matrix of 0s and 1s indicating whether node x connects to node y (0 means no, 1 means yes).

const graph = [
  [0, 1, 0, 0],
  [1, 0, 1, 1],
  [0, 1, 0, 1],
  [0, 1, 1, 0],
];

Graph algorithms

Breadth-first search (BFS) and Depth-first search (DFS)
- Problem examples:
  - Is there a path between two nodes in this undirected graph?
  - What’s the shortest path between two nodes in this undirected, unweighted graph?
  - Can this undirected graph be colored with two colors
  - Does this undirected graph have a cycle?
Dijkstra’s Algorithm: Finds the shortest path from one node to all other nodes in a weighted graph.
Topological Sort: Arranges the nodes in a directed, acyclic graph in a special order based on incoming edges.
Minimum Spanning Tree: Finds the cheapest set of edges needed to reach all nodes in a weighted graph.

notes.zianwar.com

Explorer

Data Structures

Array

Pros

Cons

Array Operations

Dynamic Array

Pros:

Cons:

Dynamic Array Operations

Hash Table

Pros:

Cons:

Use cases:

Goal:

Hash Table Operations

Implementation

Collision Resolution

Linked List

Pros:

Cons:

Linked List Operations

Implementation

Techniques

Stack

Pros

Uses

Stack Operations

Implementation

Techniques

Queue

Pros

Uses

Queue Operations

Implementation

Techniques

Tree

Use cases

Leaves, Depth, and Height

Tree Traversals

Binary Tree

Implementation

Binary Search Tree (BST)

Pros

Cons

Binary Heap (Min)

Pros

Cons

Uses

Binary Min Heap Operations

Implementation

Priority Queue

Pros:

Cons:

Uses

Implementation

Trie

Pros

Cons

Trie Operations

Implementation

Techniques

Graph

Pros

Cons

Terminology

Graph representation

Graph algorithms

Table of Contents