1.0.1 • Published 6 months ago

@ad0x99/algo-utils v1.0.1

Weekly downloads
-
License
ISC
Repository
github
Last release
6 months ago

JavaScript Algorithms and Data Structure

  • A collection of Algorithms and Data Structure in JavaScript

Table of Contents

The Big O Shorthands

  • Arithmetic operations are constant
  • Variable assignment is constant
  • Accessing elements in an array (by index) or object (by key) is constant
  • In a loop, the complexity is the length of the loop times the complexity of whatever happens inside of the loop

Space Complexity in JS

  • Most primitives (booleans, numbers, undefined, null) are constant space
  • Strings require O(n) space (where n is the string length)
  • Reference types are generally O(n), where n is the length (for arrays) or the number of keys (for objects)

The Big O of Objects

  • When to use objects

    • When you don't need order
    • When you need fast access / insertion and removal
Big O of Objects / Object MethodsTime Complexity
InsertionO(1)
RemovalO(1)
SearchingO(n)
AccessO(1)
Object.keysO(n)
Object.valuesO(n)
Object.entriesO(n)
hasOwnPropertyO(1)

Big O of Arrays (Ordered Lists)

  • When to use arrays
    • When you need order
    • When you need fast access / insertion and removal
Big O of Arrays / Array MethodsTime Complexity
InsertionIt depends
RemovalIt depends
SearchingO(n)
AccessO(1)
pushO(1)
popO(1)
shiftO(n)
unshiftO(n)
concatO(n)
sliceO(n)
spliceO(n)
sortO(n * log n)
forEach/map/filter/reduce/etc.O(n)

Problem Solving Approach

What is an algorithms?

  • A process or set of steps to accomplish a certain task

How to approach a problem?

  • Devise a plan for solving problems
  • Master common problem solving patterns
  • Understand the problem
  • Explore concrete examples
  • Break it down
  • Solve/Simplify
  • Look back and refactor

Understand the problem

  1. Can I restate the problem in my own words?
  2. What are the inputs that go into the problem?
  3. What are the outputs that should come from the solution to the problem?
  4. Can the outputs be determined from the inputs? In other words, do I have enough information to solve the problem?
  5. How should I label the important pieces of data that are a part of the problem?

Explore concrete examples

  1. Start with simple examples
  2. Progress to more complex examples
  3. Explore examples with empty inputs
  4. Explore examples with invalid inputs

Break it down

  • Explicitly write out the steps you need to take. This force you to think about the code you'll write before you write it, and helps you catch any lingering conceptual issues or misunderstandings before you dive in and have worry about details as well

Solve or Simplify

  • Solve the problem if you can not solve a simpler problem
  • Find the core difficulty in what you're trying to do
  • Temporarily ignore that difficulty
  • Write a simplified solution
  • Then incorporate that difficulty back in

Look Back and Refactor

  • Refactoring questions
    • Can you check the result?
    • Can you derive the result differently?
    • Can you understand it at a glance?
    • Can you use the result or method for some other problem?
    • Can you improve performance of your solution?
    • Can you think of other ways to refactor?
    • How have other people solved this problem?

Problem Solving Patterns

  • Some patterns:
    • Frequency Counter
    • Multiple Pointers
    • Sliding Window
    • Divide and Conquer
    • Dynamic Programming
    • Greedy Algorithms
    • Backtracking

Frequency Counter

  • This pattern uses objects or sets to collect values/frequencies of values
  • This can often avoid the need for nested loops or O(n^2) operations with array/strings

Multiple Pointers

  • Create pointers or values that correspond to an index or position and move towards the beginning, end or middle based on a certain condition
  • Very efficient for solving problems with minimal space complexity as well

Sliding Window

  • This pattern involves creating a window which can either be an array or number form one position to another
  • Depending on a certain condition, the window either increases or closes (and a new window is created)
  • This pattern is very useful for keeping track of a subset of data in an array/string, etc.

Divide and Conquer

  • This pattern involves dividing a data set into smaller chunks and the repeating a process with a subset of data
  • This pattern can tremendously decrease time complexity

Recursion

  • The recursion is a process (function) that calls itself

The Call Stack

  • A call stack is a mechanism for an interpreter (like the JavaScript interpreter in a web browser) to keep track of its place in a script that calls multiple functions
  • Any time a function is invoked, it's placed (pushed) on the top of the stack
  • When JavaScript sees the return keyword or when the function ends, the compiler will remove (pop) function out of the stack

How Recursion Functions Work

  • We invoke the same function with a different input until you reach your base case
  • The base case is the condition when the recursion ends
  • In summary, 2 essential parts of a recursion function are base case and different input

Helper Method Recursion

  • A helper method is a recursive method that makes use of additional parameters to keep track of values. It is any method you use to help in the execution of other methods or functions and which is not used outside of that context
function outer(input) {
  let outerScopedVariable = [];

  function helper(helperInput) {
    // Modify the outerScopedVariable
    helper(helperInput--);
  }

  helper(input);

  return outerScopedVariable;
}

Pure Recursion

  • A function is pure if it does not change any non-local variables, read files or network connections, or make any output
  • A function qualifies as a pure function if:

    • It will always return the same result if given the same arguments. This is also known as referential transparency. This means that pure functions rely on their own arguments and immutable values to determine their return value. For example, mathematical functions are considered to be referentially transparent .
    • It doesn’t produce any side effects. What this means is that a function does not change the association between its name and value within a given scope. Side effects are harmful because they introduce uncertainty to the code. Consequently, this leads to difficulty tracing and debugging should an issue arise.

Searching Algorithms

Searching AlgorithmsBest caseAverage CaseWorst CaseExamples
Linear SearchO(1)O(n)O(n)See example of linear search
Binary SearchO(1)O(log n)O(log n)See example of binary search

Sorting Algorithms

  • Sorting is the process of rearranging the items in a collection (e.g. an array) so that the items are in some kind of order
Searching AlgorithmsBest caseAverage CaseWorst CaseSpace ComplexityExamples
Bubble SortO(n)O(n^2)O(n^2)O(1)See bubble sort implementation
Insertion SortO(n)O(n^2)O(n^2)O(1)See insertion sort implementation
Selection SortO(n^2)O(n^2)O(n^2)O(1)See selection sort implementation
Merge SortO(n log n)O(n log n)O(n log n)O(n)See merge sort implementation
Quick SortO(n log n)O(n log n)O(n^2)O(log n)See quick sort implementation
Radix SortO(nk) (n - length of the array, k - number of digits (average))O(nk)O(nk)O(n + k)See radix sort implementation

Data Structure Introduction

Linked List

  • Linked list is a data structure that contains a head, tail and length property
  • It consists of nodes, and each node has a value and a pointer to another node or null
  • A node stores piece of data
Linked ListArray
Do not have indexesIndexed in order
Connected via nodes with a next pointerInsertion and deletion can be expensive
Random access is not allowedCan quickly be accessed at a specific index

Singly Linked List

  • A singly linked list is a linear data structure in which the elements are not stored in contiguous memory locations and each element is connected only to its next element using a pointer.

  • Singly Linked Lists are an excellent alternative to arrays when insertion and deletion at the beginning are frequently required

  • Arrays contain a built in index whereas Linked Lists do not

  • The idea of a list data structure that consists of nodes is the foundation for other data structures like Stacks and Queues
Singly Linked List MethodsTime Complexity
InsertionO(1)
RemovalO(1) or O(n) (It depends)
SearchingO(n)
AccessingO(n)

Doubly Linked List

  • A doubly linked list (DLL) is a special type of linked list in which each node contains a pointer to the previous node as well as the next node of the linked list. Therefore, in a doubly linked list, a node consists of three parts: node data, pointer to the next node in sequence (next pointer) , pointer to the previous node (previous pointer).
Doubly Linked List MethodsTime Complexity
InsertionO(1)
RemovalO(1)
SearchingO(n)
AccessingO(n)

Stack

  • Stack is a linear data structure that follows a particular order in which the operations are performed. The order may be LIFO(Last In First Out) or FILO(First In Last Out).
  • LIFO implies that the element that is inserted last, comes out first and FILO implies that the element that is inserted first, comes out last.
Big O of Stack MethodsTime Complexity
InsertionO(1)
RemovalO(1)
SearchingO(n)
AccessingO(n)

Where Stacks are used

  • Managing function invocations
  • Undo / Redo functionality
  • Routing (the history object) is treated like a stack

Queue

  • A Queue is defined as a linear data structure that is open at both ends and the operations are performed in First In First Out (FIFO) order.

  • How do we use queue in programming?

    • Background tasks
    • Uploading resources
    • Printing / Task processing
Stack MethodsTime Complexity
InsertionO(1)
RemovalO(1)
SearchingO(n)
AccessingO(n)

Binary Search Tree

Trees

  • Tree Terminology

    • Root: the top node in a tree
    • Child: a node directly connected to another node when moving away from the Root
    • Parent: the converse notion of a child
    • Siblings: a group of nodes with the same parent
    • Leaf: a node with no children
    • Edge: the connection between one node and another

Tree

How Trees are used?

  • Trees are used in computer science for various tasks, including storing information, representing hierarchical data, and providing efficient algorithms for operations such as insertion, deletion, and searching.

  • For examples: HTML DOM, Network Routing, Abstract Syntax Tree, Folders in OS, JSON, etc.

  • Kinds of Trees: Trees, Binary Trees, Binary Search Trees

BST

  • Every parent node has at most 2 children
  • Every node to the left of a parent node is always less than the parent
  • Every node to the right of a parent node is always greater than the parent
Big O of BSTTime Complexity
InsertionO(log n)
SearchingO(log n)

Tree Traversal

  • Tree traversal, also known as tree search, is a process of visiting each node of a tree data structure. During tree traversal, you visit each node of a tree exactly once and perform an operation on the nodes like checking the node data (search) or updating the node.

  • Tree traversal algorithms can be classified broadly in the following two categories by the order in which the nodes are visited:

    • Depth-first search (DFS) algorithm: It starts with the root node and first visits all nodes of one branch as deep as possible before backtracking. It visits all other branches in a similar fashion. There are three subtypes under this that we will cover in this article.
    • Breadth-first search (BFS) algorithm: This also starts from the root node and visits all nodes of current depth before moving to the next depth in the tree. We will cover one algorithm of BFS type in the upcoming section.

Difference between DFS & BFS

Breadth First Search (BFS)

  • The Breadth First Search (BFS) algorithm is used to search a graph data structure for a node that meets a set of criteria. It starts at the root of the graph and visits all nodes at the current depth level before moving on to the nodes at the next depth level.

Breadth First Search

Depth First Search (DFS)

  • Depth First Traversal (or DFS) for a graph is similar to Depth First Traversal of a tree. It's a recursive algorithm for searching all the vertices of a graph or tree data structure. Traversal means visiting all the nodes of a graph. Unlike trees, graphs may contain cycles (a node may be visited twice). To avoid processing a node more than once, use a boolean visited array. A graph can have more than one DFS traversal.

Breadth First Search

Binary Heap

  • A binary heap is a heap data structure that takes the form of a binary tree (a tree in which each node has at most two children) which satisfies the following additional constraints:

    • The binary tree is complete, i.e. every level except the bottom-most (deepest one) level is completely filled and if the last level of the tree is not complete, then the nodes of the bottom-most level are filled from left to right.
    • Max-heap property: The key of every node is larger than or equal to its children.
    • In a Max Binary Heap, parent nodes are always larger then child nodes, and in a Min Binary Heap, parent nodes are always smaller than child nodes
Big O of Binary HeapsTime Complexity
InsertionO(log n)
RemovalO(log n)
SearchingO(n)

Min Heap & Max Heap - Visualization

Why should we need to know about Binary Heap?

  • Binary Heap has constant time (O(1)) time complexity, this means that we know where the value will be no matter what data structure we use
  • Binary Heaps are used to implement Priority Queues or Schedulers, where the earliest item is desired
  • They are also used quite a bit, with graph traversal algorithms

Max Binary Heap

  • Each parent has at most 2 child nodes
  • The value of each parent node is always greater than its child nodes
  • The parent is greater than the children, but there are no guarantees between sibling nodes
  • A binary heap is as compact as possible. All the children of each node are as full as they can be and left children are filled out first

Max Heap

Min Binary Heap

  • Each parent has at most 2 child nodes
  • The value of each parent node is always less than its child nodes
  • The parent is less than the children, but there are no guarantees between sibling nodes
  • A binary heap is as compact as possible. All the children of each node are as full as they can be and left children are filled out first

Min Heap

Priority Queue

Hash Tables

  • A hash table (aka hash map) is a data structure that implements an associative array or dictionary. It is an abstract data type that maps keys to values. A hash table uses a hash function to compute an index, also called a hash code, into an array of buckets or slots, from which the desired value can be found. During lookup, the key is hashed and the resulting hash indicates where the corresponding value is stored.
  • Hash tables are used to store key-value pairs, with the keys are not ordered
  • Unlike arrays, hash tables are fast for all of the following operations: finding values, adding new values, and removing values
  • In a hash table, a new index is processed using the keys. And, the element corresponding to that key is stored in the index. This process is called hashing.
Big O of Hash TablesTime Complexity (average cases)
InsertionO(1)
DeletionO(1)
AccessingO(1)

Hashing

Hash Function

  • The hash function creates a mapping between key and value, this is done through the use of mathematical formulas known as hash functions. The result of the hash function is referred to as a hash value or hash. The hash value is a representation of the original string of characters but usually smaller than the original.

  • Using a prime number for the size of a hash table array can help to reduce collisions and improve performance. Prime numbers are not divisible by any other number other than 1 and itself, so they are less likely to have a common factor with the hash code of an object being inserted into the table. This means that there is a lower chance of two objects being assigned to the same slot in the array, which can lead to fewer collisions and faster retrieval times. Additionally, using a prime number allows for a more evenly distributed hash code, which can also help to reduce collisions.

  • Prime numbers — and why blockchains can’t exist without them! - Medium

  • Why Should the Length of Your Hash Table Be a Prime Number? - Medium
  • Does making array size a prime number help in hash table implementation? Why? - Quora

Hash Collision

  • Hash collision happens when the hash function generates the same index for multiple keys, there will be a conflict (what value to be stored in that index). To resolve the has collision, we can use one of the following techniques:

    • Collision resolution by chaining: In chaining, if a hash function produces the same index for multiple elements, these elements are stored in the same index by using a doubly-linked list
    • Open Addressing - Linear/Quadratic Probing and Double Hashing: Open addressing doesn't store multiple elements into the same slot. Here, each slot is either filled with a single key or left NIL

      • Linear Probing: In linear probing, the hash table is searched sequentially that starts from the original location of the hash. If in case the location that we get is already occupied, then we check for the next location.

      • Quadratic Probing (aka Mid-Square method): Quadratic probing operates by taking the original hash index and adding successive values of an arbitrary quadratic polynomial until an open slot is found.

      • Double Hashing: Double hashing make use of two hash function

        • The first hash function is h1(k) which takes the key and gives out a location on the hash table. But if the new location is not occupied or empty then we can easily place our key.
        • But in case the location is occupied (collision) we will use secondary hash-function h2(k) in combination with the first hash-function h1(k) to find the new location on the hash table.

Hash Collision

Graphs

What is Graphs?

  • A graph data structure consists of a finite (and possibly mutable) set of vertices (or nodes) together with a set of unordered pairs of these vertices for an undirected graph or a set of ordered pairs for a directed graph

  • A Graph is a non-linear data structure consisting of vertices and edges. It is a collection of nodes that have data and are connected to other nodes. The vertices are sometimes also referred to as nodes and the edges are lines or arcs that connect any two nodes in the graph. More formally a Graph is composed of a set of vertices( V ) and a set of edges( E ).The graph is denoted by G(E, V).

Graphs

Big O of Graphs

  • |V|: number of vertices
  • |E|: number of edges
Big O of GraphsAdjacency ListAdjacency Matrix
Add VertexO(1)O(V^2)
Add EdgeO(1)O(1)
Remove VertexO(V + E)O(V^2)
Remove EdgeO(E)O(1)
QueryO(V + E)O(1)
StorageO(V + E)O(V^2)
Adjacency ListAdjacency Matrix
Can take up less space (in sparse graphs)Take up more space (in sparse graphs)
Faster to iterate over all edgesSlower to iterate over all edges
Can be slower to lookup specific edgeFaster to lookup specific edge

How Graphs are used?

  • Social Networks
  • Location / Mapping
  • Routing Algorithms
  • Visual Hierarchy
  • File System Optimization
  • etc...

Graph Terminology

  • Adjacency: A vertex is said to be adjacent to another vertex if there is an edge connecting them.
  • Path: A sequence of edges that allows you to go from vertex A to vertex B is called a path.
  • Directed Graph: A directed graph is a set of vertices (nodes) connected by edges, with each node having a direction associated with it. A graph in which an edge (u,v) doesn't necessarily mean that there is an edge (v, u) as well. The edges in such a graph are represented by arrows to show the direction of the edge.
  • Undirected Graph: In an undirected graph the edges are bidirectional, with no direction associated with them. Hence, the graph can be traversed in either direction. The absence of an arrow tells us that the graph is undirected.

Directed & Undirected Graphs

  • Weighted Graph: A weighted graph is defined as a special type of graph in which the edges are assigned some weights which represent cost, distance, and many other relative measuring units.
  • Unweighted Graph: An unweighted graph is a graph in which the edges do not have weights or costs associated with them. Instead, they simply represent the presence of a connection between two vertices

Weighted & Unweighted Graphs

Graph Representation

Adjacency Matrix

  • An adjacency matrix is a 2D array of V x V vertices. Each row and column represent a vertex. If the value of any element a[i][j] is 1, it represents that there is an edge connecting vertex i and vertex j.

Adjacency Matrix

Adjacency List

  • An adjacency list represents a graph as an array of linked lists. The index of the array represents a vertex and each element in its linked list represents the other vertices that form an edge with the vertex.

Adjacency List

Graph Traversal

Graph Traversal Uses

  • Peer to peer networking
  • Web crawlers
  • Finding closest matching / recommendations
  • Shortest path problems
    • GPS Navigation
    • Solving mazes
    • AI (shortest path to win the game)

Dijkastra's Algorithm

What is Dijkastra's Algorithm?

Why is it useful?

  • It's commonly used such as:
    • GPS - find fastest route
    • Network Routing - finds open shortest path for data
    • Biology - used to model the spread of viruses among humans
    • Airline tickets - finding cheapest route to your destination
    • Many other uses!

Dynamic Programming

Overlapping Sub-Problems

Optimal Sub-Structure

Memoization

Tabulation

1.0.1

6 months ago

1.0.0

6 months ago