Introduction to Algorithms, Data Structures, and Their Importance

Erwin Feng
9 min readDec 22, 2020
An algorithm for greatest common divisor, from wikipedia

When we think of the term algorithms, often times images of complex mathematical statements or code will come to mind. However, they are not necessarily as intimidating as we make them out to be. In essence, algorithms are a process or set of rules to be followed that will solve a problem. Let’s say you’re trying to impress some friends for a potluck and get an impressive recipe from online for your ham so it won’t disappoint. The problem solved here is making a delicious ham, and the algorithm followed is the recipe. Data structures are a way in which we can organize, store, and manage data for efficient access and manipulation(sounds like something algorithms would love to use)! In programming, developing algorithm and data structure skills can pay off so you understand implementation of functions and many technical interviews may challenge you with algorithm-based questions.

Why Bother Learning Them At All?

Image from coderust

People may also argue that there are not enough real world applications for algorithms. Most popular coding languages are well-established and 90% of the time we can just use their library functions for our needs. What’s the point of learning implementation when it is usually abstracted? I’d like to argue that a good programmer always understands what “happens under the hood”. By doing so, you learn the logic behind these abstracted library functions and will always know how to properly apply them. Also, in day to day work as a programmer(no matter what kind), problems will need to be overcome with logic, and having a strong foundation in algorithms will make your life that much easier.

It also doesn’t hurt that a sizable portion of potential employers like to ask questions related to computer science concepts such as algorithms, design patterns, and data structures! Say you are interviewing at Google as a front-end developer and your position wouldn’t require knowledge of the aforementioned concepts, so you don’t study them. The problem here is that highly rated tech companies such as Google, Amazon, and Microsoft all put a heavy emphasis on algorithms and data structures regardless of position the candidate is interviewing for. They want the best of the best, innovative engineers that can think outside of the box and tackle challenges of all sorts. They want engineers that understand coding will only be a portion of their responsibilities, and will work on optimizing the project design through algorithms to save the company resources(servers, computation power, data storage, etc).

I’d like to provide an example about the magnitude of resources saved due to optimization. We will be discussing it using Big O notation, used in computer science to classify algorithms based on runtime. Say you are on the gmail team at Google and you’d like to sort all of US users. The time complexity for n users could be O(n log n) if we used an optimal algorithm, compared to O(n²), a non-optimized method. If there were 100 million US gmail users, method one would result in 800 million(8x10⁸), while method two is 1x10¹⁶, more than 1 million times larger. The efficiency difference is astronomical and would save Google massive amounts of time and cost in maintaining resources until the sort is completed. You would also have to make the right choice in deciding what data structure(s) and design are needed to achieve this maximum efficiency.

Disclaimer: In this calculation we assume the O(n²) runs in n² and the O(n log n) runs in n log n for the sake of the example. In reality, big O notation is a simplification of the runtime, but the point is that for large n, the n log n algorithm is going to be vastly superior.

Basic Algorithms — Binary Search

Now that we’ve discussed what algorithms are and their importance, it would be nice if we could look at some examples to further our understanding. Let’s say you want to find your roll number (an ID number) arranged in increasing order in a 1000 page document, how would we proceed? Starting from page 1 and going to 1000 is a linear process that could take a long time. Instead we could jump straight to the middle at page 500 and take a look. If our number isn’t there but it’s higher than the ones on page 500, we can eliminate pages 1–500, and jump to the middle again, page 750. This process will be repeated until the correct page with our number is found. This algorithm is known as binary search and is a much faster search method than linear for larger data structures. If our numbers weren’t ordered, we would have to first decide what data structure to store the numbers(I personally am thinking of an array) and then sort them before searching. Sort is a commonly built in library function for many languages, but “under the hood” it is just a type of sorting algorithm.

Binary search example, taken from geeksforgeeks.org

Basic Algorithms — Pathfinding

Pathfinding is built upon the famous Dijkstra’s algorithm which finds the shortest route between 2 points of interest. If we use a GPS and set route options to “shortest route” we would be implementing a pathfinding algorithm. A basic example would be solving the fastest way out of a maze.

Maze, taken from Khan Academy

We would start on the goal square and start mapping all other squares based on number of moves it takes to reach from the goal, going one square at a time. In a larger, more complex maze once everything has been properly mapped, a program can find the best path by following a sequence of squares where the number at the end is the lowest. This is a great example of how we applied breadth-first search, where we find the shortest path to all vertices from a given source vertex.

Basic Algorithms — Recursion, Divide and Conquer, Dynamic Programming

Recursion is a method of solving a problem where the solution depends on solutions to smaller instances of the same problem. An example is when a function directly or indirectly calls on itself. Let’s say the goal of this function is to determine if a string is a palindrome or not. Some languages have library functions that can reverse a string and then we could check it against the original. However, not all languages may have a reverse function, so let’s try to write our own from scratch using recursion. Given an example palindrome of RACECAR, we could write a function that compares the first and last letters of the string. If they match, we remove those letters and call on the function again, else we have determined the string is not a palindrome. This function continue to call on itself until there are 0 or 1 letters left (proving that the string is a palindrome) or until a pair of letters does not match in the initial comparison. Another fun problem to think about is the Tower of Hanoi and how it could be solved through recursion.

Palindrome recursion, taken from StackOverflow

An interesting point to note is many algorithms follow other algorithms, such as the Divide and Conquer algorithm. In DAC, we divide the problem into sub problems, solve the sub problems through recursion, and then combine them for the final solution. Binary search actually follows DAC, in each step the algorithm compares the input X with the middle of the array. If if does not matchup, the algorithm recurs again, but this time for only half of the original array.

Example of divide and conquer, taken from Khan Academy

Regarding the previous point, it makes sense for algorithms to be based off of each other to achieved optimal efficiency. If we see that our recursive solution has repeated calls for the same inputs, we could use Dynamic Programming. The idea is to store inputs from the sub problems so we don’t have to re-compute them later, reducing time complexities greatly. We will be using the Fibonacci sequence as our example.

Simple recursive method(1) in ruby for fibonacci numbers
Time complexity for fibonacci using previous code, tree taken from Geekforgeeks.org
New fibonacci method(2), storing into an array
Faster method(3) than storing in array

In the #fib method, if we wanted to find a number late in the sequence (n being very large), we would have to call on the method many times, resulting in poor scaling. For n=10 it would take 453 iterations, for n=32 it would take 18,454,894 iterations. In the second method, the fibonacci numbers are stored into an array as they are calculated so we avoid the first method’s initial problem. The third method is slightly faster than the second since we are not storing them into an array, but just grabbing the desired nth number at the end. If we use the Ruby gem Benchmark, we can compare run time between each of the methods.

Method 1 Benchmark
Method 2 Benchmark
Method 3 Benchmark

There’s a significant reduction in runtime after incorporating dynamic programming to our first method. For methods 2 and 3, there’s a small increase in performance with the third method, but nowhere as drastic as the difference from the first.

Closing Thoughts and a Schrödinger Puzzle

Having a strong understanding of algorithms and data structures will make you a better programmer. There are so many applications of algorithms and higher tier tech companies use them as a filter to find stronger candidates. To begin your journey, I would first focus on becoming familiar with a programming language of your choice. It’s important to first learn the data structures of the language, then math and logic fundamentals behind common algorithms. Next, I would learn about Big-O and runtime before beginning to implement some algorithms through coding. Using popular algorithm books such as Introduction to Algorithms will greatly aid in your learning. Once you have achieved a level of competency with the basics, I would begin practicing problems on code challenge websites. I hope you all don’t feel intimidated by the path ahead, while it is not an easy journey, you will become more comfortable through increased practice and the payoffs are great.

Lastly, I was inspired to to write this article as I was doing the New York Times crossword and stumbled upon some trivia. The 1996 crossword before election day was known as a Schrödinger puzzle and I’ve included a GIF to demonstrate. In 39 across, the clue is “lead story in tomorrow’s newspaper with 43 across” and many people were outraged the NYT would be so bold to call the election before results. However, “CLINTON” and “BOBDOLE” both fulfill 39 across as well as 23 down, 27 down, 35 down, and 42 down despite the different answers. I thought it was incredible for there to be two different answers that also satisfied the same 5 clues. I began looking into algorithms for finishing/generating crosswords, it’s still a work in progress though!

Famous 1996 NYT Schrödinger puzzle crossword, from Reddit

--

--