Sorting is defined as the rearrangement of the given data in a particular order. This order can be related to numerical values i.e. ascending or descending order, alphabetical i.e. case sensitive or insensitive and can be based on length of the string or on different coding styles like ascii or unicode.
Sorting is an extra step that is carried out to increase the efficiency of the operation that is being performed. For example: sorting a data structure like an array in advance may speed up a search operation. Also, it is necessary for some algorithms to function, for example, binary search works only on sorted data.
We have a variety of sorting algorithms ranging from bubble sort, selection sort, merge sort, quick sort, counting sort, radix sort, etc. Each one of them has its own set of weaknesses and strengths, based on which they can be employed by the developers.
Why do we need sorting?
In school, the children of a class are sorted in the increasing order of height while standing in a queue for the morning assembly. Similarly, the attendance register has the names of children sorted in alphabetical order. The telephone directories also have the contact numbers sorted by owner's name alphabetically. The sorted data is preferred over unsorted data because it is easier to handle and search for.
Small amount of data, a class of 60 in school, can be manually sorted but large data, for example, employee data in big company cannot be sorted manually. So, several sorting algorithms came into existence that could help human sort the data in seconds.
The data can be sorted in various ways like an ascending order, descending order, alphabetical order (in case of names). It can also be sorted using multiple keys for example sorting the employee data first by their respective department and then in an alphabetical order.
Sorting optimizes the efficiency of various algorithms like searching and merging, etc. Some algorithms like Prim's algorithm , Kruskal's algorithm, dijkstra's algorithm require the data to be already sorted before they could be applied.
What are various properties of a sorting algorithm?
A sorting algorithm has different properties like:
Inplace or not: An in-place sorting algorithm does not use any extra space to arrange the given elements in order i.e. sorts the elements by swapping them. For example: insertion sort, selection sort, etc.
Stability: A stable sorting algorithm keeps the equal elements in same order as in the input. For example: insertion sort, merge sort, bubble sort, etc.
Comparison-based or not: As the word suggests, a comparison-based algorithm compares elements to each other to sort them like bubble sort, insertion sort, selection sort, etc. The non-comparison based sorting algorithms work on assumptions like counting sort, bucket sort and do not sort using comparison operator.
Internal or External: Internal sorting can be carried out while residing in the computer's memory whereas when there are a large number of elements external sorting has to be carried out by storing data in files. Sorting algorithms can be implemented both internally and externally depending on memory requirement.
Explain some basic sorting algorithms.
The basic sorting algorithms include bubble, insertion and selection sort which make the sorting concept easier to understand. These algorithms are capable of sorting small amount of data and are not efficient with large data.
Bubble sort swaps two consecutive out-of-order elements, do this repeatedly until no more swaps occur. It bubbles(moves) up the elements to its proper place that they belong in specified order.
Insertion sort is an in-place, stable sorting technique that maintains two sub lists i.e. sorted and unsorted and progresses by comparing each consecutive element and inserting it in the correct order.
Selection sort selects the maximum element from the data and keeps replacing it with the elements from the end for maintaining an ascending order.
What are some efficient sorting algorithms?
Some efficient sorting algorithms include merge sort, quick sort and shell sort, heap sort that work well with large amount of data.
Merge sort is based on the divide and conquer technique i.e. it partitions the given data into groups until each group contains single element and then sorts and merges each group iteratively to produce sorted list of data.
Quick sort chooses a pivot element to partition the data. Both the sides are sorted by iteratively splitting and sorting and then merged in a way that the elements before the pivot have a value lesser than that of a pivot and vice versa.
Shell sort is an improved insertion sort. An insertion sort exchanges consecutive elements but a shell sort can exchange far apart elements with each other which leads to an increase in the efficiency of the algorithm.
What are some distribution-based sorting algorithms?
Distribution based sorting algorithms are those which divide the data into structures, sort it individually and then combine to get the sorted output. These include counting sort, bucket sort and radix sort.
Counting sort is a non-comparison-based algorithm that groups all the elements into bins in the range 0 to k and then counts the number of elements less than or greater than a particular element to place it at the correct position.
Radix sort uses any stable sort to sort the input according to the least significant digit followed by the next digit iteratively until we reach the most significant digit. It assumes the input consists of d-digit integers.
Bucket sort assumes that the input is uniformly distributed and falls in the range [0, K). It uses a hash function to assign each element to a bucket. Then all the buckets are sorted using any sorting algorithm and then merged to get sorted data.
What is tree sort and heap sort?
Tree sort utilizes the property of binary search trees to sort the elements. A binary search has the property that the value of the left child node is less than the parent node and the value of the right child node is greater than that of the parent node. So, this sorting algorithm first builds a binary search tree with the elements and then ouptuts the in order traversal of the binary search tree which gives the elements in sorted order.
Heap sort first inserts the given elements into a heap and then deletes the root element to get the elements in sorted order. Max-heap has the property that the value of each node is greater than equal to its child nodes and vice versa for the min-heap. For example, a min-heap is build using the elements. The root is deleted repeatedly while maintaining heap property to get ascending order as the root is smallest element.
What is the time and space complexity of different sorting algorithms?
Time complexity is the amount of time required by algorithm to execute as a function of input size. Similarly, space complexity refers to the amount of space required by algorithm to execute as function of input size.
The symbols to denote best-case time complexity is \(Ω(n)\), average-case time complexity is \(Θ(n)\) and worst-case time complexity is \(O(n)\). Time complexity order is same as mathematics 1>logn>n>nlogn>n^2.
Bubble sort and Insertion sort offer \(Ω(n)\) best-case time complexity and \(O(n^2)\) worst-case time complexity. The worst-case space complexity is \(O(1)\) as they are in place sorting.
Selection sort offers \(Ω(n^2)\) best and worst-case time complexity as it is independent of the prior order of elements and \(O(1)\) space complexity.
Quick sort and Merge sort offer \(Ω(n \cdot log(n)\) best-case time complexity whereas worst-case time complexity of merge sort is \(O(n \cdot log(n)\) but \(O(n^2)\) for quick sort because it depends on position of the pivot element. . The worst-case space complexity of merge sort is \(O(n)\) to store an extra array and that of stable quick sort is \(O(log(n)\) because of recursive calls.
When to use which sorting algorithm?
Choosing a sorting algorithm entirely depends on the problem at hand.
Bubble sort is used for basic understanding of sorting algorithms and can detect already sorted data. . Bubble sort It is used to sort the TV channels according to user's viewing time.
Selection sort is quite easy to implement and offers in-place sorting but does not perform well when the size of data increases.
Insertion sort sorts the data as soon as it receives it and is quite adaptive i.e. works well with already sorted data.
Merge sort is stable algorithm that can be implied when an efficient algorithm is needed. It is used where the data is accessed in sequential manner like tape drives and also in field of e-commerce.
Quick sort works well with arrays and medium sized datasets. It is considered one of the fast sorting algorithms if the pivot element is chosen randomly and is employed by sport channels to sort scores.
Heap sort is used when there is restriction on space complexity as it offers O(1) worst-case space complexity, i.e. utilises no extra space. It is employed in security and embedded system problems and used to sort linked list.
- Astrachan, Owen. 2003. "Bubble Sort: An Archaeological Algorithmic Analysis." Computer Science Department, Duke University, December 09. Accessed 2022-04-07.
- Baeldung. 2020. "An Overview of QuickSort Algorithm." Baeldung University, October 09. Accessed 2022-04-19.
- Baeldung. 2021. "Which Sorting Algorithm to Use?" Baeldung University, May 28. Accessed 2022-04-06.
- Baeldung. 2021m. "Quicksort vs. Mergesort." Baeldung University, April 08. Accessed 2022-04-19.
- Bhargava, Aditya Y. 2016. "Grokking Algorithms." Manning Publications. Accessed 2022-04-06.
- BigOCheatSheet. 2016. "Know Thy Complexities!" BigOCheatSheet, August 23. Accessed 2022-02-18
- Das, Sandipan. 2022. "10 Best Sorting Algorithms You Must Know About." Crio.Do, February 08. Accessed 2022-04-11.
- Halim, Steven. 2016. "Lecture 10: Sorting." CS 1020E: Data Structures and Algorithms, National University of Singapore. Accessed 2022-04-08.
- Joshi, Vaidehi. 2017. "Getting To The Root Of Sorting With Radix Sort." Base CS, on Medium, July 24. Updated 2019-04-08. Accessed 2022-04-08.
- Karumanchi, Narasimha. 2017. "Data Structures and Algorithms Made Easy." 5th edition, CareerMonk Publications. Accessed 2022-02-18.
- Knuth, Donald E.. 1998. "The Art of Computer Programming." (vol. 3_ Sorting and Searching) (2nd ed.), Addison Wesley Longman. Accessed 2022-04-06.
- Lean, Thomas. 2011. "Tony Hoare: inventing Quicksort." British Library Board, August 09. Accessed 2022-04-07.
- Mohapatra, Subasish. 2020. "Data Structures Using C." CET, Bhubaneswar. Accessed 2022-04-06.
- Saylor Academy. 2021. "Radix Sort." CS102: Introduction to Computer Science II, Unit 7: Searching and Sorting, 7.2: Sorting Algorithms, Saylor Academy. Updated 2021-05-14. Accessed 2022-04-07.
- Tang, Daisy. 2012. "Which Sorting Algorithm to Use?" CS241 -- Lecture Notes: Sorting Algorithm, cpp.edu. Accessed 2022-04-06.
- Thareja, Reema. 2014. "Data Structures Using C." Second edition, Oxford University Press. Accessed 2022-04-06.
- Wikipedia. 2022. "Sorting algorithm." Wikipedia. Updated 2022-03-21. Accessed 2022-04-06.
- Wikipedia. 2022m. "Merge sort." Wikipedia. Updated 2022-03-31. Accessed 2022-04-04.
- Wikipedia. 2022q. "Quick sort." Wikipedia. Updated 2022-03-31. Accessed 2022-04-06.
- Champion, Kasey. 2019. "Lecture 15: Sorting Algorithms." CSE 373: Data Structures and Algorithms, Winter 2019, University of Washington. Accessed 2022-04-08.
- Gregg, Chris, and Julie Zelenski. 2020. "Lecture 5/15: Sorting." CS 106B: Programming Abstractions, Spring 2020, Computer Science Department, Stanford University. Updated 2020-04-19. Accessed 2022-04-08.
- Sedgewick, Robert, and Kevin Wayne. 2018. "2.5 Sorting Applications." Algorithms, 4th Edition by Robert Sedgewick and Kevin Wayne, Chapter 2 Sorting, Princeton University. Updated 2018-03-18. Accessed 2022-04-11.