Searching (Section B1)

UniDrill
Feb 21
6 min read

SEARCHING TECHNIQUES IN DATA STRUCTURES: COMPLETE CUET STUDY NOTES

1. Introduction to Searching

Searching is one of the most fundamental operations in computer science. It refers to the process of locating a particular element, often called the key, within a collection of data.

The result of a search operation determines whether the element is present in the collection or not. If the key is found, the search is considered successful and typically returns the position or index of that element; if not, it is declared unsuccessful.

For programmers, understanding searching is essential because the efficiency of an algorithm often depends on how quickly it can retrieve data for processing.

Just as a person might search for an item in their home by either remembering its exact location or looking through possible spots, a computer uses specific techniques like Linear Search, Binary Search, and Hashing to find information.

2. Linear Search (Sequential Search)

Linear search is the simplest and most basic searching method. It is an exhaustive technique where every element in a list is compared one by one with the key being searched.

2.1 Logic and Procedure

Sequential Comparison: The search begins at the first element of the list and moves towards the last element.
Matching Process: If an element matches the key, the search is successful and the position is returned.
Termination: If the entire list is traversed and no match is found, the search is unsuccessful.
Suitability: This method is ideal for small collections of data or lists that are unordered.

2.2 Algorithm for Linear Search

Algorithm 6.1:

Input: A list (numList), a search key (K), and the number of elements (n).
Initialize: Set index = 0.
Loop: While index < n, repeat:
- If numList[index] == K, then print "Element found at position index+1" and STOP.
- Else, index = index + 1.
Failure: If the loop ends, print "Search unsuccessful".

2.3 Analysis of Linear Search Cases

By performing a dry run on different arrangements of a list, we can identify performance variations:

Best Case: Occurs when the key is the first element in the list. In this scenario, the algorithm performs only one comparison.
Worst Case: Occurs when the key is the last element or is not present at all. In these cases, the algorithm must perform n comparisons, where n is the total number of elements.
Average Case: On average, the algorithm will find the element somewhere in the middle, requiring approximately n/2 comparisons.

2.4 Python Implementation

# Linear Search Function [8]

def linearSearch(list1, key):

for i in range(len(list1)):

if list1[i] == key:

return i + 1 # Returning 1-based position

return None

# Usage

my_list = [12, 23, 3, -45]

key = 23

pos = linearSearch(my_list, key)

if pos:

print("Element found at position", pos)

else:

print("Not found")

3. Binary Search

Binary search is a much more efficient technique, but it comes with a strict requirement: the collection of data must be sorted (either in ascending or descending order).

3.1 Logic: The Divide and Conquer Approach

Binary search takes advantage of the ordered nature of the list to narrow down the search area.

Middle Comparison: The key is first compared with the element at the middle position of the list.
Three Possibilities:
1. The middle element matches the key: Search successful.
2. The middle element is greater than the key: The key must be in the first half of the list. The second half is ignored.
3. The middle element is smaller than the key: The key must be in the second half. The first half is ignored.
Halving: This process of splitting and reducing the list size by half continues until the key is found or the list is reduced to a single non-matching item.

3.2 Algorithm for Binary Search

Algorithm 6.2:

Initialize: Set first = 0 and last = n-1.
Calculate Mid: mid = (first + last) // 2.
Loop: While first <= last:
- If numList[mid] == key, print "Found at mid+1" and STOP.
- Else if numList[mid] > key, set last = mid - 1.
- Else, set first = mid + 1.
- Recalculate mid = (first + last) // 2.
Failure: Print "Search unsuccessful".

3.3 Analysis of Binary Search Cases

Best Case: The key is found at the exact middle of the list in the very first iteration. This requires only one iteration.
Worst Case: The key is at the extreme ends (first or last position) or is not present. This requires the maximum number of iterations, which is logarithmic relative to the list size (log2n).
Advantage: Unlike linear search, each unsuccessful comparison in binary search provides information about where the key may be, allowing the algorithm to skip half the remaining elements.

3.4 Python Implementation

# Binary Search Function [19]

def binarySearch(list1, key):

first = 0

last = len(list1) - 1

while first <= last:

mid = (first + last) // 2

if list1[mid] == key:

return mid + 1 # 1-based position

elif key > list1[mid]:

first = mid + 1

else:

last = mid - 1

return -1

# Usage (List MUST be sorted) [20]

my_list = [21-28]

key = 7

pos = binarySearch(my_list, key)

4. Comparing Linear and Binary Search

Feature	Linear Search	Binary Search
List Requirement	Unordered or Ordered	Must be Ordered
Efficiency	Slow for large lists	Very fast for large lists
Comparisons	Increases linearly with size	Increases logarithmically
Time Complexity	Linear Time (n)	Logarithmic Time (logn)
Best Case	1 comparison (at index 0)	1 iteration (at middle)

Note on Sorting Overhead: While binary search is faster, sorting a large list takes substantial time (overhead). However, this overhead is usually worth it if the list needs to be searched multiple times.

5. Search by Hashing

Hashing is a powerful technique that allows for searching in just one step, regardless of the size of the list. If we know the exact index of every value in a list, we only need a single comparison to confirm its presence.

5.1 Hash Functions and Hash Tables

A hash function is a mathematical formula used to calculate the index of an element in a list. It takes elements as input and generates index values, which are then used to populate a hash table.

The Remainder Method (Modulo Division): A common and simple hash function for numeric data. It divides the element by the size of the hash table and uses the remainder as the index.

Formula: h(element)=element(modsize of hash table)

5.2 Implementation Example

Suppose we have an empty hash table of size 10 (indices 0–9). To store the element 34:

Calculation: 34(mod10)=4.
Action: Place 34 at index 4 in the hash table.

To search for 34 later, the computer simply recalculates 34(mod10), checks index 4, and finds the value in one step.

5.3 Python Implementation of Hashing

# Creating a simple hash table using a list [35]

hashTable = [None] * 10

L = [21, 36-39]

# Inserting elements using hash function [35]

for val in L:

hashTable[val % 10] = val

# Function to search using hashing [34]

def hashFind(key, hashTable):

index = key % 10

if hashTable[index] == key:

return index + 1

else:

return None

6. Collision and Resolution

6.1 The Concept of Collision

A collision occurs when two or more different elements map to the same index in the hash table.

Example: If our hash function is element % 10, then both 16 and 26 result in a hash value of 6. Since a single slot in a standard list can hold only one item, this creates a conflict.

6.2 Collision Resolution

To handle collisions, programmers must use a mechanism for placing the conflicting items in alternative slots. This process is known as collision resolution. While there are many methods for resolution (such as linear probing or chaining), these are generally considered advanced topics.

6.3 Perfect Hash Functions

A perfect hash function is one where every item in a list maps to a unique index in the hash table. If a hash function is perfect, collisions will never occur.

7. Summary of Key Concepts for CUET

Searching is about finding a key's presence and position.
Linear Search checks every item; it's slow for large data but works on unordered lists.
Binary Search is efficient but requires sorted data. It constantly halves the search area.
Time Complexity: Linear search is O(n), while algorithms with nested loops (like most sorts) are O(n2).
Hashing is the most efficient search technique, offering O(1) search time because it calculates the location directly.
Collisions are the main challenge in hashing, occurring when the hash function is not "perfect".
Search Choice: The selection of a searching algorithm depends on whether the data is sorted, the size of the dataset, and the frequency of search operations.

Searching (Section B1)

SEARCHING TECHNIQUES IN DATA STRUCTURES: COMPLETE CUET STUDY NOTES

Recent Posts

Comments