# 5-2 Searching an unsorted array

The problem examines three algorithms for searching for a value $x$ in an unsorted array $A$ consisting for $n$ elements.

Consider the following randomized strategy: pick a random index $i$ into $A$. If $A[i] = x$, then we terminate; otherwise, we continue the search by picking a new random index into $A$. We continue picking random indices into $A$ until we find an index $j$ such that $A[j] = x$ or until we have checked every element of $A$. Note that we pick from the whole set of indices each time, so that we may examine a given element more than once.

a.Write pseudocode for a procedure $\text{RANDOM-SEARCH}$ to implement the strategy above. Be sure that your algorithm terminates when all indices into $A$ have been picked.

b.Suppose that there is exactly one index $i$ such that $A[i] = x$. What is the expected number of indices into $A$ that we must pick before we find $x$ and $\text{RANDOM-SEARCH}$ terminates?

c.Generalizing your solution to part (b), suppose that there are $k \ge 1$ indices $i$ such that $A[i] = x$. What is the expected number of indices into $A$ that we must pick before we find $x$ and $\text{RANDOM-SEARCH}$ terminates? Your answer should be a function of $n$ and $k$.

d.Suppose that there are no indices $i$ such that $A[i] = x$. What is the expected number of indices into $A$ that we must pick before we have checked all elements of $A$ and $\text{RANDOM-SEARCH}$ terminates?Now consider a deterministic linear search algorithm, which we refer to as $\text{DETERMINISTIC-SEARCH}$. Specifically, the algorithm searches $A$ for $x$ in order, considering $A[1], A[2], A[3], \ldots, A[n]$ until either it finds $A[i] = x$ or it reaches the end of the array. Assume that possible permutations of the input array are equally likely.

e.Suppose that there is exactly one index $i$ such that $A[i] = x$. What is the average-case running time of $\text{DETERMINISTIC-SEARCH}$? What is the worst-case running time of $\text{DETERMINISTIC-SEARCH}$?

f.Generalizing your solution to part (e), suppose that there are $k \ge 1$ indices $i$ such that $A[i] = x$. What is the average-case running time of $\text{DETERMINISTIC-SEARCH}$? What is the worst-case running time of $\text{DETERMINISTIC-SEARCH}$? Your answer should be a function of $n$ and $k$.

g.Suppose that there are no indices $i$ such that $A[i] = x$. What is the average-case running time of $\text{DETERMINISTIC-SEARCH}$? What is the worst-case running time of $\text{DETERMINISTIC-SEARCH}$?Finally, consider a randomized algorithm $\text{SCRAMBLE-SEARCH}$ that works by first randomly permuting the input array and then running the deterministic linear search given above on the resulting permuting array.

h.Letting $k$ be the number of indices $i$ such that $A[i] = x$, give the worst-case and expected running time of $\text{SCRAMBLE-SEARCH}$ for the cases in which $k = 0$ and $k = 1$. Generalizing your solution to handle the case in which $k \ge 1$.

i.Which of the three searching algorithms would you use? Explain your answer.

**a.**

```
RANDOM-SEARCH(x, A, n)
v = Ø
while |Ø| != n
i = RANDOM(1, n)
if A[i] = x
return i
else
v = v ∩ i
return NIL
```

$v$ can be implemented in multiple ways: a hash table, a tree or a bitmap. The last one would probabily perform best and consume the least space.

**b.** $\text{RANDOM-SEARCH}$ is well-modelled by Bernoulli trials. The expected number of picks is $n$.

**c.** In similar fashion, the expected number of picks is $n / k$.

**d.** This is modelled by the balls and bins problem, explored in section 5.4.2. The answer is $n(\ln n + O(1))$.

**e.** The worst-case running time is $n$. The average-case is $(n + 1) / 2$ (obviously).

**f.** The worst-case running time is $n - k + 1$. The average-case running time is $(n + 1) / (k + 1)$. Let $X_i$ be an indicator random variable that the $i$th element is a match. $\Pr\{X_i\} = 1 / (k + 1)$. Let $Y$ be an indicator random variable that we have found a match after the first $n - k + 1$ elements ($\Pr\{Y\} = 1$). Thus,

$$ \begin{aligned} \text E[X] & = \text E[X_1 + X_2 + \ldots + X_{n - k} + Y] \\ & = 1 + \sum_{i = 1}^{n - k}\text E[X_i] = 1 + \frac{n - k}{k + 1} \\ & = \frac{n + 1}{k + 1}. \end{aligned} $$

**g.** Both the worst-case and average case is $n$.

**h.** It's the same as $\text{DETERMINISTIC-SEARCH}$, only we replace "average-case" with "expected".

**i.** Definitelly $\text{DETERMINISTIC-SEARCH}$. $\text{SCRAMBLE-SEARCH}$ gives better expected results, but for the cost of randomly permuting the array, which is a linear operation. In the same time we could have scanned the full array and reported a result.