Extended Search Planning for Multiple Moving Targets Incorporating search priorities

This article deals with a one-searcher multi-target search problem where targets with different detection priorities move in Markov processes in each discrete time interval over a given space search area, and the total number of search time intervals is fixed. A limited search resource is available in each search time interval and an exponential detection function is assumed. The searcher can obtain a target detection reward, if the target is detected, which represents the detection priority of target and does not increase with respect to time. The objective is to establish the optimal search plan that allocates the search resource effort over the search areas in each time interval in order to maximize the total detection reward. The analysis shows that the given problem can be decomposed into interval-wise individual search problems, each being treated as a single stationary target problem for each time interval. Thus, an iterative procedure is derived to solve a sequence of stationary target problems. The computational results show that the proposed algorithm guarantees optimality.


Introduction
In modern warfare, many weapon systems implemented in intelligence, surveillance, and reconnaissance (ISR) mission have been developed and utilized in the military field. One of those is the target-acquisition system, where the most important issue is how to accurately detect targets in a timely way. Search theory has given reasonable answers to these problems. Search problems deal with the search plan or strategy that allocates search resources to maximize the detection probability, including three major elements: probability distributions for targets location and motion, detection functions, and constraints on the search resource. The information about the targets position at a certain time interval and its subsequent motion can be quantified in terms of a probability distribution. The detection function relates the amount of resources placed in an area to the probability of detecting a target located in that area. Generally, the searcher has a limited amount of resources available to conduct a search mission. The limitation of resources may restrict distribution of the search resource infinitely over search areas.
As an object of interest, there are two types of targets. A stationary target is assumed to be located in one of the discrete cells that partition the search area and does not change its location. A moving target moves from one cell to another cell as time goes by. Earlier studies on search problems have been focused on detecting a single stationary target or a single moving target. However, in the real environment of modern warfare, there are many different types of targets that are moving in operational area, and the searcher is interested in detecting some or all 472 EXTENDED SEARCH PLANNING of them, not only a particular one. The targets displace their location frequently during the operation in order to secure their survivability. Furthermore, they all have different degrees of threat. Therefore, it is important for the searcher to consider the detection priorities of targets when he is planning for the search operation.
Motivated by these facts, a search problem of detecting multiple moving targets with different detection reward will be addressed in this article. A single searcher, an Unmanned Aerial Vehicle (UAV) with limited search resources performs an ISR mission. The UAV search is conducted in discrete time intervals, which are mutually independent of each others time interval. Multiple targets are moving among discrete cells along the Markov process between time intervals. The probability distribution of the targets initial location, the transition probability of each target, and the exponential detection function are known to the UAV. Then, the problem for the UAV is to establish the optimal search plan that allocates the given limited resources to the cells at each time interval in order to maximize the total detection reward.
The remainder of this article is organized as follows. The next section presents a brief review of related literature. In Section 3, the problem description and formulation is presented. Section 4 proposes an algorithm based on solution properties, and Section 5 gives the computational results of numerical examples. Finally, Section 6 concludes this work and discusses future work.

Related work
Numerous studies research search problems in the military domain. Koopman [9] provided the basic probabilistic foundation. He has defined the elements of the basic problem of optimal search: a prior distribution on target location, a function relating search resource and detection probability, and a constrained amount of search resource. He has also shown the optimization criteria of maximizing detection probability subject to a constraint on resource and found the optimal allocation of a fixed amount of search resource to a stationary target using an exponential detection function.
For a stationary target which emits radio frequencies, Kim [8] developed various methods to use Directional Finders (DFs) to determine the location of the starionary target in situations where there is an enemy threat. He presented six models, each appropriate for a different battlefield situation. Stone [11] found necessary and sufficient conditions for optimal search plan for a general class of stationary target problems involving Lagrangian multipliers. These conditions were used to show that incrementally optimal or myopic search plans are totally optimal for the standard cases (e.g., regular detection function), when a detection function is called regular if its first derivative is continuous, positive, and strictly decreasing.
Smith and Kimeldorf [10] considered the discrete search problem with an unknown number, N , of stationary targets. The objective of the problem is to minimize the expected cost which is associated with finding at least one target. They have showed that a locally optimal plan was identical to the global optimal plan when N has a Poisson distribution.
As an extension of stationary target problem, a single moving target is considered for a search problem. Stone [12] found solutions for special types of moving target problems, e.g., two-celled Markovian motion and conditionally deterministic target motion. For the case of target motions described by a continuous time-and-space Markov process and the detection function being exponential, necessary conditions for an optimal detection search were found.
The first substantial progress in developing an algorithm for moving target detection problems was obtained by Brown [2]. For the case of an exponential detection function, he applied the Karush-Kuhn-Tucker conditions to find an algorithm for computing the optimal detection search allocation for a target moving in discrete space and time, according to a Markov process. For this case, he was able to reduce the moving target problem to that of solving a sequence of stationary target problems. Lately, his work was extended to the problem with generalized linear constraints by Dambreville and Le Cadre [4] who explored the management of mixed search resources, such as radar and sonar.
Washburn [15] found the necessary and sufficient conditions for optimality for a general moving target problem that lends itself to the successive improvement of the search plan. After several years, Washburn [16] proposed Forward and Backward (FAB) algorithm which is a generalized Browns algorithm to apply to a wider class of payoff functions than any maximizing probability of detection. Tierney and Kadane [14] developed an algorithm to search strategy which maximizes the detection probability of the target with the constraint of the limited budget. Kadane [6] further developed such a search strategy under the restraint of the existing budget. Thomas and Eagle [13] considered a single searcher problem with a perishable target whose lifetimes are geometrically distributed and proposed several heuristic methods.
While the classical search problem deals with a single stationary target [2,15,16,17], recent works on search problem have extended to include moving and/or multiple targets. Kim et al. [7] considered a search planning and task allocation problem for multiple UAVs and multiple moving targets. Dell et al. [5] dealt with the problem with multiple searchers searching for a single target whose probability distribution of location is known. They have proposed a branch-and-bound algorithm and several heuristic algorithms. Berger et al. [1] exploited a mixed-integer programming to solve a multi-target, multi-searcher search problem.

Problem description
The proposed problem considers the situation where enemy troop is moving in an operation area and a friendly force is preparing for reconnaissance operation. The operation area is partitioned into several smaller sections, called cells. It is assumed that the terrain analysis of each cell is immediately completed. The friendly force performs reconnaissance operation with a single UAV carrying a limited search resource, say flight time, to detect the enemy troop targets. The UAV searches consecutively for targets over the operation area during the scout flight.
The goal of the reconnaissance operation is to detect threatening moving targets, which keep displacing their positions from one cell to another cell to secure survivability. The UAV obtains some reward if it detects a target. The detection reward depends on the priority of the target and decreases in time, which implies motivation to detect the targets as soon as possible. The objective of the proposed problem is to find an optimal search plan that allocates the flight time of the UAV to each cell during given time interval so as to maximize the detection reward subject to the limited total flight time. The assumptions considered are as follows; 1. The targets do not respond to the UAVs search action such that targets are passive, not either evading or hiding.
2. The number of targets, the probability distribution of each targets initial location, and its transition probability are assumed to be known in advance.
3. The search is conducted during each time interval and the targets move among the cells. The numbers of cells and time intervals are finite.
4. The targets moving process follows the Markov process. It is assumed that target transitions are independent of each other.

5.
A target is not divided into several targets and targets are not merged together during the reconnaissance operation.
6. The detection function is assumed as an exponential function, which represents detection probability of the search resource.
Targets move around in a search space that consists of a finite number of cells C = {1, 2, , c}. N discrete time intervals T = {1, 2, , n} are considered to detect the targets. A path of target j, w j , is defined as a sequence of cells {c ij : i = 1, 2, , n}. Target j takes a path w j from a finite set of paths, Ω, with probability π(w j ), where ∑ wj ∈Ω π(w j ) = 1 holds. Since each targets movement follows a Markov process, it holds that π(w j ) =

EXTENDED SEARCH PLANNING
is the probability that target j is placed in cell k at the initial time and t jks is the probability that target j moves from cell k to cell s. A UAV is provided with limited flight time, R i ≥ 0, during the i th time interval, assuming that the whole limited search flight time is distributed among the cells in arbitrary proportion to detect the targets which are represented by x ik ≥ 0 where x ik is the flight time of UAV allocated in cell k during the time interval i. Let the search plan X be the sequence of x ik for k ∈ C and i = 1, 2, · · · , n. The sum of such elapsed flight times in all cells during each time interval cannot exceed the allocated flight time during that time interval, which is constrained as ∑ k∈C x ik ≤ R i for all i and k. The probability of detecting a target during the i th time interval, assuming that the target is in cell k, is described by the detection function, 1 − exp(−α ik x ik ), where α ik > 0 is a constant reflecting the search condition in the cell during a time interval i. It is assumed that all searches during distinct time intervals are mutually independent, letting g j (x, i) be the overall probability that target j will be undetected untill the i th time interval; g j (x, i) can be derived as Let h j (x, i) be the probability of detecting target j during time interval [1, i]. Then, it holds that h j (x, i) = 1Cg j (x, i), and the UAV can detect target j during the i th time interval at the probability If the detection is made during the i th time interval, then the UAV will get the detection reward V ij . Therefore, the expected detection reward due to the detection of target j during the i th time interval is derived as Then, the expected total detection reward can be expressed as . Accordingly, the objective function of the proposed problem can be stated as to maximize the expected total detection reward, which can be rearranged as follows; It is easy to find out that maximizing the objective function of Eq.
Consequently, the given problem can be formulated as follows; where x ik ≥ 0 for all i and k.

Analysis
The objective of the problem is to determine the optimal distribution of the search resource among all cells in each time interval to maximize the total expected detection reward. For the problem analysis, consider the situation where a target with no priority moves among the partitioned search areas during finite time intervals. Let g(x, i) be defined as the probability that the target will be undetected until the i th time interval. The problem to minimize the non-detection probability of the target, that is, to minimize g(x, i) subject to Eq. (4), can be viewed as a single moving target problem. There is an efficient method to find the optimal search plan for a single moving target problem, which has been provided by Brown [2].
The objective function f (X) of the given problem is a linear combination of the functions, g j (x, i) for i = 1, 2, · · · , n and j = 1, 2, · · · , m. To minimize each g j (x, i), it can be viewed as the single moving target problem for target j. However, any one of the optimal solutions to each single moving target problem may not be the optimal one to the proposed problem. It is because each target has its own probability of the initial location and transition, and priority that is different from the others. Thus, it may not be appropriate to decompose the given problem into several single moving target problems, each being solved separately. Several solution properties should be analyzed to derive the efficient algorithm to guarantee the optimality for the given problem. Firstly, hj , x h,c hj ) was proved to be convex in a single moving target problem by Brown [2]. Moreover, f (X) is a linear combination of the functions g j (x, i), for i = 1, 2, · · · , n and j = 1, 2, · · · , m, which are non-negatively weighted by ∆V ij . This implies that the given problem becomes a convex programming problem since f (X) is convex and all constraints are linear.
Consider a special case where n = 1 and m = 1, which is a single stationary target problem. This problem minimizes the non-detection probability is the probability that a target is located in cell k. In a general case where n ̸ = 1 and m ̸ = 1, consider the movement of target in a particular time interval. Each target occupies exactly one cell in each of n time intervals. That is, there are no other targets moving into another cell in a single time interval. Then, the targets in a particular time interval can be considered as stationary targets. Thus, the stationary target problem is investigated for the given problem.
Review the single moving target case. For a particular time interval t for 1 ≤ t ≤ i, the overall non-detection probability is equivalent to the probability that no target was detected before time interval t, is detected at time interval t, and will be detected after time interval t to i. Therefore, the function g(x, i) can be derived as follows: where p(x, t) is the probability that a target arrives at cell k in the t th time interval without being detected before the t th time interval, and q(x, t, i) is the probability that a target is in cell k in the t th time interval and will not be detected after the t th time interval to the i th time interval, given no detection in the time interval [1, t].
The function of Eq. (5) is required for a stationary target problem, and the stationary target problem can be solved if p(x, t) and q(x, t, i) are known. The given problem can be reduced to a stationary target problem, given any search plan X fixed except for a particular time interval. The reason is as follows: Let p jk (X, t) be the probability that target j is in cell k in the t th time interval and not detected by the search plan X before the t th time interval, and q jk (X, t, i) be the probability that target j is in cell k in the t th time interval and will not be detected by the search plan X to the i th time interval after the t th time interval for 1 ≤ t < i. Then, p jk (X, t) and q jk (X, t, i) can be expressed as where p jk (X, 0) = 1 for all all j and k, and q jk (X, t, i) = ∑ s∈C t jsk exp(−α t+1,s x t+1,s )q jk (X, t + 1, i), 476 EXTENDED SEARCH PLANNING where q jk (X, i, i) = 1 for all all j and k. Eq. (6) and (7) are put together to give g j (X, i) as Thus, f (X) can be expressed as follows; Since all x ik are given except x ik for k = 1, 2, · · · , c, it is possible to compute each p jk (X, t) and q jk (X, t, i). Therefore, Eq. (9) represents the formula for a single stationary target search problem. Finally, the problem reduced to a stationary target problem for the tth time interval can be viewed as the problem of choosing x tk for the single stage search problem. After solving the single stage search problem, given any feasible search plan, a new plan includes all the elements of the prior plan, x ik for i ̸ = t and also the updated value of x tk , which means the allocation of the search resource for the tth time interval for k = 1, 2, · · · , c. The new search plan, after the single stage search problem, can then improve the objective function value rather than the prior one. Given any feasible solution X that constraints Eq. (4), let another feasible solution that is obtained after the single stage search problem be X ′ . Then f (X) ≥ f (X ′ ) for every t. The reason is as follows: As shown above, f (X) can be expressed as Eq. (9). After solving the single stage search problem, X ′ is given as X ′ = {x ik ∈ X for i = 1, · · · , n and x ′ tk satisfies Eq. (4)}. By the definition of the single stage search problem, . Therefore, f (X ′ ) is always less than or equal to f (X). According to the solution properties mentioned above, a new search plan generated by the single stage search problem updates current solutions so as to improve the objective function value. Therefore, these solution properties can be used to establish an effective algorithm to find the optimal search plan.

Algorithm
The basic idea of the algorithm is to transform the given problem to the sequence of single stage search problems. This iterative procedure guarantees optimality. The first step of the algorithm is to find an initial search plan X 0 , which is a feasible solution that satisfies constraints Eq. (4). The next step is to reduce the problem to a single stage search problem. In order to solve the single stage search problem, the particular time interval is selected from all partitioned time intervals. After picking up one time interval, the initial solution is fixed except for the chosen time interval and the single stage search problem is solved. Then, we need to calculate the functions, p jk (X 0 , t) and q jk (X 0 , t, i), for all i, j and k. If Ω is small enough to practically enumerate its elements, then p jk (X 0 , t) and q jk (X 0 , t, i) may be calculated directly. However, in most cases it is impractical to enumerate Ω. However, the recursive formulas in Eq. (6) and (7) permit an efficient calculation.
After computing p jk (X 0 , t) and q jk (X 0 , t, i), the single stage search problem becomes a typical stationary target problem. This problem can be solved easily by the Stone [12]s algorithm. He proved the existence of a unique solution to stationary search problem, adopting an exponential detection function and proposed an efficient method. A new search plan is produced after solving the single stage search problem. This plan includes the solution of the single stage search problem for the chosen time interval and the other solutions of the previous search plan except for the solution at that time interval. This implies that the elements at the chosen time interval are replaced by the new ones from the solution of the single stage search problem. As shown in the third solution property, the updated search plan improves the objective function value. Consequently, the search resource is reallocated to cells in order to increase the total detection reward by the new search plan. The next step is to choose another time interval. The search plan is fixed again except for that time interval and the single stage search problem is solved once more. These steps are applied recursively from the 1st time interval to the nth time interval. At the end of the first search the resource is allocated to cells for all time intervals. However, this plan may not be the best one because the plan may be improved. So the above procedure continues at the next iterations.
Let Y t (X) be the new search plan after solving the single stage search problem for the tth time interval, given any feasible solution X, and set X t = Y t (X t−1 ) with the initial solution. In this case forming X 1 , X 2 , · · · , X n corresponds to allocating the search resource for the successive time intervals in order to obtain the greatest decrease in the objective function value at the current time interval. According to the third solution property, Washburn [15] showed the necessary and sufficient condition for the optimality of the moving target problems, which leads to a successive improvement of search plan. In addition it has been proved that the associated strategy of renewing the search plan at a given time interval has a certain limit point, using a target location distribution. A strategy that cannot be improved by the above technique is called Criticality. Evidently, criticality is a necessary and sufficient condition for optimality in the case of the Markovian motion of target in discrete time and space. This condition says that the proposed algorithm guarantees the optimality. Accordingly, the optimization algorithm proceeds as follows; Step 1: Let X 0 t satisfies Eq. (3) and (4) be the initial solution.
Step 2: Let ε be a small positive number.
Step 5: Solve a single stage search problem.
Step 6: Update current solution by setting X r t = Y t (X r t−1 ).
Step 7: If t = n, then to go Step 8. Otherwise, Increase t by 1 and go to Step 5.

Computational results
This section consists of two parts. The first part presents the numerical results of myopic search and the proposed search algorithm. Specifically, the difference of search resource allocations and the detection probabilities of targets in both plans are compared. In the second part, the optimal search plans for multiple moving targets with identical and/or different priorities are compared. In a search plan for targets with different priorities, the search resource is allocated preferentially depending on the priorities of targets. We consider the case where 5 targets move among 9 cells in 8 time intervals. The probability distribution of the initial location is given in Table 1 and the transition of targets are generated randomly. The detection rewards of targets, search resource R i based on time interval, are shown in Table 2 and ε is set to 0.01.      Figure 1 illustrates the detection probabilities achieved by the myopic plan and the proposed algorithm. The detection probability of target 4, which has more detection rewards than the others throughout almost all time intervals, is higher than those of the others. This indicates that the search plan for multiple targets adjusts properly the resource allocation to detect the targets with higher priorities.  As shown on Figure 2, the objective function value decreases rapidly throughout successive iterations of the algorithm. The objective function value converges to the limit within the 4 th or 5 th iteration and there is no possibility of improvement after the 5 th iteration.

Comparison of the optimal search for target with identical and/or different priorities
The resource allocation of the optimal search plan for the targets with identical priorities is quite different from that of the search plan for the targets with different priorities. Table 4 and 5 show the difference between these two cases.  In the case of the targets with identical priorities, the search plan distributes the resource among the cells or the paths that are more likely to have a target and better search condition. However, the optimal search plan for the targets with different priorities allocates the resource preferentially to the cells where the targets with higher priorities are located.
The detection probabilities in the search plan for the targets with identical priorities are dispersed between 0.82 0.89 as shown in Figure 3. However, the detection probabilities in the search plan for the targets with different priorities are different, depending on the targets priorities. This implies that the search resource should be allocated to detect the targets with higher priorities.

Conclusion
This article deals with a search problem for multiple moving targets with search priorities incorporated. Though most of the search problems in the literature have considered a single moving target to maximize the detection probability, this article considers multiple targets that move in Markov processes in discrete time over a given space. The worth of targets is evaluated based on priorities and does not increase in time to reflect real condition in intelligence, surveillance, and reconnaissance operation. The objective is to determine the optimal search plan which allocates the limited It is shown that the given problem can be decomposed into interval-wise individual search problems, each of which is treated as a single stage target problem for given time interval. Therefore, the problem can be solved by working on the sequence of single stage target problems iteratively. At each iteration, search resource reallocation is needed to improve the objective function value so as to maximize the total detection reward. The proposed iterative procedure shows that the objective function value converges to the limit point, implying that the optimality is guaranteed. The computational results show that the optimal search plan of the proposed algorithm provides more improved objective function value and increased detection probability of target than the myopic search plan. As a further study, other search models considering different detection functions, dynamic search cost, other constraints on the UAVs path and can be considered to extend the problem.