Chapter 9 PSO Variations

The standard PSO analyzed in the previous chapter is capable of solving a wide range of problems, but often gets stuck in local minima. In this chapter, different variants of the standard PSO are analyzed using a problem from the financial domain. The first variant is the PSO with function stretching, which is designed to allow the PSO to escape from local minima if they are discovered. The second variant is the local PSO, which is designed to reduce the probability of getting stuck in local minima by limiting the spread of information in the swarm. The third variant, the PSO with feasibility preservation, tries to optimize within the feasibility space and therefore provide only feasible solutions. The last variant is the PSO with self-adaptive velocity, which tries to adjust the control parameters according to certain rules and randomness.

9.1 Testproblem Discrete ITP-MSTE

All variants are tested on a discrete ITP-MSTE to replicate the SP500TR with a tracking portfolio consisting of the top 50 assets in the S&P 500 derived from the example with discarding in section 7.4. The daily data used to solve the ITP ranges from 2018-01-01 to 2019-12-31, and the assets must be in the SP500TR at the end of the time frame and have no missing values. The tracking portfolio is discrete and has a net asset value of twenty thousand USD. The tracking portfolio is discretized using closing prices on 2019-12-31, and returns are calculated as simple returns using the adjusted closing prices. The maximum weighting for each asset is 10% to reduce the dimension of the search space. Additional constraints are long only and portfolio weights \(w\) should satisfy \(0.99 \leq \textstyle\sum w_i \leq 1\). All variants are run 100 times and compared to 100 runs of the standard PSO created in the previous chapter. The swarm size for the PSO and all variants is 50 and the iterations are set to 400 with coefficients \(c_p=c_g=0.5\) and the inertia weight starts at \(1.2\) and decreases to zero. All PSO’s start with the zero vector as the initial particle position to test the ability to find the feasible space.

The next plot analyzes the behavior of the 100 standard PSO runs in each iteration by plotting the median of the best fitness achieved in each iteration. The confidence bands for the 95% and 5% quantiles of the best fitness values are plotted in the same color as the median, with less transparency:

nav <- 20000
 
from <- "2018-01-01"
to <- "2019-12-31"

spx_composition <- buffer(
  get_spx_composition(),
  "AS_spx_composition"
)


pool_data <- buffer(
  get_yf(
    tickers = spx_composition %>% 
      filter(Date<=to) %>% 
      filter(Date==max(Date)) %>% 
      pull(Ticker), 
    from = from, 
    to = to
  ), 
  "AS_sp500_asset_data"
)

load("data/assets_pool_50.rdata")

pool_data$returns <- pool_data$returns[, assets_pool_50]
pool_data$prices <- pool_data$prices[, assets_pool_50]


bm_returns <- buffer(
  get_yf(tickers = "^SP500TR", from = from, to = to)$returns, 
  "AS_sp500tr"
) %>% setNames(., "SP500TR")


pool_returns <- pool_data$returns
mat <- list(
  Dmat = t(pool_returns) %*% pool_returns,
  dvec = t(pool_returns) %*% bm_returns,
  Amat = t(rbind(
    -rep(1, ncol(pool_returns)), # sum w <= 1
    rep(1, ncol(pool_returns)), # sum w >= 0.99
    diag(1, 
         nrow=ncol(pool_returns), 
         ncol=ncol(pool_returns)) # long only
  )),
  bvec = c(
    -1, # sum w <= 1
    0.99, # sum w >= 0.99
    rep(0, ncol(pool_returns)) # long only
  ),
  meq = 0
)

prices <- last(pool_data$prices)

calc_fit <- function(x){
  as.numeric(0.5 * t(x) %*% mat$Dmat %*% x - t(mat$dvec) %*% x)
}
calc_const <- function(x){
  const <- t(mat$Amat) %*% x - mat$bvec
  sum(pmin(0, const)^2)
}

set.seed(0)

SPSO_feasable_solutions <- NULL
df_SPSO <- NULL
for(i in 1:100){
  res_SPSO_time <- system.time({
    res_SPSO <- pso(
      par = rep(0, ncol(pool_data$returns)),
      fn = function(x){
        x <- as.vector(round(x*nav/prices)*prices/nav)
        fitness <- calc_fit(x)
        constraints <- calc_const(x)
        return(fitness+100*constraints)
      },
      lower = 0,
      upper = 0.1,
      control = list(
        s = 50, # swarm size
        c.p = 0.5, # inherit best
        c.g = 0.5, # global best
        maxiter = 400, # iterations
        w0 = 1.2, # starting inertia weight
        wN = 0, # ending inertia weight
        save_fit = T # save more information
      )
    )
  })
  res_SPSO$solution <- as.vector(round(res_SPSO$solution*nav/prices)*prices/nav)
  
  df_SPSO <- rbind(df_SPSO, 
    data.frame(
      "run" = i,
      suppressWarnings(rbind(data.frame(
        "type" = "PSO", "time"=res_SPSO_time[3], "const_break"=calc_const(res_SPSO$solution), res_SPSO$fit_data %>% select(iter, "mean_fit"=mean, "best_fit"=best)
      )))
    )
  )
  
  if(calc_const(as.vector(round(res_SPSO$solution*nav/prices)*prices/nav)) == 0){
    SPSO_feasable_solutions <- cbind(SPSO_feasable_solutions, as.vector(round(res_SPSO$solution*nav/prices)*prices/nav))
  }
  
}

df_res <- df_SPSO %>% 
  group_by(iter, type) %>% 
  summarise(time_mean=mean(time), const_break_mean=mean(const_break), best_fit_q1 = quantile(best_fit, 0.05), best_fit_q3 = quantile(best_fit, 0.95), best_fit_mean = mean(best_fit), best_fit_median = quantile(best_fit, 0.5)) %>% 
  ungroup()

save(df_res, df_SPSO, nav, from, to, prices, calc_fit, calc_const, mat, pool_returns, bm_returns, SPSO_feasable_solutions, pool_data, file="data/save_variant1.rdata")

load("data/save_variant1.rdata")

plot_ly() %>% 
  add_trace(data = df_res, x=~iter, y=~best_fit_median, name = "PSO", mode="lines", type = 'scatter', line = list(color="rgba(255, 51, 0, 1)")) %>% 
  add_trace(data = df_res, x=~iter, y=~best_fit_q1, name = "PSO_q1", mode="lines", type = 'scatter', line = list(color="rgba(255, 51, 0, 0.4)"), showlegend=F) %>% 
  add_trace(data = df_res, x=~iter, y=~best_fit_q3, name = "PSO_q3", mode="lines", type = 'scatter', fill="tonexty", line = list(color="rgba(255, 51, 0, 0.4)"), fillcolor = "rgba(255, 51, 0, 0.4)", showlegend=F) %>% 
  layout(yaxis=list(range=c(min(df_res[df_res$iter>30,]$best_fit_q1)-0.0002,  max(df_res[df_res$iter>30,]$best_fit_q3)+0.0005), title="best fitness")) %>% 
  html_save(., vheight=345, vwidth = 850, expand=c(-15,0,-3,0))

Figure 9.1: Fitness of the standard PSO for comparison with later variants

The aggregate statistics of the last iterations of all 100 runs can be found in the table below:

Figure 9.2: Statistics of the standard PSO for comparison with later variants

9.2 Function Stretching

PSO often gets stuck in local minima, i.e. when the current global best position is a local minimum with a surrounding region, with only higher fitnesses. In such situations, it is difficult for the PSO to escape and find the global minimum. Function stretching attempts to allow the PSO to escape such local minima by transforming the fitness function in the same way as described in (Parsopoulos & Vrahatis, 2002). After finding a local minimum, a two-stage transformation proposed by Vrahatis in 1996 can be used to stretch the original function such that the discovered local minimum is transformed into a maximum and any position with less fitness remains unchanged. The two stages of the transformation with a discovered local minimum \(\bar{x}\) are:

\[\begin{equation} G(x) = f(x) + \gamma_1 \cdot \| x-\bar{x} \| \cdot (\text{sign}(f(x)-f(\bar{x}))+1) \tag{9.1} \end{equation}\] and \[\begin{equation} H(x) = G(x) + \gamma_2 \cdot \frac{\text{sign}\biggl(f(x)-f(\bar{x})\biggr)+1}{\text{tanh}\biggl( \mu \cdot (G(x)-G(\bar{x})) \biggr)}. \tag{9.2} \end{equation}\]

The value \(G(\bar{x})\) can be simplified to \(f(\bar{x})\), the norm used in \(G(x)\) is the Manhattan norm and the \(\text{sign}()\) function is defined as follows:

\[ \text{sign}(x) = \begin{cases} 1, & \text{if}\ \ x > 0\\ 0, & \text{if}\ \ x = 0\\ -1, & \text{if}\ \ x < 0. \end{cases} \]

In the source it is suggested to select the following parameter values as default:

\[\begin{align*} \gamma_1 &= 5000 \\ \gamma_2 &= 0.5 \\ \mu &= 10^{-10}. \end{align*}\]

It is difficult to interpret both transformations accurately, especially in higher dimensions, but some concepts can be recognized by looking only at the most important parts. The first transformation \(G(x)\) uplifts all values greater than or equal to the local minimum and increases the uplift as a function of distance from the local minimum. The second function \(H(x)\) also does not change any values below the local minimum and otherwise focuses on all values near the local minimum, stretching it to infinity and dropping steeply to repel particles.

To better understand the transformation, it is used to stretch a simple function in \(\mathbb{R}^1\) defined as follows:

\[ f(x) = cos(x)+\frac{1}{10}\cdot x \]

and translated to the objective function:

fn <- function(pos){
  cos(pos) + 1/10 * pos
}

The domain of definition is chosen as \(x \in [-20, 20]\). Suppose the PSO gets stuck in the local minimum at \(\bar{x} = \pi - \text{arcsin}(\frac{1}{10}) \approx 3.04\). The original function fn and the transformed function fn_stretched, which matches \(H(x)\) in equation (9.2), are shown in the following graph:

fn1 <- function(pos, pos_best, pos_best_fit){
  res <- fn(pos)
  G <- res + 5000 * sqrt(sum((pos - pos_best)^2))/length(pos) * (sign(res - pos_best_fit) + 1)
  H <- G + 0.5 * (sign(res - pos_best_fit) + 1)/(tanh(10^(-10) * (G - pos_best_fit)))
  return(H)
}

X <- seq(-20, 20, 0.001)
x_best <- pi-asin(1/10)
x_best_fit <- fn(x_best)

p1 <- plot_ly(x=X, y=sapply(X, fn), type="scatter", mode="lines", name="fn") %>% 
   add_trace(x=X, y=rep(fn(pi-asin(1/10)), length(X)), type="scatter", mode="lines", name="local_minima", line=list(dash="dot", color="grey"))
p2 <- plot_ly(x=X, y=sapply(X, fn1, pos_best=x_best, pos_best_fit=x_best_fit), type="scatter", mode="lines", name="fn_stretched", line=list(color="orange")) %>% 
  layout(yaxis=list(range=c(-2, 20)*10^5))

subplot(p1, p2, shareY=T, nrows=2, margin=0.05) %>% 
  html_save(., vheight=480, vwidth=800, expand = c(-10,0,-15,0))

Figure 9.3: The effect of function stretching in a 1D example

It can be seen that the fitness is stretched upward around the local minimum \(\bar{x}\), making it much easier for the PSO to go down the hill and fall into new minima with lower fitness. It can be observed that on the right hand side there was only higher fitness before and after the transformation a local minimum has emerged which has a large area where only higher fitness values are located. In the worst case this can lead to a new local minimum from which it is even more difficult to escape. All regions with lower fitness remain unchanged, as can be seen in the zoomed version of the bottom diagram from above:

Figure 9.4: Zoomed version of the lower graphic from the figure above

9.2.1 Implementation

Since it is not possible to know if the PSO is stuck in a local minimum, a stagnation value was added that increases by one if the global best particle does not change. After ten iterations with no change, a local minimum is assumed and the transformation of the original objective function \(f(x)\) takes place. After that, all personal best fitness values must be re-evaluated to work with the evaluated space and the stagnation value is set to zero. To prevent transformation just at the end of all iterations, the current iteration must be less than the maximum iteration minus twenty to allow transformation to occur. The transformed function \(H(x)\) remains the objective function until the next local minimum is suspected, which again leads to a transformation of the original function \(f(x)\).

9.2.2 Test PSO with Function Stretching

The PSO with function stretching is called PSO-fnS and is evaluated on the test problem with \(\gamma_1 = 5000\), \(\gamma_2 = 0.5\) and \(\mu = 10^{-10}\):

load("data/save_variant1.rdata")
set.seed(0)
# R/PSO_functions.R : pso_fn_stretching()

df <- NULL
for(i in 1:100){
  res_pso_fns_time <- system.time({
    res_pso_fns <- pso_fn_stretching(
      par = rep(0, ncol(pool_data$returns)),
      fn = function(x){
        x <- as.vector(round(x*nav/prices)*prices/nav)
        fitness <- calc_fit(x)
        constraints <- calc_const(x)
        return(fitness+100*constraints)
      },
      lower = 0,
      upper = 0.1,
      control = list(
        s = 50, # swarm size
        c.p = 0.5, # inherit best
        c.g = 0.5, # global best
        maxiter = 400, # iterations
        w0 = 1.2, # starting inertia weight
        wN = 0, # ending inertia weight
        fn_stretching = T,
        save_fit = T
      )
    )
  })
  res_pso_fns$solution <- as.vector(round(res_pso_fns$solution*nav/prices)*prices/nav)
  
  df <- rbind(df, 
    data.frame(
      "run" = i,
      suppressWarnings(data.frame(
        "type" = "PSO-fnS", "time"=res_pso_fns_time[3], "const_break"=calc_const(res_pso_fns$solution), res_pso_fns$trace_fit %>% select(iter, mean_fit, best_fit)
      ))
    )
  )
}



df_res <- rbind(df_SPSO, df) %>% 
  #mutate(best_fit = best_fit +1) %>% 
  group_by(iter, type) %>% 
  summarise(time_mean=mean(time), const_break_mean=mean(const_break), best_fit_q1 = quantile(best_fit, 0.05), best_fit_q3 = quantile(best_fit, 0.95), best_fit_mean = mean(best_fit), best_fit_median = quantile(best_fit, 0.5)) %>% 
  ungroup()

save(df_res, df, file="data/save_variant2.rdata")

load("data/save_variant2.rdata")

plot_ly() %>% 
  add_trace(data = df_res %>% filter(type=="PSO"), x=~iter, y=~best_fit_median, name = "PSO", mode="lines", type = 'scatter', line = list(color="rgba(255, 51, 0, 1)")) %>% 
  add_trace(data = df_res %>% filter(type=="PSO"), x=~iter, y=~best_fit_q1, name = "PSO_q1", mode="lines", type = 'scatter', line = list(color="rgba(255, 51, 0, 0.4)"), showlegend=F) %>% 
  add_trace(data = df_res %>% filter(type=="PSO"), x=~iter, y=~best_fit_q3, name = "PSO_q3", mode="lines", type = 'scatter', fill="tonexty", line = list(color="rgba(255, 51, 0, 0.4)"), fillcolor = "rgba(255, 51, 0, 0.4)", showlegend=F) %>% 
  add_trace(data = df_res %>% filter(type=="PSO-fnS"), x=~iter, y=~best_fit_median, name = "PSO-fnS", mode="lines", type = 'scatter',line = list(color="rgba(0, 102, 204, 1)")) %>% 
  add_trace(data = df_res %>% filter(type=="PSO-fnS"), x=~iter, y=~best_fit_q1, name = "PSO-fnS_q1", mode="lines", type = 'scatter', line = list(color="rgba(0, 102, 204, 0.3)"), showlegend=F) %>% 
  add_trace(data = df_res %>% filter(type=="PSO-fnS"), x=~iter, y=~best_fit_q3, name = "PSO-fnS_q3", mode="lines", type = 'scatter', fill="tonexty", line = list(color="rgba(0, 102, 204, 0.3)"), fillcolor = "rgba(0, 102, 204, 0.3)", showlegend=F) %>% 
  layout(yaxis=list(range=c(min(df_res[df_res$iter>30,]$best_fit_q1)-0.0002,  max(df_res[df_res$iter>30,]$best_fit_q3)+0.0005), title="best fitness")) %>% 
  html_save(., vheight=345, vwidth = 850, expand=c(-15,0,-3,0))

Figure 9.5: Comparison of fitness between the PSO with functional stretching and the standard PSO

The aggregate statistics of the last iterations of all 100 runs can be found in the table below:

Figure 9.6: Comparison of statistics between the PSO with functional stretching and the standard PSO

It turns out that function stretching does not provide any benefits in this particular problem. Perhaps it is due to the unfavorable values for \(\gamma_1\), \(\gamma_2\) and \(\mu\), which lead to too strong repulsion of the particles. Another reason could be that the stretching of the function generates new local minima which are even more difficult to overcome than the original ones, especially in higher dimensions.

9.3 Local PSO

The standard PSO, also referred to as global PSO is a special case of the local PSO. The main difference lies in how the best particle is selected. In the local PSO, after the particles \(x_1, x_2, \cdots, x_N\) are initialized, each particle \(x_i\) is assigned a neighborhood \(N(x_i, \bar{k})\). The best particle within this neighborhood is referred to as the local best particle for particle \(x_i\), and it replaces the global best particle in the velocity update. If the neighborhood is chosen to be large enough to include all particles, it becomes equivalent to the standard PSO. A simple way to define the neighborhood with \(k\) neighbors for particle \(x_i\) is given in (Engelbrecht, 2013):

\[ N(x_i, k) = \{ x_{i-\bar{k}}, x_{i-(\bar{k}-1)}, x_{i-(\bar{k}-2)}, \cdots, x_{i}, \cdots, x_{i+(\bar{k}-2)}, x_{i+(\bar{k}-1)}, x_{i+\bar{k}} \} \]

with

\[ \bar{k} = floor(\frac{k}{2}) = \lfloor \frac{k}{2} \rfloor. \]

To illustrate this, the following figure defines the neighborhoods \(N(x_3, 4)\) and \(N(x_1, 4)\):

Visualization of the simple neighborhood topology.

In the latter case, it can be seen that the overflowing boundary will continue on the opposite side of the arranged particles.

9.3.1 Implementation

First, the neighbors for each particle are stored in a suitable data structure before the main part of the PSO is executed. In the local PSO, the global best particle is replaced by the local best particle, which must be determined for each particle in every iteration.

9.3.2 Test Local PSO

The PSO with particle neighborhoods is called PSO-local and is evaluated on the test problem with \(k=10\):

load("data/save_variant1.rdata")
set.seed(0)
# R/PSO_functions.R : pso_local()


df <- NULL
for(i in 1:100){

  res_pso_local_time <- system.time({
    res_pso_local <- pso_local(
      par = rep(0, ncol(pool_data$returns)),
      fn = function(x){
        x <- as.vector(round(x*nav/prices)*prices/nav)
        fitness <- calc_fit(x)
        constraints <- calc_const(x)
        return(fitness+100*constraints)
      },
      lower = 0,
      upper = 0.1,
      control = list(
        s = 50, # swarm size
        c.p = 0.5, # inherit best
        c.g = 0.5, # global best
        maxiter = 400, # iterations
        w0 = 1.2, # starting inertia weight
        wN = 0, # ending inertia weight
        save_fit = T,
        k=10
      )
    )
  })
  res_pso_local$solution <- as.vector(round(res_pso_local$solution*nav/prices)*prices/nav)
  
  df <- rbind(df, 
    data.frame(
      "run" = i,
      suppressWarnings(data.frame(
        "type" = "PSO-local", "time"=res_pso_local_time[3], "const_break"=calc_const(res_pso_local$solution), res_pso_local$fit_data %>% select(iter, "mean_fit"=mean, "best_fit" = best)
      ))
    )
  )
}



df_res <- rbind(df_SPSO, df) %>% 
  #mutate(best_fit = best_fit +1) %>% 
  group_by(iter, type) %>% 
  summarise(time_mean=mean(time), const_break_mean=mean(const_break), best_fit_q1 = quantile(best_fit, 0.05), best_fit_q3 = quantile(best_fit, 0.95), best_fit_mean = mean(best_fit), best_fit_median = quantile(best_fit, 0.5)) %>% 
  ungroup()

save(df_res, df, file="data/save_variant3.rdata")

load("data/save_variant3.rdata")

plot_ly() %>% 
  add_trace(data = df_res %>% filter(type=="PSO"), x=~iter, y=~best_fit_median, name = "PSO", mode="lines", type = 'scatter', line = list(color="rgba(255, 51, 0, 1)")) %>% 
  add_trace(data = df_res %>% filter(type=="PSO"), x=~iter, y=~best_fit_q1, name = "PSO_q1", mode="lines", type = 'scatter', line = list(color="rgba(255, 51, 0, 0.4)"), showlegend=F) %>% 
  add_trace(data = df_res %>% filter(type=="PSO"), x=~iter, y=~best_fit_q3, name = "PSO_q3", mode="lines", type = 'scatter', fill="tonexty", line = list(color="rgba(255, 51, 0, 0.4)"), fillcolor = "rgba(255, 51, 0, 0.4)", showlegend=F) %>% 
  add_trace(data = df_res %>% filter(type=="PSO-local"), x=~iter, y=~best_fit_median, name = "PSO-local", mode="lines", type = 'scatter',line = list(color="rgba(0, 102, 204, 1)")) %>% 
  add_trace(data = df_res %>% filter(type=="PSO-local"), x=~iter, y=~best_fit_q1, name = "PSO-local_q1", mode="lines", type = 'scatter', line = list(color="rgba(0, 102, 204, 0.3)"), showlegend=F) %>% 
  add_trace(data = df_res %>% filter(type=="PSO-local"), x=~iter, y=~best_fit_q3, name = "PSO-local_q3", mode="lines", type = 'scatter', fill="tonexty", line = list(color="rgba(0, 102, 204, 0.3)"), fillcolor = "rgba(0, 102, 204, 0.3)", showlegend=F) %>% 
  layout(yaxis=list(range=c(min(df_res[df_res$iter>30,]$best_fit_q1)-0.0002,  max(df_res[df_res$iter>30,]$best_fit_q3)+0.0005), title="best fitness")) %>% 
  html_save(., vheight=345, vwidth = 850, expand=c(-15,0,-3,0))

Figure 9.7: Comparison of fitness between the local PSO and the standard PSO

The aggregate statistics of the last iterations of all 100 runs can be found in the table below:

Figure 9.8: Comparison of statistics between the local PSO and the standard PSO

It can be seen that it is superior to the standard PSO in this case. Especially in preventing stagnation in local minima, which can be seen in the narrower quantile bands at the end.

9.4 Preserving Feasibility

Other variants of PSO often provide solutions that are infeasible, resulting in the need to run them multiple times. To ensure that each solution is feasible, a new variant was created in (Hu et al., 2003) that preserves the feasibility of the solutions. To be precise, this is not a variant of its own, but rather a different method for handling constraints instead of the commonly used penalty method. Nevertheless, it must change the core of the PSO implementation, which is why it is classified as its own variant in this work. The difference to the standard PSO is that the initialization of the particles is repeated until all positions are feasible. After that, only feasible solutions are stored as global or personal best positions, resulting in a guaranteed feasible final solution. Even the first step is the most difficult to achieve in practice. To illustrate this, the result of trying to find a feasible position among a million randomly generated positions of the test problem looks like this:

set.seed(0)
fn_const = function(x){
  x <- as.vector(round(x*nav/prices)*prices/nav)
  constraints <- calc_const(x)
  return(constraints)
}

time <- system.time({
  X <- mrunif(
    nr = ncol(pool_data$returns), nc=10^6, lower=0, upper=0.1
  )
  X_const <- apply(X, 2, fn_const)
})

print(paste0("Feasable positions: ", sum(X_const==0)))

[1] "Feasable positions: 0"

print(paste0("Elapsed time: ", time[3], " seconds"))

[1] "Elapsed time: 190.12 seconds"

rm(X, X_const)

It can be seen that after a million randomly generated positions, there was not a single feasible position. For this reason, the first step has been modified to start with only one feasible position, randomly selected from one of the final solutions of the standard PSO. The standard PSO was also repeated with the same starting positions to allow comparison.

9.4.1 Test Preserving Feasibility PSO

This section compares the PSO with feasibility preservation and the standard PSO. Both PSOs use the same feasible solutions as starting positions:

load("data/save_variant1.rdata")
set.seed(0)
# R/PSO_functions.R : pso_preserving_feasibility()

sel_sol <- round(runif(100,1,ncol(SPSO_feasable_solutions)))


df <- NULL
for(i in 1:100){

  res_pso_preFe_time <- system.time({
    res_pso_preFe <- pso_preserving_feasibility(
      par = SPSO_feasable_solutions[,sel_sol[i]],
      fn_fit = function(x){
        x <- as.vector(round(x*nav/prices)*prices/nav)
        constraints <- calc_const(x)
        fitness <- calc_fit(x)
        return(fitness + 100*constraints)
      },
      fn_const = function(x){
        x <- as.vector(round(x*nav/prices)*prices/nav)
        constraints <- calc_const(x)
        return(100*constraints)
      },
      lower = 0,
      upper = 0.1,
      control = list(
        s = 50, # swarm size
        c.p = 0.5, # inherit best
        c.g = 0.5, # global best
        maxiter = 400, # iterations
        w0 = 1.2, # starting inertia weight
        wN = 0, # ending inertia weight
        save_fit = T,
        k=10
      )
    )
  })
  res_pso_preFe$solution <- as.vector(round(res_pso_preFe$solution*nav/prices)*prices/nav)
  
  df <- rbind(df, 
    data.frame(
      "run" = i,
      suppressWarnings(data.frame(
        "type" = "PSO-preFe", "time"=res_pso_preFe_time[3], "const_break"=calc_const(res_pso_preFe$solution), res_pso_preFe$fit_data %>% select(iter, "mean_fit"=mean, "best_fit" = best)
      ))
    )
  )
}



df_SPSO2 <- NULL
for(i in 1:100){
  res_SPSO_time <- system.time({
    res_SPSO <- pso(
      par = SPSO_feasable_solutions[,sel_sol[i]],
      fn = function(x){
        x <- as.vector(round(x*nav/prices)*prices/nav)
        fitness <- calc_fit(x)
        constraints <- calc_const(x)
        return(fitness+100*constraints)
      },
      lower = 0,
      upper = 0.1,
      control = list(
        s = 50, # swarm size
        c.p = 0.5, # inherit best
        c.g = 0.5, # global best
        maxiter = 400, # iterations
        w0 = 1.2, # starting inertia weight
        wN = 0, # ending inertia weight
        save_fit = T # save more information
      )
    )
  })
  res_SPSO$solution <- as.vector(round(res_SPSO$solution*nav/prices)*prices/nav)
  
  df_SPSO2 <- rbind(df_SPSO2, 
    data.frame(
      "run" = i,
      suppressWarnings(rbind(data.frame(
        "type" = "PSO", "time"=res_SPSO_time[3], "const_break"=calc_const(res_SPSO$solution), res_SPSO$fit_data %>% select(iter, "mean_fit"=mean, "best_fit"=best)
      )))
    )
  )
  
}


df_res <- rbind(df_SPSO2, df) %>% 
  #mutate(best_fit = best_fit +1) %>% 
  group_by(iter, type) %>% 
  summarise(time_mean=mean(time), const_break_mean=mean(const_break), best_fit_q1 = quantile(best_fit, 0.05), best_fit_q3 = quantile(best_fit, 0.95), best_fit_mean = mean(best_fit), best_fit_median = quantile(best_fit, 0.5)) %>% 
  ungroup()

df_res$iter <- df_res$iter+400

save(df_res, df, file="data/save_variant4.rdata")

load("data/save_variant4.rdata")


plot_ly() %>% 
  add_trace(data = df_res %>% filter(type=="PSO"), x=~iter, y=~best_fit_median, name = "PSO", mode="lines", type = 'scatter', line = list(color="rgba(255, 51, 0, 1)")) %>% 
  add_trace(data = df_res %>% filter(type=="PSO"), x=~iter, y=~best_fit_q1, name = "PSO_q1", mode="lines", type = 'scatter', line = list(color="rgba(255, 51, 0, 0.4)"), showlegend=F) %>% 
  add_trace(data = df_res %>% filter(type=="PSO"), x=~iter, y=~best_fit_q3, name = "PSO_q3", mode="lines", type = 'scatter', fill="tonexty", line = list(color="rgba(255, 51, 0, 0.4)"), fillcolor = "rgba(255, 51, 0, 0.4)", showlegend=F) %>% 
  add_trace(data = df_res %>% filter(type=="PSO-preFe"), x=~iter, y=~best_fit_median, name = "PSO-preFe", mode="lines", type = 'scatter',line = list(color="rgba(0, 102, 204, 1)")) %>% 
  add_trace(data = df_res %>% filter(type=="PSO-preFe"), x=~iter, y=~best_fit_q1, name = "PSO-preFe_q1", mode="lines", type = 'scatter', line = list(color="rgba(0, 102, 204, 0.3)"), showlegend=F) %>% 
  add_trace(data = df_res %>% filter(type=="PSO-preFe"), x=~iter, y=~best_fit_q3, name = "PSO-preFe_q3", mode="lines", type = 'scatter', fill="tonexty", line = list(color="rgba(0, 102, 204, 0.3)"), fillcolor = "rgba(0, 102, 204, 0.3)", showlegend=F) %>% 
  layout(yaxis=list(range=c(min(df_res[df_res$iter>30,]$best_fit_q1)-0.0002,  max(df_res[df_res$iter>30,]$best_fit_q3)+0.0005), title="best fitness")) %>% 
  html_save(., vheight=345, vwidth = 850, expand=c(-15,0,-3,0))

Figure 9.9: Comparison of fitness between the PSO with feasibility preservation and the standard PSO, starting from the same randomly selected initial particles obtained from the last iterations of the standard PSO solutions

The aggregate statistics of the last iterations of all 100 runs can be found in the table below:

Figure 9.10: Comparison of statistics between the PSO with feasibility preservation and the standard PSO, starting from the same randomly selected initial particles obtained from the last iterations of the standard PSO solutions

The results indicate that the PSO with feasibility preservation is not able to solve finance-related problems that have a very small feasible space. It is more efficient to repeat the standard PSO with the previous best position until the solution is feasible.

9.5 Self-Adaptive Velocity

A PSO approach with self-adaptive velocity that attempts to reduce hyperparameters was analyzed in (Fan & Yan, 2014). The self-adaptive velocity is enabled by multiple velocity update schemes that are used randomly. Moreover, all hyperparameters are self-adaptive, since each particle has its own coefficients \(c_g\), \(c_p\), and \(w\), which change after each iteration depending on the fitness and other factors. The resulting PSO has no real hyperparameters that need to be tuned, and thus can be used as a general-purpose PSO.

9.5.1 Implementation

The process of this PSO variant is significantly different from the standard PSO, so all changes are summarized in steps.

1) Initialize: Each particle \(d\) must initialize its own inertial weight \(w_d^0=0.5\) and acceleration coefficients \(c_{p,d}^0 = c_{g,d}^0 = 2\).

2) Velocity and positions: Update the velocity of each particle \(d\) with the following switch-case for a uniform random number \(r = \text{Unif}(0,1)\) and maximal iterations \(i_{max}\), in iteration \(i+1\):

\[\begin{align*} v_d^{i+1} &= w_d^i \cdot v_d^{i}+c_{p,d}^i \cdot Z \cdot (p_{p,d}^i-x_d^i) + c_{g,d}^i \cdot Z \cdot (p_{g}^i-x_d^i) \\ Z &= \begin{cases} \text{Unif}(0,1), & \text{if}\ \ r > 0.8\\ \text{Cauchy}(\mu_1, \sigma_1), & \text{if}\ \ 0.8 \geq r > 0.4\\ \text{Cauchy}(\mu_2, \sigma_2), & \text{if}\ \ 0.4 \geq r\\ \end{cases} \end{align*}\]

with

\[\begin{align*} \mu_1 &= 0.1 \cdot (1-(\frac{i}{i_{max}})^2) + 0.3 \\ \sigma_1 &= 0.1 \\ \mu_2 &= 0.4 \cdot (1-(\frac{i}{i_{max}})^2) + 0.2 \\ \sigma_2 &= 0.4 \end{align*}\]

and \(\text{Cauchy}(\mu, \sigma)\) is a random number generated from the Cauchy distribution obtained with rcauchy() in R. The position update is the same as for the standard PSO. When a particle \(d\) has left the search space in its coordinate \(z\), it is moved back with the following switch-case for \(r = \text{Unif}(0,1)\):

\[ x_{z,d} = \begin{cases} \text{generate uniform in the search space}, & \text{if}\ \ r > 0.7\\ \text{push back to boundary}, & \text{otherwise.}\ \ \\ \end{cases} \]

3) Fitness evaluation: In the same way as for the standard PSO.

4) Self-adaptive control parameters: For an objective function \(f()\) and the maximum fitness of all particles \(f_{max} = \text{max}(f(X^{i+1}))\), the parameters \(w_d^{i}\), \(c_{p,d}^{i}\) and \(c_{g,d}^{i}\) are adjusted for each particle \(d\) as follows:

\[\begin{align*} W^i_d &= \frac{\left| f(x_d^{i+1})-f_{max} \right|}{\sum_d\left| f(x_d^{i+1})-f_{max} \right|} \\ w_d^{i+1} &= \text{Cauchy}(\sum_d W^i_d \cdot w_d^{i}, 0.2) \\ c_{p,d}^{i+1} &= \text{Cauchy}(\sum_d W^i_d \cdot c_{p,d}^{i}, 0.3) \\ c_{g,d}^{i+1} &= \text{Cauchy}(\sum_d W^i_d \cdot c_{g,d}^{i}, 0.3). \end{align*}\]

Then, the parameters are adjusted to their limits using the following formulas:

\[\begin{align*} w_d^{i+1} &= \begin{cases} \text{Unif}(0,1), & \text{if}\ \ w_d^{i+1} > 1\\ \text{Unif}(0,0.1), & \text{if}\ \ w_d^{i+1} < 0\\ w_d^{i+1}, & \text{otherwise} \end{cases}\\ c_{p,d}^{i+1} &= \begin{cases} \text{Unif}(0,1) \cdot 4, & \text{if}\ \ c_{p,d}^{i+1} > 4\\ \text{Unif}(0,1), & \text{if}\ \ c_{p,d}^{i+1} < 0\\ c_{p,d}^{i+1}, & \text{otherwise} \end{cases}\\ c_{g,d}^{i+1} &= \begin{cases} \text{Unif}(0,1) \cdot 4, & \text{if}\ \ c_{g,d}^{i+1} > 4\\ \text{Unif}(0,1), & \text{if}\ \ c_{g,d}^{i+1} < 0\\ c_{g,d}^{i+1}, & \text{otherwise.} \end{cases}\\ \end{align*}\]

5) Update the best positions: Update the personal best positions \(p_{p,d}\) and global best position \(p_g\) as in the standard PSO.

6) Repeat: Steps 2 to 5 are repeated until the maximum iteration \(i_{max}\) is reached.

9.5.2 Analyse Implementation

The random use of the distributions for the velocity update increases the diversity of the swarm. The coefficients in iteration \(i\) with 100 maximum iterations are distributed as follows:

Figure 9.11: Comparison of the different distributions used in the velocity update of the PSO with self-adapting velocity depending on the iteration

It can be seen that the randomness of the motion increases compared to the uniform distribution and the center of the Cauchy distributions slowly decreases towards the absolute term. In addition, the two Cauchy distributions differ in explorability and exploitability, indicated by probabilities outside \([0, 1]\).

Even more difficult to interpret is the adjusting of the control parameters. The value \(W_d^i\) is a weighting of the distances to the worst fitness, resulting in a higher weighting of the particles with good fitness. Later, the control parameters are adjusted using the Cauchy distribution with a weighted value of the previous control parameters as the center, giving higher weights to the control parameters that produced better fitness. This results in random control parameters distributed around the best previous control parameters. The resulting behavior can be described with a small quote, “If exploration is beneficial, more exploration is done. If not, more is exploited.”

9.5.3 Test PSO with Self-Adaptive Velocity

The PSO with self-adaptive velocity is called PSO-SAvel and is evaluated for the test problem with the constants used in the implementation section:

load("data/save_variant1.rdata")
set.seed(0)
# R/PSO_functions.R : pso_self_adaptive_velocity()

df <- NULL
for(i in 1:100){
  res_pso_SAvel_time <- system.time({
    res_pso_SAvel <- pso_self_adaptive_velocity(
      par = rep(0, ncol(pool_data$returns)),
      fn = function(x){
        x <- as.vector(round(x*nav/prices)*prices/nav)
        fitness <- calc_fit(x)
        constraints <- calc_const(x)
        return(fitness+100*constraints)
      },
      lower = 0,
      upper = 0.1,
      control = list(
        s = 50, # swarm size
        maxiter = 400, # iterations
        save_fit = T
      )
    )
  })
  res_pso_SAvel$solution <- as.vector(round(res_pso_SAvel$solution*nav/prices)*prices/nav)
  
  df <- rbind(df, 
    data.frame(
      "run" = i,
      suppressWarnings(data.frame(
        "type" = "PSO-SAvel", "time"=res_pso_SAvel_time[3], "const_break"=calc_const(res_pso_SAvel$solution), res_pso_SAvel$fit_data %>% select(iter, mean_fit=mean, best_fit=best)
      ))
    )
  )
}



df_res <- rbind(df_SPSO, df) %>% 
  #mutate(best_fit = best_fit +1) %>% 
  group_by(iter, type) %>% 
  summarise(time_mean=mean(time), const_break_mean=mean(const_break), best_fit_q1 = quantile(best_fit, 0.05), best_fit_q3 = quantile(best_fit, 0.95), best_fit_mean = mean(best_fit), best_fit_median = quantile(best_fit, 0.5)) %>% 
  ungroup()


save(df_res, df, file="data/save_variant5.rdata")

load("data/save_variant5.rdata")

plot_ly() %>% 
  add_trace(data = df_res %>% filter(type=="PSO"), x=~iter, y=~best_fit_median, name = "PSO", mode="lines", type = 'scatter', line = list(color="rgba(255, 51, 0, 1)")) %>% 
  add_trace(data = df_res %>% filter(type=="PSO"), x=~iter, y=~best_fit_q1, name = "PSO_q1", mode="lines", type = 'scatter', line = list(color="rgba(255, 51, 0, 0.4)"), showlegend=F) %>% 
  add_trace(data = df_res %>% filter(type=="PSO"), x=~iter, y=~best_fit_q3, name = "PSO_q3", mode="lines", type = 'scatter', fill="tonexty", line = list(color="rgba(255, 51, 0, 0.4)"), fillcolor = "rgba(255, 51, 0, 0.4)", showlegend=F) %>% 
  add_trace(data = df_res %>% filter(type=="PSO-SAvel"), x=~iter, y=~best_fit_median, name = "PSO-SAvel", mode="lines", type = 'scatter',line = list(color="rgba(0, 102, 204, 1)")) %>% 
  add_trace(data = df_res %>% filter(type=="PSO-SAvel"), x=~iter, y=~best_fit_q1, name = "PSO-SAvel_q1", mode="lines", type = 'scatter', line = list(color="rgba(0, 102, 204, 0.3)"), showlegend=F) %>% 
  add_trace(data = df_res %>% filter(type=="PSO-SAvel"), x=~iter, y=~best_fit_q3, name = "PSO-SAvel_q3", mode="lines", type = 'scatter', fill="tonexty", line = list(color="rgba(0, 102, 204, 0.3)"), fillcolor = "rgba(0, 102, 204, 0.3)", showlegend=F) %>% 
  layout(yaxis=list(range=c(min(df_res[df_res$iter>30,]$best_fit_q1)-0.0002,  max(df_res[df_res$iter>30,]$best_fit_q3)+0.0005), title="best fitness")) %>% 
  html_save(., vheight=345, vwidth = 850, expand=c(-25,0,-3,0))

Figure 9.12: Comparison of fitness between the PSO with self-adaptive velocity and the standard PSO

The aggregate statistics of the last iterations of all 100 runs can be found in the table below:

Figure 9.13: Comparison of statistics between the PSO with self-adaptive velocity and the standard PSO

The results look very promising, and with fewer hyperparameters to fine-tune, this may be one of the best variants for general use of a PSO.

9.6 PSO R Package

In this section, the existing package pso is used to compare the results with the standard PSO. The PSO from the existing package is called PSO-pkg and has been reconfigured to have the same hyperparameters as the standard PSO. It must be said that the PSO-pkg differs in the initialization of the velocity and does not use the compression coefficient of one tenth of the initialized velocity. The following diagram compares the fitness of the standard PSO and the PSO-pkg based on the test problem:

load("data/save_variant1.rdata")
set.seed(0)

df <- NULL
for(i in 1:100){
  res_pso_pkg_time <- system.time({
    res_pso_pkg <-suppressMessages(psoptim(
      par = rep(0, ncol(pool_data$returns)),
      fn = function(x){
        x <- as.vector(round(x*nav/prices)*prices/nav)
        fitness <- calc_fit(x)
        constraints <- calc_const(x)
        return(fitness+100*constraints)
      },
      lower = 0,
      upper = 0.1,
      control = list(
        s = 50, # swarm size
        maxit = 400, # iterations
        w = c(1.2, 0),
        c.p = 0.5,
        c.g = 0.5,
        k = 50,
        p=1,
        type="SPSO2007",
        hybrid=F,
        rand.order=F,
        trace.stats = T,
        trace=T
      )
    ))
  })
  res_pso_pkg$par <- as.vector(round(res_pso_pkg$par*nav/prices)*prices/nav)
  
  df_raw <- data.frame("iter"=res_pso_pkg$stats$it, "best_fit"=lapply(res_pso_pkg$stats$f, min) %>% unlist() %>% as.vector())
  min_fit <- df_raw[1,]$best_fit
  for(k in 2:nrow(df_raw)){
    if(min_fit < df_raw[k,]$best_fit){
      df_raw[k,]$best_fit <- min_fit
    }else{
      min_fit <- df_raw[k,]$best_fit
    }
  }
  
  df <- rbind(df, 
    data.frame(
      "run" = i,
      suppressWarnings(data.frame(
        "type" = "PSO-pkg", "time"=res_pso_pkg_time[3], "const_break"=calc_const(res_pso_pkg$par), df_raw
      ))
    )
  )
}



df_res <- bind_rows(df_SPSO, df) %>% 
  #mutate(best_fit = best_fit +1) %>% 
  group_by(iter, type) %>% 
  summarise(time_mean=mean(time), const_break_mean=mean(const_break), best_fit_q1 = quantile(best_fit, 0.05), best_fit_q3 = quantile(best_fit, 0.95), best_fit_mean = mean(best_fit), best_fit_median = quantile(best_fit, 0.5)) %>% 
  ungroup()

save(df_res, df, file="data/save_variant6.rdata")

load("data/save_variant6.rdata")

plot_ly() %>% 
  add_trace(data = df_res %>% filter(type=="PSO"), x=~iter, y=~best_fit_median, name = "PSO", mode="lines", type = 'scatter', line = list(color="rgba(255, 51, 0, 1)")) %>% 
  add_trace(data = df_res %>% filter(type=="PSO"), x=~iter, y=~best_fit_q1, name = "PSO_q1", mode="lines", type = 'scatter', line = list(color="rgba(255, 51, 0, 0.4)"), showlegend=F) %>% 
  add_trace(data = df_res %>% filter(type=="PSO"), x=~iter, y=~best_fit_q3, name = "PSO_q3", mode="lines", type = 'scatter', fill="tonexty", line = list(color="rgba(255, 51, 0, 0.4)"), fillcolor = "rgba(255, 51, 0, 0.4)", showlegend=F) %>% 
  add_trace(data = df_res %>% filter(type=="PSO-pkg"), x=~iter, y=~best_fit_median, name = "PSO-pkg", mode="lines", type = 'scatter',line = list(color="rgba(0, 102, 204, 1)")) %>% 
  add_trace(data = df_res %>% filter(type=="PSO-pkg"), x=~iter, y=~best_fit_q1, name = "PSO-pkg_q1", mode="lines", type = 'scatter', line = list(color="rgba(0, 102, 204, 0.3)"), showlegend=F) %>% 
  add_trace(data = df_res %>% filter(type=="PSO-pkg"), x=~iter, y=~best_fit_q3, name = "PSO-pkg_q3", mode="lines", type = 'scatter', fill="tonexty", line = list(color="rgba(0, 102, 204, 0.3)"), fillcolor = "rgba(0, 102, 204, 0.3)", showlegend=F) %>% 
  layout(yaxis=list(range=c(min(df_res[df_res$iter>30,]$best_fit_q1)-0.0002,  max(df_res[df_res$iter>30,]$best_fit_q3)+0.0005), title="best fitness")) %>% 
  html_save(., vheight=345, vwidth = 850, expand=c(-25,0,-3,0))

Figure 9.14: Comparison of fitness between the PSO R package with standard PSO settings and the standard PSO

The aggregate statistics of the last iterations of all 100 runs can be found in the table below:

Figure 9.15: Comparison of statistics between the PSO R package with standard PSO settings and the standard PSO

The PSO-pkg is slightly different from the standard PSO, which is most likely due to the different velocity initialization. It is important to note that the package pso has many features and variants that are disabled for this comparison.

9.7 Comparison with other Metaheuristics

In addition to the PSO, there are other metaheuristics that also treat the optimization problem as a black box and can thus be brought to a generic form. This is exactly what has been done in the R package metaheuristicOpt by implementing 19 more metaheuristics besides the PSO. Therefore, this package can be used to compare which metaheuristic is suitable for a specific type of optimization problem. The implemented metaheuristics are the standard versions, which is why no statement about variants of these metaheuristics can be given. Likewise, it is not guaranteed how performant the respective metaheuristics have been implemented. Nevertheless, this package is a great help to quickly and easily test different metaheuristics for an explicit problem. The metaheuristics implemented in the R package metaheuristicOpt are the following:

Figure 9.16: List of metaheuristics from the R package metaheuristicOpt

These metaheuristics are used to solve the test problem defined at the beginning of this chapter. The same settings are used as for the PSO variants, i.e. each run has a population size of 50, which corresponds to the number of particles in the swarm. The maximum iteration is 400 and each metaheuristic is run 100 times. Unlike the analysis above, only the last iteration of each run is recorded and compared. The data collected from each run of a metaheuristic are the fitness, the constraint breaks, and the run time. Aggregated for the respective metaheuristic, the 5% and 95% quantiles of the fitness and constraint breaks are calculated for the 100 runs, as well as the mean fitness, mean constraint breaks and mean runtime. These metrics are used to compare the respective metaheuristics in a ranking. The score calculated for this purpose is computed using the ranking in the respective category, which takes values from 1 to 24. The resulting score is a weighted sum of these individual rankings as follows:

\[\begin{align*} \text{score} =& \ 3 \cdot \text{q5\_fitness\_rnk} + 2 \cdot \text{mean\_fitness\_rnk} + \text{q95\_fitness\_rnk}\\ &+ \text{q5\_const\_break\_rnk} + 0.5 \cdot \text{mean\_const\_break\_rnk} \\ &+ 0.5 \cdot \text{q95\_const\_break\_rnk} + 3 \cdot \text{mean\_runtime\_rnk} \end{align*}\]

The score is to be interpreted in such a way that a smaller number is better than a larger one. It can be seen that the highest weight is on the 5% quantile of fitness, as this is representative of the best solutions after repeating the metaheuristic several times. Overall, this is a subjective ranking to sort the metaheuristics and compare them with a single number. In the following chart should therefore be all the necessary information to allow the reader to form his or here own opinion:

library(metaheuristicOpt)
metaheuristics <- c("ABC", "ALO", "BA", "BHO", "CLONALG", "CS", "CSO", "DA", "DE", "FFA", "GA", "GOA", "GWO", "HS", "KH", "MFO", "SCA", "WOA")
n_tests <- 100

res_all <- NULL
for(i in 1:length(metaheuristics)){
  opt_name <- metaheuristics[i]
  try({
    for(k in 1:n_tests){
      eval(
        expr=parse(
          text=paste('
            time <- system.time({
              res <- ',opt_name,'(
                FUN = function(x){
                  x <- as.vector(round(x*nav/prices)*prices/nav)
                  fitness <- calc_fit(x)
                  constraints <- calc_const(x)
                  return(fitness+100*constraints)
                },
                optimType = "MIN",
                numVar = ncol(pool_data$returns),
                numPopulation = 50,
                maxIter = 400,
                rangeVar = matrix(c(0, 0.1), ncol=ncol(pool_data$returns), nrow=2)
              )
            }) %>% .[3]
            res <- as.vector(round(unlist(res)*nav/prices)*prices/nav)
            res_df <- data.frame(
              opt_name = opt_name,
              fitness = calc_fit(res),
              constraints = calc_const(res),
              time = time,
              row.names = NULL
            )
          '))
      )

      res_all <- rbind(res_all, res_df)
    }
  })
}

res_all$overall_fit <- res_all$fitness+res_all$constraints



#save(res_all, file="data/external_metaheuristics.rdata")

load("data/external_methaheuristics.rdata")


df_PSOs <- NULL
load("data/save_variant1.rdata")
df_PSOs <- bind_rows(df_PSOs, df_SPSO %>% filter(type=="PSO", iter==400))
load("data/save_variant2.rdata")
df_PSOs <- bind_rows(df_PSOs, df %>% filter(type!="PSO", iter==400))
load("data/save_variant3.rdata")
df_PSOs <- bind_rows(df_PSOs, df %>% filter(type!="PSO", iter==400))
load("data/save_variant5.rdata")
df_PSOs <- bind_rows(df_PSOs, df %>% filter(type!="PSO", iter==400))
load("data/save_variant6.rdata")
df_PSOs <- bind_rows(df_PSOs, df %>% filter(type!="PSO", iter==400))
df_PSOs <- df_PSOs %>% select(type, time, fitness=best_fit, const_break) %>% mutate(created=1)

res_all <- bind_rows(
  df_PSOs,
  res_all %>% mutate(created=0, fitness = overall_fit) %>% select(type = opt_name, fitness, "const_break"=constraints, time, created)
)

res_all_agg <- res_all %>%
  group_by(type) %>%
  summarise(
    created = unique(created),
    fitness_q5 = quantile(fitness, 0.05),
    fitness_mean = mean(fitness),
    fitness_q95 = quantile(fitness, 0.95),
    const_break_q5 = quantile(const_break, 0.05),
    const_break_mean = mean(const_break),
    const_break_q95 = quantile(const_break, 0.95),
    time_mean = mean(time)
  ) %>%
  mutate(
    score = 3 * rank(fitness_q5) + 2 * rank(fitness_mean) + rank(fitness_q95) +
      1 * rank(const_break_q5) + 0.5 * rank(const_break_mean) + 0.5 * rank(const_break_q95) +
      3 * rank(time_mean)
  ) %>%
  arrange(score)
#res_all_agg$type <- factor(res_all_agg$type, levels = res_all_agg$type)

reactable(
  res_all_agg,
  columns = list(
    type = colDef(width=85),
    created = colDef(show=F),
    fitness_q5 = colDef(width=90, format=colFormat(digits=6)),
    fitness_mean = colDef(width=100, format=colFormat(digits=6)),
    fitness_q95 = colDef(width=90, format=colFormat(digits=6)),
    const_break_q5 = colDef(width=115, format=colFormat(digits=6)),
    const_break_mean = colDef(width=135, format=colFormat(digits=6)),
    const_break_q95 = colDef(width=125, format=colFormat(digits=6)),
    time_mean = colDef(width=85, format=colFormat(digits=2)),
    score = colDef(width=60, format=colFormat(digits=1))
  ),
  style = list(fontFamily="Computer Modern, sans-serif", fontSize=13),
  defaultPageSize = 30
) %>% 
  html_save(., expand = c(-40,-20,-40,-20), vwidth=970, vheight = 1100)

Figure 9.17: Comparison of statistics between metaheuristics

To visualize the detailed runs, a boxplot of the important metrics fitness, constraint break and runtime was also created. In this boxplot, the individual points of each run are displayed under the respective boxes. If there are no points or boxes, it is because they have taken on too large values:

Figure 9.18: Comparison of statistics between metaheuristics with boxplots

It can be observed that the metaheuristics from the R package metaheuristicOpt, consistently have a longer runtime, which is partly due to the implementation, since it does not make use of vectorial operations. Additionally it can be seen that beside the PSO and its variants also the Clonal Selection Algorithm (CLONALG) and the Ant Lion Optimizer Algorithm (ALO) reached a satisfying fitness, but had a significantly longer runtime. Overall, the self-adaptive velocity PSO (PSO-SAvel) had the best ranking, which is also confirmed by the key statistics. In addition, the PSO-SAvel has fewer hyperparameters, which is why it will be used for the backtests in the next chapter.

References

Engelbrecht, A. P. (2013). Particle swarm optimization: Global best or local best? 2013 BRICS Congress on Computational Intelligence and 11th Brazilian Congress on Computational Intelligence, 124–135.

Fan, Q., & Yan, X. (2014). Self-adaptive particle swarm optimization with multiple velocity strategies and its application for p-xylene oxidation reaction process optimization. Chemometrics and Intelligent Laboratory Systems, 139, 15–25.

Hu, X., Eberhart, R. C., & Shi, Y. (2003). Engineering optimization with particle swarm. Proceedings of the 2003 IEEE Swarm Intelligence Symposium. SIS’03 (Cat. No. 03ex706), 53–57.

Parsopoulos, K. E., & Vrahatis, M. N. (2002). Recent approaches to global optimization problems through particle swarm optimization. Natural Computing, 1(2), 235–306.