Chapter 9 PSO Variations
The standard PSO analyzed in the previous chapter is capable of solving a wide range of problems, but often gets stuck in local minima. In this chapter, different variants of the standard PSO are analyzed using a problem from the financial domain. The first variant is the PSO with function stretching, which is designed to allow the PSO to escape from local minima if they are discovered. The second variant is the local PSO, which is designed to reduce the probability of getting stuck in local minima by limiting the spread of information in the swarm. The third variant, the PSO with feasibility preservation, tries to optimize within the feasibility space and therefore provide only feasible solutions. The last variant is the PSO with self-adaptive velocity, which tries to adjust the control parameters according to certain rules and randomness.
9.1 Testproblem Discrete ITP-MSTE
All variants are tested on a discrete ITP-MSTE to replicate the SP500TR with a tracking portfolio consisting of the top 50 assets in the S&P 500 derived from the example with discarding in section 7.4. The daily data used to solve the ITP ranges from 2018-01-01 to 2019-12-31, and the assets must be in the SP500TR at the end of the time frame and have no missing values. The tracking portfolio is discrete and has a net asset value of twenty thousand USD. The tracking portfolio is discretized using closing prices on 2019-12-31, and returns are calculated as simple returns using the adjusted closing prices. The maximum weighting for each asset is 10% to reduce the dimension of the search space. Additional constraints are long only and portfolio weights \(w\) should satisfy \(0.99 \leq \textstyle\sum w_i \leq 1\). All variants are run 100 times and compared to 100 runs of the standard PSO created in the previous chapter. The swarm size for the PSO and all variants is 50 and the iterations are set to 400 with coefficients \(c_p=c_g=0.5\) and the inertia weight starts at \(1.2\) and decreases to zero. All PSO’s start with the zero vector as the initial particle position to test the ability to find the feasible space.
The next plot analyzes the behavior of the 100 standard PSO runs in each iteration by plotting the median of the best fitness achieved in each iteration. The confidence bands for the 95% and 5% quantiles of the best fitness values are plotted in the same color as the median, with less transparency:
<- 20000
nav
<- "2018-01-01"
from <- "2019-12-31"
to
<- buffer(
spx_composition get_spx_composition(),
"AS_spx_composition"
)
<- buffer(
pool_data get_yf(
tickers = spx_composition %>%
filter(Date<=to) %>%
filter(Date==max(Date)) %>%
pull(Ticker),
from = from,
to = to
), "AS_sp500_asset_data"
)
load("data/assets_pool_50.rdata")
$returns <- pool_data$returns[, assets_pool_50]
pool_data$prices <- pool_data$prices[, assets_pool_50]
pool_data
<- buffer(
bm_returns get_yf(tickers = "^SP500TR", from = from, to = to)$returns,
"AS_sp500tr"
%>% setNames(., "SP500TR")
)
<- pool_data$returns
pool_returns <- list(
mat Dmat = t(pool_returns) %*% pool_returns,
dvec = t(pool_returns) %*% bm_returns,
Amat = t(rbind(
-rep(1, ncol(pool_returns)), # sum w <= 1
rep(1, ncol(pool_returns)), # sum w >= 0.99
diag(1,
nrow=ncol(pool_returns),
ncol=ncol(pool_returns)) # long only
)),bvec = c(
-1, # sum w <= 1
0.99, # sum w >= 0.99
rep(0, ncol(pool_returns)) # long only
),meq = 0
)
<- last(pool_data$prices)
prices
<- function(x){
calc_fit as.numeric(0.5 * t(x) %*% mat$Dmat %*% x - t(mat$dvec) %*% x)
}<- function(x){
calc_const <- t(mat$Amat) %*% x - mat$bvec
const sum(pmin(0, const)^2)
}
set.seed(0)
<- NULL
SPSO_feasable_solutions <- NULL
df_SPSO for(i in 1:100){
<- system.time({
res_SPSO_time <- pso(
res_SPSO par = rep(0, ncol(pool_data$returns)),
fn = function(x){
<- as.vector(round(x*nav/prices)*prices/nav)
x <- calc_fit(x)
fitness <- calc_const(x)
constraints return(fitness+100*constraints)
},lower = 0,
upper = 0.1,
control = list(
s = 50, # swarm size
c.p = 0.5, # inherit best
c.g = 0.5, # global best
maxiter = 400, # iterations
w0 = 1.2, # starting inertia weight
wN = 0, # ending inertia weight
save_fit = T # save more information
)
)
})$solution <- as.vector(round(res_SPSO$solution*nav/prices)*prices/nav)
res_SPSO
<- rbind(df_SPSO,
df_SPSO data.frame(
"run" = i,
suppressWarnings(rbind(data.frame(
"type" = "PSO", "time"=res_SPSO_time[3], "const_break"=calc_const(res_SPSO$solution), res_SPSO$fit_data %>% select(iter, "mean_fit"=mean, "best_fit"=best)
)))
)
)
if(calc_const(as.vector(round(res_SPSO$solution*nav/prices)*prices/nav)) == 0){
<- cbind(SPSO_feasable_solutions, as.vector(round(res_SPSO$solution*nav/prices)*prices/nav))
SPSO_feasable_solutions
}
}
<- df_SPSO %>%
df_res group_by(iter, type) %>%
summarise(time_mean=mean(time), const_break_mean=mean(const_break), best_fit_q1 = quantile(best_fit, 0.05), best_fit_q3 = quantile(best_fit, 0.95), best_fit_mean = mean(best_fit), best_fit_median = quantile(best_fit, 0.5)) %>%
ungroup()
save(df_res, df_SPSO, nav, from, to, prices, calc_fit, calc_const, mat, pool_returns, bm_returns, SPSO_feasable_solutions, pool_data, file="data/save_variant1.rdata")
load("data/save_variant1.rdata")
plot_ly() %>%
add_trace(data = df_res, x=~iter, y=~best_fit_median, name = "PSO", mode="lines", type = 'scatter', line = list(color="rgba(255, 51, 0, 1)")) %>%
add_trace(data = df_res, x=~iter, y=~best_fit_q1, name = "PSO_q1", mode="lines", type = 'scatter', line = list(color="rgba(255, 51, 0, 0.4)"), showlegend=F) %>%
add_trace(data = df_res, x=~iter, y=~best_fit_q3, name = "PSO_q3", mode="lines", type = 'scatter', fill="tonexty", line = list(color="rgba(255, 51, 0, 0.4)"), fillcolor = "rgba(255, 51, 0, 0.4)", showlegend=F) %>%
layout(yaxis=list(range=c(min(df_res[df_res$iter>30,]$best_fit_q1)-0.0002, max(df_res[df_res$iter>30,]$best_fit_q3)+0.0005), title="best fitness")) %>%
html_save(., vheight=345, vwidth = 850, expand=c(-15,0,-3,0))
Figure 9.1: Fitness of the standard PSO for comparison with later variants
The aggregate statistics of the last iterations of all 100 runs can be found in the table below:
Figure 9.2: Statistics of the standard PSO for comparison with later variants
9.2 Function Stretching
PSO often gets stuck in local minima, i.e. when the current global best position is a local minimum with a surrounding region, with only higher fitnesses. In such situations, it is difficult for the PSO to escape and find the global minimum. Function stretching attempts to allow the PSO to escape such local minima by transforming the fitness function in the same way as described in (Parsopoulos & Vrahatis, 2002). After finding a local minimum, a two-stage transformation proposed by Vrahatis in 1996 can be used to stretch the original function such that the discovered local minimum is transformed into a maximum and any position with less fitness remains unchanged. The two stages of the transformation with a discovered local minimum \(\bar{x}\) are:
\[\begin{equation} G(x) = f(x) + \gamma_1 \cdot \| x-\bar{x} \| \cdot (\text{sign}(f(x)-f(\bar{x}))+1) \tag{9.1} \end{equation}\] and \[\begin{equation} H(x) = G(x) + \gamma_2 \cdot \frac{\text{sign}\biggl(f(x)-f(\bar{x})\biggr)+1}{\text{tanh}\biggl( \mu \cdot (G(x)-G(\bar{x})) \biggr)}. \tag{9.2} \end{equation}\]
The value \(G(\bar{x})\) can be simplified to \(f(\bar{x})\), the norm used in \(G(x)\) is the Manhattan norm and the \(\text{sign}()\) function is defined as follows:
\[ \text{sign}(x) = \begin{cases} 1, & \text{if}\ \ x > 0\\ 0, & \text{if}\ \ x = 0\\ -1, & \text{if}\ \ x < 0. \end{cases} \]
In the source it is suggested to select the following parameter values as default:
\[\begin{align*} \gamma_1 &= 5000 \\ \gamma_2 &= 0.5 \\ \mu &= 10^{-10}. \end{align*}\]
It is difficult to interpret both transformations accurately, especially in higher dimensions, but some concepts can be recognized by looking only at the most important parts. The first transformation \(G(x)\) uplifts all values greater than or equal to the local minimum and increases the uplift as a function of distance from the local minimum. The second function \(H(x)\) also does not change any values below the local minimum and otherwise focuses on all values near the local minimum, stretching it to infinity and dropping steeply to repel particles.
To better understand the transformation, it is used to stretch a simple function in \(\mathbb{R}^1\) defined as follows:
\[ f(x) = cos(x)+\frac{1}{10}\cdot x \]
and translated to the objective function:
<- function(pos){
fn cos(pos) + 1/10 * pos
}
The domain of definition is chosen as \(x \in [-20, 20]\). Suppose the PSO gets stuck in the local minimum at \(\bar{x} = \pi - \text{arcsin}(\frac{1}{10}) \approx 3.04\). The original function fn
and the transformed function fn_stretched
, which matches \(H(x)\) in equation (9.2), are shown in the following graph:
<- function(pos, pos_best, pos_best_fit){
fn1 <- fn(pos)
res <- res + 5000 * sqrt(sum((pos - pos_best)^2))/length(pos) * (sign(res - pos_best_fit) + 1)
G <- G + 0.5 * (sign(res - pos_best_fit) + 1)/(tanh(10^(-10) * (G - pos_best_fit)))
H return(H)
}
<- seq(-20, 20, 0.001)
X <- pi-asin(1/10)
x_best <- fn(x_best)
x_best_fit
<- plot_ly(x=X, y=sapply(X, fn), type="scatter", mode="lines", name="fn") %>%
p1 add_trace(x=X, y=rep(fn(pi-asin(1/10)), length(X)), type="scatter", mode="lines", name="local_minima", line=list(dash="dot", color="grey"))
<- plot_ly(x=X, y=sapply(X, fn1, pos_best=x_best, pos_best_fit=x_best_fit), type="scatter", mode="lines", name="fn_stretched", line=list(color="orange")) %>%
p2 layout(yaxis=list(range=c(-2, 20)*10^5))
subplot(p1, p2, shareY=T, nrows=2, margin=0.05) %>%
html_save(., vheight=480, vwidth=800, expand = c(-10,0,-15,0))
Figure 9.3: The effect of function stretching in a 1D example
It can be seen that the fitness is stretched upward around the local minimum \(\bar{x}\), making it much easier for the PSO to go down the hill and fall into new minima with lower fitness. It can be observed that on the right hand side there was only higher fitness before and after the transformation a local minimum has emerged which has a large area where only higher fitness values are located. In the worst case this can lead to a new local minimum from which it is even more difficult to escape. All regions with lower fitness remain unchanged, as can be seen in the zoomed version of the bottom diagram from above: