General Efficient Frontier¶

The mean-variance optimization methods described previously can be used whenever you have a vector of expected returns and a covariance matrix. The objective and constraints will be some combination of the portfolio return and portfolio volatility.

However, you may want to construct the efficient frontier for an entirely different type of risk model (one that doesn’t depend on covariance matrices), or optimize an objective unrelated to portfolio return (e.g tracking error). PyPortfolioOpt comes with several popular alternatives and provides support for custom optimization problems.

Efficient Semivariance¶

Instead of penalising volatility, mean-semivariance optimization seeks to only penalise downside volatility, since upside volatility may be desirable.

There are two approaches to the mean-semivariance optimization problem. The first is to use a heuristic (i.e “quick and dirty”) solution: pretending that the semicovariance matrix (implemented in risk_models) is a typical covariance matrix and doing standard mean-variance optimization. It can be shown that this does not yield a portfolio that is efficient in mean-semivariance space (though it might be a good-enough approximation).

Fortunately, it is possible to write mean-semivariance optimization as a convex problem (albeit one with many variables), that can be solved to give an “exact” solution. For example, to maximise return for a target semivariance \(s^*\) (long-only), we would solve the following problem:

\[\begin{split}\begin{equation*} \begin{aligned} & \underset{w}{\text{maximise}} & & w^T \mu \\ & \text{subject to} & & n^T n \leq s^* \\ &&& B w - p + n = 0 \\ &&& w^T \mathbf{1} = 1 \\ &&& n \geq 0 \\ &&& p \geq 0. \\ \end{aligned} \end{equation*}\end{split}\]

Here, B is the \(T \times N\) (scaled) matrix of excess returns: B = (returns - benchmark) / sqrt(T). Additional linear equality constraints and convex inequality constraints can be added.

PyPortfolioOpt allows users to optimize along the efficient semivariance frontier via the EfficientSemivariance class. EfficientSemivariance inherits from EfficientFrontier, so it has the same utility methods (e.g add_constraint(), portfolio_performance()), but finds portfolios on the mean-semivariance frontier. Note that some of the parent methods, like max_sharpe() and min_volatility() are not applicable to mean-semivariance portfolios, so calling them returns NotImplementedError.

EfficientSemivariance has a slightly different API to EfficientFrontier. Instead of passing in a covariance matrix, you should past in a dataframe of historical/simulated returns (this can be constructed from your price dataframe using the helper method expected_returns.returns_from_prices()). Here is a full example, in which we seek the portfolio that minimises the semivariance for a target annual return of 20%:

from pypfopt import expected_returns, EfficientSemivariance

df = ... # your dataframe of prices
mu = expected_returns.mean_historical_returns(df)
historical_returns = expected_returns.returns_from_prices(df)

es = EfficientSemivariance(mu, historical_returns)
es.efficient_return(0.20)

# We can use the same helper methods as before
weights = es.clean_weights()
print(weights)
es.portfolio_performance(verbose=True)

The portfolio_performance method outputs the expected portfolio return, semivariance, and the Sortino ratio (like the Sharpe ratio, but for downside deviation).

Interested readers should refer to Estrada (2007) [1] for more details. I’d like to thank Philipp Schiele for authoring the bulk of the efficient semivariance functionality and documentation (all errors are my own). The implementation is based on Markowitz et al (2019) [2].

Caution

Finding portfolios on the mean-semivariance frontier is computationally harder than standard mean-variance optimization: our implementation uses 2T + N optimization variables, meaning that for 50 assets and 3 years of data, there are about 1500 variables. While EfficientSemivariance allows for additional constraints/objectives in principle, you are much more likely to run into solver errors. I suggest that you keep EfficientSemivariance problems small and minimally constrained.

class pypfopt.efficient_frontier.EfficientSemivariance(expected_returns, returns, frequency=252, benchmark=0, weight_bounds=(0, 1), solver=None, verbose=False, solver_options=None)[source]¶

EfficientSemivariance objects allow for optimization along the mean-semivariance frontier. This may be relevant for users who are more concerned about downside deviation.

Instance variables:

Inputs:
- n_assets - int
- tickers - str list
- bounds - float tuple OR (float tuple) list
- returns - pd.DataFrame
- expected_returns - np.ndarray
- solver - str
- solver_options - {str: str} dict
Output: weights - np.ndarray

Public methods:

min_semivariance() minimises the portfolio semivariance (downside deviation)
max_quadratic_utility() maximises the “downside quadratic utility”, given some risk aversion.
efficient_risk() maximises return for a given target semideviation
efficient_return() minimises semideviation for a given target return
add_objective() adds a (convex) objective to the optimization problem
add_constraint() adds a constraint to the optimization problem
convex_objective() solves for a generic convex objective with linear constraints
portfolio_performance() calculates the expected return, semideviation and Sortino ratio for the optimized portfolio.
set_weights() creates self.weights (np.ndarray) from a weights dict
clean_weights() rounds the weights and clips near-zeros.
save_weights_to_file() saves the weights to csv, json, or txt.

efficient_return(target_return, market_neutral=False)[source]¶

Minimise semideviation for a given target return.

Parameters:	target_return (float) – the desired return of the resulting portfolio. market_neutral (bool, optional) – whether the portfolio should be market neutral (weights sum to zero), defaults to False. Requires negative lower weight bound.
Raises:	ValueError – if `target_return` is not a positive float ValueError – if no portfolio can be found with return equal to `target_return`
Returns:	asset weights for the optimal portfolio
Return type:	OrderedDict

efficient_risk(target_semideviation, market_neutral=False)[source]¶

Maximise return for a target semideviation (downside standard deviation). The resulting portfolio will have a semideviation less than the target (but not guaranteed to be equal).

Parameters:	target_semideviation (float) – the desired maximum semideviation of the resulting portfolio. market_neutral – whether the portfolio should be market neutral (weights sum to zero), defaults to False. Requires negative lower weight bound. market_neutral – bool, optional
Returns:	asset weights for the efficient risk portfolio
Return type:	OrderedDict

max_quadratic_utility(risk_aversion=1, market_neutral=False)[source]¶

Maximise the given quadratic utility, using portfolio semivariance instead of variance.

Parameters:	risk_aversion (positive float) – risk aversion parameter (must be greater than 0), defaults to 1 market_neutral – whether the portfolio should be market neutral (weights sum to zero), defaults to False. Requires negative lower weight bound. market_neutral – bool, optional
Returns:	asset weights for the maximum-utility portfolio
Return type:	OrderedDict

min_semivariance(market_neutral=False)[source]¶

Minimise portfolio semivariance (see docs for further explanation).

Parameters:	market_neutral – whether the portfolio should be market neutral (weights sum to zero), defaults to False. Requires negative lower weight bound. market_neutral – bool, optional
Returns:	asset weights for the volatility-minimising portfolio
Return type:	OrderedDict

portfolio_performance(verbose=False, risk_free_rate=0.02)[source]¶

After optimising, calculate (and optionally print) the performance of the optimal portfolio, specifically: expected return, semideviation, Sortino ratio.

Parameters:	verbose (bool, optional) – whether performance should be printed, defaults to False risk_free_rate (float, optional) – risk-free rate of borrowing/lending, defaults to 0.02. The period of the risk-free rate should correspond to the frequency of expected returns.
Raises:	ValueError – if weights have not been calculated yet
Returns:	expected return, semideviation, Sortino ratio.
Return type:	(float, float, float)

Efficient CVaR¶

The conditional value-at-risk (a.k.a expected shortfall) is a popular measure of tail risk. The CVaR can be thought of as the average of losses that occur on “very bad days”, where “very bad” is quantified by the parameter \(\beta\).

For example, if we calculate the CVaR to be 10% for \(\beta = 0.95\), we can be 95% confident that the worst-case average daily loss will be 3%. Put differently, the CVaR is the average of all losses so severe that they only occur \((1-\beta)\%\) of the time.

While CVaR is quite an intuitive concept, a lot of new notation is required to formulate it mathematically (see the wiki page for more details). We will adopt the following notation:

w for the vector of portfolio weights
r for a vector of asset returns (daily), with probability distribution \(p(r)\).
\(L(w, r) = - w^T r\) for the loss of the portfolio
\(\alpha\) for the portfolio value-at-risk (VaR) with confidence \(\beta\).

The CVaR can then be written as:

\[CVaR(w, \beta) = \frac{1}{1-\beta} \int_{L(w, r) \geq \alpha (w)} L(w, r) p(r)dr.\]

This is a nasty expression to optimize because we are essentially integrating over VaR values. The key insight of Rockafellar and Uryasev (2001) [3] is that we can can equivalently optimize the following convex function:

\[F_\beta (w, \alpha) = \alpha + \frac{1}{1-\beta} \int [-w^T r - \alpha]^+ p(r) dr,\]

where \([x]^+ = \max(x, 0)\). The authors prove that minimising \(F_\beta(w, \alpha)\) over all \(w, \alpha\) minimises the CVaR. Suppose we have a sample of T daily returns (these can either be historical or simulated). The integral in the expression becomes a sum, so the CVaR optimization problem reduces to a linear program:

\[\begin{split}\begin{equation*} \begin{aligned} & \underset{w, \alpha}{\text{minimise}} & & \alpha + \frac{1}{1-\beta} \frac 1 T \sum_{i=1}^T u_i \\ & \text{subject to} & & u_i \geq 0 \\ &&& u_i \geq -w^T r_i - \alpha. \\ \end{aligned} \end{equation*}\end{split}\]

This formulation introduces a new variable for each datapoint (similar to Efficient Semivariance), so you may run into performance issues for long returns dataframes. At the same time, you should aim to provide a sample of data that is large enough to include tail events.

I am grateful to Nicolas Knudde for the initial draft (all errors are my own). The implementation is based on Rockafellar and Uryasev (2001) [3].

class pypfopt.efficient_frontier.EfficientCVaR(expected_returns, returns, beta=0.95, weight_bounds=(0, 1), solver=None, verbose=False, solver_options=None)[source]¶

The EfficientCVaR class allows for optimization along the mean-CVaR frontier, using the formulation of Rockafellar and Ursayev (2001).

Instance variables:

Inputs:
- n_assets - int
- tickers - str list
- bounds - float tuple OR (float tuple) list
- returns - pd.DataFrame
- expected_returns - np.ndarray
- solver - str
- solver_options - {str: str} dict
Output: weights - np.ndarray

Public methods:

min_cvar() minimises the CVaR
efficient_risk() maximises return for a given CVaR
efficient_return() minimises CVaR for a given target return
add_objective() adds a (convex) objective to the optimization problem
add_constraint() adds a constraint to the optimization problem
portfolio_performance() calculates the expected return and CVaR of the portfolio
set_weights() creates self.weights (np.ndarray) from a weights dict
clean_weights() rounds the weights and clips near-zeros.
save_weights_to_file() saves the weights to csv, json, or txt.

efficient_return(target_return, market_neutral=False)[source]¶

Minimise CVaR for a given target return.

Parameters:	target_return (float) – the desired return of the resulting portfolio. market_neutral (bool, optional) – whether the portfolio should be market neutral (weights sum to zero), defaults to False. Requires negative lower weight bound.
Raises:	ValueError – if `target_return` is not a positive float ValueError – if no portfolio can be found with return equal to `target_return`
Returns:	asset weights for the optimal portfolio
Return type:	OrderedDict

efficient_risk(target_cvar, market_neutral=False)[source]¶

Maximise return for a target CVaR. The resulting portfolio will have a CVaR less than the target (but not guaranteed to be equal).

Parameters:	target_cvar (float) – the desired conditional value at risk of the resulting portfolio. market_neutral – whether the portfolio should be market neutral (weights sum to zero), defaults to False. Requires negative lower weight bound. market_neutral – bool, optional
Returns:	asset weights for the efficient risk portfolio
Return type:	OrderedDict

min_cvar(market_neutral=False)[source]¶

Minimise portfolio CVaR (see docs for further explanation).

Parameters:	market_neutral – whether the portfolio should be market neutral (weights sum to zero), defaults to False. Requires negative lower weight bound. market_neutral – bool, optional
Returns:	asset weights for the volatility-minimising portfolio
Return type:	OrderedDict

portfolio_performance(verbose=False)[source]¶

After optimising, calculate (and optionally print) the performance of the optimal portfolio, specifically: expected return, CVaR

Parameters:	verbose (bool, optional) – whether performance should be printed, defaults to False
Raises:	ValueError – if weights have not been calculated yet
Returns:	expected return, CVaR.
Return type:	(float, float)

set_weights(input_weights)[source]¶

Utility function to set weights attribute (np.array) from user input

Parameters:	input_weights (dict) – {ticker: weight} dict

EfficientCDaR¶

The conditional drawdown at risk (CDaR) is a more exotic measure of tail risk. It tries to alleviate the problems with Efficient Semivariance and Efficient CVaR in that it accounts for the timespan of material decreases in value. The CDaR can be thought of as the average of losses that occur on “very bad periods”, where “very bad” is quantified by the parameter \(\beta\). The drawdown is defined as the difference in non-compounded return to the previous peak.

Put differently, the CDaR is the average of all drawdowns so severe that they only occur \((1-\beta)\%\) of the time. When \(\beta = 1\) CDaR is simply the maximum drawdown.

While drawdown is quite an intuitive concept, a lot of new notation is required to formulate it mathematically (see the wiki page for more details). We will adopt the following notation:

w for the vector of portfolio weights
r for a vector of cumulative asset returns (daily), with probability distribution \(p(r(t))\).
\(D(w, r, t) = \max_{\tau<t}(w^T r(\tau))-w^T r(t)\) for the drawdown of the portfolio
\(\alpha\) for the portfolio drawdown (DaR) with confidence \(\beta\).

The CDaR can then be written as:

\[CDaR(w, \beta) = \frac{1}{1-\beta} \int_{D(w, r, t) \geq \alpha (w)} D(w, r, t) p(r(t))dr(t).\]

This is a nasty expression to optimise because we are essentially integrating over VaR values. The key insight of Chekhlov, Rockafellar and Uryasev (2005) [4] is that we can can equivalently optimise a convex function, which can be transformed to a linear problem (in the same manner as for CVaR).

class pypfopt.efficient_frontier.EfficientCDaR(expected_returns, returns, beta=0.95, weight_bounds=(0, 1), solver=None, verbose=False, solver_options=None)[source]¶

The EfficientCDaR class allows for optimisation along the mean-CDaR frontier, using the formulation of Chekhlov, Ursayev and Zabarankin (2005).

Instance variables:

Inputs:
- n_assets - int
- tickers - str list
- bounds - float tuple OR (float tuple) list
- returns - pd.DataFrame
- expected_returns - np.ndarray
- solver - str
- solver_options - {str: str} dict
Output: weights - np.ndarray

Public methods:

min_cdar() minimises the CDaR
efficient_risk() maximises return for a given CDaR
efficient_return() minimises CDaR for a given target return
add_objective() adds a (convex) objective to the optimisation problem
add_constraint() adds a (linear) constraint to the optimisation problem
portfolio_performance() calculates the expected return and CDaR of the portfolio
set_weights() creates self.weights (np.ndarray) from a weights dict
clean_weights() rounds the weights and clips near-zeros.
save_weights_to_file() saves the weights to csv, json, or txt.

efficient_return(target_return, market_neutral=False)[source]¶

Minimise CDaR for a given target return.

Parameters:	target_return (float) – the desired return of the resulting portfolio. market_neutral (bool, optional) – whether the portfolio should be market neutral (weights sum to zero), defaults to False. Requires negative lower weight bound.
Raises:	ValueError – if `target_return` is not a positive float ValueError – if no portfolio can be found with return equal to `target_return`
Returns:	asset weights for the optimal portfolio
Return type:	OrderedDict

efficient_risk(target_cdar, market_neutral=False)[source]¶

Maximise return for a target CDaR. The resulting portfolio will have a CDaR less than the target (but not guaranteed to be equal).

Parameters:	target_cdar (float) – the desired maximum CDaR of the resulting portfolio. market_neutral – whether the portfolio should be market neutral (weights sum to zero), defaults to False. Requires negative lower weight bound. market_neutral – bool, optional
Returns:	asset weights for the efficient risk portfolio
Return type:	OrderedDict

min_cdar(market_neutral=False)[source]¶

Minimise portfolio CDaR (see docs for further explanation).

Parameters:	market_neutral – whether the portfolio should be market neutral (weights sum to zero), defaults to False. Requires negative lower weight bound. market_neutral – bool, optional
Returns:	asset weights for the volatility-minimising portfolio
Return type:	OrderedDict

portfolio_performance(verbose=False)[source]¶

After optimising, calculate (and optionally print) the performance of the optimal portfolio, specifically: expected return, CDaR

Parameters:	verbose (bool, optional) – whether performance should be printed, defaults to False
Raises:	ValueError – if weights have not been calculated yet
Returns:	expected return, CDaR.
Return type:	(float, float)

set_weights(input_weights)[source]¶

Utility function to set weights attribute (np.array) from user input

Parameters:	input_weights (dict) – {ticker: weight} dict

I am grateful to Nicolas Knudde for implementing this feature.

Custom optimization problems¶

We have seen previously that it is easy to add constraints to EfficientFrontier objects (and by extension, other general efficient frontier objects like EfficientSemivariance). However, what if you aren’t interested in anything related to max_sharpe(), min_volatility(), efficient_risk() etc and want to set up a completely new problem to optimize for some custom objective?

For example, perhaps our objective is to construct a basket of assets that best replicates a particular index, in other words, to minimise the tracking error. This does not fit within a mean-variance optimization paradigm, but we can still implement it in PyPortfolioOpt:

from pypfopt.base_optimizer import BaseConvexOptimizer
from pypfopt.objective_functions import ex_post_tracking_error

historic_rets = ... # dataframe of historic asset returns
benchmark_rets = ... # pd.Series of historic benchmark returns (same index as historic)

opt = BaseConvexOptimizer(
    n_assets=len(historic_returns.columns),
    tickers=historic_returns.columns,
    weight_bounds=(0, 1)
)
opt.convex_objective(
    ex_post_tracking_error,
    historic_returns=historic_rets,
    benchmark_returns=benchmark_rets,
)
weights = opt.clean_weights()

The EfficientFrontier class inherits from BaseConvexOptimizer. It may be more convenient to call convex_objective from an EfficientFrontier instance than from BaseConvexOptimizer, particularly if your objective depends on the mean returns or covariance matrix.

You can either optimize some generic convex_objective (which must be built using cvxpy atomic functions – see here) or a nonconvex_objective, which uses scipy.optimize as the backend and thus has a completely different API. For more examples, check out this cookbook recipe.

class pypfopt.base_optimizer.BaseConvexOptimizer¶
BaseConvexOptimizer.convex_objective(custom_objective, weights_sum_to_one=True, **kwargs)¶
Optimize a custom convex objective function. Constraints should be added with ef.add_constraint(). Optimizer arguments must be passed as keyword-args. Example:
# Could define as a lambda function instead
def logarithmic_barrier(w, cov_matrix, k=0.1):
    # 60 Years of Portfolio Optimization, Kolm et al (2014)
    return cp.quad_form(w, cov_matrix) - k * cp.sum(cp.log(w))

w = ef.convex_objective(logarithmic_barrier, cov_matrix=ef.cov_matrix)
Parameters:

custom_objective (function with signature (cp.Variable, **kwargs) -> cp.Expression) – an objective function to be MINIMISED. This should be written using cvxpy atoms Should map (w, **kwargs) -> float.

weights_sum_to_one (bool, optional) – whether to add the default objective, defaults to True

Raises:
OptimizationError – if the objective is nonconvex or constraints nonlinear.

Returns:
asset weights for the efficient risk portfolio

Return type:
OrderedDict
BaseConvexOptimizer.nonconvex_objective(custom_objective, objective_args=None, weights_sum_to_one=True, constraints=None, solver='SLSQP', initial_guess=None)¶
Optimize some objective function using the scipy backend. This can support nonconvex objectives and nonlinear constraints, but may get stuck at local minima. Example:
# Market-neutral efficient risk
constraints = [
    {"type": "eq", "fun": lambda w: np.sum(w)},  # weights sum to zero
    {
        "type": "eq",
        "fun": lambda w: target_risk ** 2 - np.dot(w.T, np.dot(ef.cov_matrix, w)),
    },  # risk = target_risk
]
ef.nonconvex_objective(
    lambda w, mu: -w.T.dot(mu),  # min negative return (i.e maximise return)
    objective_args=(ef.expected_returns,),
    weights_sum_to_one=False,
    constraints=constraints,
)
Parameters:

objective_function (function with signature (np.ndarray, args) -> float) – an objective function to be MINIMISED. This function should map (weight, args) -> cost

objective_args (tuple of np.ndarrays) – arguments for the objective function (excluding weight)

weights_sum_to_one (bool, optional) – whether to add the default objective, defaults to True

constraints (dict list) – list of constraints in the scipy format (i.e dicts)

solver (string) – which SCIPY solver to use, e.g “SLSQP”, “COBYLA”, “BFGS”. User beware: different optimizers require different inputs.

initial_guess (np.ndarray) – the initial guess for the weights, shape (n,) or (n, 1)

Returns:
asset weights that optimize the custom objective

Return type:
OrderedDict

References¶

[1]	Estrada, J (2007). Mean-Semivariance Optimization: A Heuristic Approach.

[2]	Markowitz, H.; Starer, D.; Fram, H.; Gerber, S. (2019). Avoiding the Downside.

[3]	(1, 2) Rockafellar, R.; Uryasev, D. (2001). Optimization of conditional value-at-risk

[4]	Chekhlov, A.; Rockafellar, R.; Uryasev, D. (2005). Drawdown measure in portfolio optimization