I think there are basically two approaches: one is to look at "fundamental" factors: does the model invest in bonds, commodities, macro events? Another is to approach from a quantitative standpoint and look at certain metrics.
A little bit of research is needed to compare various models, but the numbers are all there, at least for the short-term.
A first step could be to look for performance: which model performed will over the last 365 days? It is easy to sort all models by this criterion. Here is a chart of the returns of the 10 best performing models, which replicated relatively well (Timothy Sykes' model doesn't replicate well, so I didn't use it in the following charts):
As you can see, the Top 10 models outperformed the S&P 500 over the last year. (For the record: yellow color is the S&P 500, green are the top 3 performance models, red is my model).
Performance is one story, risk is another: did the models outperform because managers were taking on too much risk? Covestor offers various risks metrics. One of them is the Value at Risk (VaR, 95%, 1 week), which basically quantifies the amount of money you can expect to lose in 5% of the weeks. Or in other words: within half a year, there will probably be one week, where you lose this amount. The following chart plots model performance over VaR for the models of the first chart:
Another factor can be correlation: it is positive when someone is able to generate positive risk-adjusted returns independent from market returns, or in other words, with beta close to zero. The following chart shows the Sortino Ratio with portfolio Beta. The Sortino Ratio is a measure for risk-adjusted returns and has some advantages over the popular Sharpe Ratio:
So what is the best model? As can be seen, the answer is not that easy and depends on factors that only the individual investor can answer. Also keep in mind that 365 days is not a long period. If the same models show would show up in these charts over a 3 years history, the question of performance persistence could be answered on the base of a better data set.











