In a set of bucket returns, every bucket should have the same number of entries, but does not because of P123?s method of handling ties. (I considered posting this in Errors rather than Feature Requests.)

This issue has arisen in the Factors group?s attempts to define the overall ?goodness? of the set of bucket returns for a factor/ formula. A specific example will make the issue clear.

Suppose there are 50 stocks in a universe, and we are conducting a 10-bucket backtest. P123 sorts the returns for the 5 stocks in descending order, then assigns them to buckets. Suppose the sorted returns are as follows, shown in groups of 5:

Group 10: 29,27,26,26,25

Group 9: 24,22,21,20,20

Group 8: 20,20,19,19,19

Group 7: 19,19,19,19,19

Group 6: 19,18,18,18,18

Group 5: 17, 15, etc

You might expect that the buckets correspond to groups, but that is not the case because of the way P123 handles ties, dumping them into the next higher bucket:

? Bucket 10 gets group 10 as expected

? Bucket 9 gets group 9 (24,22,21,20,20) and also the first two in group 8 (20,20)

? Bucket 8 gets the last three from group 8 (19,19,19), plus all of group 7 (19,19,19,19,19) and even one from group 6 (19)

? Bucket 7 is empty!

? Bucket 6 gets the remaining four from group 6 (18, 18, 18, 18)

? And so forth

As compared to the simple one-to-one correspondence between groups and buckets, P123?s methodology suffers from these deficiencies:

? We are left with buckets with highly variable number of entries in each. Worse yet, we do not know these counts. It is not possible to construct a meaningful metric such as slope, standard deviation, correlation, t-statistic, smoothness, or whatever from such a mess.

? The method is arbitrary, in that an entirely different set would be obtained if the results had been sorted in ascending order, but there is no mathematical reason to prefer one direction over the other.

? The method is non-intuitive, with results that are almost impossible to understand or explain.

Because of its impact on the Factors group, I consider this to be a crucial need.

Results:
Total score:
**
17
**
, # of Votes:
**
5
**
, Average:
**
3.4
**

Scores are calculated as (importance) × (# of votes), where importance ranges from 1 to 4.

Final Comments

Request was based on an erroneous understanding of the process.