Course:CPSC532:StaRAI:2017:David

Course:CPSC532:StaRAI

Here is a place for David to describe the methods he investigated on the training and test sets described in the paper. This can contain notes, but should eventually converge to descriptions suitable for inclusion in the paper. It will also contain results that will not appear in the paper (e.g., trying some methods that do not work well).

CPSC 532 David's Results

Method	ml-60k ASE	ml-60k Log loss
Predict 0.5	0.25	1
training average	0.2159	0.9004
mf + seed feature as gender (opt-test)	0.20159	0.8598
mf + nearest neighbours	0.2078	0.8707
mf+ logistic regression (1 feat)	0.2067	0.8683
mf+ logistic regression (5 feat)	0.1993	0.8440
mf + forcing + logistic regression	0.2050	0.8662

Note that opt-test means that meta-parameters were optimized on the test set. These are not of publication quality, but this is useful information as it given an upper-bound for the method.

Matrix Factorization Methods

The general idea of the methods I investigated was to use the standard matrix factorization algorithm and then to try to extract the gender from the user features. The variants here are in no particular order.

Seed a user feature as gender

In this method, I seeded one user feature to be 2 for females, and -2 for males (I tried 1,2,4) and 2 worked better, so I will just stick with 2 and not try to optimize it. The reported prediction is, using one feature, sigmoid(user_feat[u][0]+pc) where pc is a regularizer (to reflect the prior).

matrix factorization + nearest neighbours

After learning a user-feature from the matrix factorization (seeded as before). The gender of a user is predicted using the k-nearest neighbors. Each of the predictions for the training set is regularized by a pseudo-count of 12. The 40 nearest neighbors are chosen.

matrix factorization + logistic regression

After learning a user-feature from matrix factorization (seeding one feature as before), I trained a logistic model to predict gender from the user-feature.

Force a feature to be gender

In this method, with one feature, I forced the feature to be gender (+2 for females and -2 for males) and learned the item feature (and the user features for the unobserved case). Then, with the item feature fixed, I trained on all users. This + nearest neghbours worked worse than predicting average (LL=0.9458) for few neighbours (minimum at 144 nn, but it approaches training avergae as neighbours and/or pseudo-count approaches infinity).

What is reported is the log-loss for forcing + logistic regression. So forcing does not have much effect, and is more complicated.

Settings: 1 feature step_size = 0.01, g_step_size=0.1, reglz=1.0 : 0.21099, 0.88594 step_size = 0.01, g_step_size=0.01, reglz=1.0 : 0.21204, 0.8863 step_size = 0.01, g_step_size=0.5, reglz=1.0 0.20534076280284183, 0.8693 step_size = 0.01, g_step_size=0.5, reglz=1.0 (reg to mean) 0.2118774314895417, 0.90219 step_size = 0.01, g_step_size=0.5, reglz=1.0 (reg to mean and to 0) 0.2118774314895417, 0.90219 0.20528757503447204, 0.8680