Course:CPSC522/December2023

From UBC Wiki

CPSC 522 Final Assignment

The final assignment is to test a hypothesis relevant to the course. Choose one of the following

Relational Learning

Predict (self-assessed) gender from movie preferences. (This was chosen as there is a good -- and simple -- dataset available and gender is the simplest property.) There is a starter file at starter.py Choose one method to test. You can compare it to the performace of the MovieLens 60K reported in the paper at [1]. If you are interested, I can look for other code we implemented.

Reinforcement Learning

AIPython has lots of reinforcement learning code. Some of which I don't understand why it works like it does. Choose one (or more of the following to investigate -- it is better to do a more in-depth analysis of one than a superficial analysis of multiple)

  • I could not get alpha_k=1/k to work. Does it converge in theory and/or practice? I tried to find an example which followed the convergence theorem, and I found alpha_k=10/(9+k) works much better in practice. Suggest an alternative one. Does it converge in theory and/or practice?
  • What happens with fixed alpha with lottery-like rewards (with low probability having a very large or small payoff)? What happens before and after the payoff? Is there a tradeoff between the alpha and the probability you can find?
  • RL with function approximation doesn't seem as good, even with features that should cover the whole state space. Why? Is it the theory, my hyperparameters, or something else? (The linear function approximator doesn't converge as reliably as the others, even with a fully-expressive feature space.)
  • What about a fixed step size with RL with generalization. Does it get messed up by lottery-like rewards? (Think about combining the last two points).
  • For the multi-agent learning, think of an alternative to using the Dirichlet. E.g., using more recent experiences more, or gradient descent on the probabilities. Does it work better than the Dirichlet implemented now?

Other

  • Test another hypothesis relevant to the course. It should be something you can explain and test in the timeframe available. Talk to me about ideas you may have.

Timetable:

  • Dec 14 draft ready for feedback
  • Dec 15 Presentations
  • Dec 19 feedback to authors
  • Dec 21 final assignment due