Course:CPSC522/Elicitation of Factored Utilities

From UBC Wiki
Jump to: navigation, search

Elicitation of Factored Utilities

Elicitation of Factored Utilities is referred to techniques where a system actively queries a user to glean relevant preferences. These preferences are then used in the task of automated decision support in tasks where the ability to make reasonable decisions on behalf of a user depends on that user’s preferences over outcomes in the domain in question.

Principal Author: Ali MohammadMehr

Abstract

In elicitation of factored utilities, there exists a system which wants to learn the preferences of the user in a preference problem and recommend the best decision to them based on the learned preferences. In such a problem, by limiting the feasible set of utility functions to the general additive utility functions, we can query the user to elicit their preferences. The queries can be local or global. Trying to represent the system's uncertainty over user preferences with probabilities introduces us to the Bayesian elicitation where we have a set of feasible utility functions with a prior distribution over those functions. Using the Bayesian elicitation, we can both make decisions using our prior distribution over utility functions and decide on the best query to be asked from the user - the query that helps us refine the best expected utility as much as possible. Elicitation with feasible utility sets is also another model that tries to represent the system's uncertainty over user preferences by trying to do without probabilities over the utility functions in the feasible utility set.

Builds on

Basic understanding of probabilities,

Related Pages

The main structure and ordering of this page is built using the paper Braziunas & Boutilier, C. 2008. Elicitation of factored utilities.[1] The flight decision example is also taken from the same paper.

Motivation

Preference elicitation is difficult for two main reasons. First, many decision problems have exponentially sized outcome spaces, defined by the possible values of outcome attributes. As an example, consider the sophisticated problem of flight selection: possible outcomes are defined by attributes such as trip cost, return time, airline, flight class, number of connections, flight length, baggage weight limit, departure time, (the possibility of) lost luggage, flight delays, and other stochastic outcomes. An ideal decision support system would be able to make use of, for example, precise flight delay statistics and incorporate a user’s relative tolerance for delays in making the recommendations that suit the user. Modeling and eliciting preferences for all outcomes in a case like this is infeasible given the size of the outcome space. The second difficulty is due to the fact that quantitative strength of preferences, or utility, is needed to trade off, for instance, the odds of flight delays with other attributes. Without quantifying the priority of users on different attributes of the outcomes, making this trade-off is impossible. Unfortunately, people are notoriously inept at quantifying their preferences with any degree of precision, adding to the challenges facing automated utility elicitation.[1]

Problem Definition

Suppose a user is faced the following decision problem: A decision must be chosen from a set of possible decisions. Each decision includes a distribution over a finite set of possible outcomes defined over the attributes that the user is interested in. Let attributes each have finite domains; the set of all outcomes is the Cartesian product of the attribute domains. Therefore, the relationship between a decision with is as follows: There exists an outcome function that given a decision , outputs the distributions over the attributes in . For example, in the flights decision example, the decision of a first class ticket on flight 338 will result in having only 1 connection (the distribution on number of connections for this decision is ) and the flight delay will have the following distribution: . Both of these distributions resulted from making the decision to buy the first class ticket on flight 338. In this page we assume that the distribution over the attributes given any decision is known as a prior knowledge.

In the mentioned example, is the set of all the possible flights between a departure point and destination point. The attributes, in this examples, will be flight Class (first class, business class, economic class), number of Connections (0, 1, 2, ...), total flight duration in hours (3, 4, 5, 6, 7+), and flight Delay in hours (0, 0–1, 1–2, 2–3, 3–5). The values for the first three attributes are deterministically defined by the choice , but the forth attribute will be known stochastically with the probability of each of the outcomes being determined by the choice d.

Utility Functions

A utility function is a quantitative representation of strengths of preferences. It is a function that captures user's preferences.

The use of quantitative utility functions allows for having a trade-off among the relative desirability of flights with all the different attributes. It also allows us to do estimation of preferences since the exact preferences of a person are neither known nor assessable.

Allowing the domain of the utility functions to contain any real-valued arbitrary function of attributes will make learning the utility function of the user intractable. By limiting the domain of utility function of the user to a specific types of functions will allow us to be able to estimate the free parameters in the utility function.

General Additive Utility

In the general additive utility (GAI) function assumption, the utility function can be written as the sum of factors of utilities:

The factors have to have the following property: the user should be indifferent between any two attributes with the same marginals on each factor. As an example, marginal increase of 0.1 on should have the same impact on user's preference compared to the same amount of marginal increase on .[2]

We can conveniently decompose the sub-utility function into local value functions (LVFs) and scaling global constants , so that . Then,

.

Without loss of generality, we can set and , where are the best and worst levels of attribute i.

Assessment of Scaling Constants

To assess the scaling constants, first we define a reference outcome. The reference outcome can be any full outcome. It is common to choose as the worst outcome or any salient outcome for user (e.g. the flight that is most commonly taken). We now define as the full outcome where the i-th attribute is set to its best level while other attributes are fixed at their reference levels; we denote its utility as . are defined similarly. Now, we should have .[2] We can see that the values for can be learned using the utility values for 2n full outcomes.

Elicitation Queries

Local Queries

local queries allow the user to only focus on a small subset of attributes and how they affect the utility function. In these kinds of queries the user is asked to assume that the attributes that are not in our interest are fixed at their reference levels.

A local comparison query simply asks the user to compare two local outcomes that are different in only a few attributes. The user does not have to consider the values for the other attributes and the decisions must be made by only taking into account the attributes in the query. For example, in our flight example, the query to compare (1 stop, business class) and (2 stops, first class) is a local comparison query.

Global Queries

Global queries are needed to calibrate LVFs by assessing the values of global parameters . These queries require consideration of all attributes and should be minimum in the number asked from user. Global queries may or may not be used based on the query elicitation scheme. In theory we could ask the user to directly identify the value of a full outcome - especially the values of , but this is usually avoided in query elicitation schemes. A simple type of global query is a total ranking query in which the user is asked to rank the full outcomes in the order of their preference. For example, in the flight example, assuming the reference outcome is (4hrs duation, business class, 2 stop, 30min delay), we could ask the user to rank the outcomes (3hrs, first, 2, 30min), (4hrs, business, 1, 30min), (4hrs, business, 2, 10min). The algorithm used to utilize the result of this query is determined based on the query elicitation scheme and decision making scheme.

Bayesian Elicitation

In Bayesian elicitation, we try to represent the system's uncertainty over user preferences by defining the set of feasible utility functions for a user using constraints on user's utility function parameters and learning the probability of the users actual utility function matching any of the functions in the feasible set. Therefore, utility functions are modeled as random variables drawn from a prior distribution. Now, the system stores a probability distribution over the user's utility function parameters, which is updated using Bayes's rule. The value of any decision can be estimated by taking the expectation of the value of the utility function for that decision. Assuming the prior distribution over the utility function parameters: [3]

Where is the probability that outcome x is realized under decision d.

Making Decisions

The best decision, which will be recommended to the user, with respect to the prior over the utility function parameters, which is learned by the system so far, is computed as follows:

Where is the value of the decision d defined the previous equation.

As an example, assume that we have narrowed down our feasible set of utility functions to the two utility functions in Table 1. If our system believed that with a probability of 0.6 the user's utility function matches with and with probability 0.4 the user's utility function matches with , the best decision would be to choose Flight C with expected value compared to 380 for both flight A and flight B.

Table 1. An example with only two utility functions in the feasible set. Each of the utility functions evaluate different amounts of utility values for each of the flights. Each flight corresponds to a decision d. The values in the table correspond to using the respective .
Flight A Flight B Flight C
400 200 300
350 650 600

Elicitation Strategies

If the system's uncertainty over utility functions is probabilistically quantified, we can compute how valuable each query is by computing the expected value of information (EVOI) for the query. The EVOI of a query is the amount by which the expected value of the post-query best decision improves compared to the expected value of the pre-query best decision. For example, in the example described above based on Table 1, the expected value of the best decision was 420. Imagine a yes/no query (named ) that we can ask from the user. The query will be answered "yes" with probability 0.8 if the user's actual utility function is . The response to will be answered "yes" with probability 0.1 if the user's actual utility function is :

With the system's belief about probabilities of each of the utilities (), if the user answers "yes" (which will happen with probability 0.52) the beliefs regarding will become (0.92,0.08), and the best decision will change to flight A with value 396. On the other hand, if the user answers "no" to , the probabilities regarding will become (0.25,0.75) and the best decision will be flight B with expected value 537. On the whole, the expected value for post-query will be , and the EVOI of will be .

The act of considering all possible future questions and answers with taking into account the cost of each query is named sequential EVOI. Computing a sequential elicitation policy amounts to solving a partially observable Markov decision process (POMDP) with a continuous state space. [4] Solving any POMDP is generally computationally difficult. Instead, at each step, we could greedily select the query that has the maximum EVOI without considering future queries. [5]

Elicitation with Feasible Utility Sets

Compared to Bayesian elicitation where we hold a prior distribution over the feasible utility set, in this section, we will only hold a feasible utility set, without defining any prior over the utility functions in the set. This is referred to as strict uncertainty. [6] The feasible set gets updated only by reducing its size through shrinking the state space of the parameters of the utility function. Again, we will not have any distribution over utility functions . These feasible sets are usually described using linear constraints on utility function parameters.

Making Decisions

Unlike Bayesian elicitation where we made decisions by choosing the decision that had the highest expected value, we use some kind of criteria to make decisions in strict uncertainty. Looking back at the example we discussed in the Bayesian elicitation case with the two utility functions in Table 1, we will explore some of the criteria in the literature used to make a decision in strict uncertainty:

Undominated Outcomes

To use the Undominated Outcomes criterion[7],which is one of the simplest criteria in the strict uncertainty domain, we just have to reduce the feasible utility set to the set that only contains undominated utility functions. We say one utility dominates another if its utility is greater for any . We will then present the final set which contains all the undominated utility functions to the user, so that he/she can decide on one of them. For example, in the flight example of Table 1, we should only check if dominates or vice versa. If none of them dominates the other, we need to present both of these utility functions to the user so that they can decide which one fits them.

Maximin Return

In this criterion, the best decision is the decision whose worst-case return is highest:

To make this possible, we need to assume that each decision leads to a single outcome , which means the space of decisions and outcomes are equivalent. For example, in the flight example with the feasible set defined in Table 1, the maximin decision is to choose flight A so that the worst-case return is 350.

Uniform Distribution

We could assume that there is a uniform distribution on the feasible set and choose the best decision to be the one that has the maximum mean return. In the example of Table 1, the best decision with this criterion is flight C with a mean utility of 450.

Minimax Regret

Similar to maximin return, we will assume equivalence in decision and outcome spaces and choose the best decision using the following formula:

Where we define:

and

This will minimize the regret of making a decision if the best decision was supposed be any decision other than the decision we made.

For example, in the example defined in Table 1, the max regret of choosing flight A is 300 because if we recommend flight A to the user, but in fact the user's utility function was then the amount of regret would be 300 because A is 300 worse than B, the optimal flight in . With the same analogy, the max regret for flight B is 200, and the max regret for flight C is 100, which means the best flight with the minimax regret criterion is flight C.

Conclusion

We have presented two different schemes used by a system trying to elicit factored preferences to learn a user's utility function and use the learned representation of user's utility function to recommend a decision to the user. The Bayesian elicitation scheme tries to learn a probability distribution over the parameters of the utility function by using queries each of which has an expected value of information. In the elicitation with feasible utility sets scheme, the system tries to achieve the same goals without learning any prior probability distribution over the feasible utility functions.

Current approaches in preference elicitation are mostly myopic, focusing on the single next best query to ask and choosing the best query based on that. hybrid preference assessment techniques could combine elicitation specific to a particular user with collaborative filtering methods. There are methods for active collaborative filtering in which hybrid preference assessment is used as their core process. [8] Careful experimental studies of query costs for different modes of interactions are certainly influential in the future design of elicitation schemes.

Annotated Bibliography

  1. 1.0 1.1 Braziunas, D., & Boutilier, C. 2008. Elicitation of factored utilities. Artificial Intelligence Magazine, 29(4), 79–92.
  2. 2.0 2.1 Fishburn, P. C. 1964. Decision and Value Theory. New York: Wiley.
  3. Chajewska, U., and Koller, D. 2000. Utilities as Random Variables: Density Estimation and Structure Discovery. In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI-00), 63–71. San Francisco: Morgan Kaufmann Publishers.
  4. Boutilier, C. 2002. A POMDP Formulation of Preference Elicitation Problems. In Proceedings of the Eighteenth National Conference on Artificial Intelligence (AAAI-02), 239–246. Menlo Park, CA: AAAI Press.
  5. Braziunas, D., and Boutilier, C. 2005. Local Utility Elicitation in GAI Models. In Proceedings of the Twenty-first Conference on Uncertainty in Artificial Intelligence (UAI-05), 42–49. San Francisco: Morgan Kaufmann Publishers.
  6. French, S. 1986. Decision Theory. New York: Halsted Press.
  7. Anandalingam, G., and White, C. C. 1993. A Penalty Function Approach to Alternative Pairwise Comparisons in ISMAUT. IEEE Transactions on Systems, Man, and Cybernetics 23(1): 330–333.
  8. Boutilier, C.; Zemel, R. S.; and Marlin, B. 2003. Active Collaborative Filtering. In Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI-03), 98–106. San Francisco: Morgan Kaufmann Publishers.

To Add

Put links and content here to be added. This does not need to be organized, and will not be graded as part of the page. If you find something that might be useful for a page, feel free to put it here.