Est time to complete: 45 mins

https://embed.notionlytics.com/wt/ZXlKd1lXZGxTV1FpT2lJME4yRmtNalJpTWpRd01qQTBNakJpT0RFek1EVTJZVGt3WmpRME5tSXhPU0lzSW5kdmNtdHpjR0ZqWlZSeVlXTnJaWEpKWkNJNklsRjBaRGt4TVRWNGJVVk9aVlJaYm5BMWIxUkhJbjA9

Admin: Replit

1. Designing Features

The second half of the previous tutorials saw that using a class of functions like polynomials has quite limited applications. Polynomials have to be continuous and are prone to overfitting when high-degree polynomials are used.

Since using a broad class of continuous functions (e.g. polynomials) is often ineffective at imitating the shapes of value functions, why not handcraft the features? This allows a designer to pick out the key aspects of the state that correspond to increases or decreases in value for a specific MDP. It also allows us to capture non-linearities and discontinuities in the value function, by designing features which incorporate these aspects (e.g. a feature which is either 0 or 1).

Can this still overcome the key issues we had with lookup tables? Largely, yes. Extracting the useful information in the state into information-dense features reduces the number of parameters needed. Also, by comparing states based on their features, generalisation to unseen states is straightforward. However, this relies on the designer having a strong understanding of the MDP and having the ability to extract useful features from the state.

This is the core idea of this tutorial - that by designing features, we can overcome the problems associated with applying lookup tables to large state spaces.

It also leads smoothly into next week: learning about neural networks. They learn features, which simplifies the designer’s life and enables learning superhuman behaviour in complex environments (e.g. the game of Go).

2. Linear Combination of Features

A linear combination of features can be used to approximate a function. They multiply learned parameter values with features. These features (usually grouped in a feature vector) are hand-crafted by a designer, who writes a deterministic function to extract salient features from the current state. It's up to the designer (you) to choose what features to use. You should try to pick features that correspond with changes in the value of a state.

For example, in the game of chess, one such feature could be whether you’ve won the game. Another could be if your queen has been taken by the opponent. Yet another could be if you have your opponent in check. As you can imagine, designing a comprehensive set of features for chess would be a Herculean task!

Alternatively, for the Mountain Car example - features could be whether the car is to the left or right of the centre of the valley and another could be whether the car is moving to the left or right.