SchedulingExampleVariant2.lidr 5.85 KB
Newer Older
Nuria Brede's avatar
Nuria Brede committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
> module SchedulingExampleVariant2

> import Data.Fin
> import Data.Vect

> import Equality

> import Framework
> import Id

> %default total
> %auto_implicits off
> %access public export


The following example is the deterministic SDP from [Bertsekas1995, Ch.1]
described in Section 2.2 of [1].

For this example we make use of the possible dependencies 
for state and control spaces in the BJI-framework.
The implementation leverages on the fact that controls represent 
possible actions and states admissible sequences of actions.

We first define a type of actions 

> data Act = A | B | C | D 

> Eq Act where
>   (==) A A    = True
>   (==) B B    = True
>   (==) C C    = True
>   (==) D D    = True
>   (==) _x _y  = False

To capture the order in which actions can be applied, we first define the 
dependencies of each action, i.e. |A| and |C| have no dependencies while
|B| and |D| can only be performed when |A| or |C|, respectively, have 
already been done.

> dependencies  :  Act -> List Act
> dependencies A  = []
> dependencies B  = [A]
> dependencies C  = []
> dependencies D  = [C]

To check whether all depdencies of a certain action have been fulfilled, we
use a more generic test which checks whether all occurrences from a list 
occur in some vector:

> hasAll  :  {A : Type} -> Eq A => List A -> {n : Nat} -> Vect n A -> Bool
> hasAll [] _l        = True
> hasAll _l []        = False
> hasAll (a :: as) l  = elem a l && hasAll as l


Whether some vector represents an admissible state is captured by a type family
that evaluates to the one element type |Unit| if the vector is admissible and
to |Void| if not. 

> isAdmissible  :  {t : Nat} -> Vect t Act -> Bool
> isAdmissible []  =  True
> isAdmissible {t=S (S (S (S t)))} _ = False
> isAdmissible (a :: as)             = hasAll (dependencies a) as  && 
>                                      not (elem a as)             &&
>                                      isAdmissible as 

We can lift Boolean predicates to type valued predicates

> toType  :  {A : Type} -> (A -> Bool) -> A -> Type
> toType P a  =  P a = True

and define
 
> AdmissibleState  :  {t : Nat} -> Vect t Act -> Type
> AdmissibleState  =  toType isAdmissible  

Similarly we could use the type |So| from the standard library to lift 
Boolean predicates. We choose to use lifted Boolean-valued predicates 
instead of proof-relevant type-valued predicates as we do not need
computational content, but just a decision whether a state is admissible
or not.
 
Now a state is simply a pair of a vector of actions and a proof that 
it represents an admissible state.
                             
> State : (t : Nat) -> Type
> State t = (as : Vect t Act ** AdmissibleState as)

Admissible controls for a given state are actions that lead to an 
admissible next state:

> AdmissibleControl : {t : Nat} -> State t -> Act -> Type
> AdmissibleControl (as ** prf) a = AdmissibleState (a :: as) 

And controls are dpendent pairs of an action and a proof that it is 
admissible.

> Control : (t : Nat) -> State t -> Type
> Control t x = (a : Act ** AdmissibleControl x a)

We instantiate the framework components accordingly:

> Framework.X t = State t
> Framework.Y t x = Control t x
  

The SDP is deterministic, thus

< Framework.M = Id

As values we use natural numbers with their familiar structure

> Framework.Val = Nat
> Framework.zero = 0
> Framework.(<+>) = (+)
 
> Framework.meas = id 
 
The transition function following Fig. 1b can then simply be defined by:

> Framework.next t  (as ** _pf)  (act ** pf)  = MkDPair (act :: as) pf

The reward (or cost) function uses a helper function that defines the setup
costs for certain combinations of actions. Here we omit restricting to the 
possible combinations of actions via type dependencies and just use a 
dummy value for combinations that will not occur.  

> setupCost : Act -> Act -> Nat
> setupCost A B  = 2
> setupCost A C  = 3
> setupCost A D  = 4
> setupCost B C  = 3
> setupCost C A  = 4
> setupCost C B  = 4
> setupCost C D  = 6
> setupCost D A  = 3
> setupCost _ _  = 100 
 
> Framework.reward Z     _x           (A ** _pf) _x'  = 5
> Framework.reward Z     _x           (C ** _pf) _x'  = 3
> Framework.reward Z     ([] ** Refl) (B ** pf)  _x'  = absurd pf
> Framework.reward Z     ([] ** Refl) (D ** pf)  _x'  = absurd pf
> Framework.reward (S t) x             y         _x'  = setupCost (head (fst x)) (fst y)


Now we can define some example policies. 
Note that ideally Idris' totality checker would know the absurd cases by itself
so that they could be omitted. However this is not yet the case - thus we have to 
construct contradictions explicitly using |absurd|.

> p0A : Framework.Policy 0
> p0A ([] ** Refl) = MkDPair A Refl 

> p0C : Framework.Policy 0
> p0C ([] ** Refl) = MkDPair C Refl  

> p1 : Framework.Policy 1
> p1 ([A] ** Refl)  =  (B ** Refl)
> p1 ([C] ** Refl)  =  (A ** Refl)
> p1 ([B] ** pf) = absurd pf 
> p1 ([D] ** pf) = absurd pf 
 
> p2 : Framework.Policy 2
> p2 ([B, A] ** Refl) = (C ** Refl) 
> p2 ([A, C] ** Refl) = (B ** Refl) 
> p2 ([C, A] ** Refl) = (B ** Refl) 
> p2 ([D, C] ** Refl) = (A ** Refl)
> p2 ([A, A] ** pf) = absurd pf 
> p2 ([A, B] ** pf) = absurd pf
> p2 ([A, D] ** pf) = absurd pf
> p2 ([B, B] ** pf) = absurd pf
> p2 ([B, C] ** pf) = absurd pf
> p2 ([B, D] ** pf) = absurd pf
> p2 ([C, B] ** pf) = absurd pf
> p2 ([C, C] ** pf) = absurd pf
> p2 ([C, D] ** pf) = absurd pf
> p2 ([D, A] ** pf) = absurd pf
> p2 ([D, B] ** pf) = absurd pf
> p2 ([D, D] ** pf) = absurd pf


.. and policy sequences

> psA : Framework.PolicySeq 0 3
> psA = (p0A :: (p1 :: (p2 :: Nil)))

> psC : Framework.PolicySeq 0 3
> psC = (p0C :: (p1 :: (p2 :: Nil)))

The initial state is

> initial : State 0
> initial = ([] ** Refl)

And we can compare the behaviour of |val| and |val'| for 
our example policy sequences:

> testA : Type
> testA = val psA initial = val' psA initial  -- 10 = 10

> testC : Type
> testC = val psC initial = val' psC initial  -- 9 = 9