Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
Nicola Botta
HaskellLibs
Commits
dff6bf7a
Commit
dff6bf7a
authored
Dec 21, 2016
by
Nicola Botta
Browse files
Initial revision.
parents
Changes
4
Expand all
Hide whitespace changes
Inline
Side-by-side
SequentialDecisionProblems/CoreTheory.lhs
0 → 100644
View file @
dff6bf7a
This diff is collapsed.
Click to expand it.
SequentialDecisionProblems/FullTheory.lhs
0 → 100644
View file @
dff6bf7a
> module SequentialDecisionProblems.FullTheory
>
import
SequentialDecisionProblems.CoreTheory
>
import
Sigma.Sigma
>
import
Sigma.Operations
>
%
default
total
>
%
access
public
export
>
%
auto_implicits
off
>
-- %hide Prelude.Nat.LTE
* Preliminaries
In order to prove the correctness of |backwardsInduction| (see
|CoreTheory|), we need a number of additional assumptions, which we
introduce here. These result in proof obligations for the user of the
framework. As in |CoreTheory|, they are represented as meta-variables
(or ``holes''). Once the user has discharged these obligations, type
checking this file will prove the optimality of |backwardsInduction|
(with some caveats related to the current Idris implementation).
* |LTE|
The binary relation introduced in |CoreTheory| for comparing values of
sequences of policies has to be a preorder:
>
reflexiveLTE
:
(
a
:
Val
)
->
a
`
LTE
`
a
>
transitiveLTE
:
(
a
:
Val
)
->
(
b
:
Val
)
->
(
c
:
Val
)
->
a
`
LTE
`
b
->
b
`
LTE
`
c
->
a
`
LTE
`
c
Additionally, |plus| is required to be monotonic with respect to |LTE|:
>
monotonePlusLTE
:
{
a
,
b
,
c
,
d
:
Val
}
->
a
`
LTE
`
b
->
c
`
LTE
`
d
->
(
a
`
plus
`
c
)
`
LTE
`
(
b
`
plus
`
d
)
* |meas|
The function |meas| introduced in |CoreTheory| to describe how a
decision maker measures the possible rewards entailed by (lists,
probability distributions, etc. of) possible next states is required to
fulfill a monotonicity condition:
>
measMon
:
{
A
:
Type
}
->
>
(
f
:
A
->
Val
)
->
(
g
:
A
->
Val
)
->
>
((
a
:
A
)
->
(
f
a
)
`
LTE
`
(
g
a
))
->
>
(
ma
:
M
A
)
->
meas
(
fmap
f
ma
)
`
LTE
`
meas
(
fmap
g
ma
)
In a nutshell, |measMon| says that, if |ma| and |mb| are similar
|M|-structures and |ma| is smaller or equal to |mb|, than it cannot be
the case that the measure of |ma| is greater than the measure of |mb|.
This conditions was originally formalized by Ionescu in [2] to give a
consistent meaning to harm measures in vulnerability studies.
* |cvalargmax|
The function |cvalargmax| introduced in |CoreTheory| must deliver
optimal controls. We need the function |cvalmax|, returning the value
of those optimal controls, in order to fully specify |cvalargmax|:
>
cvalmax
:
{
t
,
n
:
Nat
}
->
>
(
x
:
State
t
)
->
(
r
:
Reachable
x
)
->
(
v
:
Viable
(
S
n
)
x
)
->
>
(
ps
:
PolicySeq
(
S
t
)
n
)
->
Val
>
cvalargmaxSpec
:
{
t
:
Nat
}
->
{
n
:
Nat
}
->
>
(
x
:
State
t
)
->
(
r
:
Reachable
x
)
->
>
(
v
:
Viable
(
S
n
)
x
)
->
(
ps
:
PolicySeq
(
S
t
)
n
)
->
>
cvalmax
x
r
v
ps
=
cval
x
r
v
ps
(
cvalargmax
x
r
v
ps
)
>
cvalmaxSpec
:
{
t
:
Nat
}
->
{
n
:
Nat
}
->
>
(
x
:
State
t
)
->
(
r
:
Reachable
x
)
->
>
(
v
:
Viable
(
S
n
)
x
)
->
(
ps
:
PolicySeq
(
S
t
)
n
)
->
>
(
y
:
GoodCtrl
t
x
n
)
->
>
(
cval
x
r
v
ps
y
)
`
LTE
`
(
cvalmax
x
r
v
ps
)
The reason for using these very specific functions, instead of more
general |argmax|, |max|, |argmaxSpec| and |maxSpec| like for instance
< argmax : {t : Nat} -> {n : Nat} ->
< (x : State t) -> (Viable (S n) x) ->
< (f : GoodCtrl t x n -> Val) -> GoodCtrl t x n
< max : {t : Nat} -> {n : Nat} ->
< (x : State t) -> (Viable (S n) x) ->
< (f : GoodCtrl t x n -> Val) -> Val
< argmaxSpec : {t : Nat} -> {n : Nat} ->
< (x : State t) -> (v : Viable (S n) x) ->
< (f : GoodCtrl t x n -> Val) ->
< max x v f = f (argmax x v f)
< maxSpec : {t : Nat} -> {n : Nat} ->
< (x : State t) -> (v : Viable (S n) x) ->
< (f : GoodCtrl t x n -> Val) -> (y : GoodCtrl t x n) ->
< (f y) `LTE` (max x v f)
is that optimisation is, in most case, not computable. The assumptions
on |cvalmax| and |cvalargmax| are the minimal requirements for the
computability of optimal extensions. Anything more general risks being
non-implementable.
* The proof of correctness of |backwardsInduction|:
** Policy sequences of length zero are optimal
>
|||
>
nilOptPolicySeq
:
OptPolicySeq
Nil
>
nilOptPolicySeq
ps'
x
r
v
=
reflexiveLTE
zero
** Bellman's principle of optimality:
>
|||
>
Bellman
:
{
t
,
m
:
Nat
}
->
>
(
ps
:
PolicySeq
(
S
t
)
m
)
->
OptPolicySeq
ps
->
>
(
p
:
Policy
t
(
S
m
))
->
OptExt
ps
p
->
>
OptPolicySeq
(
p
::
ps
)
>
Bellman
{
t
}
{
m
}
ps
ops
p
oep
=
opps
where
>
opps
:
OptPolicySeq
(
p
::
ps
)
>
opps
(
p'
::
ps'
)
x
r
v
=
transitiveLTE
(
val
x
r
v
(
p'
::
ps'
))
>
(
val
x
r
v
(
p'
::
ps
))
>
(
val
x
r
v
(
p
::
ps
))
>
s4
s5
where
>
gy'
:
GoodCtrl
t
x
m
>
gy'
=
p'
x
r
v
>
y'
:
Ctrl
t
x
>
y'
=
ctrl
gy'
>
mx'
:
M
(
State
(
S
t
))
>
mx'
=
nexts
t
x
y'
>
av'
:
All
(
Viable
m
)
mx'
>
av'
=
allViable
gy'
>
f'
:
PossibleNextState
x
(
ctrl
gy'
)
->
Val
>
f'
=
sval
x
r
v
gy'
ps'
>
f
:
PossibleNextState
x
(
ctrl
gy'
)
->
Val
>
f
=
sval
x
r
v
gy'
ps
>
s1
:
(
x'
:
State
(
S
t
))
->
(
r'
:
Reachable
x'
)
->
(
v'
:
Viable
m
x'
)
->
>
val
x'
r'
v'
ps'
`
LTE
`
val
x'
r'
v'
ps
>
s1
x'
r'
v'
=
ops
ps'
x'
r'
v'
>
s2
:
(
z
:
PossibleNextState
x
(
ctrl
gy'
))
->
(
f'
z
)
`
LTE
`
(
f
z
)
>
s2
(
MkSigma
x'
x'emx'
)
=
>
monotonePlusLTE
(
reflexiveLTE
(
reward
t
x
y'
x'
))
(
s1
x'
r'
v'
)
where
>
ar'
:
All
Reachable
mx'
>
ar'
=
reachableSpec1
x
r
y'
>
r'
:
Reachable
x'
>
r'
=
allElemSpec0
x'
mx'
ar'
x'emx'
>
v'
:
Viable
m
x'
>
v'
=
allElemSpec0
x'
mx'
av'
x'emx'
>
s3
:
meas
(
fmap
f'
(
tagElem
mx'
))
`
LTE
`
meas
(
fmap
f
(
tagElem
mx'
))
>
s3
=
measMon
f'
f
s2
(
tagElem
mx'
)
>
s4
:
val
x
r
v
(
p'
::
ps'
)
`
LTE
`
val
x
r
v
(
p'
::
ps
)
>
s4
=
s3
>
s5
:
val
x
r
v
(
p'
::
ps
)
`
LTE
`
val
x
r
v
(
p
::
ps
)
>
s5
=
oep
p'
x
r
v
>
|||
>
optExtLemma
:
{
t
,
n
:
Nat
}
->
>
(
ps
:
PolicySeq
(
S
t
)
n
)
->
OptExt
ps
(
optExt
ps
)
>
optExtLemma
{
t
}
{
n
}
ps
p'
x
r
v
=
s6
where
>
p
:
Policy
t
(
S
n
)
>
p
=
optExt
ps
>
gy
:
GoodCtrl
t
x
n
>
gy
=
p
x
r
v
>
y
:
Ctrl
t
x
>
y
=
ctrl
gy
>
av
:
All
(
Viable
n
)
(
nexts
t
x
y
)
>
av
=
allViable
gy
>
gy'
:
GoodCtrl
t
x
n
>
gy'
=
p'
x
r
v
>
y'
:
Ctrl
t
x
>
y'
=
ctrl
gy'
>
av'
:
All
(
Viable
n
)
(
nexts
t
x
y'
)
>
av'
=
allViable
gy'
>
g
:
GoodCtrl
t
x
n
->
Val
>
g
=
cval
x
r
v
ps
>
f
:
PossibleNextState
x
(
ctrl
gy
)
->
Val
>
f
=
sval
x
r
v
gy
ps
>
f'
:
PossibleNextState
x
(
ctrl
gy'
)
->
Val
>
f'
=
sval
x
r
v
gy'
ps
>
s1
:
g
gy'
`
LTE
`
cvalmax
x
r
v
ps
>
s1
=
cvalmaxSpec
x
r
v
ps
gy'
>
s2
:
g
gy'
`
LTE
`
g
(
cvalargmax
x
r
v
ps
)
>
s2
=
replace
{
P
=
\
z
=>
(
g
gy'
`
LTE
`
z
)}
(
cvalargmaxSpec
x
r
v
ps
)
s1
>
-- the rest of the steps are for the (sort of) human reader
>
s3
:
g
gy'
`
LTE
`
g
gy
>
s3
=
s2
>
s4
:
cval
x
r
v
ps
gy'
`
LTE
`
cval
x
r
v
ps
gy
>
s4
=
s3
>
s5
:
meas
(
fmap
f'
(
tagElem
(
nexts
t
x
y'
)))
`
LTE
`
meas
(
fmap
f
(
tagElem
(
nexts
t
x
y
)))
>
s5
=
s4
>
s6
:
val
x
r
v
(
p'
::
ps
)
`
LTE
`
val
x
r
v
(
p
::
ps
)
>
s6
=
s5
** Correctness of backwards induction
>
backwardsInductionLemma
:
(
t
:
Nat
)
->
(
n
:
Nat
)
->
OptPolicySeq
(
backwardsInduction
t
n
)
>
backwardsInductionLemma
t
Z
=
nilOptPolicySeq
>
backwardsInductionLemma
t
(
S
n
)
=
Bellman
ps
ops
p
oep
where
>
ps
:
PolicySeq
(
S
t
)
n
>
ps
=
backwardsInduction
(
S
t
)
n
>
ops
:
OptPolicySeq
ps
>
ops
=
backwardsInductionLemma
(
S
t
)
n
>
p
:
Policy
t
(
S
n
)
>
p
=
optExt
ps
>
oep
:
OptExt
ps
p
>
oep
=
optExtLemma
ps
Thus, we can compute provably optimal sequences of policies for
arbitrary SDPs and number of decision steps.
>
{-
>
---}
[1] Bellman, Richard; "Dynamic Programming", Princeton University Press,
1957
[2] Ionescu, Cezar; "Vulnerability Modelling and Monadic Dynamical
Systems", Freie Universitaet Berlin, 2009
SequentialDecisionProblems/Helpers.lhs
0 → 100644
View file @
dff6bf7a
> module SequentialDecisionProblems.Helpers
>
import
Data.Fin
>
import
Control.Isomorphism
>
import
SequentialDecisionProblems.CoreTheory
>
import
Finite.Predicates
>
import
Finite.Operations
>
import
Finite.Properties
>
import
Sigma.Sigma
>
import
Sigma.Operations
>
import
Sigma.Properties
>
%
default
total
>
%
access
public
export
>
%
auto_implicits
off
* Finiteness notions
>
|||
>
FiniteViable
:
Type
>
FiniteViable
=
{
t
:
Nat
}
->
>
(
n
:
Nat
)
->
(
x
:
State
t
)
->
Finite
(
Viable
{
t
}
n
x
)
>
|||
>
FiniteAll
:
Type
>
FiniteAll
=
{
A
:
Type
}
->
{
P
:
A
->
Type
}
->
>
Finite1
P
->
(
ma
:
M
A
)
->
Finite
(
All
P
ma
)
>
|||
>
FiniteAllViable
:
Type
>
FiniteAllViable
=
{
t
:
Nat
}
->
{
n
:
Nat
}
->
>
(
x
:
State
t
)
->
(
y
:
Ctrl
t
x
)
->
>
Finite
(
All
(
Viable
{
t
=
S
t
}
n
)
(
nexts
t
x
y
))
>
|||
>
FiniteNotEmpty
:
Type
>
FiniteNotEmpty
=
{
t
:
Nat
}
->
{
n
:
Nat
}
->
>
(
x
:
State
t
)
->
(
y
:
Ctrl
t
x
)
->
>
Finite
(
NotEmpty
(
nexts
t
x
y
))
>
|||
>
FiniteGood
:
Type
>
FiniteGood
=
{
t
:
Nat
}
->
{
n
:
Nat
}
->
>
(
x
:
State
t
)
->
(
y
:
Ctrl
t
x
)
->
>
Finite
(
Good
t
x
n
y
)
>
|||
>
FiniteCtrl
:
Type
>
FiniteCtrl
=
{
t
:
Nat
}
->
{
n
:
Nat
}
->
>
(
x
:
State
t
)
->
Finite
(
Ctrl
t
x
)
>
|||
>
FiniteGoodCtrl
:
Type
>
FiniteGoodCtrl
=
{
t
:
Nat
}
->
{
n
:
Nat
}
->
>
(
x
:
State
t
)
->
Viable
{
t
=
t
}
(
S
n
)
x
->
>
Finite
(
GoodCtrl
t
x
n
)
* Standard deduction patterns in the implementation of specific SDPs
We would like to provide users with a function
< finiteAllViable : FiniteAll -> FiniteViable -> FiniteAllViable
but, as it turns out, implementing this function is not trivial (see
issues 3135 and 3136). Thus, for the time being, we introduce 2
additional assumptions in the global context
>
|||
If
users
can
prove
that
All
is
finite
...
>
finiteAll
:
{
A
:
Type
}
->
{
P
:
A
->
Type
}
->
>
Finite1
P
->
(
ma
:
M
A
)
->
Finite
(
All
P
ma
)
>
|||
...
and
that
Viable
is
finite
,
>
finiteViable
:
{
t
:
Nat
}
->
{
n
:
Nat
}
->
>
(
x
:
State
t
)
->
Finite
(
Viable
{
t
}
n
x
)
and apply them to deduce finiteness of |All Viable|.
>
|||
we
can
deduce
that
All
Viable
is
finite
>
finiteAllViable
:
{
t
:
Nat
}
->
{
n
:
Nat
}
->
>
(
x
:
State
t
)
->
(
y
:
Ctrl
t
x
)
->
>
Finite
(
All
(
Viable
{
t
=
S
t
}
n
)
(
nexts
t
x
y
))
>
finiteAllViable
{
t
}
{
n
}
x
y
=
finiteAll
(
finiteViable
{
t
=
S
t
}
{
n
})
(
nexts
t
x
y
)
Similarly, we assume that users are able to prove that |NotEmpty| is finite
>
finiteNotEmpty
:
{
t
:
Nat
}
->
{
n
:
Nat
}
->
>
(
x
:
State
t
)
->
(
y
:
Ctrl
t
x
)
->
>
Finite
(
NotEmpty
(
nexts
t
x
y
))
Finiteness of |NotEmpty| and |All Viable| directly implies finiteness of
|Good|
>
|||
>
finiteGood
:
{
t
:
Nat
}
->
{
n
:
Nat
}
->
>
(
x
:
State
t
)
->
(
y
:
Ctrl
t
x
)
->
>
Finite
(
Good
t
x
n
y
)
>
finiteGood
{
n
}
x
y
=
finiteProduct
(
finiteNotEmpty
{
n
}
x
y
)
(
finiteAllViable
x
y
)
and, assuming finiteness of controls, finiteness of |GoodCtrl|:
>
|||
>
finiteCtrl
:
{
t
:
Nat
}
->
{
n
:
Nat
}
->
(
x
:
State
t
)
->
Finite
(
Ctrl
t
x
)
>
|||
>
finiteGoodCtrl
:
{
t
:
Nat
}
->
{
n
:
Nat
}
->
>
(
x
:
State
t
)
->
>
Finite
(
GoodCtrl
t
x
n
)
>
finiteGoodCtrl
{
t
}
{
n
}
x
=
finiteSigmaLemma0
(
finiteCtrl
{
t
}
{
n
}
x
)
(
finiteGood
{
t
}
{
n
}
x
)
Finally, we can show
>
|||
>
cardNotZGoodCtrl
:
{
t
:
Nat
}
->
{
n
:
Nat
}
->
>
(
x
:
State
t
)
->
(
v
:
Viable
{
t
=
t
}
(
S
n
)
x
)
->
>
CardNotZ
(
finiteGoodCtrl
{
t
}
{
n
}
x
)
>
cardNotZGoodCtrl
x
v
=
cardNotZLemma
(
finiteGoodCtrl
x
)
(
viableSpec1
x
v
)
>
{-
>
---}
SequentialDecisionProblems/Utils.lhs
0 → 100644
View file @
dff6bf7a
> module SequentialDecisionProblems.Utils
>
import
Data.Fin
>
import
Control.Isomorphism
>
import
SequentialDecisionProblems.CoreTheory
>
import
Sigma.Sigma
>
import
Sigma.Operations
>
import
Finite.Predicates
>
import
Fun.Operations
>
%
default
total
>
%
access
public
export
>
%
auto_implicits
off
* ...
>
|||
>
FiniteViable
:
Type
>
FiniteViable
=
{
t
:
Nat
}
->
{
n
:
Nat
}
->
(
x
:
State
t
)
->
Finite
(
Viable
{
t
}
n
x
)
>
|||
>
FiniteAll
:
Type
>
FiniteAll
=
{
A
:
Type
}
->
{
P
:
A
->
Type
}
->
Finite1
P
->
(
ma
:
M
A
)
->
Finite
(
All
P
ma
)
>
|||
>
FiniteAllViable
:
Type
>
FiniteAllViable
=
{
t
:
Nat
}
->
{
n
:
Nat
}
->
>
(
x
:
State
t
)
->
(
y
:
Ctrl
t
x
)
->
>
Finite
(
All
(
Viable
{
t
=
S
t
}
n
)
(
nexts
t
x
y
))
>
|||
>
FiniteNonEmpty
:
Type
>
FiniteNonEmpty
=
{
t
:
Nat
}
->
{
n
:
Nat
}
->
>
(
x
:
State
t
)
->
(
y
:
Ctrl
t
x
)
->
>
Finite
(
NotEmpty
(
nexts
t
x
y
))
>
|||
>
FiniteGoodCtrl
:
Type
>
FiniteGoodCtrl
=
{
t
:
Nat
}
->
{
n
:
Nat
}
->
>
(
x
:
State
t
)
->
Viable
{
t
=
t
}
(
S
n
)
x
->
>
Finite
(
GoodCtrl
t
x
n
)
>
{-
>
finiteAllViable : FiniteAll -> FiniteViable -> FiniteAllViable
>
finiteAllViable fAll fViable = fAllViable where
>
fAll' : {A : Type} -> {P : A -> Type} -> Finite1 P -> (ma : M A) -> Finite (All P ma)
>
fAll' = fAll
>
fViable' : {t : Nat} -> {n : Nat} -> (x : State t) -> Finite (Viable {t} n x)
>
fViable' = fViable
>
fAllViable : {t : Nat} -> {n : Nat} ->
>
(x : State t) -> (y : Ctrl t x) ->
>
Finite (All (Viable {t = S t} n) (nexts t x y))
>
fAllViable {t} {n} x y = fAll' (fViable' {t = S t} {n}) (nexts t x y)
>
-}
* Show states and controls
>
|||
>
showState
:
{
t
:
Nat
}
->
State
t
->
String
>
|||
>
showCtrl
:
{
t
:
Nat
}
->
{
x
:
State
t
}
->
Ctrl
t
x
->
String
>
|||
>
showStateCtrl
:
{
t
:
Nat
}
->
Sigma
(
State
t
)
(
Ctrl
t
)
->
String
>
showStateCtrl
{
t
}
(
MkSigma
x
y
)
=
"("
++
showState
{
t
}
x
++
" ** "
++
showCtrl
{
t
}
{
x
}
y
++
")"
* Sequences of state-control pairs
>
|||
>
data
StateCtrlSeq
:
(
t
:
Nat
)
->
(
n
:
Nat
)
->
Type
where
>
Nil
:
{
t
:
Nat
}
->
(
x
:
State
t
)
->
StateCtrlSeq
t
Z
>
(
::
)
:
{
t
,
n
:
Nat
}
->
Sigma
(
State
t
)
(
Ctrl
t
)
->
StateCtrlSeq
(
S
t
)
n
->
StateCtrlSeq
t
(
S
n
)
>
using
(
t
:
Nat
,
n
:
Nat
)
>
implementation
Show
(
StateCtrlSeq
t
n
)
where
>
show
=
show'
where
>
show'
:
{
t
:
Nat
}
->
{
n
:
Nat
}
->
StateCtrlSeq
t
n
->
String
>
show'
xys
=
"["
++
show''
""
xys
++
"]"
where
>
show''
:
{
t'
:
Nat
}
->
{
n'
:
Nat
}
->
String
->
StateCtrlSeq
t'
n'
->
String
>
show''
{
t'
}
{
n'
=
Z
}
acc
(
Nil
x
)
=
>
acc
++
"("
++
showState
x
++
" ** "
++
" "
++
")"
>
show''
{
t'
}
{
n'
=
S
m'
}
acc
(
xy
::
xys
)
=
>
show''
{
t'
=
S
t'
}
{
n'
=
m'
}
(
acc
++
showStateCtrl
xy
++
", "
)
xys
>
|||
>
valStateCtrlSeq
:
(
t
:
Nat
)
->
(
n
:
Nat
)
->
StateCtrlSeq
t
n
->
Val
>
valStateCtrlSeq
t
Z
(
Nil
x
)
=
>
zero
>
valStateCtrlSeq
t
(
S
Z
)
((
MkSigma
x
y
)
::
(
Nil
x'
))
=
>
reward
t
x
y
x'
>
valStateCtrlSeq
t
(
S
(
S
m
))
((
MkSigma
x
y
)
::
(
MkSigma
x'
y'
)
::
xys
)
=
>
reward
t
x
y
x'
`
plus
`
valStateCtrlSeq
(
S
t
)
(
S
m
)
((
MkSigma
x'
y'
)
::
xys
)
* Trajectories
>
|||
The
monadic
operations
>
ret
:
{
A
:
Type
}
->
A
->
M
A
>
bind
:
{
A
,
B
:
Type
}
->
M
A
->
(
A
->
M
B
)
->
M
B
>
postulate
monadSpec1
:
{
A
,
B
:
Type
}
->
{
f
:
A
->
B
}
->
>
(
fmap
f
)
.
ret
=
ret
.
f
>
postulate
monadSpec21
:
{
A
,
B
:
Type
}
->
{
f
:
A
->
M
B
}
->
{
a
:
A
}
->
>
bind
(
ret
a
)
f
=
f
a
>
postulate
monadSpec22
:
{
A
:
Type
}
->
{
ma
:
M
A
}
->
>
bind
ma
ret
=
ma
>
postulate
monadSpec23
:
{
A
,
B
,
C
:
Type
}
->
{
f
:
A
->
M
B
}
->
{
g
:
B
->
M
C
}
->
{
ma
:
M
A
}
->
>
bind
(
bind
ma
f
)
g
=
bind
ma
(
\
a
=>
bind
(
f
a
)
g
)
>
|||
>
possibleStateCtrlSeqs
:
{
t
,
n
:
Nat
}
->
(
x
:
State
t
)
->
(
r
:
Reachable
x
)
->
(
v
:
Viable
n
x
)
->
>
(
ps
:
PolicySeq
t
n
)
->
M
(
StateCtrlSeq
t
n
)
>
possibleStateCtrlSeqs
{
t
}
{
n
=
Z
}
x
r
v
Nil
=
ret
(
Nil
x
)
>
possibleStateCtrlSeqs
{
t
}
{
n
=
S
m
}
x
r
v
(
p
::
ps'
)
=
>
fmap
g
(
bind
(
tagElem
mx'
)
f
)
where
>
y
:
Ctrl
t
x
>
y
=
ctrl
(
p
x
r
v
)
>
mx'
:
M
(
State
(
S
t
))
>
mx'
=
nexts
t
x
y
>
av
:
All
(
Viable
m
)
mx'
>
av
=
allViable
(
p
x
r
v
)
>
g
:
StateCtrlSeq
(
S
t
)
m
->
StateCtrlSeq
t
(
S
m
)
>
g
=
((
MkSigma
x
y
)
::
)
>
f
:
Sigma
(
State
(
S
t
))
(
\
x'
=>
x'
`
Elem
`
mx'
)
->
M
(
StateCtrlSeq
(
S
t
)
m
)
>
f
(
MkSigma
x'
x'emx'
)
=
possibleStateCtrlSeqs
{
n
=
m
}
x'
r'
v'
ps'
where
>
ar
:
All
Reachable
mx'
>
ar
=
reachableSpec1
x
r
y
>
r'
:
Reachable
x'
>
r'
=
allElemSpec0
x'
mx'
ar
x'emx'
>
v'
:
Viable
m
x'
>
v'
=
allElemSpec0
x'
mx'
av
x'emx'
>
|||
>
morePossibleStateCtrlSeqs
:
{
t
,
n
:
Nat
}
->
(
mx
:
M
(
State
t
))
->
>
All
Reachable
mx
->
All
(
Viable
n
)
mx
->
>
(
ps
:
PolicySeq
t
n
)
->
M
(
StateCtrlSeq
t
n
)
>
morePossibleStateCtrlSeqs
{
t
}
{
n
}
mx
ar
av
ps
=
bind
(
tagElem
mx
)
f
where
>
f
:
Sigma
(
State
t
)
(
\
x
=>
x
`
Elem
`
mx
)
->
M
(
StateCtrlSeq
t
n
)
>
f
(
MkSigma
x
xemx
)
=
possibleStateCtrlSeqs
x
r
v
ps
where
>
r
:
Reachable
x
>
r
=
allElemSpec0
x
mx
ar
xemx
>
v
:
Viable
n
x
>
v
=
allElemSpec0
x
mx
av
xemx
>
|||
>
possibleStateCtrlSeqsRewards
:
{
t
:
Nat
}
->
{
n
:
Nat
}
->
>
(
x
:
State
t
)
->
(
r
:
Reachable
x
)
->
(
v
:
Viable
n
x
)
->
>
(
ps
:
PolicySeq
t
n
)
->
M
(
StateCtrlSeq
t
n
,
Val
)
>
possibleStateCtrlSeqsRewards
{
t
}
{
n
}
x
r
v
ps
=
>
fmap
(
pair
(
id
,
valStateCtrlSeq
t
n
))
(
possibleStateCtrlSeqs
{
t
}
{
n
}
x
r
v
ps
)
>
|||
>
possibleStateCtrlSeqsRewards'
:
{
t
:
Nat
}
->
{
n
:
Nat
}
->
>
M
(
StateCtrlSeq
t
n
)
->
M
(
StateCtrlSeq
t
n
,
Val
)
>
possibleStateCtrlSeqsRewards'
{
t
}
{
n
}
xys
=
fmap
(
pair
(
id
,
valStateCtrlSeq
t
n
))
xys
>
|||
>
possibleRewards
:
{
t
:
Nat
}
->
{
n
:
Nat
}
->
>
(
x
:
State
t
)
->
(
r
:
Reachable
x
)
->
(
v
:
Viable
n
x
)
->
>
(
ps
:
PolicySeq
t
n
)
->
M
Val
>
possibleRewards
{
t
}
{
n
}
x
r
v
ps
=
>
fmap
(
valStateCtrlSeq
t
n
)
(
possibleStateCtrlSeqs
{
t
}
{
n
}
x
r
v
ps
)
>
|||
>
possibleRewards'
:
{
t
:
Nat
}
->
{
n
:
Nat
}
->
M
(
StateCtrlSeq
t
n
)
->
M
Val
>
possibleRewards'
{
t
}
{
n
}
xys
=
fmap
(
valStateCtrlSeq
t
n
)
xys
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment