" EFFICIENT APPROXIMATE POLICY ITERATION METHODS FOR SEQUENTIAL DECISION MAKING IN REINFORCEMENT LEARNING "