Blog and News

ETF Bot adjustment


Manual update for ETF Bot decision chain. Adjustment was necessary cause of split 1:25. Portfolio needed to be reduced by 25.

Did the stock bot fail?


We have restarted our stock bot. After a couple of months the stock bot has done some decisions and now he lost more than 90% percent of the virtual value, that was provided in the beginning of the year. We start to investigate what happened here.

The goal is to create a decision supporting system for slow trading with small budget. The biggest pool of cost for small budget trading are transaction fees. Every transaction costs at least 10 USD units. We use 10 units, because we say if you trade you should be also the owner of the share. If you invest in some index financial products, you need to trust your provider. In worst case scenario, index financial product could be a pyramid scheme. The transaction cost is not paid only once, it needs to be paid twice: once for buying and the second time for share selling.

What does it mean for the decisions? The decision bot needs to decide for stocks with a positive trend above 2%. The bot starts with 1000 units, and if he buys and sells the stock he pays 20 units transaction cost: 20/1000=0.02. After some bad trading decisions, the percental transaction rate gets higher compared to available virtual cash. Every bad decision increases transaction cost.

In the first analysis, we see not that all buy and hold decisions were bad.

We analyze the time period from 19th February 2019 till the 16th August 2019. In that time the bot decided 117 times. In the time period, the bot’s portfolio included 40 different stocks

Does the bot identify the positive trends for chosen shares? Mostly yes, the predicted trend was corrected for 85% of the stocks.

In average, the holding period of the shares was only 177/40=4.425 days. But if we take a look on the predicted share price values, to be successful the bot needed to hold the shares at least for 18 days in average.

For the new round we restrict the decisions, so the bot can hold the shares longer.


Reinforcement Learning Examples Environment


In the previous post we made an introduction to reinforcement learning. In that post we make an example how the reinforcement learning works. Every learning agent in reinforcement learning learns to decide by himself. For that the agent needs an environment to interact with. What is an environment, it is a model of our problem which needs to be solved. For the Artificial Intelligence frameworks like Tensorflow there are some predefined models. For our example to demonstrate how reinforcement learning works we take the frozen lake szenario. The frozen lake szenario is based on a grid with four different areas: safe (S), frozen (F), hole (H), and goal (G). The agent starts in safe (S) and needs to find a way to the goal (G). Depending on the size of the lake environment, the agent needs a couple of trials to reach the goal. The rules for that game a simple, the agent starts from the safe point (S) and moves across the lake to the goal. If the agent chooses the hole (H) he will fall into the lake and the game starts over in safe (S). In order to reach the target, the agent needs to learn where the frozen areas are and use them.


For the example we take a grid of 3x3. If the agent reaches the goal (G) he receives 100 points as reward. For the punishment, if the agent falls into the hole (H) he gets -100 points. Our lake is visualized as table. In the next post, we will explain how the agent interacts with that environment.


Reinforcement Learning


The definition of reinforcement learning is according to Sutton & Barto (2018) "what to do--how to map situations to actions--so as to maximize a numerical reward signal" (p. 1). It is a learning where the learner called learning agent discovers the environment and makes its own decision for actions. The main aspect of reinforcement learning is, that in complex environments the actions have consequences and this consequences do not affect only immediately on the end reward result.

  • Trial and error
  • Delayed reward

Reinforcement learning is based on dynamical systems theory, especially Markov decision processes.

Compared to other learning paradigms like unsupervised and supervised learning, reinforcement learning creates a third paradigm, which is between the supervised and unsupervised. The learning is not supervised because the learning agent can take own decisions and it is not unsupervised because the learning agent has a reward function and the learner doesn't search for hidden structures (Sutton & Barto 2018 p.1-2). For the searching of the solution in finance market, we decide to create a bot that don't try to find patterns. The bot is searching for the solution to maximize the reward.
(Sutton , Richard S. & Barto Andrew G. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, USA, 2st edition 2018)


First post


To create a better understanding of the Artificial Intelligence (AI) that we are using we will post interpretation and suggestion in this blog. Our AI uses reinforcement learning, next post will describe what the reinforcement learning is and why it is great to use. Besides of the machine learning topics, in that blog we will present popular finance strategies.