Analysis of magnitude of changes

I’ve been told to look into the magnitude of changes in the market. I’ve done that while sorting them into 2 categories; by the accuracy of the directional prediction. The following box-plots show the results.

You will see that there are outliers for both groups, but the number of outliers for the wrong-prediction group are greater in number. This might be why using a straight enter-and-exit strategy will not work; the losing trades lose a lot. I should look into placing a hard stop-loss at the 0.3 mark.

To clarify, the magnitudes shown in these box-plots are not the number of pips that are gained or lost. They are normalized values, where 1 coincides with the greatest movement seen in the training-set of data. Maybe I should look that them in terms of actual pips, but that shouldn’t make any difference to the chart.

On second thought, I should look at the pip-values of the change, if only to be thorough. Better safe than sorry. I’ll put that up in a later blog-post.

Edit: The 0.3 mark corresponds to a 0.0187 change in the forex-rate, approximately 187 pips. So I should place a hard stop-loss 187 pips away from my entry point. I will look into the performance of a trading system that uses this.

Posted in trading | Leave a comment

bulk-editing files

Over the last 3 days I have been trying to learn how to do bulk edits of files. I’ve decided to change the license of my programs to the Affero GPL v3, and wanted to change the notices in my files to reflect that. So I learned that in CL the only way to do this is to read the files into a string, and then run cl-ppcre on that string, and then output that string back into the file.

That was proving to be a headache for me, and I didn’t want to risk polluting my files. So I decided to go with a more tried and tested system. I found that for bulk edits, sed is a useful tool, and that perl is used a lot for this too. The link that I found to be the most useful is here. It gave me the basic template that I used in the end. But first lets have a look at the options I looked at.

Sed

Very nice system. Seems to be used a lot. Seems to be liked a lot. The problem I had with it was that it didn’t work on a multi-line basis. I looked at the reference manual and this helpful page. Lots of good stuff there, but I wanted something that would work multi-line.

In the process, I also came across this page, which I think would be quite useful to know. Good amount of information on this site.

Emacs

Emacs is the text-editor that I use. Its amazing. And it has a regex-replace function. Awesome. Thats just what the doctor ordered. So lets fire it up in batch-mode (Which I’ve never done before, and was looking forward to learning how) and get this done. No need to learn anything else apart from some Emacs Lisp, and that doesn’t seem so difficult.

One problem, though. I couldn’t find any reference as to how to get the regex-replace working in a non-interactive manner. So I figure that it can’t make any sense to do it through batch-mode. Too bad, I was looking forward to learning how to use that. Now, I’m not saying that its not possible, just that I couldn’t find any guide on the internet that showed me how to do it. I’m fully expecting someone to comment on this and show me how its done in emacs-lisp in a few lines of code. I hope that happens. It would be good to learn something.

Perl

And that brings me to perl. I looked at the link that I gave at the top of the post, and it seemed to have the basic information I wanted/needed. I looked up the Perl regular expressions page, and its got a lot of great information. Perl’s regex system seems to be the most featureful out there. Enough people refer to it, which certainly satisfies my ‘tried and tested’ requirement.

I don’t know any perl. This has been my introduction to it. I have found it to be very useful. It took a few tries to get it working just the way I wanted it to, and the find utility is amazing. I have gotten to learn more about it in the past few days, and it rocks.

Anyway, the final code I came up with to do my bulk-editing is

find -name '*.lisp' -execdir perl -0777 -pi.bak -e 's/^;; Copyright.*\n;; Distribu.*$/;; Copyright 2011-2012 Ravi Desai rd7190@gmail.com\n;; Distributed under the terms of the GNU Affero GPL version 3 or any later version./mi' '{}' '+'

The find utility spits out a list of all files with ‘.lisp’ in the last part of their name and feeds it to the perl system. The -0777 flag tells perl to just ‘slurp’ in the entire file (As opposed to reading in only one line at a time, which seems to be default). ‘Slurp’ing the file is important since I’m doing a multiline substitution. The ‘i.bak’ flag says that the files should be edited in-place. The ‘.bak’ part is optional, and says that the original files should be copied with a ‘.bak’ extension. This serves as a backup. Very good idea. The in-place modification flag is very nice. I like it. It certainly saved me a lot of work compared to doing the read-into-string, change-string, output-string-back-into-file thing that I would have had to do in CL. The ‘p’ flag does a looping thing that I don’t quite fully understand yet. This information is available here. I have to hand it to the perl people; they certainly know how to keep all the help files in one organized place. Kudos.

Wrap up

In conclusion, I think the perl system rocks (In this domain, at least). I don’t even know where to begin with having something this simple (1 line of bash code) in CL. If there is a way, please let me know. Thanks.

Posted in programming | Tagged , , , | Leave a comment

New performance measures for pleasance

This post is a recap of the pleasance system. I’ve just finished checking the pleasance system on the EURUSD and GBPJPY trading pairs. This is in addition to the GBPUSD that I usually work with. I had found a logical error in the nn-for-ga.lisp file, and it has now been fixed. It made no difference to the performance of the system, thankfully. Directional accuracy for the currency pairs are valid for 5 days into the future. All the results are from using the top 3 technical indicators provided by the pleasance system. Note that the top 3 indicators are different for each currency pair. I have no plans to look into the reasons for that right now.

Currency pair Directional accuracy
GBPUSD 70%
EURUSD 69%
GBPJPY 73%

I’ve spoken to someone I know and have been told that with this performance, I should start looking at my risk-management strategy to make a more profitable system. The 3*ATR(20) trailing stop system is working, but its not great. I would love to have something better than that, but need to put in some study-time before I get something useful.

Posted in trading | Tagged , , | 1 Comment

Project amethyst

Amethyst is a project I’m starting that will help me get better at programming in the financial trading and analysis domain. Its a single repository that will hold multiple independent trade simulators. I’m guessing that I’ll do a few of them and then I’ll get to abstracting out the common parts. This, if done well, will help with creating a general framework for trade simulation in Common Lisp. The code for this project is at this page.

In addition to working on this particular problem domain, I’m thinking that this project will also help me improve my programming skills. I’m looking forward to the challenges I’m sure to stumble upon as I work on this project. I’m guessing one of the first ones will be to have multiple trades going on at once. That should be fun to solve.

Posted in programming, trading | Leave a comment

Lucifer is out

The ‘lucifer’ project is out. Its available at its Github page. It should be a good basis for most of the trading simulations that I have planned.

The next step is to create a seperate project with all the trading system simulators on them. That should be fun. I’ll find enough ideas on the forums.

Posted in programming, trading | Leave a comment

Git filter-branch

The Pleasance project currently has everything it needs within its own package. That should change. The reason is that the data-mgmt and indicators files provide functionality that isn’t specific to this implementation of a predictor.

So the solution is to put those 2 files and their functionality into a package of its own. Then their functionality can be used by any program that requires this. I didn’t want to just copy and paste them into their own github repository, since that would mean they lose their history. I did a search for this and found that git allows for something like this quite well. Instead of making a copy of the repository and then just deleting the other files (But still having them in the history of this smaller project), there is a way to delete the other files and directories from all the history of the project! Very nice, git, very nice.

Git filter-branch

This is a nifty utility that does exactly what I’m looking for. The link to its guide (In the form of an example), is right here on Github. The man-page, as always, is very useful too.

Thank you Git.

Posted in programming | Leave a comment

Musings – 20120106

In the last few weeks I’ve been trying to think about how to make a system that has a way to finding out robust trade parameters. I recently just got a paper and pen and started thinking about it freely and noting everything down. Perhaps its time I break out the ‘Thinkertoys’ book again. This post will not be organized well or flow or anything like that. Its just me and my thinking. I’ve tried to put in some order, but it doesn’t make a lot of sense. I’m mostly doing this so that it doesn’t come off as one run-on paragraph.

As if that is not bad enough, I also began writing down ideas and notes while creating this post. And I’ve not bothered to put any of this in thought-chronological order. Enjoy reading; I hope you don’t get a headache.

What is the basic system like?

A basic system could look like the following.

  1. Enter under ‘x’ conditions.
  2. Exit under ‘y’ conditions.
  3. Else maintain current status.

Its all about changing the status of the trade or maintaining the status. This is a very basic system. There is nothing more to it.

How do we go about it?

Set stop-loss to some function of present values of indicators.

Can’t do that since we don’t know what the correct values are. We have the ‘x’, but don’t have the ‘y’. This means that we certainly can’t use a supervised NN for this.

Can we find out what ‘y’ should be by running a GA on the training data? To prevent curve-fitting, we apply the results to a validation-set and use the results as a fitness score.

There is a problem with that. Doesn’t feel right. Feels off. How do we know that this is any good? We aren’t taking anything from each iteration (training – validation pair). All we can do is just run a new curve-fit on new (Live) data. Nothing is learned.

Got carried away. Started thinking of just using GAs throughout. Wrong. Original idea was to run GAs on some data to find out what ‘y’ should be. Then train a NN using the ‘y’ and known information (Indicators) as ‘x’. And then what? This doesn’t make sense.

New idea. Example. Take a 5-month period (Hypothetical). Take a time-period. Find out what the SL (Stop-loss) should have been at the end of every time period (I’m using daily data). This can be the ‘y’. Don’t use GA to find out ‘y’. The analysis should take into account (At each point in time) the true range of prices (Highest high and lowest low) for the next ‘n’ periods. And then it should spit out a number that tells us what the SL should be. This might get complicated to create. Thinking of how to do that right now in different markets. Can’t think of how to do that. It would involve a system that looks at the big picture. Doesn’t make sense. Too difficult. Perhaps I should just focus on something that closes in the next 30 periods (30 days)? How about that. Would make things simpler. Would have to cap the time limit at some point, right?

Other ideas.

How would this be any different from a system that looks at the data for any day and then tries to predict the range the data is going to have over the next ‘n’ periods? It isn’t any different from that. That would be a small change in the current Pleasance system. And this does not help me with the problem that I’m trying to solve right now.

Another prediction system that I could make is to predict when a big move in the markets is going to happen. This shouldn’t be difficult. Just flag the ‘n’ time periods before a ‘big move’. The ‘big move’ could be defined as a change in price of some percentage of the ATR or something. Not complicated. Could be a basic binary output NN.

What is the objective?

To learn from the past. To learn what the stoploss and take-profit should be set at. To learn a function that tells us what the SL&TP values should be.

Learn how to apply that function in active trading. Then take that & apply a NN to it to learn what the function itself should be.

Things to do? Perhaps?

Look into the other AI techniques out there to see if there is a better NN-form for this or a better system than NNs.

Posted in trading | Leave a comment

>End-of-year

>

Its the end of the year. I’m not where I wanted to be at the end of the year. On the other hand, I’m in a better place than I was at the beginning. Progress in small steps.

State of the program.

I’ve put the program up on Github.com under a GPLv3 license. This makes it easier for people to follow the progress of this project. I feel that this is a great way to get more people involved in it, and to get help on it from others. I’ve also found that its easier to stay on top of things when I know that the commit-log is publicly viewable. It gives me something to shoot for.

I named the program ‘Pleasance’, after the home planet of my favourite sci-fi character. This program takes a bunch of technical indicators and applies them to a fully-connected back-propagation neural-network using a genetic algorithm and finds out which indicators are useful. By changing the code slightly you can have the GA focus on optimizing directional accuracy or the mean error (Which takes magnitude into account).

Recent changes.

I’ve made a few changes to the system that doesn’t change the program’s functionality a great deal. These changes are meant to make the program easier to use by other people. I’ve added a macro and a function that make it easier to recalculate everything and just put all the data into a user-named vector, which removes the need to re-load the entire project and change source code if you add a new technical indicator. Small stuff to make it more user-friendly.

Next steps.

I want to get a working model out there ASAP. So I think I will not keep adding functionality from the YuWangLai book into the matrix version of the program. Instead I will mash together my trade-simulator function and a neural-net that focuses only on the 3 indicators that show promise with directional accuracy. I’ll then turn the whole thing into a function and wrap a GA around it and have the GA focus on profitability. I want a working model of the system out there ASAP.

I will leave the Pleasance project alone, and put all this new functionality into a seperate project. I feel the Pleasance project is good at what it does and I should leave it at that.

Posted in neural nets, programming, trading | Leave a comment

>Success, failure, and giving up.

>

So I hit an interesting place a little more than a month ago. My program worked. But it didn’t give results that were terrible helpful. Talk about an anti-climax.

First, the good part.

The good part is that the GA+NN combination works wonders. I had it crunching through 50 different chromosomes for about 30 generations using 12 indicators. It ran through all of that in about 30 minutes on my cutesy laptop. Amazing. I’m pleasantly surprised by that performance. I’m still interested in parallel programming and seeing if I can get this baby on the cloud and playing about with it there, but don’t need to do it right now.

Now keep in mind that I have all the advanced NN stuff only in the matrix form of the program, and I have no clue how to turn that into a form usable by the GA. But using the NN-function as it was since the last post, and improving the GA function, I was able to get up to 70% directional accuracy in the forecast 5 days into the future. WOOOO!!

Thats amazing. Thats fantastic. I made that program & I trust it to be true and I stand by those results, which are in line with what the authors of the book have. I’d have done cartwheels if I didn’t think that I’d probably hurt myself. This is amazing because with 3 indicators I can say with (approximately) 65% accuracy which way the market is going to be 5 days in the future.

And now the other part.

This is all well and good, but forecasting doesn’t make money; trading does. I selected those 3 indicators, ran them through an array-version of the NN for an arbitrary 2000 epochs (Runs through the data), and then tested them on data immediately after the training data in the timeline. The results were showing losses. I changed it from only doing take-profits and stop-losses that were fixed to trailing stop-losses and take-profits, and finally no take-profits, to see if the results would improve if I cut my losses short and let my profits run.

The significant difference was that I went from making small losses to making small profits. Good, but no dice. The problem I had with this was that it was having an R-expectancy less than what I had with the random-entry system! What a waste of my time. Those were not good days.

And how I dealt with it.

Not as effectively as I could have. Thats the short version.

The long version is as follows. I got off the computer and didn’t touch the programs for a bit, and tried to just figure out some way to get beyond this issue. After about 2 weeks of doing this unsuccessfully, I just stopped thinking about it at all. Then I lost myself doing fun stuff that wasn’t productive at all. Then I made a few phone calls and spoke to some people about career stuff. I tried to get a volunteer position with some of the business improvement associations in the neighbourhood but didn’t get any response that got me actually getting work done with them (Just the usual we’ll-call-you-sometime-in-the-undefined-future). Then one of those contacts asked me if I could make a simulator in a user-friendly manner for them.

So that got me thinking. I could still put my programs to good use. There is definitely something here that is promising, and it would be good to figure out some way of sharing this. I might even get a job out of it someday. So lets do this. Lets try to have a better interface to my system than just a text interface. Lets try to make it user-friendly.

And I also figured out what was going on with the system that I had made. I was using the GA to find something that would have better directional accuracy. It delivered exactly that. I was not asking it to help me with something that would be profitable, though.

This might herald the next evolution of my system. Lets have a look at the history. I had the technical indicators. I had the trade simulator. I ran those manually. Then I made more code that wrapped around all of that in the form of an NN so that it would find out what weights each technical indicator should have, as opposed to weighing them myself in an arbitrary manner. I delegated that task to the NN. Then I wondered if, perhaps, the NN was just wasting its time with some indicators. So I built the GA around the NN to find out which indicators are always showing up at the top of each generation consistently. I’ve started with a few tools and have since then been building wrappers around them, with each layer building on top of the previous one.

Right now the system has identified the indicators that matter. How about I make another system with those indicators fixed, that searches for a robust set of trading parameters that are profitable? I’m going to rephrase that just in case I’ve not been clear (Its late, I should have gone to bed about an hour ago, but I want to get this done first). Right now the system I have searches the problem-space for the most useful indicators (Useful in the directional accuracy sense). Now that I have the answer to that I could start searching the problem-space for the best (As in robust) trading parameters (Like stop-loss levels and take-profit targets and the like). Which makes me wonder why I didn’t just do that in the first place using the random-entry method. D’uh. There have been moments during this project when I’ve felt like a genius of diminutive proportions. This is one of those moments. Not my brightest hour, true, but its better to find out than to stay in the dark. Back to topic. I could also have trading parameters for entry points as some function of the NN’s output. Things are looking better, now that I know (Or at the very least, have an idea) of what actions to take next. Good stuff.

Side note(s).

Only 3 indicators consistently were in the chromosomes representing the NNs with the highest directional accuracy. Crazy, right? Especially out of 12 possible indicators? Lets get rid of the suspense. The 3 indicators were the ATR; an indicator that divided the close by the average of the last 10 closes; and moving-average(5) minus moving-average(10) all divided by moving-average(5) (Moving averages of closing prices of the last ‘n’ days). Thats it! Those 3 consistently made the list, generation after generation through the GA. Weird. Or not weird, but definitely counter-intuitive.

Posted in neural nets, programming, trading | Leave a comment

>Big updates to the system

>With the reading of the Yu, Wang & Lai book (Foreign-exchange-rate forecasting with artificial neural networks), I decided to implement a Genetic Algorithm (GA) to figure out which indicators are actually useful for predicting the future price. This involved further refactoring of the code.

Genetic algorithms

A GA works by sampling and scoring different possible solutions to a problem, and then mixing and matching different parts of the best solutions. This is done iteratively, and eventually a good solution is found (Though its not necessarily the best possible solution out there).

My GA function’s logic is derived directly from what I’ve read about GAs on the internet. A very useful resource, the internet. I can’t think of improvements on the logic for now. I give the GA-function the number of generations I want, the length of each chromosome, and the size I want the population to be, and it creates random chromosomes and feeds them to the NN-function (See below) and ranks the results and performs crossover. I keep the better half of the population and perform crossovers from them through the generations. Its a very computationally-intensive process, but it seems to work. I’m happy to have learned how to do this. I have decided to not use PCA or other statistical measures to find out which indicators are helpful.

I’ve created more indicators based off what the authors of the forecasting book use. I’ve not made the full list of indicators that they say they evaluate, but I have a total of 9 right now (Including the ones I had earlier). This isn’t a big list, but I haven’t gotten the motivation to work on progrmaming more in. I chose instead to forge ahead and work on creating the GA and functionalizing the NN.

Functionalizing the code

The main requirement from the GA was that different NNs compete and be evaluated against each other and that just wasn’t possible with the program structured the way it was. I had to make the entire NN creation and training encapsulated so that I could run it repeatedly with different inputs (That were randomly selected). This complicated things even further since I found it cumbersome to use macros for define-hidden-nodes and define-input-nodes. I think this is the 4th or 5th iteration of the entire NN-training functionality.

To do all this, I learned about the “labels” utility that allows one to define functions within functions. I used this a lot. I also implemented more closures than earlier. Functional programming isn’t so bind-mending anymore; I like that.

Now, everything is inside one function that is called with a “chromosome” as its only argument. This chromosome is a GA concept. This huge NN-function takes this chromosome to determine which indicators are going to be used for this particular NN. After training is done, the NN-function evaluates the aggregate error in the training and validation sets and returns them as part of a list to the GA.

Limitations

Currently, there is a lot of improvement that can be done for training NNs. I don’t know if all of it is necessary, though. One of the major things to consider is to find out a better way to end training. Presently I’m using a hard-limit of training iterations. I was trying to make it work using the validation error (Stop training when validation error starts climbing), but that wasn’t stopping quickly enough for me.

Other improvements to consider would be to have more adaptable parameters. Currently I don’t have a momentum parameter, and my learning rate is static and arbitrary. The Lean, Yu, and Lai book has a whole list of things that can be done to improve performance of NNs. With my reading of this book, I find only the suggestion to use GAs, and this list of possible tweaks to be useful. Perhaps I will find more useful things after some more experience in this problem-domain.

Code

Mapping the chromosome to indices for specifix indicators

Each chromosome is a list of 1s and 0s (Ones and Zeros). This function takes the chromosome and using the position of the 1s, creates a new list that gives the index of the indicators that are going to be used by this NN. The length of this new list (Which is the same as the number of 1s in the chromosome) automatically tells us how many indicators are being used as inputs for this NN. The output of this function is the input for the define-input-nodes function.


(chromosome-to-input-vectors (chromosome)
(let ((a ()))
(loop for i below (length chromosome) do
(if (= 1 (elt chromosome i))
(push (1+ i)
a)))
a))

Define-hidden-nodes

This is the latest iteration of the define-hidden-nodes function. It does away with macros completely. I haven’t noticed a performance loss.


(define-hidden-nodes ()
(loop for i below numberofhiddennodes do
(let ((hidden-node-index i) ;Because we are closing over this variable for each hidden-node.
(number-of-input-nodes numberofinputnodes))
(setf (aref hidden-node hidden-node-index)
(lambda (dataset input-index)
(tanh
(loop for input-node-index below number-of-input-nodes sum
(* (aref weights-1 input-node-index hidden-node-index)
(funcall (aref input-node input-node-index)
dataset input-index)))))))))
Posted in neural nets, programming, trading | Leave a comment