Documente Academic
Documente Profesional
Documente Cultură
strategy. Momentum strategy is one where sorting stocks by their momentum (past
return), diving them into a few groups, forming a portfolio correspondingly for
each group and holding them for a certain period of time. It is essentially
performance chasing.
I use the data from CRSP, all the American stocks, which is about 7000 stocks.
I write everything in R and Python and produce some results. Due to the time
constraint, I was able to only compute the case of holding portfolio for just
one year while using the past year’s return (momentum). The result is that the
hedge portfolio would lose money because longing the top portfolio and shorting
the bottom portfolio turning out to be negative. However, this is only one year.
This is no way to achieve a significance statistics associate with this negative
value. The result is that one would lose 34.625 percent of his capital if he
were to invest in a momentum strategy.
A major issue I faced when cleaning the data is about missing observations. A
lot of stocks have missing data in January and/or December. The real problem is
that those missing observations are not even noted in the dataset. In other words,
if we were to represent the monthly prices of each stock by a vector, the length
of vectors would not be the same. Another issue is that the stocks that have
both of prices in January and December change over time. These two constitute an
issue of matching and merging that stopped me from progressing. While I just
started this project, I was not particularly good at Python or R. So I chose R
and was unable to solve this problem. Towards the end of this project,
incidentally, I was starting to form a clear vision as to how to use Python to
manipulate data. There is certainly some kind of procedure to handle this as in
SAS. Another solution using basic Python operation to this problem is fairly
easy. However, due to the time constraint, I was unable to actually implement it
in Python. Therefore I hereby outline the solution: use dictionary. I first
create a dictionary where the list stores the tickers and the value stores all
the corresponding prices. While loading prices from the dataset which is
presumably in a dataframe, I would artificially construct an arbitrary number,
say -10000, as the price whenever a missing observation happens. This way, I
will create a set of vectors of prices all of which have the equal length. Next,
I will delete all the tickers that have an observation of -10000. I do this for
every year and form a corresponding dictionary. Lastly, I form a common set of
sets of tickers of those dictionary. This set of tickers represents all the
stocks that legitimately have both prices in both January and December over time.
Smoothing price is another way to deal with the missing observations.
Next, I export this dataframe into an excel spreadsheet because I cannot proceed
in R and therefore turn to Python.
Note that the order is reversed. So the top portfolio is the one that has the
stocks with lowest momentum.
In conclusion, this momentum strategy is unlikely to work within a very short
period of time. The only way to ensure its significance is to run the computation
in a rolling fashion, which is left to future work. The real challenge, as it
turns out, is not so much about the coding of momentum itself as about the data
cleaning, specifically, the matching and merging. In realistic empirical research,
one has to deal with matching and merging almost always. Therefore I need to
master matching and merging in at least both R and Python.