Stock Price Prediction using Social News Sentiment Analysis
Stock market prediction using Sentiment Analysis of News
In order to improve the performance of models performing Stock Market Prediction, the first data to be added to the price data are those that focus on news published by newspapers or online blogs.
This is based on the fact that one or more published news items can influence the market trend. For example, a news item informing about layoffs that have been made by a certain listed company might suggest that the company is not experiencing a good financial situation and that investors might be tempted to sell its shares, thus lowering the market price. The same could happen with positive news, which would ecourage investors to buy a certain stock.Therefore, the main problems with this approach, when defining a particular asset, are where to get the news and what judgement to make of it.
A first article that can be analysed to understand how to carry out this type of process is 'Empirical evaluation of an automated intraday stock recommendation system incorporating both market data and textual news' by T. Geva and J. Zahavi from 2014. [34]
The authors use a dataset consisting of both historical stock market indicators to predict market movements, specifically analysing 72 as- sets taken from the 500 that make up the S&P 500 [footnote - The S&P 500 Index is a stock market index that measures the stock performance of the 500 largest publicly traded companies in the USA. The S&P 500 is one of the best representations of the US stock market], both from the headlines of breaking news taken from the Reuters website, and from sites that can be found in the Reuters news featured feed, such as Business Wire news, PR Newswire, EDGAR Online, and Market- Watch, this information was taken in the date range of 15 September 2006 to 31 August 2007.
To train and validate the classification models, the authors used a dataset consisting of 1,500 news items randomly selected from the first three months of the available data and manually assigned tags according to the categories to which they were affiliated, first performing a cleaning phase, removing irrelevant news and news related to previous market activity, and then splitting the resulting dataset into train and test in order to train a classifier that could assign each news item its corresponding category. Once classified, each news item is given a sentiment, specifically a label defining whether it is positive or negative news and then defining a numerical score. This information was first attributed manually, then with the data obtained a supervised linear regression algorithm was trained to determine the score of unlabeled news items.
In this paper, both an approach using Deep Learning algorithms, i.e. Neural Networks, and a supervised approach were applied, testing a Stepwise logistic regression (SLR), showing through the results that the integration of textual news data combined with market data helped to improve the performance of the model.
In 2019, R. Ren, D. D. Wu, and T. Liu [47] tried to apply a process similar to that of [34] by training a Support Vector Machine that aimed to predict stock price movement (up or down) for the SSE 50 index using 8 features based on sentiment analysis and historical asset data. However, unlike the previous article the dataset consisted of conventional time series data of the 50 assets that make up the SSE 50 Index, a primarily blue-chip stock index on the Shanghai stock market, with an additional label of 1 if the trend is positive, i.e. the closing price of the asset is higher than the closing price of the previous day, -1 the vice versa, but above all from the data obtained from a web crawler that extrapolated the news of 51 shares of Sina stock on the Eastmoney stock forum in the period between 7 June 2014 and 7 June 2016 saving not only the date of the news and the title but also the content. [...]
Questo brano è tratto dalla tesi:
Stock Price Prediction using Social News Sentiment Analysis
Autore: | Anna Lamboglia |
Tipo: | Laurea II ciclo (magistrale o specialistica) |
Anno: | 2021-22 |
Università: | Università degli Studi di Napoli - Federico II |
Facoltà: | Giurisprudenza |
Corso: | Ingegneria informatica |
Relatore: | Vincenzo Moscato |
Lingua: | Inglese |
Num. pagine: | 111 |
