Twitter is one of the most popular social networks for sentiment analysis. This data set of tweets are related to the stock market. We collected 943,672 tweets between April 9 and July 16, 2020, using the S&P 500 tag (#SPX500), the references to the top 25 companies in the S&P 500 index, and the Bloomberg tag (#stocks). 1,300 out of the 943,672 tweets were manually annotated in positive, neutral, or negative classes. A second independent annotator reviewed the manually annotated tweets.

Instructions: 

Twitter RAW data was downloaded using the Twitter REST API search, namely the "Tweepy (version 3.8.0)" Python package, which was created to make the interaction between the REST API and the developers easier. The Twitter REST API only retrieves data from the past seven days and allows to filter tweets by language. The tweets retrieved were filtered out for the English (en) language. Data collection was performed from April 9 to July 16, 2020, using the following Twitter tags as search parameter: #SPX500, #SP500, SPX500, SP500, $SPX, #stocks, $MSFT, $AAPL, $AMZN, $FB, $BBRK.B, $GOOG, $JNJ, $JPM, $V, $PG, $MA, $INTC $UNH, $BAC, $T, $HD, $XOM, $DIS, $VZ, $KO, $MRK, $CMCSA, $CVX, $PEP, $PFE. Due to the large number of data retrieved in the RAW files, it was necessary to store only each tweet's content and creation date.

 

The file tweets_labelled_09042020_16072020.csv consists of 5,000 tweets selected using random sampling out of the 943,672 sampled. Out of those 5,000 tweets, 1,300 were manually annotated and reviewed by a second independent annotator. The file tweets_remaining_09042020_16072020.csv contains the remaining 938,672 tweets.

Categories:
285 Views

This data is for the portfolio

Categories:
68 Views

This dataset includes the time series of daily returns for the main stock indices of G20 countries including Argentina, Australia, Brazil, Canada, China, the European Union, France, Germany, India, Indonesia, Italy, Japan, Mexico, Russia, Saudi Arabia, South Africa, South Korea, Turkey, the United Kingdom, and the United States from Jan 1, 2010 to Jan 1, 2020.

Categories:
401 Views

Data for the study has been retrieved from a publicly available data set of a leading European P2P lending platform, Bondora (https://www.bondora.com/en). The retrieved data is a pool of both defaulted and non-defaulted loans from the time period between 1st March 2009 and 27th January 2020. The data comprises demographic and financial information of borrowers and loan transactions. In P2P lending, loans are typically uncollateralized and lenders seek higher returns as compensation for the financial risk they take.

Instructions: 

The dataset also consists of data preprocessing Jupyter notebook that will help in working with the data and to perform basic data pre-processing. The zip file of the dataset consists of pre-processed and raw dataset directly extracted from the Bondora website https://www.bondora.com/en.

Disclaimer:
In the attached notebook, I have used my intuition and assumption for performing data-preprocessing.

Categories:
1898 Views

Ropsten Ethereum format is an “Ethereum Testnet” that runs the same protocol as Ethereum.

It contains the following fileds:

 

-Transaction Hash code

-Block Number

-Unix Timestamp

-Date and Time

-Adress From

-Adress To

-Contract Adress

-Value In (in Ethereum)

-Value Out (in Ethereum)

-CurrentValue

-Transfer Fee (in Ethereum)

Instructions: 

EXPERIMENT RESULTS, EVALUATION AND DISCUSSION

The experimental prototype of the proposed remitance model was evaluated over two days from May 22 to May 23, 2019, each day from 9 am to 4 am GMT+1 using the experimental prototype. 160 remittance transactions were performed. All the transactions details are publicly available on the Ropsten public log available at: https://ropsten.etherscan.io/address/0x904248FE328a186CE76666ee9e2548Ab2.... On the day of the transactions, the value of one Ether was $230.17 USD.

 

The proposed model was evaluated in regard to 2 criterions. The first was to ensure that the entire process behaved in a trustworthy manner in regard to the intention of the expeditor and the second examined the financial cost of using the system compared to actual MTO services.

 

To test the trustworthiness of the model, different scenarios were tested to ensure that the logic embedded in the application prototype behaved as expected. These test scenarios were simulated using Selenium, a Web automation framework. Selenium simulates human interaction using a web based application. With Selenium, one can track every key stroke a user enters in the system, and at each step check the internal state of the system. Testing using Selenium not only allows us to test the quality of the code, but it also tests the overall behavior of the system as seen by the end user, especially the page responsiveness.

 

The prototype was tested for 80 different scenarios. For each scenario, the test result was either pass or fail. The tests included security scenarios where the most common attacks were simulated, and the normal functioning of the application with scenarios simulating participants not sending confirmation on time, or sending wrong signatures, or participants trying to cheat or hack the system. The tests targeted the application layer, which interacts with the end users, as well as the smart contract logic layer.

 

After the correction of all the coding defects, the second version of the prototype was submitted to all the tests and behaved according to our proposed definition of a trustworthy remittance system. However, an important caveat should be noted. Scenarios not considered in this first research activity could exist that would show the model may not perform as intended, especially in the logic layer dealing with the smart contract reliability. Testing smart contract reliability is an important area of research. 

 

To determine how costly the use of the proposed trusted remittance application would be compared to a current MTO, the prototype was tested using Ropsten, the public Ethereum test network. Ropsten is the closest system to the real Ethereum network. Using Ropsten allows us to have a close approximation of how the system would behave in real life.  

 

When we discuss the cost of using the proposed model, we refer to the gas cost, which is the cost charged by the Ethereum network to perform a transaction. With Ethereum there are no free operations; every operation (command) performed in the system costs Ether and it is identified as a gas fee. This gas fee can be understood as a constraint to oblige the developer of a smart contract to use the least resource intensive operation possible. The gas fees are distributed to miners of the Ethereum network as compensation for maintaining security. 

 

In a commercial sense, in addition to the gas fees that are collected by the network, the operator of the smart contract should also set its own fees. For this experimentation, this remittance fee was set to zero and only the cost of using the Ethereum resource network has been studied.   

 

160 transactions were performed on the Ropsten network. A transaction was comprised of the 3 stages of a remittance of service: the expeditor sending money through the system, confirmation of the transaction by the service provider and the beneficiary, and transferring the money to the service provider or to the expeditor in the case of a failed transaction.

Table 4 shows the average cost for each step of the transaction was between 3.08 Ether to 0.3 Ether. The total average cost of a remittance transaction is $1.29, irrespective of the amount remitted. This is largely below the $15 that is usually paid for a remittance of $200 sent from Canada to the DRC using Western Union. If we consider that this amount is only the cost of using the system and does not take into account any fee for the (RS) operator, adding a small overhead charge for the (RS) operator would still make the system commercially competitive compared to an MTO.

 

            Table 4: Cost of using the application

Actor

Operation

Average cost in USD

Expeditor

 

 

 

Transferring money

0,6608527 $

Beneficiary

 

 

 

Signing Confirmation 

0,21955631 $

 

Choosing a service provider

0,20657076 $

Service provider

 

 

 

Signing confirmation

0,20657076 $

Total cost of the transaction

 

1,29355053$4

 

 

 

 

 

 

 

 

 

 

 

 

 

Results show that amount remitted has no influence on the cost of the transfer, the gas cost charged by the Ethereum network. It seems that there is a slight dependence between the time a transaction is completed and the gas cost, however. This could be linked to the load of the system at certain time of the day.

 

The time to complete a command in the Ethereum network, whether sending money or confirming a transaction, ranged from 5s to 1 minute during the testing. It has been observed that this time is a function of the network load and is not related to the amount or type of the command.

 

It was also noted that when using Ethereum, the cost of the transaction is spread among the participants. In the current MTO model, the expeditor bears all the costs of the transaction. In the proposed remittance of service model using Ethereum, each participant has to pay a fee. 

 

This is mandatory due to the manner in which Ethereum works. With Ethereum there are no free transactions; every command executed in the Ethereum network costs a transaction fee. This could be addressed in a future iteration of the system model with a functionality computing the expected gas cost to be paid by each participant and incorporating it in the expeditor fees. In this manner, the system will reimburse automatically gas fees to the other participants.

Categories:
504 Views

This dataset was created for research on blockchain anomaly and fraud detection. And donated to IEEE data port online community.

https://github.com/epicprojects/blockchain-anomaly-detection

 

Files: 

bitcoin_hacks_2010_2013.csv: Contains known hashes of bitcoin theft/malicious transactions from 2010-2013

malicious_tx_in.csv: Contains hashes of input transactions flowing into malicious transactions.

Instructions: 

The dataset contains transaction hashes of all bitcoin Heists, Thefts, Hacks, Scams, and Losses from 2010-2014. These datasets are constructed from the information bitcoin forum (https://bitcointalk.org/index.php?topic=576337.0) and Blockchain.com

 References:

  1. https://arxiv.org/abs/1611.03942

  2. https://arxiv.org/abs/1611.03941

Categories:
1310 Views

Information:

 This dataset was created for research on blockchain anomaly and fraud detection. And donated to IEEE data port online community.

Research experiments for this dataset can be found at https://github.com/epicprojects/blockchain-anomaly-detection

 

 

Instructions: 

 

*This dataset is created by parsing raw bitcoin .BLK files. Using this dataset one can create a directed acyclic graph (DAG) of bitcoin transaction network as mentioned in references.

 

DIMENSIONS:

  • tx_hash_from: Input transaction hash
  • tx_hash_to: Output transaction hash
  • datetime: Represents the date and time of the transaction
  • amount_bitcoins: The amount of bitcoins transferred.

 

 

REFERENCES:

  1. https://arxiv.org/abs/1611.03942
  2. https://arxiv.org/abs/1611.03941
  3. https://arxiv.org/abs/1107.4524
  4. http://anonymity-in-bitcoin.blogspot.com/2011/09/code-datasets-and-spsn1...
  5. http://snap.stanford.edu/class/cs224w-2013/projects2013/cs224w-030-final...

 

 

Categories:
1511 Views

Information:

This dataset was created for research on blockchain anomaly and fraud detection. And donated to IEEE data port online community. 

https://github.com/epicprojects/blockchain-anomaly-detection

 

 

 

Instructions: 

A directed-acyclic graph is created from the bitcoin transaction data and metadata is extracted to create this dataset. 

 

DIMENSIONS:

  • tx_hash: Hash of the bitcoin transaction.
  • indegree: Number of transactions that are inputs of tx_hash
  • outdegree: Number of transactions that are outputs of tx_hash.
  • in_btc: Number of bitcoins on each incoming edge to tx_hash.
  • out_btc: Number of bitcoins on each outgoing edge from tx_hash.
  • total_btc: Net number of bitcoins flowing in and out from tx_hash.
  • mean_in_btc: Average number of bitcoins flowing in for tx_hash.
  • mean_out_btc: Average number of bitcoins flowing out for tx_hash.
  • in-malicious: Will be 1 if the tx_hash is an input of a malicious transaction.
  • out-malicious: Will be 1 if the tx_hash is an output of a malicious transaction.
  • is-malicious: Will be 1 if the tx_hash is a malicious transaction.
  • out_and_tx_malicious: Will be 1 if the tx_hash is a malicious transaction or an output of a malicious transaction.
  • all_malicious: Will be 1 if the tx_hash is a malicious transaction or an output of a malicious transaction or input of a malicious transaction.

 

REFERENCES:

  1. https://arxiv.org/abs/1611.03942
  2. https://arxiv.org/abs/1611.03941
  3. https://arxiv.org/abs/1107.4524
  4. http://anonymity-in-bitcoin.blogspot.com/2011/09/code-datasets-and-spsn1...
  5. http://snap.stanford.edu/class/cs224w-2013/projects2013/cs224w-030-final...

 

 

Categories:
898 Views

Consumer complaints are added to this public database after the company has responded to the complaint, confirming a commercial relationship with the consumer, or after they've had the complaint for 15 calendar days, whichever comes first. We don’t verify all the facts alleged in complaints, but we do give companies the opportunity to publicly respond to complaints by selecting responses from a pre-populated list. Company-level information should be considered in the context of company size and/or market share.

Categories:
483 Views

Pages