Using Vsync to build an online stock trading platform.

In this project, your job is to use Vsync as the basis for a stock trading system. The basic idea in its simplest form is as follows. The platform maintains a database of stocks that are available for trading, consisting of stock ticker symbols that index into a kind of table. At each moment in time, each stock has a list of bids and offers. For example, perhaps Sally Smith is interested in buying 10,000 shares of IBM stock if the price drops below $180/share, and would sell 25,000 IBM shares at prices above $225/share. Jim Jones isn't interested in buying IBM, but has 40,000 shares for sale at $227/share.

In general, bids and offers are a kind of template that can be fairly elaborate. Your first task is to decide what you want to support in your bid and offer "sheets", which should be available for human and for computer trading platforms. Options to consider include
  • Instead of just one bid or one offer, there could be a list (all for the same stock, but at various prices)
  • There might be a minimum block size: Sally may not want to by a single share at a time; perhaps she would only be interested in blocks of 5,000 shares or more, etc.
  • The buyer or seller may or may not be willing to break up a trade into smaller sub-trades. If not, the stock system must satisfy the entire request at one time.
  • The buyer might specify a limit on the residual (untraded) portion: Sally might be open to selling less
than 25,000 shares, but perhaps only if she can sell at least 20,000 at the desired price. Here the residual would be 5,000 shares.
  • The price could be expressed as a range, or as a specific number. Above we used a single number, but there can be cases in which a range of pricing gives more flexibility to the exchange.
  • There will probably be a timeout after which the offer is no longer valid, and an ID that can be used to cancel the order if conditions change. You might allow a new order to replace an old order to reduce the overheads if a trader keeps adjusting his or her bid/offered pricing.

The basic role of the exchange is to receive a new bid or offered record, and then either carry out a trade if this new record enables one, or update its bid/offered database. The exchange also continuously streams reports on the bid/offered pricing and on completed trades, in a publish/subscribe model where clients subscribe to symbols that interest them. (A real exchange charges for this basic information service as a function of how much volume it experiences on the total set of subscriptions for a particular client, and also charges a fee for each transaction that it completes, normally in pennies per share or as a percentage of the value of the entire trade. Exchanges also have various mechanisms to ensure the liquidity of the traders it supports, and to wire transfer funds and shares for consummated trades).

Even fancier options exist. Some exchanges are starting to experiment with basket-trading interfaces, in which traders who work with index funds such as the Dow Jones Industrial Average or the S&P 500 can trade "shares" in the underlying index. One share of the S&P 500 really consists of a basket containing shares of the underlying 500 stocks. Thus the single sale really involves selling 500 stocks. One could imagine solutions that will match up a basket bid or offer against mixtures of other kinds of baskets, shares of the underlying equities, etc. On the other hand, optimal versions of stock matching algorithms quickly become NP hard problems. Still, these are the kinds of features that could make your stock exchange very popular with customers and some NP hard problems can be approximated using easier greedy solutions that will often come close to the optimal, but that run in time more similar to that of a sorting algorithm.

Last, make a block diagram illustrating a client filling in the form, talking to a stock exchange server (just one) that runs these matching

Now that you've designed the basic trading "mechanism" implement a single non-replicated server that supports your API, and then a client platform that uses it. Employee a RESTful or Windows Communication Framework (WCF) version of client/server remote method invocation. You'll find lots of tools for implementing RESTful client/server systems on Linux. WCF is considered to be the first choice on Windows systems and is easy to use from Visual Studio, where you can push a button to create the template of your server or to import the server API into a client (build the server first).

Now test your single-server system out with a few concurrent clients. In what ways does the matching problem limit performance? How does the solution scale (slow down!) with growing numbers of clients and growing numbers of stock symbols?

The next step is the one where you'll introduce Vsync. Back at the design board, you'll need to brainstorm about ways that by spreading the service over a group of servers, you can enhance performance and also gain fault-tolerance (for this particular service you may need to adjust the Vsync failure detection timers!). You will need a policy for mapping clients to servers (should they connect at random? Or should certain clients favor certain servers? The latter makes sense if data is somehow sharded so that specific information is more likely to be found at specific server-group members).

In your design, think about the following behaviors that the solution needs to support. First, you'll want to leverage parallelism: with more servers the stock exchange should be able to handle more trades. But you also need to guarantee correctness: the distributed solution still needs to have a behavior that a single non-distributed server might have had. Next, think about the total capacity of the service in terms of load it can support, numbers of equities it can handle, and perhaps about its ability to deal with complex trading services such as basket trading. Finally, design a fault-tolerance strategy that will guarantee correct recovery when a server fails, or when a failed server restarts.

With this plan, go ahead and interface your existing server to Vsync, running multiple copies on your personal laptop or workstation to test the solution. We recommend a very incremental, step by step approach: add one feature at a time, then retest, making sure that you understand what Vsync is doing at each step. The manner in which your code handles its own crashes can have a big impact on fault-tolerance and self-repair speeds for the copies of the service that remain healthy, so you may want to make sure you've read about those aspects of Vsync and have the solution tuned for peak self-repair speeds!

Test the resulting solution. Does it indeed have the scalability and performance properties you had hoped to see? What about scalability in reporting to the client: do different clients see the same event reports at the same time? If not, what can be said about the skew in time between when client A and client B learn about the identical event X? This might represent an advantage to some clients (e.g. lower skew would mean that A is perhaps trading using better data than B). How can you ensure fairness in your system?

Develop a nice graphical API to demonstrate your solution to a professor or TA. It should be able to show bid and offered price ranges and actual transactions as time passes for any set of stocks that the user might tell it to track. You could also build other kinds of operator consoles to help the exchange owners track the behavior of the exchange servers and detect problems. Could your system include special features to help the SEC sense evidence of fraud (e.g. if some client seems to have unfair early notice about trades and constantly scoops up the free shares just before a big order comes in, then resells them at a small profit)? Such behaviors are a big topic in the news these days. You can read about this kind of high-speed arbitrage in the cover story of the Sunday New York Times magazine for 6th of April, 2014.

Other features to consider:
  • Carefully profile the performance of your solution and tune it as much as you can. What limits the speed of your stock exchange system? Are there ways Vsync needs to improve in order for your performance to improve? If so, report your findings on this web site (below, or in the discussions board).
  • We made no use of the Vsync out of band (OOB) data movement layer, but that actually offers big speedups for moving big objects from place to place inside the system. What would be the best ways to leverage the OOB features in your solution? Build and demonstrate them.
  • Many stock exchanges allow their clients to maintain read-only copies of the exchange, often with complete historical records, so that they can do fairly elaborate computations (used to arrive at the bid and offered pricing they will use). Build a client-side "database" that can be used this way. In order to make sure a bid or offered price remains valid, add a field so that a client can submit an order that has an if statement built in: "If the status of IBM is still xxxx, then place this order....". You would design a condition validation policy allowing the client to basically say "this order is only valid if the key stock pricing on which it was based hasn't changed since I computed the order parameters." That way, if by the time the order reaches the trading floor IBM stock has shifted (or perhaps the IBM order might be tied to the trades in some basket of tech-sector stocks, or the price of oil, or whatever), the order would be automatically withdrawn.
  • Stock exchanges often have multiple locations at which trading is permitted these days. In the past, we thought of the New York Stock Exchange as being located right in New York. But today there are some exchanges that have locations in NYC, Chicago, London, Tokyo, etc. How could your solution be expanded to offer that sort of geographic spread, with data centers in each of several cities?
  • A fair geographically distributed stock exchange is one with locations in each of several cities in which each order is presented to the various exchanges at approximately the same time in real-time. This gives high-speed traders less chance to arbitrage by noticing that a big order was placed in NYC, way larger than NYC can handle on its own, and then rushing to place orders in Chicago or London with the goal of driving up prices for whatever the big buyer or seller was trying to do, and thus earning a windfall profit. Think about ways to solve this problem while still maintaining the smallest possible delays between when a request is submitted and when the order is executed. You may assume very fast communication links that are dedicated to your exchange system, ultra-high-quality time synchronization via stratum one NTP time servers on each machine, or other features that could help.

If you get to this line, and have done everything listed above, you should seriously consider trying to launch an actual exchange. There is big money in operating stock exchanges, and a solution with all of these features (especially basket trading) could be a very valuable commodity!

Last edited Nov 19, 2015 at 8:21 PM by birman, version 2