Interview Question: Processing File

Sample Question #245 (programming – general)
 
Imagine you have a text (ASCII) data file that contains one stock-date per line. Each line has the following information: stock ticker, trading date, close price, day’s volume, closing call price, and closing put price. (Assume that each stock in our universe can have at most one call and one put.) Note that the closing call price and put price can be both missing (i.e., a stock can never have only a call or only a put), in which case the last two fields are left blank.
 
In your favorite programming language, read this input file and create an output file that contains one stock per line with the following information per line:
 
     stock ticker, average close price over all days, total volume for all days, average call price, average put price
 
Don’t worry about little details like how to read the file or which delimiter to use. I want to know how you process the information and create the output lines as requested.
Advertisements
This entry was posted in Sample Qs. Bookmark the permalink.

2 Responses to Interview Question: Processing File

  1. Brett says:

    HINT
     
    I give you two possible ways to solve this:
     
    1) If you use a language like Perl which is great with hash tables, then you should use a hash table to store the input stock data, keyed by the ticker.
     
    2) If you want to use a loop, be sure to keep track of which stock you’re storing into an appropriate data structure. (What data structure should you use?)
     
    This question (a real interview question) serves as an excellent example of the kind of text data processing quants have to do on a daily basis. Of course, many quant shops utilize RDBs, in which case one will have the luxury of working with SQL.
     

  2. Brett says:

    More hint:  you should mention how you take care of missing data in any of the fields (you may safely assume that each line in the input file has at least a ticker, i.e., the first field on each input line is always valid). For example, how do you denote a missing variable? How do you calculate average and total if a subset of the trading days don’t have data?
     

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s