I've been playing around with Rust and Python lately. I've also been playing around with OpenAI's API. I thought it would be fun to combine all three and create a custom company stock report generator. I'm not a financial advisor, so don't take any of this as financial advice. I'm just having fun with some code.

Generative models are all the rage these days. OpenAI's API is a great way to play around with them. I've been using it to generate text. I've also been using it to generate images. I thought it would be fun to use it to generate stock reports. GAI (Generative Artificial Intelligence) is a great way to generate text, but it works even better at taking a pile of data and commentary on a subject and producing a report on that topic. For now, I won't be sharing the code for this project, but I will share the results. The code is an unholy mess that might be the result of me no longer writing software professionally for nearly five years now. I will share snippets of code but not the whole thing.

Check out the reports!

The architecture is something like this:

  • An AWS Lambda function written in Python that orchestrates the heavy lifting. This function is triggered by an AWS SQS queue.
  • An AWS SQS queue that is populated by an AWS Lambda function written in Rust.
  • This Lambda function is exposed as an URL that is mapped to a custom slash command in Slack.

The Python Lambda function does the following:

  • A company stock symbol is passed to it via the SQS queue.
  • It then makes call to Polygon.io's APIs to get the company's name, and a list of recent news articles about the company.
  • Each news article is pulled down and the page contents are extracted using BeautifulSoup4. The text is then passed to OpenAI's API to generate a summary of the article.
  • The Python Lambda function also uses the python module yfinance to pull down the company's stock price history.
  • The Python Lambda function then uses the python module matplotlib to generate a graph of the company's stock price history.
  • Technical analysis is performed on the company's stock price history using the python module ta.
  • The technical analysis is then passed to OpenAI's API to generate a summary of the technical analysis.

The Rust Lambda function does the following:

  • It receives a company stock symbol via an HTTP POST request.
  • The symbol is submitted to an AWS API Gateway endpoint which inserts the symbol into an AWS SQS queue.

The Python Lambda function posts the reports progress to a channel in Slack. The Python Lambda function also posts the report to a channel in Slack when it is complete. The Python Lambda function also posts the report to a web page. The entire site is hosted on AWS S3.

One of the things that I ran into was wanting to get competitors or other companies in the same industry or sector that the subject of a report was part of. Trying to find a data source, that outputted what I wanted was quite difficult. I wanted, for example, a list of all the companies in the same sector as US Steel. I ended up turning to OpenAI's API to generate a list of companies in the same sector. I used the following prompt:

"return a json structure representing competitor companies to US Steel; include ticker symbol, company name and industry;  json should be in the format [{'company': 'Company Name Goes Here', 'symbol': 'SYMBOL', 'industry': 'Actual Industry Name Goes Here'}]; only output json do not wrap it in markdown; use double quotes for quoting keys and values"

Even a year ago, verbosely describing what you wanted to an API let alone an AI API would have been a pipe dream. I was able to get the following output from OpenAI's API:

[
    {
        "company": "ArcelorMittal",
        "symbol": "MT",
        "industry": "Steel"
    },
    {
        "company": "Cleveland-Cliffs Inc.",
        "symbol": "CLF",
        "industry": "Steel"
    },
    {
        "company": "Commercial Metals Company",
        "symbol": "CMC",
        "industry": "Steel"
    },
    {
        "company": "Nucor Corporation",
        "symbol": "NUE",
        "industry": "Steel"
    },
    {
        "company": "Reliance Steel & Aluminum Co.",
        "symbol": "RS",
        "industry": "Steel"
    },
    {
        "company": "Steel Dynamics, Inc.",
        "symbol": "STLD",
        "industry": "Steel"
    },
    {
        "company": "Ternium S.A.",
        "symbol": "TX",
        "industry": "Steel"
    },
]

The report application (the Python Lambda function) is backed by a DynamoDB table. The table has the following schema:

{
    "symbol":       symbol,
    "date_":        end_date.strftime("%Y-%m-%d %H:%M:%S"),
    "fundamentals": stock_fundamentals.to_json(orient='records'),
    "financials":   ticker.financials.to_json(orient='records'),
    "report":       complete_text,
    "data":         last_day_summary.to_json(orient='records'),
    "cost":         Decimal(str(cost)),
    "news":         news_summary,
    "url":          report_url,
    "run_id":       run_id,
}

The symbol field is the company's stock symbol. The date_ field is the date the report was generated. The fundamentals field is a JSON representation of the company's fundamentals. The financials field is a JSON representation of the company's financials. The report field is the report itself. The data field is a JSON representation of the company's stock price history. The cost field is the cost of generating the report; derived from published OpenAI model costs. The news field is a summary of the news articles about the company. The url field is the URL of the report. The run_id field is an ID generated by sqids that is used to identify the report. It is particularly useful when debugging and viewing progress in Slack.

Here is the gist of the code used by the Rust Lambda function:

use lambda_http::{service_fn, RequestExt, IntoResponse, Request, Body};
use std::str;
use percent_encoding::{percent_decode};
use regex::Regex;
use reqwest;
use serde_json::json;
use rust_decimal::Decimal;

#[tokio::main]
async fn main() -> Result<(), lambda_http::Error> {
    tracing_subscriber::fmt()
    .with_max_level(tracing::Level::INFO)
    // disable printing the name of the module in every log line.
    .with_target(false)
    // disabling time is handy because CloudWatch will add the ingestion time.
    .without_time()
    .init();

    lambda_http::run(service_fn(report)).await?;
    Ok(())
}

fn convert_binary_body_to_text(request: &Request) -> Result<String, &'static str> {
    match request.body() {
        Body::Binary(binary_data) => {
            // Attempt to convert the binary data to a UTF-8 encoded string
            str::from_utf8(binary_data)
                .map(|s| s.to_string())
                .map_err(|_| "Failed to convert binary data to UTF-8 string")
        }
        _ => Err("Request body is not binary"),
    }
}

async fn report(
    request: Request
) -> Result<impl IntoResponse, std::convert::Infallible> {
    let _context = request.lambda_context_ref();

    match convert_binary_body_to_text(&request) {
        Ok(text) => {
            // Successfully converted binary data to text

            let client = reqwest::Client::new();
            let re = Regex::new(r"[&]").unwrap();
            let re2 = Regex::new(r"^text=").unwrap();
            let re3 = Regex::new(r"[=]").unwrap();
            let re4 = Regex::new(r"^response_url=").unwrap();

            let decoded = percent_decode(text.as_bytes())
                            .decode_utf8_lossy() // This method will replace invalid UTF-8 sequences with � (REPLACEMENT CHARACTER)
                            .to_string();  

            let parts: Vec<&str> = re.split(&decoded).collect();

            let mut response_url = String::new();
            let mut name = String::new();
            let mut symbol = String::new();
            let mut resp;

            for part in &parts {
                if re2.is_match(&part) {

                    let p2: Vec<&str> = re3.split(&part).collect();

                    symbol = str::replace(&p2[1], "$", "").to_uppercase();

                    let mut url = format!("https://submit-company-to-sqs?symbol={}", symbol);

                    let _ = client.get(&url)
                        .send()
                        .await
                        .unwrap()
                        .json::<serde_json::Value>()
                        .await
                        .unwrap();

                    url = format!("https://api.polygon.io/v3/reference/tickers/{}?apiKey=APIKEYGOESHERE", symbol);

                    resp = client.get(&url)
                        .send()
                        .await
                        .unwrap()
                        .json::<serde_json::Value>()
                        .await
                        .unwrap();

                    name = extract_info(&resp, "name");

                }
                else if re4.is_match(&part) {
                    let p2: Vec<&str> = re3.split(&part).collect();

                    response_url = format!("{}", p2[1].to_string());

                }
            }

            let _ = client.post(response_url)
                .json(&json!({
                    "response_type": "in_channel",
                    "text": format!("Request for a report for *{}* (<https://finance.yahoo.com/quote/{}|{}>) submitted.", name, symbol, symbol)
                }))
                .send()
                .await;

            Ok(format!(""))
        }
        Err(error) => {
            // Handle the error (e.g., log it, return an error response, etc.)
            Ok(format!("Error: {}", error))
        }
    }

}

fn extract_info(resp: &serde_json::Value, value: &str) -> String {
    if let Some(results) = resp["results"].as_object() {
        if let Some(name_value) = results.get(value) {
            str::replace(name_value.to_string().as_str(), "\"", "")
        } else {
            "Error1".to_string()
        }
    } else {
        "Error2".to_string()
    }
}