finance.src package
Submodules
finance.src.file_generator module
finance.src.postgres_interface module
Module to interact with postgres databases
It contains generic methods to interact with postgres databases regardless of the data they contain
- class finance.src.postgres_interface.PostgresInterface[source]
Bases:
object
Class to interact with postgres databases
- create_engine() Engine [source]
function that creates engines to connect to postgres databases
- Returns:
dictionary with the engines to connect to the databases
- Return type:
dict
- create_table_object(table_name: str, engine: Engine, schema: str = 'stocks') Table [source]
Method to create a table object
- Parameters:
table_name (str) – name of the table to create the object for
engine (sqlalchemy.engine.Engine) – engine to connect to the database
schema (str) – schema of the table default: stocks
- Returns:
table object
- Return type:
sqlalchemy.Table
- insert_batch(table: Table, batch: list, conn: Connection) None [source]
Method to insert a batch of data into a table
- Parameters:
table (str) – table to insert data into
data (list) – list of tuples with the data to insert into the table
- Return type:
None
- migrate_dbs(batch_size: int = 5000, tap_cloud_provider: str = 'NEON', target_cloud_provider: str = 'AVN') None [source]
Method to migrate a database to another one Supposed to be used only once to migrate data to a target database
- Parameters:
batch_size (int) – number of rows to insert in each batch default: 5000
tap_cloud_provider (str) – cloud provider of the tap database default: NEON
target_cloud_provider (str) – cloud provider of the target database default: AVN
- Return type:
None
finance.src.s3_interface module
Module to interact with s3
finance.src.schedule_jobs module
Module that schedules jobs in the CI/CD pipeline use to update the database
This module contains Methods and classes that can be reused over all the different jobs in the CI/CD pipeline
- class finance.src.schedule_jobs.ScheduleJobs(provider: str, table_name: str, frequency: str = 'annual', batch_size: int = 500)[source]
Bases:
object
Class that schedules jobs in the CI/CD pipeline use for updating the database and extracting data from yfinance
- get_tickers_batch_yf_object(tickers_list: list) list[yfinance.ticker.Ticker] [source]
Method to get a batch of yfinance tickers from a list of tickers
- Parameters:
tickers (list) – list of tickers to get the yfinance tickers from
- run_pipeline()[source]
Main method that each of the jobs in the CI/CD pipeline will run It includes steps like:
getting a batch of tickers to update from valid_tickers table
extracting data from yfinance for each ticker
inserting the data into the database
updating the validity of the tickers in valid_tickers table
- tickers_to_query(table_name: str, engine: Engine, frequency: str = 'annual') list [source]
TODO: make this docs better TODO: make the query better
Method to get a batch of tickers from valid_tickers table that have not been inserted into other main tables (financials, balance_sheet, cashflow, etc.).
- Parameters:
table_name (str) – name of the table to get the tickers from
engine (sqlalchemy.engine.Engine) – engine to connect to the database, defines if it is local or neon
frequency (str) – frequency of the data (either annual or quarterly)
table) (The query is equivalent to (for cahsflow) – select valid_tickers.ticker, valid_tickers.cashflow_annual_available, subquery.max_insert_date from stocks.valid_tickers left join ( select cashflow.ticker, max(cashflow.insert_date) as max_insert_date from stocks.cashflow group by cashflow.ticker ) as subquery on valid_tickers.ticker = subquery.ticker where valid_tickers.cashflow_annual_available order by subquery.max_insert_date asc
- Returns:
list of tickers
- Return type:
list
finance.src.utils module
This module contains utility functions for the finance package.
finance.src.yahoo_ticker module
Module to perform operations on the yahoo finance API data (tickers)
- class finance.src.yahoo_ticker.Ticker(countries: str | List[str] = None, chunksize: int = 20, frequency: Literal['annual', 'quarterly'] = 'annual', schema: str = 'stocks')[source]
Bases:
object
- extract_tickers_data(ticker: Ticker, table_name: str, table_columns: list) DataFrame [source]
Method that gets the data from the yfinance API and transforms it into a list of tuples
- Parameters:
ticker (yf.Ticker) – The ticker or stock to update
table_name (str) – The name of the table that the ticker data is extracted for
table_columns (list) – The columns of the table
- Return type:
None
- flush_records(table_name: str, records: list)[source]
Method to flush records to a table
- Parameters:
table_name (str) – The name of the table to flush the records to
records (list) – The records to flush to the table
- get_currency_code(ticker: str) str [source]
Method that gets the currency code of a ticker from valid_tickers table in the database
- Parameters:
ticker (str) – The ticker symbol
- Returns:
The currency code of the ticker
- Return type:
str
- get_data_df(table_name: str, frequency: str, ticker: Ticker)[source]
Method that returns a df based on the name of the table and frequency
- Parameters:
table_name (str) – The name of the table that is going to be filled
frequency (str) – The frequency of the data to be extracted Either annual or quarterly
- Returns:
The dataframe with the data
- Return type:
pd.DataFrame
- load_valid_tickers(sink_table: str) List[str] [source]
Method to load the valid tickers from the database based on the validity status of the tickers in the valid_tickers table
- Parameters:
sink_table (str) – The name of the table to load the tickers from
- Returns:
A list of the valid tickers
- Return type:
List[str]
- update_tickers_list_table()[source]
Method to update the tickers_list table in postgres Gets all the data in the data dir excel file (all available tickers) and inserts them into the database
- update_validity_status(table_name: str, tickers: list[str], availability: bool = False)[source]
Method That gets a list of tickers and updates the validity status of the tickers for a specific criteria (e.g. balance_sheet_annual_availabile) in the valid_tickers table, e.g. if the ticker has not balance sheet data for the quarterly frequency, the balance_sheet_quarterly_available column in the valid_tickers table will be updated to False
- Parameters:
table_name (str) – The name of the table which the ticker was supposed to be updated
ticker (list[str]) – The tickers that was supposed to be updated
validity (bool) – The validity status of the ticker for the specific criteria default: False
- Return type:
None