Kucoin Data Pipeline#

kucoincli.pipe.pipeline(tickers: str, engine: <module 'sqlalchemy.engine' from '/home/docs/checkouts/readthedocs.org/user_builds/kucoin-cli/envs/latest/lib/python3.9/site-packages/sqlalchemy/engine/__init__.py'>, end: str, start: str | None = None, interval: str = '1day', loop_range: int | None = None, loop_increment: int = 1500, chunk_size: int = 500, schema: str | None = None, if_exists: str = 'append', progress_bar: bool = True) None#

Data acquisition pipeline from KuCoin OHLCV API call -> SQL database.

Leverage pandas, sqlalchemy, and kucoincli.client to obtain, format, and catalogue OHLCV data in a permanent database.

Notes

  • Be aware that KuCoin servers are on UTC time. If this is not accounted for

returns may be inaccurate. Returns may be unexpected when using naive datetime objects rather than strings for the start or end arguments. * This pipeline is a wrapper utilizing the kucoincli.client function ohlcv. For details on the underlying data acquisition, reference the docstring.

Parameters:
  • tickers (str or list) – Ticker or list of tickers to call Kucoin API for OHLCV data (e.g., BTC-USDT).

  • engine (sqlalchemy.engine) – SQLAlchemy engine. For further information about engine objects review the sqlalchemy.create_engine documentation.

  • end (datetime.datetime or str) – Latest date in range. Date input may be either string (e.g., YYYY-MM-DD or YYYY-MM-DD HH:MM:SS) or a datetime object. See kucoincli.client.ohlcv for further formatting details.

  • start (datetime.datetime or str) – (Optional) Earliest date in range. User must input either start or loop_range. See end for formatting details.

  • interval (str) – (Optional) OHLCV interval frequency. Default=1day. Options: 1min, 3min, 5min, 15min, 30min, 1hour, 2hour, 4hour, 6hour, 8hour, 12hour, 1day, 1week

  • loop_range (int) – (Optional) Rather than specifying an earliest date, users may specify a latest date and obtain data in loop_increment chunks for loop_range API calls. For example, if we specify end=”2022-01-01”, loop_increment=100, and loop_range=10, the pipeline will call 1000 bars at interval granularity starting with 2022-01-01 and walking backwards. | Note: loop_range will be ignored unless end=None.

  • loop_increment (int) – (Optional) Used to control the max number bars of OHLCV data retrieved per call. Max bars per call is 1500. Default=1500.

  • schema (str) – (Optional) SQL schema in which to store acquired OHLCV data. Default=None. SQLite databases may not use schema argument. mySQL or psql databases will utilize default scheme if schema=None. For further information review pandas.to_sql.

  • chunk_size (int) – (Optional) Chunksize for use by pandas.to_sql. Chunksize may be optimized for better read/write performance to SQL database. See pandas.to_sql documentation for further details.

  • progress_bar (bool) – (Optional) Displays a loading bar and timer for each asset queried. Default=True

  • if_exists (str) – (Optional) Control the pipelines behavior if a table already exists in the defined database/schema. Default=`append`. Options: fail, replace, append. For further details see pandas.to_sql docs. * fail: Raise a ValueError. * replace: Drop the table before inserting new values. * append: Insert new values to the existing table.