diogenes.read package¶
Submodules¶
diogenes.read.read module¶
This module provides functions that convert databases in external formats to Numpy structured arrays.
-
class
diogenes.read.read.
SQLConnection
(conn_str, allow_caching=False, tmp_dir='.', parse_datetimes=[], allow_pgres_copy_optimization=True)¶ Bases:
object
Connection to SQL that returns numpy structured arrays Intended to vaguely implement DBAPI 2
Parameters: - conn_str (str) – SQLAlchemy connection string (http://docs.sqlalchemy.org/en/rel_0_9/core/engines.html)
- allow_caching (bool) – If True, diogenes will cache the results of each query and return the cached result if the same query is performed twice. If False, each query will be sent to the database
- tmp_dir (str) – If allow_caching is True, the cached results will be stored in tmp_dir. Also, where csvs will be stored for postgres servers
- parse_datetimes (list of col names) – Columns that should be interpreted as datetimes
-
execute
(exec_str, invalidate_cache=False)¶ Executes a query
Parameters: - exec_str (str) – SQL query to execute
- invalidate_cache (bool) – If this SQLConnection object was initialized with allow_caching=True, identical queries will always return the same result. If invalidate_cache is True, this behavior is overriden and the query will be reexecuted.
Returns: Results of the query in terms of a numpy structured array
Return type: numpy.ndarray
-
exception
diogenes.read.read.
SQLError
¶ Bases:
exceptions.Exception
-
diogenes.read.read.
connect_sql
(con_str, allow_caching=False, tmp_dir='.', parse_datetimes=[], allow_pgres_copy_optimization=True)¶ Provides an SQLConnection object, which makes structured arrays from SQL
Parameters: - conn_str (str) – SQLAlchemy connection string (http://docs.sqlalchemy.org/en/rel_0_9/core/engines.html)
- allow_caching (bool) – If True, diogenes will cache the results of each query and return the cached result if the same query is performed twice. If False, each query will be sent to the database
- tmp_dir (str) – If allow_caching is True, the cached results will be stored in tmp_dir. Also where csvs will be stored for postgres servers
- parse_datetimes (list of col names) – Columns that should be interpreted as datetimes
Returns: Object that executes SQL queries and returns numpy structured arrays
Return type:
-
diogenes.read.read.
open_csv
(path, delimiter=', ', header=True, col_names=None, parse_datetimes=[])¶ Creates a structured array from a local .csv file
Parameters: - path (str) – path of the csv file
- delimiter (str) – Character used to delimit csv fields
- header (bool) – If True, assumes the first line of the csv has column names
- col_names (list of str or None) – If header is False, this list will be used for column names
- parse_datetimes (list of col names) – Columns that should be interpreted as datetimes
Returns: - numpy.ndarray – structured array corresponding to the csv
- If header is False and col_names is None, diogenes will assign
- arbitrary column names
-
diogenes.read.read.
open_csv_url
(url, delimiter=', ', header=True, col_names=None, parse_datetimes=[])¶ Creates a structured array from a url
Parameters: - url (str) – url of the csv file
- delimiter (str) – Character used to delimit csv fields
- header (bool) – If True, assumes the first line of the csv has column names
- col_names (list of str or None) – If header is False, this list will be used for column names
- parse_datetimes (list of col names) – Columns that should be interpreted as datetimes
Returns: - numpy.ndarray – structured array corresponding to the csv
- If header is False and col_names is None, diogenes will assign
- arbitrary column names