pyaxis¶
Pcaxis Parser module
This module obtains a pandas DataFrame of tabular data from a PC-Axis file or URL. Reads data and metadata from PC-Axis [1] into a dataframe and dictionary, and returns a dictionary containing both structures.
Example
from pyaxis import pyaxis
px = pyaxis.parse(self.base_path + ‘px/2184.px’, encoding=’ISO-8859-2’)
[1] | https://www.scb.se/en/services/statistical-programs-for-px-files/ |
..todo:
meta_split: "NOTE" attribute can be multiple, but only the last one
is added to the dictionary
-
pyaxis.
build_dataframe
(dimension_names, dimension_members, data_values)¶ Builds a dataframe by adding the cartesian product of dimension members, plus the series of data.
Parameters: - dimension_names (list of string) –
- dimension_members (list of string) –
- data_values (list of string) –
Returns: data (pandas dataframe)
-
pyaxis.
get_dimensions
(metadata)¶ Reads STUB and HEADING values from metadata dictionary.
Parameters: metadata – dictionary of metadata Returns: dimension_names (list) dimension_members (list)
-
pyaxis.
metadata_extract
(pc_axis)¶ Extracts metadata and data from pc-axis file contents.
Parameters: pc_axis (str) – pc_axis file contents. Returns: each item conforms to an ATTRIBUTE=VALUES pattern data (string): data values Return type: metadata_attributes (list of string)
-
pyaxis.
metadata_split_to_dict
(metadata_elements)¶ Splits the list of metadata elements into a dictionary of multi-valued keys.
Parameters: metadata_elements (list of string) – pairs ATTRIBUTE=VALUES Returns: {‘attribute1’: [‘value1’, ‘value2’, … ], …} Return type: metadata (dictionary)
-
pyaxis.
parse
(uri, encoding, timeout=10)¶ Extracts metadata and data sections from pc-axis.
Parameters: - uri (str) – file name or URL
- encoding (str) – charset encoding
- timeout (int) – request timeout in seconds; optional
Returns: - dictionary of metadata and pandas df.
METADATA: dictionary of metadata DATA: pandas dataframe
Return type: pc_axis_dict (dictionary)
-
pyaxis.
read
(uri, encoding, timeout=10)¶ Reads a text file from file system or URL.
Parameters: - uri (str) – file name or URL
- encoding (str) – charset encoding
- timeout (int) – request timeout; optional
Returns: file contents.
Return type: raw_pcaxis (str)
-
pyaxis.
uri_type
(uri)¶ Determines the type of URI.
Parameters: uri (str) – pc-axis file name or URL Returns: ‘URL’ | ‘FILE’ Return type: uri_type (str)