Advanced Functionalities
NOAADataset (pyleotups.utils.NOAADataset)
- class pyleotups.utils.NOAADataset.NOAADataset(study_data)[source]
This class encapsulates study metadata and its related components (e.g. publications, sites) retrieved from the NOAA API.
- study_id
The unique NOAA study identifier.
- Type:
str
- xml_id
The XML identifier of the study.
- Type:
str
- metadata
A dictionary containing basic metadata such as studyName, dataType, earliestYearBP, etc.
- Type:
dict
- investigators
A comma-separated string of investigator names.
- Type:
str
- publications
A list of Publication objects associated with the study.
- Type:
list of Publication
PaleoData (pyleotups.utils.PaleoData)
- class pyleotups.utils.PaleoData.PaleoData(paleo_data, study_id, site_id)[source]
Represents paleo data associated with a site, including multiple data files and full variable metadata per file.
- datatable_id
Unique NOAA data table identifier.
- Type:
str
- dataTableName
Name of the data table.
- Type:
str
- timeUnit
Time unit used in the data table.
- Type:
str
- files
List of raw file info dicts.
- Type:
list of dict
- file_variable_map
Maps fileUrl to a dict of variables and their full metadata.
- Type:
dict
- file_url
Shortcut to first file URL (for backward compatibility).
- Type:
str or np.nan
- variables
Shortcut to variable names in first file (for backward compatibility).
- Type:
list of str
Publication (pyleotups.utils.Publication)
- class pyleotups.utils.Publication.Publication(pub_data)[source]
Represents a publication within a study.
- author
The name of the author(s) of the publication.
- Type:
str
- title
The title of the publication.
- Type:
str
- journal
The journal where the publication appeared.
- Type:
str
- year
The publication year.
- Type:
str
- volume
The volume number (if applicable).
- Type:
str or None
- number
The issue number (if applicable).
- Type:
str or None
- pages
The page numbers (if applicable).
- Type:
str or None
- pub_type
The type of publication.
- Type:
str or None
- doi
The Digital Object Identifier.
- Type:
str or None
- url
URL for the publication.
- Type:
str or None
- study_id
The NOAA study ID to which this publication belongs.
- Type:
str or None
Site (pyleotups.utils.Site)
Parsers
NonStandardParser (pyleotups.utils.Parser.NonStandardParser)
- class pyleotups.utils.Parser.NonStandardParser.NonStandardParser(file_path)[source]
Parser for NOAA files that do not follow standard metadata formatting.
- file_path
Path to the file to be parsed.
- Type:
str
- lines
Lines read from the file.
- Type:
list of str
- blocks
Segregated blocks of lines with associated metadata.
- Type:
list of dict
- detect_header_extent(block, delimiter)[source]
Detects how many initial lines qualify as header rows.
- Parameters:
block (dict) – Block of lines.
delimiter (str) – Delimiter used to split lines.
- Returns:
Number of header lines, and index of title line if found.
- Return type:
tuple of (int, Optional[int])
- parse()[source]
Parses the file and extracts tabular data.
- Returns:
List of extracted tables.
- Return type:
list of pandas.DataFrame
- Raises:
ParsingError – If no usable tables are found.
StandardParser (pyleotups.utils.Parser.StandardParser)
- exception pyleotups.utils.Parser.StandardParser.ParsingError[source]
Exception raised when the StandardParser encounters a parsing error.
- class pyleotups.utils.Parser.StandardParser.StandardParser(url=None)[source]
StandardParser parses NOAA .txt data files with standard format: Standard format refers to NOAA Templated file with metadata -> (# lines), variables -> (## lines), data (tab-deliimited).
- url
URL of the file to parse.
- Type:
str
- lines
Fetched lines from file.
- Type:
list of str
- meta_start
Index where metadata block starts.
- Type:
int
- meta_end
Index where metadata block ends.
- Type:
int
- variables
Extracted variable names.
- Type:
list of str
- skip_lines
Lines to skip after metadata to reach data.
- Type:
int
- data
Parsed data rows.
- Type:
list of list of str
- df
Final constructed dataframe.
- Type:
pandas.DataFrame