hdx.scraper.geonode.geonodetohdx

GeoNode Utilities:

Reads from GeoNode servers and creates datasets.

create_dataset_showcase

def create_dataset_showcase(dataset: Dataset, showcase: Showcase,
                            **kwargs: Any) -> None

[view_source]

Create dataset and showcase

Arguments:

  • dataset Dataset - Dataset to create
  • showcase Showcase - Showcase to create
  • **kwargs - Args to pass to dataset create_in_hdx call

Returns:

None

delete_from_hdx

def delete_from_hdx(dataset: Dataset) -> None

[view_source]

Delete dataset and any associated showcases

Arguments:

  • dataset Dataset - Dataset to delete

Returns:

None

GeoNodeToHDX Objects

class GeoNodeToHDX()

[view_source]

Utilities to bring GeoNode data into HDX. hdx_geonode_config_yaml points to a YAML file that overrides base values and is in this format:

ignore_data: - deprecated

category_mapping: Elevation: 'elevation - topography - altitude' 'Inland Waters': river

titleabstract_mapping: bridges: - bridges - transportation - 'facilities and infrastructure' idp: camp: - 'displaced persons locations - camps - shelters' - 'internally displaced persons - idp' else: - 'internally displaced persons - idp'

Arguments:

  • geonode_url str - GeoNode server url
  • downloader Download - Download object from HDX Python Utilities
  • hdx_geonode_config_yaml Optional[str] - Configuration file for scraper

get_ignore_data

def get_ignore_data() -> List[str]

[view_source]

Get terms in the abstract that mean that the dataset should not be added to HDX

Returns:

  • List[str] - List of terms in the abstract that mean that the dataset should not be added to HDX

get_category_mapping

def get_category_mapping() -> Dict[str, str]

[view_source]

Get mappings from the category field category__gn_description to HDX metadata tags

Returns:

  • Dict[str,str] - List of mappings from the category field category__gn_description to HDX metadata tags

get_titleabstract_mapping

def get_titleabstract_mapping() -> Dict[str, Union[Dict, List]]

[view_source]

Get mappings from terms in the title or abstract to HDX metadata tags

Returns:

  • Dict[str,Union[Dict,List]] - List of mappings from terms in the title or abstract to HDX metadata tags

get_countries

def get_countries(use_count: bool = True) -> List[Dict]

[view_source]

Get countries from GeoNode

Arguments:

  • use_count bool - Whether to use null count metadata to exclude countries. Defaults to True.

Returns:

  • List[Dict] - List of countries in form (iso3 code, name)

get_layers

def get_layers(countryiso: Optional[str] = None) -> List[Dict]

[view_source]

Get layers from GeoNode optionally for a particular country

Arguments:

  • countryiso Optional[str] - ISO 3 code of country from which to get layers. Defaults to None (all countries).

Returns:

  • List[Dict] - List of layers

get_orgname

@staticmethod
def get_orgname(metadata: Dict, orgclass: Type = Organization) -> str

[view_source]

Get orgname from Dict if available or use orgid from Dict to look up organisation name

Arguments:

  • metadata Dict - Dictionary containing keys: maintainerid, orgid, updatefreq, subnational
  • orgclass Type - Class to use for look up. Defaults to Organization.

Returns:

  • str - Organisation name

generate_dataset_and_showcase

def generate_dataset_and_showcase(
    countryiso: str,
    layer: Dict,
    metadata: Dict,
    get_date_from_title: bool = False,
    process_dataset_name: Callable[[str], str] = lambda x: x,
    dataset_codlevel_mapping: Dict[str, List] = dict(),
    dataset_tags_mapping: Dict[str, List] = dict()
) -> Tuple[Optional[Dataset], Optional[List], Optional[Showcase]]

[view_source]

Generate dataset and showcase for GeoNode layer

Arguments:

  • countryiso str - ISO 3 code of country
  • layer Dict - Data about layer from GeoNode
  • metadata Dict - Dictionary containing keys: maintainerid, orgid, updatefreq, subnational
  • get_date_from_title bool - Whether to remove dates from title. Defaults to False.
  • process_dataset_name Callable[[str], str] - Function to change the dataset name. Defaults to lambda x: x.
  • dataset_codlevel_mapping Dict[str, List] - Mapping from dataset name to cod levels. Defaults to empty dictionary.
  • dataset_tags_mapping Dict[str, List] - Mapping from dataset name to additional tags. Defaults to empty dictionary.

Returns:

  • Tuple[Optional[Dataset],List,Optional[Showcase]] - Dataset, date ranges in dataset title and Showcase objects or None, None, None

generate_datasets_and_showcases

def generate_datasets_and_showcases(
        metadata: Dict,
        create_dataset_showcase: Callable[[Dataset, Showcase, Any],
                                          None] = create_dataset_showcase,
        use_count: bool = True,
        countrydata: Dict[str, Optional[str]] = None,
        get_date_from_title: bool = False,
        process_dataset_name: Callable[[str], str] = lambda x: x,
        dataset_codlevel_mapping: Dict[str, List] = dict(),
        dataset_tags_mapping: Dict[str, List] = dict(),
        **kwargs: Any) -> List[str]

[view_source]

Generate datasets and showcases for all GeoNode layers

Arguments:

  • metadata Dict - Dictionary containing keys: maintainerid, orgid, updatefreq, subnational
  • create_dataset_showcase Callable[[Dataset, Showcase, Any], None] - Function to call to create dataset and showcase
  • use_count bool - Whether to use null count metadata to exclude countries. Defaults to True.
  • countrydata Dict[str, Optional[str]] - Dictionary of countrydata. Defaults to None (read from GeoNode).
  • get_date_from_title bool - Whether to remove dates from title. Defaults to False.
  • process_dataset_name Callable[[str], str] - Function to change the dataset name. Defaults to lambda x: x.
  • dataset_codlevel_mapping Dict[str, List] - Mapping from dataset name to cod levels. Defaults to empty dictionary.
  • dataset_tags_mapping Dict[str, List] - Mapping from dataset name to additional tags. Defaults to empty dictionary.
  • **kwargs - Args to pass to dataset create_in_hdx call

Returns:

  • List[str] - List of names of datasets added or updated

delete_other_datasets

def delete_other_datasets(
        datasets_to_keep: List[str],
        metadata: Dict,
        delete_from_hdx: Callable[[Dataset], None] = delete_from_hdx) -> None

[view_source]

Delete all GeoNode datasets and associated showcases in HDX where layers have been deleted from the GeoNode server.

Arguments:

  • datasets_to_keep List[str] - List of dataset names that are to be kept (they were added or updated)
  • metadata Dict - Dictionary containing keys: maintainerid, orgid, updatefreq, subnational
  • delete_from_hdx Callable[[Dataset], None] - Function to call to delete dataset

Returns:

None