Package `geofetch` Documentation

Package-level data

Class `Finder`

Class for finding GSE accessions in special period of time. Additionally, user can add specific filters for the search, while initialization of the class

def __init__(self, filters: str=None, retmax: int=10000000)

Parameters:

filters (``): filters that have to be added to the query.Filter Patterns can be found here: https://www.ncbi.nlm.nih.gov/books/NBK3837/#EntrezHelp.Using_the_Advanced_Search_Pag
retmax (``): maximum number of retrieved accessions.

def find_differences(old_list: list, new_list: list) -> list

Compare 2 lists and search for elements that are not in old list

Parameters:

old_list (``): old list of elements
new_list (``): new list of elements

Returns:

``: list of elements that are not in old list but are in new_list

def generate_file(self, file_path: str, gse_list: list=None)

Save the list of GSE accessions stored in this Finder object to a given file

Parameters:

file_path (``): root to the file where gse accessions have to be saved
gse_list (``): list of gse accessions

Returns:

``: NoReturn

def get_gse_all(self) -> list

Get list of all gse accession available in GEO

Returns:

``: list of gse accession

def get_gse_by_date(self, start_date: str, end_date: str=None) -> list

Search gse accessions by providing start date and end date. By default, the last date is today.

Parameters:

start_date (``): 'YYYY/MM/DD']
end_date (``): 'YYYY/MM/DD']

Returns:

``: list of gse accessions

def get_gse_by_day_count(self, n_days: int=1) -> list

Get list of gse accessions that were uploaded or updated in last X days

Parameters:

n_days (``): number of days from now [e.g. 5]

Returns:

``: list of gse accession

def get_gse_id_by_query(self, url: str) -> list

Run esearch (ncbi search tool) by specifying URL and retrieve gse list result

Parameters:

url (``): url of the query

Returns:

``: list of gse ids

def get_gse_last_3_month(self) -> list

Get list of gse accession that were uploaded or updated in last 3 month

Returns:

``: list of gse accession

def get_gse_last_week(self) -> list

Get list of gse accession that were uploaded or updated in last week

Returns:

``: list of gse accession

def uid_to_gse(uid: str) -> str

UID to GES accession converter

Parameters:

uid (``): uid string (Unique Identifier Number in GEO)

Returns:

``: GSE id string

Class `Geofetcher`

Class to download or get projects, metadata, data from GEO and SRA

def __init__(self, name: str='', metadata_root: str='', metadata_folder: str='', just_metadata: bool=False, refresh_metadata: bool=False, config_template: str=None, pipeline_samples: str=None, pipeline_project: str=None, skip: int=0, acc_anno: bool=False, use_key_subset: bool=False, processed: bool=False, data_source: str='samples', filter: str=None, filter_size: str=None, geo_folder: str='.', split_experiments: bool=False, bam_folder: str='', fq_folder: str='', sra_folder: str='', bam_conversion: bool=False, picard_path: str='', input: str=None, const_limit_project: int=50, const_limit_discard: int=1000, attr_limit_truncate: int=500, max_soft_size: str='1GB', discard_soft: bool=False, add_dotfile: bool=False, disable_progressbar: bool=False, add_convert_modifier: bool=False, opts=None, max_prefetch_size=None, **kwargs)

Constructor

Parameters:

input (``): GSEnumber or path to the input file
name (``): Specify a project name. Defaults to GSE number or name of accessions file name
metadata_root (``): Specify a parent folder location to store metadata.The project name will be added as a subfolder [Default: $SRAMETA:]
metadata_folder (``): Specify an absolute folder location to store metadata. No subfolder will be added.Overrides value of --metadata-root [Default: Not used (--metadata-root is used by default)]
just_metadata (``): If set, don't actually run downloads, just create metadata
refresh_metadata (``): If set, re-download metadata even if it exists.
config_template (``): Project config yaml file template.
pipeline_samples (``): Specify one or more filepaths to SAMPLES pipeline interface yaml files.These will be added to the project config file to make it immediately compatible with looper. [Default: null]
pipeline_project (``): Specify one or more filepaths to PROJECT pipeline interface yaml files.These will be added to the project config file to make it immediately compatible with looper. [Default: null]
acc_anno (``): Produce annotation sheets for each accession.Project combined PEP for the whole project won't be produced.
discard_soft (``): Create project without downloading soft files on the disc
add_dotfile (``): Add .pep.yaml file that points .yaml PEP file
disable_progressbar (``): Set true to disable progressbar

def fetch_all(self, input: str, name: str=None) -> Union[NoReturn, peppy.project.Project]

Main function driver/workflow Function that search, filters, downloads and save data and metadata from GEO and SRA

Parameters:

input (``): GSE or input file with gse's
name (``): Name of the project

Returns:

``: NoReturn or peppy Project

def fetch_processed_one(self, gse_file_content: list, gsm_file_content: list, gsm_filter_list: dict) -> Tuple

Fetche one processed GSE project and return its metadata

Parameters:

gsm_file_content (``): gse soft file content
gse_file_content (``): gsm soft file content
gsm_filter_list (``): list of gsm that have to be downloaded

Returns:

``: Tuple of project list of gsm samples and gse samples

def get_projects(self, input: str, just_metadata: bool=True, discard_soft: bool=True) -> dict

Function for fetching projects from GEO|SRA and receiving peppy project

Parameters:

input (``): GSE number, or path to file of GSE numbers
just_metadata (``): process only metadata
discard_soft (``): clean run, without downloading soft files

Returns:

``: peppy project or list of project, if acc_anno is set.

Version Information: geofetch v0.12.6, generated by lucidoc v0.4.4

Package geofetch Documentation

Class Finder

Parameters:

Parameters:

Returns:

Parameters:

Returns:

Returns:

Parameters:

Returns:

Parameters:

Returns:

Parameters:

Returns:

Returns:

Returns:

Parameters:

Returns:

Class Geofetcher

Parameters:

Parameters:

Returns:

Parameters:

Returns:

Parameters:

Returns:

Package `geofetch` Documentation

Class `Finder`

Class `Geofetcher`