GSE Finder
is a geofetch class that provides functions to find and retrieve a list of GSE (GEO accession number) by using NCBI searching tool.
The main features of the geofetch Finder are:
- Find GEO accession numbers (GSE) of the project that were uploaded or updated in certain period of time.
- Use the same filter query as GEO DataSets Advanced Search Builder is using
- Save list of the GSEs to file (This file with geo can be used later in geofetch)
- Easier and faster to get GSEs using NCBI filter and certain period of time.
Tutorial
0) Initiale Finder object.
from geofetch import Finder
gse_obj = Finder()
# Optionally: provide filter string and max number of retrieve elements
gse_obj = Finder(filters="((bed) OR narrow peak) AND Homo sapiens[Organism]", retmax=10)
1) Get list of all GSE in GEO
gse_list = gse_obj.get_gse_all()
2) Get list of GSE that were uploaded and updated last week
gse_list = gse_obj.get_gse_last_week()
3) Get list of GSE that were uploaded and updated last 3 month
gse_list = gse_obj.get_gse_last_3_month()
4) Get list of GSE that were uploaded and updated in las number of days
# project that were uploaded in last 5 days:
gse_list = gse_obj.get_gse_by_day_count(5)
5) Get list of GSE that were uploaded in certain period of time
gse_list = gse_obj.get_gse_by_date(start_date="2015/05/05", end_date="2020/05/05")
6) Save last searched list of items to the file
gse_obj.generate_file("path/to/the/file")
# if you want to save different list of files you can provide it to the function
gse_obj.generate_file("path/to/the/file", gse_list=["123", "124"])
7) Compare two lists:
new_gse_list = gse_obj.find_differences(list1, list2)
More information about gse and queries and id: - https://www.ncbi.nlm.nih.gov/geo/info/geo_paccess.html - https://newarkcaptain.com/how-to-retrieve-ncbi-geo-information-using-apis-part1/ - https://www.ncbi.nlm.nih.gov/books/NBK3837/#EntrezHelp.Using_the_Advanced_Search_Pag