{"private": false, "num_tags": 11, "num_resources": 0, "submission_authors": [], "id": "93647c78-d12d-4202-879c-77f0036e1d4c", "metadata_created": "2016-05-23T06:40:18.036961", "metadata_modified": "2016-05-23T06:40:18.036961", "state": "active", "creator_user_id": "76242f93-446d-40d2-9771-789bf7e49aeb", "type": "dataset", "resources": [{"rating": null, "cache_last_updated": null, "tag_string": "", "revision_timestamp": "May 23, 2016, 10:38:51 (EST)", "package_id": "93647c78-d12d-4202-879c-77f0036e1d4c", "file": "", "owner": "admin", "datastore_active": false, "id": "69fa940d-a0d8-4ff5-969e-60600dd85675", "size": null, "state": "active", "pkg_name": "geodeepdive", "last_modified": null, "hash": "", "description": "Synthesizing data from the published literature is critical to addressing a wide range of questions, ranging from the history and future of global biodiversity to the evolution of continental crust. Doing so manually, however, can be prohibitively time consuming and produces a monolithic database that is disconnected from primary sources that are difficult to fully cite.\r\n\r\nWe are building a scalable, dependable cyberinfrastructure to facilitate new approaches to the discovery, acquisition, utilization, and citation of data and knowledge in the published literature.", "format": "HTML", "mimetype_inner": null, "folder_id": "root", "url_type": null, "recycle_removed": false, "intended_use_auth": false, "mimetype": null, "cache_url": null, "name": "GeoDeepDive", "created": "2016-05-18T14:54:02.462818", "url": "https://geodeepdive.org/", "owner_org": null, "license_type": "no-license-restriction", "position": 0, "revision_id": "40503161-8070-984f-7e5d-a2ab4a5650de", "resource_type": "link"}], "tags": [{"vocabulary_id": null, "state": "active", "display_name": "atmosphere", "id": "c4657ece-0f71-4a9e-85a0-af57ae1f93fc", "name": "atmosphere"}, {"vocabulary_id": null, "state": "active", "display_name": "geochemistry", "id": "bc506bb7-2b10-431a-9823-e3d29ea791d2", "name": "geochemistry"}, {"vocabulary_id": null, "state": "active", "display_name": "geology", "id": "1b3d17bf-1c67-49ff-bcaa-dbeb8627179c", "name": "geology"}, {"vocabulary_id": null, "state": "active", "display_name": "geophysics", "id": "0fa41fd5-e0a0-4b78-96ce-288b6dddcf42", "name": "geophysics"}, {"vocabulary_id": null, "state": "active", "display_name": "hydrology", "id": "1b60dcac-13ab-4957-93f6-a2a877d895f3", "name": "hydrology"}, {"vocabulary_id": null, "state": "active", "display_name": "literature", "id": "cddfed25-0e63-414b-bf51-490668d9bcd3", "name": "literature"}, {"vocabulary_id": null, "state": "active", "display_name": "marine geology", "id": "20705953-dcd6-43c9-8fef-83c06d3ede9f", "name": "marine geology"}, {"vocabulary_id": null, "state": "active", "display_name": "paleoclimate", "id": "25ebf1df-6d87-4237-ac5e-36d17f696a6c", "name": "paleoclimate"}, {"vocabulary_id": null, "state": "active", "display_name": "planetary geology", "id": "6749fe93-11f1-4a7b-9ff3-afa89a7fd1ff", "name": "planetary geology"}, {"vocabulary_id": null, "state": "active", "display_name": "publications", "id": "000c1413-bccd-4658-8292-aaf70e4cff31", "name": "publications"}, {"vocabulary_id": null, "state": "active", "display_name": "water resources", "id": "1c2b3417-6e20-4e36-b0ee-0a4c917974a6", "name": "water resources"}], "package_reviewed": true, "name": "geodeepdive", "isopen": false, "notes": "From the about section: Synthesizing data from the published literature is critical to addressing a wide range of questions, ranging from the history and future of global biodiversity to the evolution of continental crust. Doing so manually, however, can be prohibitively time consuming and produces a monolithic database that is disconnected from primary sources that are difficult to fully cite.\r\n\r\nWe are building a scalable, dependable cyberinfrastructure to facilitate new approaches to the discovery, acquisition, utilization, and citation of data and knowledge in the published literature.\r\n\r\nThe primary focus of this U.S. National Science Foundation EarthCube building block project (NSF ICER 1343760) is the construction of a cyberinfrastructure that is capable of supporting end-to-end text and data mining (TDM) and knowledge base creation/augmentation activities in the geosciences and biosciences. The infrastructure includes the following key components:\r\n\r\nAutomated, rate-controlled and authenticated original document fetching\r\nSecure original document storage and bibliographic/source metadata management\r\nAutomated pre-processing of documents by multiple software tools; ability to quickly deploy new tools/versions of tools across all documents\r\nAPI for basic full-text search and discovery capabilities\r\nAbility to pre-index content using external dictionaries (e.g., Macrostrat lithologies)\r\nAbility to generate fully documented, bibliographically complete testing and development datasets based on user-supplied terms\r\nCapacity to support the deployment of user-developed TDM applications across full corpus, with on-demand updates as new relevant documents are acquired", "extras": [{"key": "citation", "value": "Notes on citations from the site: \"Every word and datum that can be derived from our infrastructure is fully traceable back to the original content provided by our partner publishers and organizations. Users must provide full citation and, when relevant, URL links back to all of the original works that contributed data to an application or result. The GeoDeepDive infrastructure can also be cited and we welcome new collaborations, both scientific and informatic.\""}, {"key": "netl_product", "value": "no"}], "title": "GeoDeepDive", "revision_id": "f08f3f1e-688a-4da4-af7f-b2f3c3c95447"}