Download Resource File(s) ========================= .. important:: An API Key is required for ALL downloads as of EDX 3.07 (09/03/2020) The API Key name component can be added to your header using any of the following ways, "X-CKAN-API-Key", "EDX-API-Key", or "Authorization". .. attention:: * Add the ``"User-Agent":`` parameter within the ``headers`` of the API request * Set ``"User-Agent":`` to the value ``"EDX-USER"`` Example 1: Download using ``wget`` ----------------------------------- **Public Resources:** To obtain a public URL needed to download a resource, you can navigate to the public submission (e.g. https://edx.netl.doe.gov/dataset/global-oil-gas-features-database), right click on the blue “Download” button and “Copy Link Address” (wording from Chrome). The copied URL from this example will look as follows (08/03/2022): https://edx.netl.doe.gov/dataset/g27625d9b-4a28-4bdf-bc5c-09834f7a9dfb/resource/34280f73-526f-497b-a672-9f37313acede/download .. code-block:: bash wget --header="EDX-API-Key:" https://edx.netl.doe.gov/dataset//resource//download IF you know your resource ID you can perform the download without the need of the submission name or id. .. code-block:: bash wget --header="EDX-API-Key:" https://edx.netl.doe.gov/resource//download **Private Resources:** To obtain a private URL needed to download a private resource, there are two scenarios: * Private Submission: Navigate to the private submission (e.g. https://edx.netl.doe.gov/dataset/databook-node-modules-folder-content), right click on the blue “Download” button and “Copy Link Address” (wording from Chrome). The copied URL from this example will look as follows (08/03/2022) : https://edx.netl.doe.gov/dataset/e807c09e-1a63-4963-9002-1e0c3f0f025d/resource/279eec86-fac8-4504-a3cb-c791bfc98dde/download .. code-block:: bash wget --header="EDX-API-Key:" https://edx.netl.doe.gov/dataset//resource//download * EDX Drive Resource: Navigate to the EDX Drive of a workspace, right click on a resource and select the “Copy Download Link” option (08/03/2022) (e.g. https://edx.netl.doe.gov/resource/55bf54eb-0e70-49e7-8f00-e7057f7dced8/download ) .. code-block:: bash wget --header="EDX-API-Key:" https://edx.netl.doe.gov/resource//download .. attention:: * Add the ``"User-Agent":`` parameter within the ``headers`` of the API request * Set ``"User-Agent":`` to the value ``"EDX-USER"`` .. list-table:: :header-rows: 1 * - Parameter Name - Description - Required Fields * - ``resource_id`` - ID of the resource found in the metadata - **Required** Example 2: Download using Python 3.8 ------------------------------------- .. code-block:: import requests import os headers = { "EDX-API-Key": 'YOUR-API-KEY-HERE', "User-Agent": 'EDX-USER', } params = { 'resource_id': 'RESOURCE-ID-HERE', } url = 'https://edx.netl.doe.gov/api/3/resource_download' # Get filename from headers print("Sending request to resource data...") response_head = requests.head(url, headers=headers, params=params) if response_head.status_code != 200: print(f"Failed to get resource data. Status code: {response_head.status_code}") exit(1) content_disposition = response_head.headers.get('Content-Disposition') # Set the filename from the Content-Disposition header if available filename = None if content_disposition and 'filename=' in content_disposition: filename = content_disposition.split('filename=')[-1].strip('"') # Get the content length from headers and determine resource size. content_length = response_head.headers.get('Content-Length') resource_size = int(content_length) if content_length is not None else None print("Resource Name:", filename) print(f"Resource Size: {resource_size} bytes") # Determine if partial file exists existing_size = 0 if os.path.exists(filename): existing_size = os.path.getsize(filename) print(f"File already exists. The current file size is: {existing_size} bytes.") if resource_size is not None: print(f"Resource file size: {resource_size} bytes") if existing_size >= resource_size: print("File already fully downloaded in current directory.") exit(0) headers['Range'] = f'bytes={existing_size}-' print(f"Resuming download from byte: {existing_size}") else: print(f"Starting download for: {filename}") # Begin download stream print(headers, url) response = requests.get(url, headers=headers, params=params, stream=True) print(f"Download response status code: {response.status_code}") if response.status_code in (200, 206): # If the server returns a 206 (for partial content), use 'ab' mode to append mode = 'ab' if response.status_code == 206 else 'wb' total_bytes = existing_size print(f"Saving to: {os.path.abspath(filename)}") with open(filename, mode) as f: for chunk in response.iter_content(chunk_size=8192): if chunk: f.write(chunk) total_bytes += len(chunk) if resource_size: percent = (total_bytes / resource_size) * 100 print(f"\rDownloaded: {total_bytes} bytes ({percent:.2f}%)", end='', flush=True) else: # If resource size is unknown, just show bytes downloaded print(f"\rDownloaded: {total_bytes} bytes", end='', flush=True) print(f"\nDownload complete.") print(f"Total bytes downloaded: {total_bytes}") else: print(f"Download Failed. Status code: {response.status_code}") try: print("Response:", response.json()) except Exception: print("Non-JSON response:", response.text)