2. DatabaseManager#
- class DatabaseManager(mode='local')#
Bases:
object
WORK IN PROGRESS
Provides methods to connect to a MongoDB database in the LASP Server, load data from the database, insert and save generated output data into the database.
- Parameters:
mode (
str
, optional) – DESC. Defaults to'local'
.
Attributes
Get the MongoClient instance.
Get the database instance.
Get the GridFS instance for storing files.
Methods
Close the connection to the MongoDB database.
compute_md5
(file_path)create_iteration_document
(process_id)Initialize a iteration collection if it doesn't exist and create an iteration table for the existing process.
create_process_document
(process_name)Initialize a process collection if it doesn't exist and create a new process document.
delete_data
(path[, mode])Deletes data from MongoDB and GridFS by file, folder, or full folder tree.
download_data
(path[, mode, overwrite])Downloads files from MongoDB GridFS and reconstructs the original folder structure.
format_user_name
(user_email)Converts an email (or username) into a 'Firstname Lastname' format.
generate_md5_report
([base_dir])Compare local files to database entries using MD5 hash values.
get_output_file
(output_type[, process_id, ...])Retrieve and download a file from MongoDB GridFS based on process and iteration identifiers.
get_pass_measurement
(sub_tables, start_time, ...)Fetch filtered data from MongoDB based on a range of [doy, year, sec] and specified tables from the pass table.
store_iteration_output
(file_paths, ...)Store files into the MongoDB database associated with an iteration.
upload_data
(path[, mode, overwrite])Uploads data from the local filesystem to MongoDB GridFS.
upload_files_recursively
(dir_path, ...[, ...])Recursively finds all files in a given directory and uploads them to MongoDB GridFS.
upload_single_file
(file_path, collection_name)Upload a single file to MongoDB GridFS.
get_outlier_flags
- close_connection()#
Close the connection to the MongoDB database.
- compute_md5(file_path)#
- create_iteration_document(process_id: str)#
Initialize a iteration collection if it doesn’t exist and create an iteration table for the existing process. :param process_id: ID of the process :type process_id: ObjectId :returns: ID of the newly created iteration document :rtype: ObjectId
- create_process_document(process_name: str)#
Initialize a process collection if it doesn’t exist and create a new process document. :param process_name: Name of the process :type process_name: str :returns: ID of the newly created process document :rtype: ObjectId
- delete_data(path: str, mode: str = 'file')#
Deletes data from MongoDB and GridFS by file, folder, or full folder tree. Prompts the user with a clear warning before deletion.
- download_data(path: str, mode: str = 'file', overwrite=False)#
Downloads files from MongoDB GridFS and reconstructs the original folder structure.
- format_user_name(user_email: str) str #
Converts an email (or username) into a ‘Firstname Lastname’ format.
- generate_md5_report(base_dir='data')#
Compare local files to database entries using MD5 hash values. :param base_dir: Root directory for local files. :type base_dir: str :return: A dictionary summarizing the comparison. :rtype: dict
- get_outlier_flags(process_id=None, user_id=None)#
- get_output_file(output_type: str, process_id='final', iteration_id='final')#
Retrieve and download a file from MongoDB GridFS based on process and iteration identifiers. This method queries the ‘iterations’ collection to find the document with the specified process_id and iteration_id, retrieves the file_id from the specified field_name, and downloads the file from GridFS.
- Parameters:
- Returns:
The path to the downloaded file.
- Return type:
- Raises:
ValueError – If no document is found with the given process_id and iteration_id, or if the field_name is not found in the document.
#process_id is optional, default is “final” - queries the recently created process #iteration_id is optional, default is “final” - queries the recently created iteration
- get_pass_measurement(sub_tables: list, start_time: list, end_time: list)#
Fetch filtered data from MongoDB based on a range of [doy, year, sec] and specified tables from the pass table. Store the results within the specified range in separate JSON format.
- Parameters:
- Returns:
A summary of the query and the JSON files.
- Return type:
str, JSON
- store_iteration_output(file_paths: list, outlier_flags: dict, iteration_id: ObjectId)#
Store files into the MongoDB database associated with an iteration. :param file_paths: List of file paths to be stored :type file_paths: list[str] :param iteration_id: ID of the iteration document :type iteration_id: ObjectId
- upload_data(path: str, mode: str = 'file', overwrite=False)#
Uploads data from the local filesystem to MongoDB GridFS.
- upload_files_recursively(dir_path: str, collection_name: str, specific_folders=None, overwrite=False)#
Recursively finds all files in a given directory and uploads them to MongoDB GridFS.
- Parameters:
dir_path (str) – Path to the directory (or file) to search for files.
collection_name (str) – MongoDB collection to store metadata.
specific_folders (list or None) – List of specific folder/file paths (absolute paths). If None, uploads everything.
overwrite (bool) – If True, replaces existing files. If False, skips existing files.
- upload_single_file(file_path: str, collection_name: str, overwrite=False)#
Upload a single file to MongoDB GridFS.
- property client#
Get the MongoClient instance. :returns: MongoClient instance :rtype: pymongo.MongoClient
- property db#
Get the database instance. :returns: Database instance :rtype: pymongo.database.Database
- property fs#
Get the GridFS instance for storing files. :returns: GridFS instance :rtype: gridfs.GridFS