System Components
System components with their role of existing and responsibilities
SharePoint
Purpose : Serves as the source repository for scanned refugee documents awaiting processing.
Responsibilities :
- Store scanned refugee documents
- Organize documents into configured folders
- Maintain document metadata and processing status
Dependencies :
None
Related Components :
Power Automate
Power Automate
Purpose : Coordinates document ingestion from SharePoint into the Digital Archive processing pipeline.
** Responsibilities** :
- Monitor configured SharePoint folders
- Process documents in batches
- Upload files to Azure Blob Storage
- Update document processing status in SharePoint
- Publish messages to Azure Queue Storage
Inputs :
- SharePoint folder location
Output :
- Queue the document Splitting & Classification events
- Update the migrated files status to
Processedin sharepoint - Documents uploaded to Blob Storage
Dependencies :
SharePointAzure Blob StorageAzure Queue Storage
Related Components :
FileSplitterAndClassifierFunction
Azure Blob Storage
Purpose : Provides centralized storage for documents throughout their processing lifecycle.
** Responsibilities** :
- Store raw uploaded documents
- Store split documents
- Store classified documents
- Maintain document state across processing phases
Inputs :
- Documents from Power Automate
- Split files from Azure Functions
Output :
- Documents for downstream processing
Dependencies :
None
Related Components :
Power AutomateFileSplitterAndClassifierFunctionCardExtractionFunctionPower Apps
Azure Queue Storage
Purpose : Enables asynchronous communication between processing stages.
Responsibilities :
- Decouple services
- Trigger Azure Functions
- Handle retries and poison messages
- Enable asynchronous processing
Inputs :
- Messages from producers
Output :
- Events consumed by Azure Functions
Dependencies :
None
Related Components :
Power AutomateFileSplitterAndClassifierFunctionCardExtractionFunctionDataCleansingFunction
FileSplitterAndClassifierFunction
Purpose : Processes uploaded documents by splitting, classifying, and routing them for data extraction.
Responsibilities :
- Read uploaded documents
- Split PDF files
- Classify documents
- Route files to appropriate Azure Storage containers
- Persist file metadata
- Publish extraction events
Inputs :
- Messages from
document-split-classification-queue - Documents from
unprocessed-raw-filescontainer
Outputs :
- Split files in Blob Storage
- Classified documents
- Database metadata records
- Messages in document-extraction-queue
Dependencies :
Blob StorageAzure Queue StorageAzure Document IntelligenceAzure SQL Database
Related Components :
CardExtractionFunctionAzure Document Intelligence
CardExtractionFunction
Purpose : Extracts structured data from classified documents using AI models.
Responsibilities :
- Retrieve classified documents
- Select extraction workflow based on card type
- Extract structured data
- Persist extracted data to Azure SQL Database
- Publish cleansing events
Inputs :
- Queue messages from
document-split-classification-queue - Classified documents from Blob Storage container
redcrosscard/redmastercard
Outputs :
- Documents data persisted in the database
- Messages in document-data-cleansing-queue
Dependencies :
Blob StorageAzure Queue StorageAzure Document IntelligenceAzure SQL Database
Related Components :
DataCleansingFunction
DataCleansingFunction
Purpose : Applies business rules to normalize and validate extracted data.
Responsibilities :
- Execute SQL stored procedures
- Normalize extracted values
- Apply business validation rules
- Prepare data for search indexing and human review
Inputs :
- Messages from document-data-cleansing-queue
- Extracted records in Azure SQL Database
Outputs :
- Cleansed database records
- Data ready for indexing
- Data ready for manual review
Dependencies :
Azure SQL DatabaseBusiness rules documentation
Related Components :
Power Apps
Purpose : Provides the user interface for document search, review, and correction workflows.
Responsibilities :
- Search processed records
- Review extracted data
- Correct extraction errors
- Display document details
- Consume Azure AI Search results
Outputs :
- Search results
- User corrections
- Manual review actions
Dependencies :
Azure AI SearchAzure SQL Database(TODO: does the app connects directly to the database)Azure blob storage
TODO: add AI Search, Azure document Intelligence, Azure SQL Database