NEDSS-DataIngestion

Data Ingestion for Modernization of NEDSS Project by Enquizit

Prerequisites

To build and run the services, Docker is required. To run the full system, you will also need Docker Compose, though it is not required for building.

Install Docker
[Optional] Install Docker Compose

Additionally, building the services locally requires Java 21.

Install Java 21

Setup

Docker is used for building the application both locally and inside a container. If you are using Docker Desktop, no further configuration is needed.

However, if you are running another container engine (such as Podman or Colima), you may need to configure environment variables. Refer to the Testcontainers documentation for details on which variables to set in a local .env file. A custom task has been added to the root build.gradle file to automatically load environment variables declared in the .env file into the JVM environment.

touch .env
$EDITOR .env

For running the services, you will need to create a dataingestion.env file with environment variables required by the services. You can copy the provided sample file and update the values as needed.

> cp dataingestion.env.sample dataingestion.env
> $EDITOR dataingestion.env

Getting all of the services up and running in Docker Compose requires some additional steps, please refer to the DevSetup.md for details.

Building / Testing

Build the entire project: ./gradlew build
Build a specific service: ./gradlew :data-ingestion-service:build
Test the entire project: ./gradlew test
Test a specific service: ./gradlew :data-processing-service:test
Run all verification checks: ./gradlew check

Running the Application inside Docker

Use docker compose to run the services.

> docker compose up -d

NOTE: If you encounter gradle exception such as missing wrapper then run the following command

> gradle wrapper

Building Docker image for EKS (1)

If you are on Mac OS Environnment, use Docker Buildx, so linux image can be built. The following command will build the image specified in the data-ingestion-service/Dockerfile.

> docker buildx  build --platform linux/amd64 -t <DOCKER_REPOS>/<IMAGE_NAME>:<VERSION> -f data-ingestion-service/Dockerfile . --push

Deploy Docker image on EKS (2)

These steps assume EKS cluster already exist and running, and it is being manage by Helm Charts. This command will create a new service if it does not exist, otherwise it will update the existing one.

Variables:

values-dev.yaml: indicate value file, helm charts pull values such as environment variable from this file.
jdbc.X=VALUE: argument to pass the value in as an enviroment variable, this value is defined in values-dev.yaml.
image.repository='VALUE': image repos, say if using registry other than docker hub. Ex: ECR

> helm upgrade --install dataingestion-service -f ./dataingestion-service/values-dev.yaml --set jdbc.dbserver='VALUE',jdbc.dbname='VALUE',jdbc.username='VALUE',jdbc.password='VALUE',jdbc.nbs.dbserver='VALUE',jdbc.nbs.dbname='VALUE',jdbc.nbs.username='VALUE',jdbc.nbs.password='VALUE',kafka.cluster='VALUE' dataingestion-service

For Helm Chart and EKS configuration, please refer to this NEDSS-Helm

Other useful commands
- helm delete <SERVICE-NAME>: delete service
- kubectl exec -it <POD-ID> -- /bin/bash : access pod environment
- kubectl get pods
- kubectl describe pod <POD-ID>: get pod info, useful to inspect configuration and debug
- kubectl logs <POD-ID>

Unit Testing and Code Coverage

Requirement:
- Code coverage must be greater than 90%
Progress:
- hl7-parser is greater than 80%.
- data-ingestion-service is greater than 80%.
  - Excluding classes and files.
    - Unused model classes in Jaxb package
      - models in this package are generated after built based on given xml definition
      - Unused model classes
        
        AnswerType
        
        CaseType
        
        ClinicalInformationType
        
        CodedType
        
        CommonQuestionsType
        
        DiseaseSpecificQuestionsType
        
        EpidemiologicInformationType
        
        HeaderType
        
        HierarchicalDesignationType
        
        HL7NumericType
        
        HL7OBXValueType
        
        HL7SNType
        
        HL7TMType
        
        IdentifiersType
        
        IdentifierType
        
        InvestigationInformationType
        
        LabReportCommmenType
        
        LabReportType
        
        NameType
        
        NoteType
        
        NumericType
        
        ObjectFactory
        
        ObservationsType
        
        ObservationType
        
        OrganizationParticipantType
        
        ParticipantsType
        
        PatientType
        
        PostalAddressType
        
        ProviderNameType
        
        ProviderParticipantType
        
        ReferenceRangeType
        
        ReportingInformationType
        
        SectionHeaderType
        
        SpecimenType
        
        SusceptibilityType
        
        TelephoneType
        
        TestResultType
        
        TestsType
        
        UnstructuredType
        
        ValuesType
    - Configuration classes
      - DataSourceConfig
      - NbsDataSourceConfig
      - OpenAPIConfig
      - SecurityConfig

SFTP ENV PARAMS

DI_SFTP_ENABLED=value  - value should be 'enabled' or 'disabled'
DI_SFTP_HOST=value - SFTP server host name
DI_SFTP_USER=value
DI_SFTP_PWD=value
DI_SFTP_ELR_FILE_EXTNS=value - Comma separted list of file extensions (ex: txt,hl7)
DI_PHCR_IMPORTER_VERSION=value - 1 for classic phcrImporter batch job, 2 for RTI
DI_SFTP_FILEPATHS=value - Comma separted list of file extensions (ex: /ELRFiles,/ELRFiles/lab-1,/ELRFiles/lab-2)

HL7 Bulk Ingestion Script

A bash utility to bulk-upload HL7 lab reports to the NBS Ingestion API. This script automates the process of iterating through thousands of files, handling authentication, capturing API-generated GUIDs, and archiving successfully processed reports.

Features

Location: docs/elr_upload_bulk.sh
Cross-Platform: Runs on macOS and Linux. Runs on Windows via Git Bash.
Named Arguments: Uses standard flags for source, destination, and secrets.
Return Value: Stops execution immediately if a 200 OK is not returned to prevent data mismatches or token expiration issues.
Automatic Archiving: Moves files from the source to the destination only upon a successful API response.
GUID Capturing: Extracts and prints the unique identifier returned by the API for each message.

Usage

Parameters

Flag	Description
-s	Path to the directory containing your .hl7 or .txt files.
-d	Path to the directory where successfully uploaded files will be moved.
-u	The full API endpoint
-t	Your current Authorization Bearer Token (JWT).
-c	The clientsecret provided by the API provider.

Example

./elr_upload_bulk.sh \
  -s ./inbox \
  -d ./archive \
  -u "https://data.nbsdemo.com/ingestion/api/elrs" \
  -t "aiouAGGWShioioj..." \
  -c "aiDakl..."

Name		Name	Last commit message	Last commit date
Latest commit History 517 Commits
.github		.github
.idea		.idea
cdaschema		cdaschema
containers		containers
coverage-report		coverage-report
data-ingestion-service		data-ingestion-service
data-processing-service		data-processing-service
deduplication		deduplication
docs		docs
gradle		gradle
hl7-parser		hl7-parser
karatetest		karatetest
phdc-xsd-jaxb		phdc-xsd-jaxb
.git-blame-ignore-revs		.git-blame-ignore-revs
.gitignore		.gitignore
CODEOWNERS		CODEOWNERS
LICENSE		LICENSE
README.md		README.md
build.gradle		build.gradle
dataingestion.env.sample		dataingestion.env.sample
docker-compose.yml		docker-compose.yml
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle		settings.gradle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NEDSS-DataIngestion

Prerequisites

Setup

Building / Testing

Running the Application inside Docker

Building Docker image for EKS (1)

Deploy Docker image on EKS (2)

Unit Testing and Code Coverage

SFTP ENV PARAMS

HL7 Bulk Ingestion Script

Features

Usage

Parameters

Example

About

Uh oh!

Releases 26

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NEDSS-DataIngestion

Prerequisites

Setup

Building / Testing

Running the Application inside Docker

Building Docker image for EKS (1)

Deploy Docker image on EKS (2)

Unit Testing and Code Coverage

SFTP ENV PARAMS

HL7 Bulk Ingestion Script

Features

Usage

Parameters

Example

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 26

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages