Changes

Peng, Yuan · 95dfa1e7
--- a/Incremental-Load.md
+++ b/Incremental-Load.md
 :information_source: Start this ETL Job for incremental load
-The ETL job FHIR-to-OMOP can be executed as incremental load. The incremental load is used to load FHIR resources from FHIR Gateway to OMOP CDM that have been changed (inserts and updates) since the last bulk load or incremental load. To ensure that the ETL job only uses new or changed resources, the resources in the FHIR Gateway are filtered using the `last_updated_at` column. Because filtering is based on the **DATA_BEGINDATE** and **DATA_ENDDATE** parameters in the `sample.env` file, both parameters have to be adjusted before running the ETL job.
+The ETL job FHIR-to-OMOP can be executed as incremental load. The incremental load is used to load FHIR resources from FHIR Gateway to OMOP CDM that have been changed (inserts, updates, deletions) since the last bulk load or incremental load. To ensure that the ETL job only uses the new, changed or deleted resources, the resources in the FHIR Gateway are filtered using the `last_updated_at` column. Because filtering is based on the **DATA_BEGINDATE** and **DATA_ENDDATE** parameters in the `sample.env` file, both parameters have to be adjusted before running the ETL job.
-When the job runs as incremental load, each FHIR resource read in must be checked to see if it already exists in OMOP CDM. For this reason, processing FHIR resources is more complex than bulkload. The following sections provide an overview of the data flow and processing of resources.
+When the job runs as incremental load, each FHIR resource read in must be checked to see if it marked as 'deleted'. For Patient, Encounter (Supply/Administrative Case) and Medication resources will be checked for existence and the updates are performed on the existing record in OMOP CDM. If the resource is marked as deleted, this record in OMOP and all other records referenced to this record will be deleted from OMOP CDM.
+For all other resources, the update is done by deleting the old record in OMOP CDM and writing the read resource as a new record to OMOP CDM. If the resource is marked as deleted, no further tranformation will be occured after the deletion.
+That is the reason why processing FHIR resources during incremental load is more complex than bulkload. 
+The special feature of OMOP CDM is that the primary keys (e.g. person_id) of the tables are automatically generated independently of the identifiers used in FHIR. This means that after transforming FHIR resources to OMOP CDM, the identifying data for linking to other FHIR resources will be lost. For this reason, it is necessary to store the mapping of identifiers used in FHIR with ids used in OMOP CDM. The tables in OMOP CDM do not provide the possibility to store id as well as identifier of FHIR resources. Therefore, our approach was to create two new columns in the tables in OMOP CDM called `fhir_logical_id` and `fhir_identifier`. These columns can be used to store meta information about id and identifier of the respective FHIR resource. In addition, a prefix is added to `fhir_logical_id` and `fhir_identifier` when processing, (e.g. `pat-` for patient resources) to specify the FHIR resource type from which the record in OMOP CDM originates.
+The following sections provide an overview of the data flow and processing of resources during incremental load.
 # Data flow
@@ -12,26 +18,24 @@ When the job runs as incremental load, each FHIR resource read in must be checke
 <summary> Show data flow of Patient resources </summary>
 ---
-When a Patient resource is read in, the job first checks whether this Patient resource already exists in OMOP CDM. This is done in the PERSON table using `fhir_logical_id` or `fhir_identifier`. If the Patient resource already exists in OMOP CDM, an update will take place in OMOP CDM based on the corresponding `person_id`. If the Patient resource does not yet exist in OMOP CDM, it will be written to OMOP CDM as a new resource.
+When a Patient resource is read in, the job first checks whether this Patient resource is marked as 'deleted'. If so, this Patient record in OMOP CDM and all other records referenced to this will be delted. If not, the job checks whether this Patient resource already exists in OMOP CDM. This is done in the PERSON table using `fhir_logical_id` or `fhir_identifier`. If the Patient resource already exists in OMOP CDM, an update will take place in OMOP CDM based on the corresponding `person_id`. If the Patient resource does not exist in OMOP CDM, it will be written to OMOP CDM as a new resource.
-![inkrementellesLaden-Patient.drawio](uploads/c4f9c9097c2aab97c0061ad7efc11f98/inkrementellesLaden-Patient.drawio.png)
+![inkrementellesLaden-Patient](uploads/77f47e4e66bd89a56a20589273e43ec6/inkrementellesLaden-Patient.png)
 </details>
 </p>
-## Versorgungsfall/Verwaltungsfall
+## Supply/Administrative Case
 <p>
 <details>
 <summary> Show data flow of Encounter resources </summary>
 ---
-When an Encounter resource is read in, the job first checks whether the referenced Patient resource already exists in OMOP CDM. This is done in the PERSON table using `fhir_logical_id` or `fhir_identifier` from the referenced FHIR Patient resource. If the referenced Patient resource already exists in OMOP CDM, the existing `person_id` will be used.
+When an Encounter resource is read in, the job first checks whether this Encounter resource is marked as 'deleted'. If so, this Encounter record in OMOP CDM and all other records referenced to this will be delted. If not, the job checks whether the referenced Patient resource already exists in OMOP CDM. This is done in the PERSON table using `fhir_logical_id` or `fhir_identifier` from the referenced FHIR Patient resource. If the referenced Patient resource already exists in OMOP CDM, the existing `person_id` will be used. Otherwise the Encounter resource is skipped.
-If the referenced Patient resource is not yet available in OMOP CDM, a dummy for this Patient resource is created and written to OMOP CDM. If the "real" Patient resource is to be written to OMOP CDM during a next incremental loading, only an update of the dummy takes place.
+Next, the job checks if the Encounter resource already exists in OMOP CDM. This is done in the VISIT_OCCURRENCE table using `fhir_logical_id` or `fhir_identifier`. If the Encounter resource does not exist in OMOP CDM, it will be written to OMOP CDM as a new resource. If the Encounter resource already exists in OMOP CDM, an update will take place in OMOP CDM based on the corresponding `visit_occurrence_id`.
-Next, the job checks if the Encounter resource already exists in OMOP CDM. This is done in the VISIT_OCCURRENCE table using `fhir_logical_id` or `fhir_identifier`. If the Encounter resource does not yet exist in OMOP CDM, it will be written to OMOP CDM as a new resource. If the Encounter resource already exists in OMOP CDM, an update will take place in OMOP CDM based on the corresponding `visit_occurrence_id`. 
+![inkrementellesLaden-Encounter](uploads/01a6ed9adef20ab02ef195b0fc05c808/inkrementellesLaden-Encounter.png)
-![inkrementellesLaden-Encounter.drawio](uploads/b07f5cb307bdd0081b0800efb7c75388/inkrementellesLaden-Encounter.drawio.png)
 </details>
 </p>
@@ -46,40 +50,31 @@ Next, the job checks if the Encounter resource already exists in OMOP CDM. This
 When a Medication resource is read in, the job first checks whether this Medication resource already exists in OMOP CDM. This is done in the MEDICATION_ID_MAP table using `fhir_logical_id` or `fhir_identifier`. If the Medication resource does not yet exist in OMOP CDM, it will be written to OMOP CDM as a new resource. If the Medication resource already exists in OMOP CDM, an update will take place in MEDICATION_ID_MAP. 
-### MedicationAdministration/Medication Statement
+### MedicationAdministration/MedicationStatement
+When a MedicationAdministration/MedicationStatement resource is read in, the job first tries to delete possible already existing records in OMOP CDM for this resource. This is done in `DRUG_EXPOSURE` table using `fhir_logical_id` or `fhir_identifier` of the read resource.
-When a MedicationAdministration/MedicationStatement resource is read in, the job first checks whether the referenced Medication resource already exists in OMOP CDM. This is done in the MEDICATION_ID_MAP table using `fhir_logical_id` or `fhir_identifier`. If the referenced Medication resource already exists in OMOP CDM, the ATC-code in from the column `atc` is used for further processing of the MedicationAdministration/MedicationStatement resource.
+Next, it checks whether the referenced Medication resource already exists in OMOP CDM. This is done in the MEDICATION_ID_MAP table using `fhir_logical_id` or `fhir_identifier`. If the referenced Medication resource already exists in OMOP CDM, the ATC-code from the column `atc` is used for further processing of the MedicationAdministration/MedicationStatement resource.
 If the referenced Medication resource is not yet available in OMOP CDM, the Medication reference of the MedicationAdministration/MedicationStatement resource is set as `drug_source_value` in DRUG_EXPOSURE. After all resources have been processed, post-processing takes place. During a next incremental loading, the referenced Medication resource is searched in the MEDICATION_ID_MAP table using the Medication reference in `drug_source_value`. If this Medication resource exists in OMOP CDM, an update in DRUG_EXPOSURE takes place.
-![inkrementellesLaden-Medication.drawio](uploads/e9738753528b7a4de08d982ee706f3d1/inkrementellesLaden-Medication.drawio.png)
+![inkrementellesLaden-Medication](uploads/c09f5ef17bc8c1e280eb641d11cf835c/inkrementellesLaden-Medication.png)
 </details>
 </p>
-## Fachabteilungsfall, Procedure, Observation, MedicationAdministration, MedicationStatement, Condition
+## Department Case, Procedure, Observation, MedicationAdministration, MedicationStatement, Condition
 <p>
 <details>
 <summary> Show data flow of Encounter, Procedure, Observation, MedicationAdministration, MedicationStatement and Condition resources </summary>
 ---
-When an Encounter/Procedure/Observation/MedicationAdministration/MedicationStatement/Condition resource is read in, the job first checks whether the referenced Patient resource and Encounter resource already exists in OMOP CDM. This is done in PERSON and VISIT_OCCURRENCE table using `fhir_logical_id` or `fhir_identifier`. If the referenced Patient resource and Encounter resource already exists in OMOP CDM, the existing `person_id` and `visit_occurrence_id` will be used.
+When an Encounter/Procedure/Observation/MedicationAdministration/MedicationStatement/Condition resource is read in, the job first tries to delete possible already existing records in OMOP CDM using `fhir_logical_id` or `fhir_identifier`. If the resource is marked as 'deleted', no further transfomation will be occured. Otherwise the read resource is processed as a new resource (analog bulk load).
-If the referenced Patient resource and Encounter resource is not yet available in OMOP CDM, a dummy for the Patient resource and a dummy for the Encounter resource is created and written to OMOP CDM. If the "real" Patient resource and Encounter resource is to be written to OMOP CDM during a next incremental loading, only an update of the dummies takes place.
-Next the job checks if the read resource (e.g. Condition) already exists in OMOP CDM. This is done analogously to Patient resources and Encounter resources using `fhir_logical_id` or `fhir_identifier`. For each resource type, the following OMOP CDM tables are used for the check:
-| Resource type | Table |
-| :------: | :------: |
-| Encounter | VISIT_DETAIL |
-| Procedure | PROCEDURE_OCCURRENCE |
-| Observation | OBSERVATION <br> MEASUREMENT |
-| MedicationAdministration, <br> MedicationStatement | DRUG_EXPOSURE |
-| Condition | CONDITION_OCCURRENCE <br> OBSERVATION <br> PROCEDURE_OCCURRENCE <br> MEASUREMENT |
-If the read resource does not yet exist in OMOP CDM, it will be written to OMOP CDM as a new resource. If the read resource already exists in OMOP CDM, an update will take place in OMOP CDM based on the corresponding `visit_detail_id`/`procedure_occurrence_id`/`observation_id`/`measurement_id`/`drug_exposure_id`/`condition_occurrence_id`.
+The job checks whether the referenced Patient resource already exists in OMOP CDM. This is done in PERSON and table using `fhir_logical_id` or `fhir_identifier`. If the referenced Patient resource is not available in OMOP CDM, the read Encounter/Procedure/Observation/MedicationAdministration/MedicationStatement/Condition resource is skipped. If the referenced Patient resource already exists in OMOP CDM, the existing `person_id` will be used. After this step, the further processing of the read resource is done analog to bulk load.
-![inkrementellesLaden-Others.drawio](uploads/b3e6a9bb2b8c44b6f8c56a599d516344/inkrementellesLaden-Others.drawio.png)
+![inkrementellesLaden-Others](uploads/ade06d6eddcdbb70c4314382a5ff0526/inkrementellesLaden-Others.png)
 </details>
 </p>
\ No newline at end of file