Skip to content

Commit eb2bbf4

Browse files
authored
adding overview + integration with purview
1 parent 85a1886 commit eb2bbf4

1 file changed

Lines changed: 176 additions & 0 deletions

File tree

Lines changed: 176 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,176 @@
1+
# Integrating Azure Cosmos DB for MongoDB with Microsoft Purview
2+
3+
Costa Rica
4+
5+
[![Microsoft Purview](https://img.shields.io/badge/Microsoft-Purview-blue)](https://learn.microsoft.com/en-us/azure/purview/) [![Azure Cosmos DB for MongoDB](https://img.shields.io/badge/Azure-Cosmos%20DB%20for%20MongoDB-blue)](https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/)
6+
7+
Last updated: 2025-06-20
8+
9+
---
10+
11+
> Microsoft Purview provides a unified data governance solution that enables organizations to manage and govern their on-premises, multi-cloud, and SaaS data. Integrating **Azure Cosmos DB for MongoDB** with Purview allows you to discover, classify, and protect sensitive document-based data while enforcing governance standards and compliance across your organization.
12+
13+
<details>
14+
<summary>List of References</summary>
15+
16+
- [Microsoft Purview Documentation](https://learn.microsoft.com/en-us/azure/purview/)
17+
- [Azure Cosmos DB for MongoDB Documentation](https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/)
18+
- [Purview MongoDB Integration Guide](https://learn.microsoft.com/en-us/azure/purview/how-to-register-scan-mongodb)
19+
- [Azure Pricing Calculator](https://azure.microsoft.com/en-us/pricing/calculator/)
20+
21+
</details>
22+
23+
<details>
24+
<summary>Table of Content</summary>
25+
26+
- [How to Integrate Cosmos DB for MongoDB with Purview](#how-to-integrate-cosmos-db-for-mongodb-with-purview)
27+
- [Registering the Data Source](#registering-the-data-source)
28+
- [Scan Configuration](#scan-configuration)
29+
- [Classification and Labeling](#classification-and-labeling)
30+
- [Governance and DLP Setup](#governance-and-dlp-setup)
31+
- [Example DLP Policies](#example-dlp-policies)
32+
- [Cost Management](#cost-management)
33+
- [Best Practices](#best-practices)
34+
- [Unity Catalog Integration](#unity-catalog-integration)
35+
36+
</details>
37+
38+
## How to Integrate Cosmos DB for MongoDB with Purview
39+
40+
### Registering the Data Source
41+
42+
- Navigate to the [Microsoft Purview Studio](https://web.purview.azure.com/).
43+
- Under **Data Map**, select **Register** > **Azure Cosmos DB for MongoDB**.
44+
- Provide account URI, authentication method, database names, and integration runtime details.
45+
- Ensure connectivity using a Managed VNet or SHIR as needed.
46+
47+
### Scan Configuration
48+
49+
- Choose collections to scan or use pattern-based selectors.
50+
- Define scan rulesets to extract metadata (e.g., schema, index fields).
51+
- Schedule full or incremental scans to avoid unnecessary costs and optimize discovery.
52+
53+
### Classification and Labeling
54+
55+
- Use built-in classifiers or create custom ones for domain-specific fields.
56+
- Apply sensitivity labels (e.g., Confidential, Restricted) from Microsoft Information Protection.
57+
- Labels can drive automated policies (e.g., data masking, alerting, role-based access).
58+
59+
## Governance and DLP Setup
60+
61+
> Use Microsoft Purview to create Data Loss Prevention (DLP) policies targeting MongoDB collections. Here are real-world examples:
62+
63+
<details>
64+
<summary><b>E.g: DLP Policy for User Profiles</b> (Click to expand)</summary>
65+
66+
> Protect customer and employee profile data stored in `user_data`, `accounts`, `profiles` collections.
67+
68+
**Steps:**
69+
1. **Define a DLP Policy:** Target collections with sensitive document schemas.
70+
2. **Set Detection Parameters:** Trigger on PII, credentials, and contact information fields.
71+
3. **Policy Actions:**
72+
- Log access to sensitive fields.
73+
- Block copy/export operations for untrusted entities.
74+
4. **Monitor Activity:** Use built-in auditing to review scan and access logs.
75+
76+
</details>
77+
78+
<details>
79+
<summary><b>E.g: DLP Policy for Payment Transactions</b> (Click to expand)</summary>
80+
81+
> Safeguard financial data in `payments`, `invoices`, `billing_records`.
82+
83+
**Steps:**
84+
1. **Define Policy Scope:** Look for fields like `credit_card`, `billing_address`, and `transaction_id`.
85+
2. **Detection:** Use built-in financial data classifiers.
86+
3. **Policy Actions:**
87+
- Encrypt fields before serving to external users.
88+
- Generate alerts for more than 100 financial records queried in under 1 minute.
89+
4. **Audit:** Track query logs through Azure Monitor integration.
90+
91+
</details>
92+
93+
<details>
94+
<summary><b>E.g: DLP Policy for Healthcare Records</b> (Click to expand)</summary>
95+
96+
> Protect personal health information (PHI) within `patients`, `treatment_history`, and `medications`.
97+
98+
**Steps:**
99+
1. **Policy Creation:** Include diagnosis codes and treatment plans.
100+
2. **PHI Detection:** Use custom tags like `diagnosis`, `symptoms`, `prescription_id`.
101+
3. **Actions:**
102+
- Mask fields for users not in the HealthPractitioner group.
103+
- Block export to unsupported formats.
104+
4. **Logs:** Enable alerting on access by country or device-type anomalies.
105+
106+
</details>
107+
108+
<details>
109+
<summary><b>E.g: DLP Policy for HR Records</b> (Click to expand)</summary>
110+
111+
> Secure data in `hr`, `payroll`, and `performance_reviews` collections.
112+
113+
**Steps:**
114+
1. **Scope:** Apply to fields like `salary`, `review_score`, `benefit_plan`.
115+
2. **Detection:** Match on numerical ranges and string pattern validation (e.g., ID formats).
116+
3. **Actions:**
117+
- Restrict access to HR-only security groups.
118+
- Redact data for cross-departmental queries.
119+
4. **Monitoring:** Report monthly activity summaries to HR audit teams.
120+
121+
</details>
122+
123+
<details>
124+
<summary><b>E.g: DLP Policy for Legal Case Data</b> (Click to expand)</summary>
125+
126+
> Protect sensitive legal content in `case_files`, `legal_memos`, and `contracts`.
127+
128+
**Steps:**
129+
1. **Classifier Setup:** Identify documents referencing legal codes, client names, settlement terms.
130+
2. **Actions:**
131+
- Encrypt entire documents upon detection.
132+
- Flag and quarantine documents shared externally.
133+
3. **Compliance Logging:** Store evidence trails in Purview for 7 years.
134+
135+
</details>
136+
137+
## Cost Management
138+
139+
> [!NOTE]
140+
> Purview and Cosmos DB pricing are consumption-based.
141+
142+
- **Cosmos DB Billing:** Based on RU/s, storage size, indexing, replication.
143+
- **Purview Billing:** Based on scan duration (vCore/hour) and volume (GB scanned).
144+
- **Tips:**
145+
- Limit scan scope to active collections.
146+
- Use Purview Data Map capacity planning.
147+
- Automate cost alerts using Azure Cost Management.
148+
149+
## Best Practices
150+
151+
- **Schema Tagging:** Add metadata tags to collections for visibility and classification.
152+
- **Minimum Privilege:** Use role-based access controls via Azure AD.
153+
- **Regular Scans:** Automate daily or weekly scans to stay up-to-date.
154+
- **Cross-Service Monitoring:** Integrate with Azure Monitor and Defender for Cloud for unified security.
155+
156+
## Unity Catalog Integration
157+
158+
> Unlock full data lineage and policy-based access using Microsoft Purview and Unity Catalog together.
159+
160+
### Integration Steps
161+
162+
1. **Register Cosmos DB source in Unity Catalog (via Purview connector)**
163+
2. **Create Data Products:** Classify MongoDB collections into curated data assets.
164+
3. **Set Data Lineage:** Map ingestion workflows from Cosmos DB to analytics/BI tools.
165+
4. **Assign Stewards and Policies:** Define who owns, maintains, and accesses the data.
166+
167+
### Benefits
168+
169+
- Unified governance across structured and unstructured data.
170+
- Compliance-ready posture for PII, PCI, HIPAA use cases.
171+
- Reduced risk through discoverability and controlled data distribution.
172+
173+
<div align="center">
174+
<h3 style="color: #4CAF50;">Total Visitors</h3>
175+
<img src="https://profile-counter.glitch.me/brown9804/count.svg" alt="Visitor Count" style="border: 2px solid #4CAF50; border-radius: 5px; padding: 5px;"/>
176+
</div>

0 commit comments

Comments
 (0)