Skip to content

Commit a3e9504

Browse files
committed
Docs
1 parent 210c544 commit a3e9504

10 files changed

Lines changed: 249 additions & 4 deletions

File tree

Readme.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,15 @@
11
#
22

3+
* [#08 - Tree Indexes: B+Trees (CMU Intro to Database Systems)](https://www.youtube.com/watch?v=scUtG_6M_lU)
4+
* [SQLite: How it works](https://www.youtube.com/watch?v=ZSKLA81tBis)
35
* [Write a database from scratch](https://www.youtube.com/playlist?list=PLWRwj01AnyEtjaw-ZnnAQWnVYPZF5WayV)
46

5-
##
7+
* [Understanding B-Trees: The Data Structure Behind Modern Databases](https://www.youtube.com/watch?v=K1a2Bk8NrYQ)
8+
9+
* [Build a NoSQL Database From Scratch in 1000 Lines of Code](https://medium.com/better-programming/build-a-nosql-database-from-the-scratch-in-1000-lines-of-code-8ed1c15ed924)
10+
* [Writing a SQL database from scratch in Go: 1. SELECT, INSERT, CREATE and a REPL](https://notes.eatonphil.com/database-basics.html)
611

7-
* Build Your Own Database From Scratch
12+
##
813

914
```
1015
1. Persistence. How not to lose or corrupt your data. Recovering from a crash.

secretary/Todo.md

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,16 @@
1+
* ```go
2+
allChildren := make([]int32, numNodes*childrenSize) // One big allocation
3+
4+
for i := range nodes {
5+
nodes[i].children = allChildren[i*childrenSize : (i+1)*childrenSize] // No new allocation
6+
}
7+
```
18
* Delete key, if deletes node, keep deleted node in array for removal from disk
2-
*
9+
* Images, binary data visual
10+
* Kademlia
11+
* Persist to storage, with compression
12+
* Bufferpool, Timebaseminheap
13+
* Inverted tree, index, ngram, bm25
14+
* hyperloglog, bloomfilter
15+
* Wal
16+
* Transaction concurrency

secretary/bustub.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
https://github.com/cmu-db/bustub

secretary/compress.md

Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
🚀 Zstd (Zstandard) in Golang
2+
3+
Zstandard (Zstd) is a fast compression algorithm that provides better compression ratios than gzip while being faster. It’s great for high-performance databases, logs, network transmission, and storage.
4+
5+
📌 Why Use Zstd in Golang?
6+
• 🔥 High compression ratio (better than gzip)
7+
• ⚡ Fast decompression (ideal for real-time applications)
8+
• 📉 Adjustable compression levels (trade-off between speed & compression)
9+
• 🔄 Streaming support (works well for large files)
10+
11+
📦 Install Zstd for Golang
12+
13+
Use the optimized Zstd package by Klaus Post:
14+
15+
go get github.com/klauspost/compress/zstd
16+
17+
18+
19+
20+
21+
✍️ Example: Compress & Decompress Using Zstd
22+
23+
package main
24+
25+
import (
26+
"bytes"
27+
"fmt"
28+
"io"
29+
"log"
30+
31+
"github.com/klauspost/compress/zstd"
32+
)
33+
34+
// Compresses input data using Zstd
35+
func compress(data []byte) []byte {
36+
var buf bytes.Buffer
37+
writer, _ := zstd.NewWriter(&buf)
38+
writer.Write(data)
39+
writer.Close()
40+
return buf.Bytes()
41+
}
42+
43+
// Decompresses Zstd compressed data
44+
func decompress(compressed []byte) []byte {
45+
reader, _ := zstd.NewReader(bytes.NewReader(compressed))
46+
decompressed, _ := io.ReadAll(reader)
47+
return decompressed
48+
}
49+
50+
func main() {
51+
data := []byte("Hello, this is a repeated text. Hello, this is a repeated text. Hello, this is a repeated text.")
52+
53+
// Compress data
54+
compressed := compress(data)
55+
fmt.Println("Original Size:", len(data))
56+
fmt.Println("Compressed Size:", len(compressed))
57+
58+
// Decompress data
59+
decompressed := decompress(compressed)
60+
fmt.Println("Decompressed:", string(decompressed))
61+
}
62+
63+
64+
65+
66+
67+
📊 How Much Can Zstd Compress?
68+
69+
Zstd compression depends on the data type. Here’s a rough estimate:
70+
71+
Data Type Compression Ratio
72+
Text (logs, JSON) 70-85% smaller
73+
Binary (executables, data) 15-40% smaller
74+
Already Compressed Data (JPEG, MP3, MP4) ~0%
75+
76+
• Level 1 (fastest) → Lower compression, high speed.
77+
• Level 22 (slowest) → Best compression, slow speed.
78+
• Default level is 3, which is a good balance.
79+
80+
🚀 Streaming Compression (for Large Data)
81+
82+
If you’re dealing with large files, use streaming compression to avoid excessive memory usage:
83+
84+
func streamCompress(reader io.Reader, writer io.Writer) error {
85+
zstdWriter, _ := zstd.NewWriter(writer)
86+
_, err := io.Copy(zstdWriter, reader)
87+
zstdWriter.Close()
88+
return err
89+
}
90+
91+
This is useful for compressing database logs, network data, and large files without loading everything into memory.
92+
93+
94+
95+
✅ Where Can Zstd Be Useful in Your Projects?
96+
• Bitcask: Compress log segments to reduce disk usage.
97+
• B+ Tree: Compress index nodes/pages for better storage efficiency.
98+
• Kademlia (DHT): Compress large values to save bandwidth.
99+
• Gemini Server: Compress responses for faster transmission.
100+
101+
Would you like a benchmark for your specific data in Go? 🚀
102+

secretary/kademilia.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
* https://github.com/libp2p/specs/blob/master/kad-dht/README.md
2121
* https://github.com/libp2p/go-libp2p/blob/master/examples/libp2p-host/host.go
2222
* https://github.com/libp2p/go-libp2p-kad-dht
23+
* https://github.com/nictuku/dht
2324
* https://github.com/pdelong/Kademlia
2425
* https://github.com/prettymuchbryce/kademlia
2526

secretary/lock.md

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -170,3 +170,109 @@ Tradeoffs
170170
• Harder to debug than traditional locking mechanisms.
171171

172172
Would you like me to refine specific parts, like concurrent range queries or node splits? 🚀
173+
174+
175+
176+
177+
178+
Database Locks: Types and Usage
179+
180+
Database locks are mechanisms used to ensure consistency, integrity, and concurrency control in multi-user environments. They prevent race conditions, dirty reads, and data corruption when multiple transactions access the same data.
181+
182+
183+
184+
Types of Database Locks
185+
186+
1. Pessimistic Locking
187+
• Blocks access to a resource until a transaction is complete.
188+
• Ensures no other transaction modifies the data while a lock is held.
189+
• Typically used in high-contention scenarios (e.g., banking systems).
190+
191+
Example (MySQL FOR UPDATE)
192+
193+
START TRANSACTION;
194+
SELECT * FROM accounts WHERE id = 1 FOR UPDATE; -- Locks row until COMMIT/ROLLBACK
195+
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
196+
COMMIT;
197+
198+
Use Cases:
199+
✅ Ensures strong consistency.
200+
❌ Can cause performance issues due to waiting/blocking.
201+
202+
203+
204+
2. Optimistic Locking
205+
• Allows concurrent access but detects conflicts before committing.
206+
• Uses version numbers or timestamps to check if data was modified.
207+
• If a conflict is detected, the transaction is retried.
208+
209+
Example (Using Version Number)
210+
211+
SELECT id, balance, version FROM accounts WHERE id = 1;
212+
UPDATE accounts SET balance = balance - 100, version = version + 1
213+
WHERE id = 1 AND version = 1; -- Fails if version changed
214+
215+
Use Cases:
216+
✅ Best for low-contention scenarios.
217+
❌ Requires extra logic for retrying transactions.
218+
219+
220+
221+
3. Table Locks
222+
• Locks the entire table, preventing other transactions from reading or writing.
223+
• Used when bulk updates need consistency.
224+
225+
Example (MySQL Table Lock)
226+
227+
LOCK TABLES accounts WRITE;
228+
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
229+
UNLOCK TABLES;
230+
231+
Use Cases:
232+
✅ Guarantees full consistency.
233+
❌ Not scalable for multi-user applications.
234+
235+
236+
237+
4. Row-Level Locks
238+
• Locks only specific rows affected by a transaction.
239+
• Allows higher concurrency than table locks.
240+
241+
Example (PostgreSQL SELECT FOR UPDATE)
242+
243+
BEGIN;
244+
SELECT * FROM orders WHERE id = 123 FOR UPDATE; -- Locks row
245+
UPDATE orders SET status = 'shipped' WHERE id = 123;
246+
COMMIT;
247+
248+
Use Cases:
249+
✅ Efficient for concurrent updates on different rows.
250+
❌ Can cause deadlocks if transactions lock rows in different orders.
251+
252+
253+
254+
5. Deadlocks and Handling
255+
256+
A deadlock occurs when two transactions hold locks and wait for each other to release them.
257+
258+
Example Deadlock (Two Transactions)
259+
260+
Transaction A: LOCK row 1 → WAIT for row 2
261+
Transaction B: LOCK row 2 → WAIT for row 1
262+
263+
Preventing Deadlocks
264+
• Access resources in a consistent order.
265+
• Use shorter transactions to minimize lock time.
266+
• Set timeouts on locks (e.g., SELECT FOR UPDATE NOWAIT).
267+
268+
269+
270+
Which Locking Strategy to Use?
271+
272+
Scenario Best Locking Strategy
273+
High contention on updates Pessimistic Locking (FOR UPDATE)
274+
Low contention, high concurrency Optimistic Locking (versioning)
275+
Bulk operations Table Locks (LOCK TABLES)
276+
Multiple transactions updating different rows Row-Level Locks
277+
278+
Would you like a deep dive into deadlocks, isolation levels, or specific databases (PostgreSQL, MySQL, etc.)? 🚀

secretary/lsmtree.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,13 @@
1+
[#04 - Database Storage: Log-Structured Merge Trees & Tuples (CMU Intro to Database Systems)](https://www.youtube.com/watch?v=IHtVWGhG0Xg&t=1372s)
2+
3+
https://github.com/facebook/rocksdb/wiki
4+
5+
https://github.com/krasun/lsmtree
6+
https://github.com/skyzh/mini-lsm
7+
8+
9+
10+
111
Object stores typically do not use B-trees like databases. Instead, they use hash-based indexing or LSM-trees (Log-Structured Merge Trees) depending on the use case. Here’s why:
212

313
1. Hash-Based Indexing (Common for Object Stores)

secretary/olap.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
https://github.com/risinglightdb/risinglight

secretary/raft.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,5 @@
1-
https://notes.eatonphil.com/2023-05-25-raft.html
1+
* https://notes.eatonphil.com/2023-05-25-raft.html
2+
3+
* https://github.com/otoolep/hraftd
4+
* https://github.com/hashicorp/raft-boltdb
5+
* https://github.com/Jille/raft-grpc-example

secretary/vector.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
https://github.com/skyzh/write-you-a-vector-db

0 commit comments

Comments
 (0)