Skip to content

Commit 5a47978

Browse files
committed
feat: add 3.14, drop 3.9
1 parent 9735aeb commit 5a47978

3 files changed

Lines changed: 7 additions & 3 deletions

File tree

.github/workflows/ci.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ jobs:
1212
strategy:
1313
fail-fast: false
1414
matrix:
15-
python-version: ["3.9", "3.10", "3.11", "3.12", "3.13"]
15+
python-version: ["3.10", "3.11", "3.12", "3.13", "3.14"]
1616

1717
steps:
1818
- name: Check-out repository

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ All notable changes to semchunk will be documented here. This project adheres to
77
- Made it possible to chunk [Isaacus Legal Graph Schema (ILGS) Documents](https://docs.isaacus.com/ilgs/introduction) instead of just strings.
88
- Added a new `tokenizer_kwargs` argument to `chunkerify()` allowing users to specify custom keyword arguments to their tokenizers and token counters. `tokenizer_kwargs` can be used to override the default behavior of treating any encountered special tokens as if they are normal text when using a `tiktoken` or `transformers` tokenzier.
99
- Where a `tiktoken` or `transformers` tokenizer is used, started treating special tokens as normal text instead of, in the case of `tiktoken`, raising an error and, in the case of `transformers`, treating them as special tokens.
10+
- Added support for Python 3.14.
1011

1112
### Changed
1213
- Demoted asterisks in the hierarchy of splitters from sentence terminators to clause separators to better reflect their typical syntactic function.
@@ -16,6 +17,9 @@ All notable changes to semchunk will be documented here. This project adheres to
1617
- Significantly improved performance in cases where `merge_splits()` was the biggest bottleneck by switching from joining splits with splitters to indexing into the original text.
1718
- Slightly sped up `merge_splits()` by switching to the standard library's `bisect_left()` function which is now faster than the previous implementation.
1819

20+
### Removed
21+
- Dropped support for Python 3.9.
22+
1923
## [3.2.5] - 2025-10-28
2024
### Changed
2125
- Switched to more accurate monthly download counts from [pypistats.org](https://pypistats.org/) rather than the less accurate counts from [pepy.tech](https://pepy.tech/).

pyproject.toml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ authors = [
1111
]
1212
description = "A Python library for splitting text into smaller chunks while preserving as much local semantic context as possible."
1313
readme = "README.md"
14-
requires-python = ">=3.9"
14+
requires-python = ">=3.10"
1515
license = {text="MIT"}
1616
keywords = [
1717
"chunking",
@@ -33,11 +33,11 @@ classifiers = [
3333
"Intended Audience :: Science/Research",
3434
"License :: OSI Approved :: MIT License",
3535
"Operating System :: OS Independent",
36-
"Programming Language :: Python :: 3.9",
3736
"Programming Language :: Python :: 3.10",
3837
"Programming Language :: Python :: 3.11",
3938
"Programming Language :: Python :: 3.12",
4039
"Programming Language :: Python :: 3.13",
40+
"Programming Language :: Python :: 3.14",
4141
"Programming Language :: Python :: Implementation :: CPython",
4242
"Topic :: Scientific/Engineering :: Artificial Intelligence",
4343
"Topic :: Software Development :: Libraries :: Python Modules",

0 commit comments

Comments
 (0)