Skip to content

Commit 24c4aec

Browse files
warsawbrettcannon
andauthored
gh-148641: Implement PEP 829 - startup configuration files (#149109)
Implement PEP 829 - startup configuration files Also add `pkgutil.resolve_name(..., strict=True)` Co-authored-by: Brett Cannon <brett@python.org>
1 parent 0bf6e31 commit 24c4aec

11 files changed

Lines changed: 1151 additions & 198 deletions

Doc/deprecations/pending-removal-in-3.18.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,3 +10,9 @@ Pending removal in Python 3.18
1010
specifier ``'N'``, which is only supported in the :mod:`!decimal` module's
1111
C implementation, has been deprecated since Python 3.13.
1212
(Contributed by Serhiy Storchaka in :gh:`89902`.)
13+
14+
* Deprecations defined by :pep:`829`:
15+
16+
* ``import`` lines in :file:`{name}.pth` files are silently ignored.
17+
18+
(Contributed by Barry Warsaw in :gh:`148641`.)

Doc/deprecations/pending-removal-in-3.20.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,16 @@ Pending removal in Python 3.20
3939

4040
(Contributed by Hugo van Kemenade and Stan Ulbrych in :gh:`76007`.)
4141

42+
* Deprecations defined by :pep:`829`:
43+
44+
* Warnings are produced for ``import`` lines found in :file:`{name}.pth`
45+
files.
46+
47+
* :file:`{name}.pth` files are no longer decoded in the locale encoding by
48+
default. They **MUST** be encoded in ``utf-8-sig``.
49+
50+
(Contributed by Barry Warsaw in :gh:`148641`.)
51+
4252
* :mod:`ast`:
4353

4454
* Creating instances of abstract AST nodes (such as :class:`ast.AST`

Doc/library/pkgutil.rst

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -194,7 +194,7 @@ support.
194194
The :mod:`importlib.resources` module provides structured access to
195195
module resources.
196196

197-
.. function:: resolve_name(name)
197+
.. function:: resolve_name(name, *, strict=False)
198198

199199
Resolve a name to an object.
200200

@@ -208,6 +208,7 @@ support.
208208

209209
* ``W(.W)*``
210210
* ``W(.W)*:(W(.W)*)?``
211+
* ``W(.W)*:(W(.W)*)``
211212

212213
The first form is intended for backward compatibility only. It assumes that
213214
some part of the dotted name is a package, and the rest is an object
@@ -222,6 +223,11 @@ support.
222223
hierarchy within that package. Only one import is needed in this form. If
223224
it ends with the colon, then a module object is returned.
224225

226+
The first two forms are accepted when ``strict=False`` (the default).
227+
228+
The third form requires both the module name and callable, separated by
229+
a colon. Only this form is accepted when ``strict=True``.
230+
225231
The function will return an object (which might be a module), or raise one
226232
of the following exceptions:
227233

@@ -233,3 +239,7 @@ support.
233239
hierarchy within the imported package to get to the desired object.
234240

235241
.. versionadded:: 3.9
242+
243+
.. versionchanged:: 3.15
244+
245+
The optional keyword-only ``strict`` flag was added.

Doc/library/site.rst

Lines changed: 161 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ import can be suppressed using the interpreter's :option:`-S` option.
1717

1818
Importing this module normally appends site-specific paths to the module search path
1919
and adds :ref:`callables <site-consts>`, including :func:`help` to the built-in
20-
namespace. However, Python startup option :option:`-S` blocks this and this module
20+
namespace. However, Python startup option :option:`-S` blocks this, and this module
2121
can be safely imported with no automatic modifications to the module search path
2222
or additions to the builtins. To explicitly trigger the usual site-specific
2323
additions, call the :func:`main` function.
@@ -71,40 +71,121 @@ the user site prefixes are also implicitly not searched for site-packages.
7171
single: # (hash); comment
7272
pair: statement; import
7373

74-
A path configuration file is a file whose name has the form :file:`{name}.pth`
75-
and exists in one of the four directories mentioned above; its contents are
76-
additional items (one per line) to be added to ``sys.path``. Non-existing items
77-
are never added to ``sys.path``, and no check is made that the item refers to a
78-
directory rather than a file. No item is added to ``sys.path`` more than
79-
once. Blank lines and lines beginning with ``#`` are skipped. Lines starting
80-
with ``import`` (followed by space or tab) are executed.
74+
The :mod:`!site` module recognizes two startup configuration files of the form
75+
:file:`{name}.pth` for path configurations, and :file:`{name}.start` for
76+
pre-first-line code execution. Both files can exist in one of the four
77+
directories mentioned above. Within each directory, these files are sorted
78+
alphabetically by filename, then parsed in sorted order.
8179

82-
.. note::
80+
.. _site-pth-files:
8381

84-
An executable line in a :file:`.pth` file is run at every Python startup,
85-
regardless of whether a particular module is actually going to be used.
86-
Its impact should thus be kept to a minimum.
87-
The primary intended purpose of executable lines is to make the
88-
corresponding module(s) importable
89-
(load 3rd-party import hooks, adjust :envvar:`PATH` etc).
90-
Any other initialization is supposed to be done upon a module's
91-
actual import, if and when it happens.
92-
Limiting a code chunk to a single line is a deliberate measure
93-
to discourage putting anything more complex here.
82+
Path extensions (:file:`.pth` files)
83+
------------------------------------
84+
85+
:file:`{name}.pth` contains additional items (one per line) to be appended to
86+
``sys.path``. Items that name non-existing directories are never added to
87+
``sys.path``, and no check is made that the item refers to a directory rather
88+
than a file. No item is added to ``sys.path`` more than once. Blank lines
89+
and lines beginning with ``#`` are skipped.
90+
91+
For backward compatibility, lines starting with ``import`` (followed by space
92+
or tab) are executed with :func:`exec`.
9493

9594
.. versionchanged:: 3.13
95+
9696
The :file:`.pth` files are now decoded by UTF-8 at first and then by the
9797
:term:`locale encoding` if it fails.
9898

99+
.. versionchanged:: next
100+
101+
:file:`.pth` file lines starting with ``import`` are deprecated. During
102+
the deprecation period, such lines are still executed (except in the case
103+
below), but a diagnostic message is emitted only when the :option:`-v` flag
104+
is given.
105+
106+
``import`` lines in :file:`{name}.pth` are silently ignored when a
107+
:ref:`matching <site-start-files>` :file:`{name}.start` file exists.
108+
109+
Errors on individual lines no longer abort processing of the rest of the
110+
file. Each error is reported and the remaining lines continue to be
111+
processed.
112+
113+
.. deprecated-removed:: next 3.20
114+
115+
Decoding :file:`{name}.pth` files in any encoding other than ``utf-8-sig``
116+
is deprecated in Python 3.15, and support for decoding from the locale
117+
encoding will be removed in Python 3.20.
118+
119+
``import`` lines in :file:`{name}.pth` files are deprecated and will be
120+
silently ignored in Python 3.18 and 3.19. In Python 3.20 a warning will be
121+
produced for ``import`` lines in :file:`{name}.pth` files.
122+
123+
124+
.. _site-start-files:
125+
126+
Startup entry points (:file:`.start` files)
127+
-------------------------------------------
128+
129+
.. versionadded:: next
130+
131+
A startup entry point file is a file whose name has the form
132+
:file:`{name}.start` and exists in one of the site-packages directories
133+
described above. Each file specifies entry points to be called during
134+
interpreter startup, using the ``pkg.mod:callable`` syntax understood by
135+
:func:`pkgutil.resolve_name`.
136+
137+
Each non-blank line that does not begin with ``#`` must contain an entry
138+
point reference in the form ``pkg.mod:callable``. The colon and callable
139+
portion are mandatory. Each callable is invoked with no arguments, and
140+
any return value is discarded.
141+
142+
:file:`.start` files are processed after all :file:`.pth` path extensions
143+
have been applied to :data:`sys.path`, ensuring that paths are available
144+
before any startup code runs.
145+
146+
Unlike :data:`sys.path` extensions from :file:`.pth` files, duplicate entry
147+
points are **not** de-duplicated --- if an entry point appears more than once,
148+
it will be called more than once.
149+
150+
If an exception occurs during resolution or invocation of an entry point,
151+
a traceback is printed to :data:`sys.stderr` and processing continues with
152+
the remaining entry points.
153+
154+
:file:`.start` files must be encoded in UTF-8.
155+
156+
:pep:`829` defined the original specification for these features.
157+
158+
.. note::
159+
160+
If a :file:`{name}.start` file exists alongside a :file:`{name}.pth` file
161+
with the same base name, any ``import`` lines in the :file:`.pth` file are
162+
ignored in favor of the entry points in the :file:`.start` file.
163+
164+
.. note::
165+
166+
Executable lines (``import`` lines in :file:`{name}.pth` files and
167+
:file:`{name}.start` file entry points) are always run at Python startup
168+
(unless :option:`-S` is given to disable the ``site.py`` module entirely),
169+
regardless of whether a particular module is actually going to be used.
170+
171+
.. note::
172+
173+
:file:`{name}.start` files invoke :func:`pkgutil.resolve_name` with
174+
``strict=True``, which requires the full ``pkg.mod:callable`` form.
175+
99176
.. index::
100177
single: package
101178
triple: path; configuration; file
102179

180+
181+
Startup file examples
182+
---------------------
183+
103184
For example, suppose ``sys.prefix`` and ``sys.exec_prefix`` are set to
104185
:file:`/usr/local`. The Python X.Y library is then installed in
105186
:file:`/usr/local/lib/python{X.Y}`. Suppose this has
106187
a subdirectory :file:`/usr/local/lib/python{X.Y}/site-packages` with three
107-
subsubdirectories, :file:`foo`, :file:`bar` and :file:`spam`, and two path
188+
sub-subdirectories, :file:`foo`, :file:`bar` and :file:`spam`, and two path
108189
configuration files, :file:`foo.pth` and :file:`bar.pth`. Assume
109190
:file:`foo.pth` contains the following::
110191

@@ -131,6 +212,45 @@ directory precedes the :file:`foo` directory because :file:`bar.pth` comes
131212
alphabetically before :file:`foo.pth`; and :file:`spam` is omitted because it is
132213
not mentioned in either path configuration file.
133214

215+
Let's say that there is also a :file:`foo.start` file containing the
216+
following::
217+
218+
# foo package startup code
219+
220+
foo.submod:initialize
221+
222+
Now, after ``sys.path`` has been extended as above, and before Python turns
223+
control over to user code, the ``foo.submod`` module is imported and the
224+
``initialize()`` function from that module is called.
225+
226+
227+
.. _site-migration-guide:
228+
229+
Migrating from ``import`` lines in ``.pth`` files to ``.start`` files
230+
---------------------------------------------------------------------
231+
232+
If your package currently ships a :file:`{name}.pth` file, you can keep all
233+
``sys.path`` extension lines unchanged. Only ``import`` lines need to be
234+
migrated.
235+
236+
To migrate, create a callable (taking zero arguments) within an importable
237+
module in your package. Reference it as a ``pkg.mod:callable`` entry point
238+
in a matching :file:`{name}.start` file. Move everything on your ``import``
239+
line after the first semi-colon into the ``callable()`` function.
240+
241+
If your package must straddle older Pythons that do not support :pep:`829`
242+
and newer Pythons that do, change the ``import`` lines in your
243+
:file:`{name}.pth` to use the following form:
244+
245+
.. code-block:: python
246+
247+
import pkg.mod; pkg.mod.callable()
248+
249+
Older Pythons will execute these ``import`` lines, while newer Pythons will
250+
ignore them in favor of the :file:`{name}.start` file. After the straddling
251+
period, remove all ``import`` lines from your :file:`.pth` files.
252+
253+
134254
:mod:`!sitecustomize`
135255
---------------------
136256

@@ -236,10 +356,27 @@ Module contents
236356
This function used to be called unconditionally.
237357

238358

239-
.. function:: addsitedir(sitedir, known_paths=None)
359+
.. function:: addsitedir(sitedir, known_paths=None, *, defer_processing_start_files=False)
360+
361+
Add a directory to sys.path and parse the :file:`.pth` and :file:`.start`
362+
files found in that directory. Typically used in :mod:`sitecustomize` or
363+
:mod:`usercustomize` (see above).
364+
365+
The *known_paths* argument is an optional set of case-normalized paths
366+
used to prevent duplicate :data:`sys.path` entries. When ``None`` (the
367+
default), the set is built from the current :data:`sys.path`.
368+
369+
While :file:`.pth` and :file:`.start` files are always parsed, set
370+
*defer_processing_start_files* to ``True`` to prevent processing the
371+
startup data found in those files, so that you can process them explicitly
372+
(this is typically used by the :func:`main` function).
373+
374+
.. versionchanged:: next
240375

241-
Add a directory to sys.path and process its :file:`.pth` files. Typically
242-
used in :mod:`sitecustomize` or :mod:`usercustomize` (see above).
376+
Also processes :file:`.start` files. See :ref:`site-start-files`.
377+
All :file:`.pth` and :file:`.start` files are now read and
378+
accumulated before any path extensions, ``import`` line execution,
379+
or entry point invocations take place.
243380

244381

245382
.. function:: getsitepackages()
@@ -308,5 +445,6 @@ value greater than 2 if there is an error.
308445
.. seealso::
309446

310447
* :pep:`370` -- Per user site-packages directory
448+
* :pep:`829` -- Startup entry points and the deprecation of import lines in ``.pth`` files
311449
* :ref:`sys-path-init` -- The initialization of :data:`sys.path`.
312450

Doc/whatsnew/3.15.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,7 @@ Summary -- Release highlights
9191
* :ref:`Improved error messages <whatsnew315-improved-error-messages>`
9292
* :ref:`The official Windows 64-bit binaries now use the tail-calling interpreter
9393
<whatsnew315-windows-tail-calling-interpreter>`
94+
* :pep:`829`: Package Startup Configuration Files
9495

9596
New features
9697
============

Lib/pkgutil.py

Lines changed: 31 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,9 @@
99
import os.path
1010
import sys
1111

12+
lazy import re
13+
14+
1215
__all__ = [
1316
'get_importer', 'iter_importers',
1417
'walk_packages', 'iter_modules', 'get_data',
@@ -398,9 +401,10 @@ def get_data(package, resource):
398401
return loader.get_data(resource_name)
399402

400403

401-
_NAME_PATTERN = None
404+
_LENIENT_PATTERN = None
405+
_STRICT_PATTERN = None
402406

403-
def resolve_name(name):
407+
def resolve_name(name, *, strict=False):
404408
"""
405409
Resolve a name to an object.
406410
@@ -410,6 +414,7 @@ def resolve_name(name):
410414
411415
W(.W)*
412416
W(.W)*:(W(.W)*)?
417+
W(.W)*:(W(.W)*)
413418
414419
The first form is intended for backward compatibility only. It assumes that
415420
some part of the dotted name is a package, and the rest is an object
@@ -424,6 +429,11 @@ def resolve_name(name):
424429
hierarchy within that package. Only one import is needed in this form. If
425430
it ends with the colon, then a module object is returned.
426431
432+
The first two forms are accepted when `strict=False` (the default).
433+
434+
The third form requires both the module name and callable, separated by
435+
a colon. Only this form is accepted when `strict=True`.
436+
427437
The function will return an object (which might be a module), or raise one
428438
of the following exceptions:
429439
@@ -432,18 +442,26 @@ def resolve_name(name):
432442
AttributeError - if a failure occurred when traversing the object hierarchy
433443
within the imported package to get to the desired object.
434444
"""
435-
global _NAME_PATTERN
436-
if _NAME_PATTERN is None:
437-
# Lazy import to speedup Python startup time
438-
import re
439-
dotted_words = r'(?!\d)(\w+)(\.(?!\d)(\w+))*'
440-
_NAME_PATTERN = re.compile(f'^(?P<pkg>{dotted_words})'
441-
f'(?P<cln>:(?P<obj>{dotted_words})?)?$',
442-
re.UNICODE)
443-
444-
m = _NAME_PATTERN.match(name)
445-
if not m:
445+
global _LENIENT_PATTERN, _STRICT_PATTERN
446+
dotted_words = r'(?!\d)(\w+)(\.(?!\d)(\w+))*'
447+
if strict:
448+
if _STRICT_PATTERN is None:
449+
_STRICT_PATTERN = re.compile(
450+
f'^(?P<pkg>{dotted_words})'
451+
f'(?P<cln>:(?P<obj>{dotted_words}))$',
452+
re.UNICODE)
453+
pattern = _STRICT_PATTERN
454+
else:
455+
if _LENIENT_PATTERN is None:
456+
_LENIENT_PATTERN = re.compile(
457+
f'^(?P<pkg>{dotted_words})'
458+
f'(?P<cln>:(?P<obj>{dotted_words})?)?$',
459+
re.UNICODE)
460+
pattern = _LENIENT_PATTERN
461+
462+
if (m := pattern.match(name)) is None:
446463
raise ValueError(f'invalid format: {name!r}')
464+
447465
gd = m.groupdict()
448466
if gd.get('cln'):
449467
# there is a colon - a one-step import is all that's needed

0 commit comments

Comments
 (0)