Python packaging guide: различия между версиями

Материал из ALT Linux Wiki
Нет описания правки
Строка 120: Строка 120:


====Metadata====
====Metadata====
If a project is assumed to be reusable it must publish corresponding meta information. The generation of such data is a job of build backend.
But there are projects which are intended only for internal usage and they don't want to produce any meta information in this case (usually such projects are not published on index like pypi.org).
* installed metadata (content of dist-info directory) shouldn't be stripped any more than it's already done. Nowadays, only '''METADATA''' and '''entry_points.txt''' are actually used by metadata parsers (See [https://docs.python.org/3/library/importlib.metadata.html importlib.metadata] for details). This data is crucial for many aspects of Python eco-system.
* installed metadata (content of dist-info directory) shouldn't be stripped any more than it's already done. Nowadays, only '''METADATA''' and '''entry_points.txt''' are actually used by metadata parsers (See [https://docs.python.org/3/library/importlib.metadata.html importlib.metadata] for details). This data is crucial for many aspects of Python eco-system.


==Migration to PEP517 in RPM specfile==
==Migration to PEP517 in RPM specfile==

Версия от 17:51, 22 июня 2022

Python-logo-master-v3-TM-flattened.png

Python packaging guide

This guide is intended to cover the packaging of Python projects: source trees having legacy setup.py or modern pyproject.toml (or both) files.

A packaging of Python distribution implies two mandatory steps:

  • build - build some artifact which can be either bare source files or binary distribution also known as wheel
  • install - installation of produced build artifact

There are optional steps as well:

  • check

Build

Python projects can be two types:

  • pure Python distributions which don't require any actual building process and can be just copied/pasted onto destination as is
  • platform-dependent distributions which require special building process and can't be just copied/pasted onto destination as is

Sources

There are two supported options for sources:

  • source tree (snapshot of VCS)
  • source distribution also known as sdist
VCS and gear

A source tree produced by gear is not a valid VCS checkout due to missing SCM's internal data (e.g. .git directory and its content).
Some packaging plugins may rely on SCM. For example,

setuptools_scm implements a file_finders entry point which returns all files tracked by your SCM. This eliminates the need for a manually constructed MANIFEST.in in most cases where this would be required when not using setuptools_scm, namely:

  • To ensure all relevant files are packaged when running the sdist command.
  • When using include_package_data to include package data as part of the build or bdist_wheel.

Thus, actual SCM source tree is required in this case, otherwise some expected data (everything, besides Python packages or modules) may not be packaged.
For example, basic git configuration maybe done in RPM specfile with:

git init                                                                         
git config user.email author@example.com                                         
git config user.name author                                                      
git add .                                                                        
git commit -m 'release'                                                          
git tag '%version'


pypi.org

Though it's possible to use wheels from pypi.org as source for installation (skipping build step), in general, it's bad idea to just unpack such a wheel for many reasons like:

  • wheels may be platform-specific
  • wheels are prebuilt distributions, builders of which may not follow distro's build rules and policies, can link to whatever they want and so on
  • wheels are usually stripped (there are no tests, docs, etc.)

Thus, only source distribution can be used from pypi.org for packaging, though it's recommended to use a source of sdist e.g., a code fetched from code hostings.

Build roles

With accepted PEP517 a build process is more standardized and is split into two responsibilities:

  • build frontend takes arbitrary source tree and calls build backend on them. User can use one frontend for building any PEP517-aware project.
  • build backend actually builds sdist or wheel. Backend is per-project defined entity.

Build dependencies

With accepted PEP518 the format of static build dependencies is standardized and can be configured in TOML via pyproject.toml file.

For the vast majority of projects the PEP517/518 configuration will be:

[build-system]
requires = ["setuptools", "wheel"]
build-backend = "setuptools.build_meta"
  • distro packager is responsible for specifying these requirements in RPM specfile. Though these dependencies may not be enough since a backend may have dynamic ones. See get-requires-for-build-wheel for details
  • distro packager may add any other required build dependency but must follow "necessary and sufficient" rule
  • presence of %pyproject_build or %pyproject_install in RPM specfile automatically adds build dependency on pyproject-installer. But if that distribution is used for something different from aforementioned macroses the corresponding requirement should be set explicitly
  • presence of %tox_check or %tox_check_pyproject in RPM specfile automatically adds build dependency on tox and its required plugins. But if that distribution is used for something different from aforementioned macroses the corresponding requirement should be set explicitly
  • dependencies required for check or build docs must be RPM-conditional i.e., they can be opted out by RPM means

Build with RPM

rpm-build-python3 provides RPM macros %pyproject_build as a CLI interface for build frontend (See pyproject-installer for details).

  • build should be done with the help of this macros unless otherwise is absolutely necessary
Implementation details
  • built wheel will be placed into {source directory}/dist/ directory


Install

A wheel is a regular zip archive having special name and suffix ".whl". Thus, an installer mostly just verifies content of a wheel, unpacks it and lays out unpacked files if necessary.

Install with RPM

rpm-build-python3 provides RPM macros %pyproject_install as a CLI interface for installer (See pyproject-installer for details).

  • installation should be done with the help of this macros unless otherwise is absolutely necessary
Implementation details
  • only the latest wheel built with %pyproject_build will be installed by %pyproject_install

Bytecompilation

PEP427 says that wheel installers should compile any installed .py to .pyc (see compiled-python-files and pyc directories for details). But rpm-build-python3 already does this by default with the help of python3.compileall.py script which compiles Python modules with the given optimization level (disabled, -O switch removes assert statements, the -OO switch removes both assert statements and __doc__ strings).


Check

Testing is vital in Python eco-system. Never skip or disable tests unless it's absolutely necessary.

Due to security reasons the hasher in ALT's build-farm is network-isolated environment and thus, testing in such an env can be very tricky or is not possible at all. Though the most of unit tests require nothing for run and check should be enabled by default. In contrast, integration tests usually require Internet's availability and thereby, should be skipped. Nowadays the vast majority of projects use pytest or stdlib's unittest as tests' runner and tox to automate testing.

Check with RPM

rpm-build-python3 provides several RPM macroses as a CLI interface for tox:

  • %tox_check:
    • set required tox's and pip's env variables for running in network-isolated env
    • pass down NO_INTERNET env variable to tox env. This variable can be used by tests' runner for making tests requring Internet conditional
    • run tox with system site distributions
    • relax all tox env's dependencies
    • generate console scripts for every system site distribution defining them
  • %tox_check_pyproject:
    • same as %tox_check but tox will use a wheel built by %pyproject_build


Packaging

Subpackages

  • if docs or tests are required by something or somebody else they must be subpackaged into their own RPM packages

Metadata

If a project is assumed to be reusable it must publish corresponding meta information. The generation of such data is a job of build backend. But there are projects which are intended only for internal usage and they don't want to produce any meta information in this case (usually such projects are not published on index like pypi.org).

  • installed metadata (content of dist-info directory) shouldn't be stripped any more than it's already done. Nowadays, only METADATA and entry_points.txt are actually used by metadata parsers (See importlib.metadata for details). This data is crucial for many aspects of Python eco-system.

Migration to PEP517 in RPM specfile

More and more Python projects are migrating from legacy setup.py-packaging and completely removing this file from their tree. This makes it impossible to package such projects with old Python RPM macroses which unconditionally rely on existent setup.py.

New RPM macroses

In most of the cases the legacy RPM macroses for build or install Python project can be (and must be) replaced with the modern ones e.g.:

legacy macros modern macros
python3_build pyproject_build
python3_build_debug pyproject_build
python3_install pyproject_install
python3_build_install pyproject_build and pyproject_install

Legacy setup.py-formatted projects

Even if a project still uses legacy packaging format it's possible to build such a project with PEP517/518-aware RPM macroses.

The following default configuration

[build-system]
requires = ["setuptools", "wheel"]
build-backend = "setuptools.build_meta:__legacy__"

will be used if:

  • pyproject.toml file is missing
  • 'build-system' table is missing


Additionally, the legacy setuptools.build_meta:__legacy__ backend will be used if:

  • 'build-backend' key of 'build-system' table is missing


Installed metadata

The new format of metadata (dist-info will be generated and packaged on migrating to PEP517-aware RPM macroses.
For example, egg-info (previous format) vs dist-info:

-  my_distribution-1.0-py3.10.egg-info
-  my_distribution-1.0-py3.10.egg-info/dependency_links.txt
-  my_distribution-1.0-py3.10.egg-info/entry_points.txt
-  my_distribution-1.0-py3.10.egg-info/not-zip-safe
-  my_distribution-1.0-py3.10.egg-info/PKG-INFO
-  my_distribution-1.0-py3.10.egg-info/requires.txt
-  my_distribution-1.0-py3.10.egg-info/SOURCES.txt
-  my_distribution-1.0-py3.10.egg-info/top_level.txt
+  my_distribution-1.0.dist-info
+  my_distribution-1.0.dist-info/entry_points.txt
+  my_distribution-1.0.dist-info/METADATA

setuptools

setuptools is the mandatory dependency of legacy RPM macroses, but it's an optional build backend for the new ones. Thus, setuptools is no longer pulled into a build environment just on usage of modern macroses and should be specified explicitly if it's still needed.

Related