Technology

This is my dumping ground for everything technology-related that I find interesting enough to share with the rest of the world.  While my posts here tend to be infrequent, I try to keep them relevant or at least interesting.

My most recent areas of focus have been cloud computing, Python, and the Fedora operating system.

An even smarter way of importing a new package in Fedora

posted Sep 21, 2011 10:26 PM by Garrett Holmstrom   [ updated Sep 21, 2011 11:19 PM ]

The recommended way to import a new package in Fedora is to take the source RPM that was approved in its package review and run fedpkg import foo.src.rpm on it to import the entire thing at once. This loses the history of the package before its approval. But if you keep the package's history in a git repository until it is time to import it into Fedora, then you can simply add that history to Fedora's repository.

Adding pre-existing package history to Fedora

Mathieu Bridon wrote a blog post that walks one through a procedure for making this happen. First we have to add the history from Fedora's git repository to the repository you already have. Since we want to replace our existing master branch with Fedora's, we rename it, add the Fedora repository's data to our repository, and check out Fedora's master branch in its place:
  1. git branch -m master local-master
  2. git remote add origin git+ssh://username@pkgs.fedoraproject.org/pkgname
  3. git fetch origin
  4. git checkout -b master origin/master
Mathieu then walks us through a convoluted sequence of branches, merges, and rebases that combine the two histories. But all we really want to do is replay our local-master branch's commits onto Fedora's master branch, so we can tell git to do this by handing a list of these commits to git cherry-pick, which does just that:
  1. git cherry-pick $(git rev-list --reverse local-master)
That's it! Now you can upload your sources with fedpkg new-sources, build your package, and continue working with it as with any other package's repository. You can also delete your old local-master branch with git branch -d local-master.

Saving work on other branches

What if you have more branches? Just merge what you did in the master branch!
  1. fedpkg switch-branch f16
  2. git merge master
  3. fedpkg switch-branch f15
  4. git merge master
  5. ...

Comparing versions in RPM conditionals

posted Apr 29, 2011 9:21 PM by Garrett Holmstrom

It's easy to check a distribution's version in a spec file since it is usually an integer:

%if 0%{?fedora} > 13

But this scheme doesn't usually work for comparing program versions because they typically contain punctuation, which blows rpmbuild's little mind.  If you have rpm 4.7 or later, however, you can use a bit of inline Lua to do it:

%if %{lua:rpm.vercmp('%{version}', '2.0.2')} > 0

Multi-stack Packaging Woes

posted Feb 5, 2011 5:42 PM by Garrett Holmstrom

I got a chance to sit down with Dave Malcolm at FUDCon to talk about the state of Python packaging in Fedora.  From the start, Fedora's packages have been designed with a single Python stack in mind at a time.  By now the process for packaging Python modules is streamlined and easy to follow, but shortly after Fedora began shipping more than one Python runtime in parallel it became clear that the solutions we have used in the past are now inadequate.

The problem

Fedora's packages, for the most part, assume that the distribution has just one Python stack.  They rely on RPM macros like __python, python_sitelib, and python_sitearch to define where files built against the current version of Python should go.  When Fedora began shipping both Python 3 and Python 2 stacks in parallel this forced a rewrite of the packaging guidelines for Python that duplicates a number of these macros so packages can build against Python 3.  So now we have:

  • __python
  • __python3
  • python_sitelib
  • python_sitearch
  • python3_sitelib
  • python3_sitearch

Packagers need to care about this because packaging a Python module that supports both Python 2 and 3 is now a lot harder.  If upstream supports them both with the same tarball then the package-building process has to build the entire module twice - once for each Python stack - and put the resulting files in two different locations.  If upstream supports them both with separate tarballs then the Python 3 version needs to go into a separate package altogether, requiring a new package review for each module as well as a non-trivial amount of work to coordinate bugfixes and updates between both packages.  Both of these methods result in a large amount of copied-and-pasted code in RPMs' spec files.

In isolation this is not a significant problem.  However, this solution will cause significant problems for the distribution over time.  As the world increasingly supports Python 3 the number of new package reviews for Python 3-compatible versions of existing modules will increase significantly.  When Python 3 finally becomes the default in Fedora, every package with a Python 2 module will need to be edited to build against what will then be an alternate Python stack, and many of them will need to be renamed, and thus re-reviewed, at the same time.  (e.g. from python-libfoo to python2-libfoo)

Whether existing Python 3 modules will also need to be renamed remains to be seen.

Packages for other Python runtimes exist as well.  EPEL 5 contains a Python 2.6 package that stands alongside the stock 2.4 package.  Since modules must be explicitly built against this alternative stack to be of any use to 2.6 users, yet another duplicate set of Python module packages has appeared.  What will happen when someone wants to package Python 3 for EPEL 5 or 6?  Still more, completely separate, Python runtimes exist in Fedora, such as PyPy and Jython.  The packager of a single pure-Python module has to build four packages for the module to work on every Python version that Fedora 14 ships.  At worst this means having to maintain and coordinate four independent packages.  At best this means one can use the same package for each, but the amount of duplicated code and the number of RPM macros necessary to do it are even worse.  The list of macros would now be something like this:

  • __python
  • __python3
  • __pypy
  • __jython
  • python_sitelib
  • python_sitearch
  • python3_sitelib
  • python3_sitearch
  • pypy_sitelib
  • jython_sitelib

This method of packaging Python modules in a distribution that ships multiple Python stacks is unsustainable.  Not only does it create a significant amount of additional work and red tape for packagers, but it also adds even more work for Fedora's already-over-burdened package reviewers.

How can we do this better?

Dave came up with a proof-of-concept tool last year that attempts to mitigate the pain of having a single source package build for multiple Python runtimes.  It lets the distribution define which Python runtimes it includes and then fills in what would have otherwise been repeated spec file code when RPMs are built.  While it is arguably over-engineered, I feel that it is the right way to approach this sort of problem.  By giving the distribution a way to define what Python runtimes packages ought to build against, it avoids the need to hard-code a distribution-specific list of them in every package's spec file.  With direct support from RPM or RPM macros, packagers could apply this sort of solution to any number of program stacks where multiple versions are likely to appear, such as PHP or Drupal.

Dave's experiment received precious little feedback on Fedora's Python mailing list.  I encourage Fedora's packagers to take a look (or another) at it and see what they can learn from it.  Perhaps we can use it as the basis of a way to make multi-stack packaging easier before the Python flood hits.

Disclaimer:  this is my rant, but Dave's solution.  Please direct any and all flames toward me (gholms), not him.  ;-)

1-3 of 3