pyFF Documentation

Author:

Leif Johansson <leifj@sunet.se>

Release:

2.1.1

pyFF is a simple but reasonably complete SAML metadata processor. It is intended to be used by anyone who needs to aggregate, validate, combine, transform, sign or publish SAML metadata.

pyFF is used to run infrastructure for several identity federations of signifficant size including edugain.org.

pyFF supports producing and validating digital signatures on SAML metadata using the pyXMLSecurity package which in turn supports PKCS#11 and other mechanisms for talking to HSMs and other cryptographic hardware.

pyFF is also a complete implementation of the SAML metadata query protocol as described in draft-young-md-query and draft-young-md-query-saml and implements extensions to MDQ for searching which means pyFF can be used as the backend for a discovery service for large-scale identity federations.

Possible usecases include running an federation aggregator, filtering metadata for use by a discovery service, generating reports from metadata (eg certificate expiration reports), transforming metadata to add custom elements.

Installation

Before you install

Make sure you have a reasonably modern python. pyFF is developed using 3.6 but 3.7 will probably become the norm soon. It is recommended that you install pyFF into a virtualenv

Start by installing some basic OS packages. For a debian/ubuntu install:

# apt-get install build-essential python-dev libxml2-dev libxslt1-dev libyaml-dev

and if you’re on a centos system (or other yum-based systems):

# yum install python-devel  libxml2-devel libxslt-devel libyaml-devel
# pip install pyyaml
# yum install make gcc kernel-devel kernel-headers glibc-headers

If you want to use OS packages instead of python packages from pypi then consider also installing the following packages before you begin:

With Sitepackages

This method re-uses existing OS-level python packages. This means you’ll have fewer worries keeping your python environment in sync with OS-level libraries.

# apt-get install python-virtualenv
# virtualenv python-pyff

Choose this method if you want the OS to keep as many of your packages up to date for you.

Without Sitepackages

This method keeps everything inside your virtualenv. Use this method if you are developing pyFF or want to run multiple python-based applications in parallell without having to worry about conflicts between packages.

# cd $HOME
# apt-get install python-virtualenv
# virtualenv -p python3 python-pyff --no-site-packages

Choose this method for maximum control - ideal for development setups.

Verifying

To verify that python 3.6 is the default python in the pyFF environment run

# python --version

The result should be Python 3.6 or later.

To verify that the version of pip you have is the latest run.

# pip install --upgrade pip

Installing

Now that you have a virtualenv, its time to install pyFF into it. Start by activating your virtualenv:

# source python-pyff/bin/activate

Next install pyFF:

# cd $HOME
# cd pyFF
# LANG=en_US.UTF-8 pip install -e .

This will install a bunch of dependencies and compile bindings for both lxml, pyyaml as well as pyXMLSecurity. This may take some time to complete. If there are no errors and if you have the pyff binary in your $PATH you should be done.

# cd $HOME
# mkdir pyff-config
# cd pyff-config

Upgrading

Unless you’ve made modifications, upgrading should be as simple as running

# source python-pyff/bin/activate
# pip install -U pyff

This should bring your virtualenv up to the latest version of pyff and its dependencies. You probably need to restart pyffd manually though.

Next Steps

Now that you hopefully have a working installation of pyFF you are ready to start exploring all the ways pyFF can help you manage metadata. It may be good to go read the Quick Start Instructions now but in general pyFF should be run in the same directory that contains a pipeline in yaml format and depending on the nature of the pipeline additional files may be needed including things like…

  • A list of metadata URLs.

  • A set of files containing metadata URLs - eg XRD or MDSL files.

  • A key and crt signing key pair which can be generated from genkey.sh in the scripts directory.

Quick Start Instructions

There are a lot of options and knobs in pyFF - in many ways pyFF is a toolchain that can be configured to do a lot of tasks. In order to start exploring pyFF it is best to start with a simple example. Assuming you have read the installation instructions and have created and activated a virtualenv with pyFF installed do the following:

First create an empty directory and cd into it. In the directory create a file called edugain.fd with the following contents:

- load:
   - http://mds.edugain.org
- select:
- stats:

Now run pyFF like this:

# pyff edugain.fd

You should see output like this after a few seconds depending on the speed of your Internet connection you should see something like this:

---
total size:     5568
selected:       5567
          idps: 3079
           sps: 2487
---

Congratulations - you have successfully fetched, parsed, selected and printed stats for the edugain metadata feed. This is of course not a useful example (probably) but it illustrates a few points about how pyFF works:

  • pyFF configuration is (mostly) in the form of yaml files

  • The yaml file reprsents a list of instructions which are processed in order

  • The load statement retrieves (and parses) SAML metadata from edugain.org

  • The select statement is used to form an active document on which subsequent instructions operate

  • Finally, the stats statement prints out some information about the current active document.

Next we’ll learn how to do more than print statistics.

Running pyFF

There are two ways to use pyFF:

# a “batch” command-line tool called pyff # a wsgi application you can use with your favorite wsgi server - eg gunicorn

In either case you need to provide some configuration and a pipeline - instructions to tell pyFF what to do - in order for anything intersting to happen. In the Quick Start Instructions guide you saw how pyFF pipelines are constructed by creating yaml files. The full set of piplines is documented in pyff.builtins. When you run pyFF in batch-mode you typically want a fairly simple pipline that loads & transforms metadata and saves some form of output format.

Batch mode: pyff

The typical way to run pyFF in batch mode is something like this:

# pyff [--loglevel=<DEBUG|INFO|WARN|ERROR>] pipeline.yaml

For various historic reasons the yaml files in the examples directory all have the ‘.fd’ extension but pyFF doesn’t care how you name your pipeline files as long as they contain valid yaml.

This is in many ways the easiest way to run pyFF but it is also somewhat limited - eg it is not possible to produce an MDQ server using this method.

WSGI application: pyffd

Development of pyFF uses gunicorn to test but othe wsgi servers (eg apache mod-wsgi etc) should work equally well. Since all configuration of pyFF can be done using environment variables (cf pyff.constants:Config) it is pretty easy to integrate in most environments.

Running pyFFd using gunicorn goes something like this (incidentally this is also how the standard docker-image launches pyFFd):

# gunicorn --workers=1 --preload --bind 0.0.0.0:8080 -e PYFF_PIPELINE=pipeline.yaml --threads 4 --worker-tmp-dir=/dev/shm  pyff.wsgi:app

The wsgi app is a lot more sophisticated than batch-mode and in particular interaction with workers/threads in gunicorn can be a bit unpredictable depending on which implementation of the various interfaces (metadata stores, schedulers, caches etc) you choose. It is usually easiest to use a single worker and multiple threads - at least until you know what you’re doing.

The example above would launch the pyFF wsgi app on port 8080. However using pyFF in this way requires that you structure your pipeline a bit differently. In the name of flexibility, most of the request processing (with the exception of a few APIs such as webfinger and search which are always available) of the pyFF wsgi app is actually delegated to the pipeline. Lets look at a basic example:

- when update:
  - load:
      - http://mds.edugain.org
- when request:
  - select:
  - pipe:
      - when accept application/samlmetadata+xml application/xml:
          - first
          - finalize:
              cacheDuration: PT12H
              validUntil: P10D
          - sign:
              key: sign.key
              cert: sign.crt
          - emit application/samlmetadata+xml
          - break
      - when accept application/json:
          - discojson
          - emit application/json
          - break

Lets pick this pipeline apart. First notice the two when instructions. The pyff.builtins:when pipe is used to conditionally execute a set of instructions. There is essentially only one type of condition. When processing a pipeline pyFF keeps a state variable (a dict-like object) which changes as the instructions are processed. When the pipeline is launched the state is initialized with a set of key-value pairs used to control execution of the pipeline.

There are a few pre-defined states, in this case we’re dealing with two: the execution mode update or request (we’ll get to that one later) or the accept state used to implement content negotiation in the pyFF wsgi app. In fact there are two ways to express a condition for when: with one parameter in which case the condition evaluates to True iff the parameter is present as a key in the state object, or with two parameters in which case the condition evaluates to True iff the parameter is present and has the prescribed value.

Looking at our example the first when clause evaluates to True when update is present in state. This happens when pyFF is in an update loop. The other when clause gets triggered when request is present in state which happens when pyFF is processing an incoming HTTP request.

There ‘update’ state name is only slightly “magical” - you could call it “foo” if you like. The way to trigger any branch like this is to POST to the /api/call/{state} endpoint (eg using cURL) like so:

# curl -XPOST -s http://localhost:8080/api/call/update

This will trigger the update state (or foo if you like). You can have any number of entry-points like this in your pipeline and trigger them from external processes using the API. The result of the pipeline is returned to the caller (which means it is probably a good idea to use the -t option to gunicorn to increase the worker timeout a bit).

The request state is triggered when pyFF gets an incoming request on any of the URI contexts other than /api and /.well-known/webfinger, eg the main MDQ context /entities. This is typically where you do most of the work in a pyFF MDQ server.

The example above uses the select pipe (pyff.builtins.select()) to setup an active document. When in request mode pyFF provides parameters for the request call by parsing the query parameters and URI path of the request according to the MDQ specification. Therefore the call to select in the pipeline above, while it may appear to have no parameters, is actually “fed” from the request processing of pyFF.

The subsequent calls to when implements content negotiation to provide a discojuice and XML version of the metadata depending on what the caller is asking for. This is key to using pyFF as a backend to the thiss.io discovery service. More than one content type may be specified to accommodate noncompliant MDQ clients.

The rest of the XML “branch” of the pipeline should be pretty easy to understand. First we use the pyff.builtins.first() pipe to ensure that we only return a single EntityDescriptor if our select match a single object. Next we set cacheDuration and validUntil parameters and sign the XML before returning it.

The rest of the JSON “branch” of the pipeline is even simpler: transform the XML in the active document to discojson format and return with the correct Content-Type.

The structure of a pipeline

Pipeline files are yaml documents representing a list of processing steps:

- step1
- step2
- step3

Each step represents a processing instruction. pyFF has a library of built-in instructions to choose from that include fetching local and remote metadata, xslt transforms, signing, validation and various forms of output and statistics.

Processing steps are called pipes. A pipe can have arguments and options:

- step [option]*:
    - argument1
    - argument2
    ...

- step [option]*:
    key1: value1
    key2: value2
    ...

Typically options are used to modify the behaviour of the pipe itself (think macros), while arguments provide runtime data to operate on. Documentation for each pipe is in the pyff.builtins Module. Also take a look at the Examples.

Deploying pyFF

Running pyFF in docker

Building a docker image

There is a build environment for docker available at https://github.com/SUNET/docker-pyff. In order to build your own docker image, clone this repository and use make to build the latest version of pyFF:

# git clone https://github.com/SUNET/docker-pyff
...
# cd docker-pyff
# make

At the end of this you should be able to run pyff:<version> where <version> will depend on what is currently the latest supported version. Sometimes a version of docker is uploaded to dockerhub but there is no guarantee that those are current or even made by anyone affiliated with the pyFF project.

Running the docker image

The docker image is based on debian:stable and contains a full install of pyFF along with most of the optional components including PyKCS11. If you start pyFF with no arguments it launches a default pipeline that fetches edugain and exposes it as an MDQ server:

# docker run -ti -p 8080:8080

A pyFF MDQ service should now be exposed on port 8080. If you are running the old pyFF 1.x branch you may also have access to the default admin interface. If you are running pyFF 2.x you can now point an MDQ frontend to to port 8080 - eg mdq-browser.

Running pyFF in production

There are several aspects to consider when deploying pyFF in production. Sometimes you want to emphasize simplicity and then you can simply run a pyFF instance and combine with a management application (eg mdq-browser) and a discovery service to quickly setup a federation hub. This model is suitable if you are setting up a collaboration hub or an SP proxy that needs to keep track of a local metadata set along with a matching discovery service.

Scenario 1: all-in-one

If you are using docker you might deploy something like that using docker-compose (or something similar implemented using k8s etc). Assuming your.public.domain is the public address of the service you wish to deploy the follwoing compose file would give you a discovery service on port 80 and an admin UI on port 8080.

Take care to check which version of the software components is the latest and greatest (and/or apropriate for your situation) and modify accordingly.

version: "3"
services:
   mdq-browser:
      image: docker.sunet.se/mdq-browser:1.0.1
      container_name: mdq_browser
      ports:
         - "8080:80"
      environment:
         - MDQ_URL=http://pyff
         - PYFF_APIS=true
   thiss:
      image: docker.sunet.se/thiss-js:1.1.2
      container_name: thiss
      ports:
         - "80:80"
      environment:
         - MDQ_URL=http://pyff/entities/
         - BASE_URL=https://your.public.domain
         - STORAGE_DOMAIN=your.public.domain
         - SEARCH_URL=http://pyff/api/search
   pyff:
      image: docker.sunet.se/pyff:stable
      container_name: pyff-api

Scenario 2: offline signing

Sometimes security is paramount and it may be prudent to firewall the signing keys for your identity federation but you still want to provide a scalable MDQ service. The MDQ specification doesn’t actually require online access to the signing key. It is possible to create an MDQ service that only consists of static files served from a simple webserver or even from a CDN.

The pyFF wsgi server implements the webfinger protocol as described in RFC 7033 and this endpoint can be use to list all objects in the MDQ server. A simple script provided in the scripts directory of the pyFF distribution uses webfinger and wget to make an isomorphic copy of the pyFF instance.

# Run an instance of pyff on a firewalled system with access to the signing keys - eg via an HSM # Use the script to mirror the pyFF instance to a local directory and copy that directory over to the public webserver or CDN

# docker run -d -p 8080:8080 pyff:1.1.0
# docker run -ti pyff:1.1.0 mirror-mdq.sh -A http://localhost:8080/ /some/dir

This will create an offline copy of http://localhost:8080/ in /some/dir. You can use rsync+ssh syntax instead (eg user@host:/some/dir) to make a copy to a remote host using rsync+ssh. This way it is possible to have a lot of control over how metadata is generated and published while at the same time providing a scalable public interface to your metadata feed.

Currently the script traverses all objects in the pyFF instance everytime it is called so allow for enough time to sign every object when you setup your mirror cycle.

Examples

Here are some more example pipelines. Most of these are designed for batch-mode pyff but the concepts can be easily included in wsgi-style pipelines with multiple entry-points.

Example 1 - A simple pull

Fetch SWAMID metadata, split it up into EntityDescriptor elements and store each as a separate file in /tmp/swamid-2.0.xml.

- load:
     - http://mds.swamid.se/md/swamid-2.0.xml
- select
- publish: "/tmp/swamid-2.0.xml"
- stats

This is a simple example in 3 steps: load, select, store and stats. Each of these commands operate on a metadata repository that starts out as empty. The first command (load) causes a URL to be downloaded and the SAML metadata found there is stored in the metadata repository. The next command (select) creates an active document (which in this case consists of all EntityDescriptors in the metadata repository). Next, (publish) is called which causes the active document to be stored in an XML file. Finally the stats command prints out some information about the metadata repository.

This is essentially a 1-1 operation: the metadata loaded is stored in a local file. Next we’ll look at a more complex example that involves filtering and transformation.

Example 2 - Grab the IdPs from edugain

Grab edugain metadata, select the IdPs (using an XPath expression), run it through the built-in ‘tidy’ XSL stylesheet (cf below) which cleans up some known problems, sign the result and write the lot to a file.

- load:
   - http://mds.edugain.org
   - edugain-signer.crt
- select:
   - "http://mds.edugain.org!//md:EntityDescriptor[md:IDPSSODescriptor]"
- xslt:
    stylesheet: tidy.xsl
- finalize:
    cacheDuration: PT5H
    validUntil: P10D
- sign:
    key: sign.key
    cert: sign.crt
- publish: /tmp/edugain-idp.xml
- stats

In this case the select (which uses an xpath in this case) picks the EntityDescriptors that contain at least one IDPSSODescriptor - in other words all IdPs. The xslt command transforms the result of this select using an xslt transformation. The finalize command sets cacheDuration and validUntil (to 10 days from the current date and time) on the EntitiesDescriptor element which is the result of calling select. The sign command performs an XML-dsig on the EntitiesDescriptor.

For reference the ‘tidy’ xsl is included with pyFF and looks like this:

<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:ds="http://www.w3.org/2000/09/xmldsig#"
                xmlns:md="urn:oasis:names:tc:SAML:2.0:metadata">

  <xsl:template match="@ID"/>
  <xsl:template match="@Id"/>
  <xsl:template match="@xml:id"/>
  <xsl:template match="@validUntil"/>
  <xsl:template match="@cacheDuration"/>
  <xsl:template match="@xml:base"/>
  <xsl:template match="ds:Signature"/>
  <xsl:template match="md:OrganizationName|md:OrganizationURL|md:OrganizationDisplayName">
    <xsl:if test="normalize-space(text(()) != ''">
            <xsl:copy><xsl:apply-templates select="node()|@*"/></xsl:copy>
    </xsl:if>
  </xsl:template>

  <xsl:template match="text()|comment()|@*">
    <xsl:copy/>
  </xsl:template>

  <xsl:template match="*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet>

Example 3 - Use an XRD file

Sometimes it is useful to keep metadata URLs and signing certificates used for validation in a separate file and pyFF supports XRD-files for this purpose. Modify the previous example to look like this:

- load:
   - links.xrd
- select: "!//md:EntityDescriptor[md:IDPSSODescriptor]"
- xslt:
    stylesheet: tidy.xsl
- sign:
    key: sign.key
    cert: sign.crt
- publish: /tmp/idp.xml
- stats

Note that in this case the select doesn’t include the http://mds.edugain.org prefix before the ‘!’-sign. This causes the xpath to operate on all source URLs, rather than just the single source http://mds.edugain.org . It would have been possible to call select with multiple arguments, each using a different URL from the file links.xrd which contains the following:

<?xml version="1.0" encoding="UTF-8"?>
<XRDS xmlns="http://docs.oasis-open.org/ns/xri/xrd-1.0">
    <XRD>
        <Subject>http://mds.swamid.se/md/swamid-2.0.xml</Subject>
        <Link rel="urn:oasis:names:tc:SAML:2.0:metadata" href="http://mds.swamid.se/md/swamid-2.0.xml">
            <Title>SWAMID</Title>
            <ds:KeyInfo xmlns:ds="http://www.w3.org/2000/09/xmldsig#">
                <ds:X509Data>
                    <ds:X509Certificate>
                    MIIFyzCCA7OgAwIBAgIJAI9LJsUJXDMVMA0GCSqGSIb3DQEBCwUAMHwxCzAJBgNV
                    BAYTAlNFMRIwEAYDVQQIDAlTdG9ja2hvbG0xEjAQBgNVBAcMCVN0b2NraG9sbTEO
                    MAwGA1UECgwFU1VORVQxDzANBgNVBAsMBlNXQU1JRDEkMCIGA1UEAwwbU1dBTUlE
                    IG1ldGFkYXRhIHNpZ25lciB2Mi4wMB4XDTE2MTIwNjA5MjgyMFoXDTM2MTIwNjA5
                    MjgyMFowfDELMAkGA1UEBhMCU0UxEjAQBgNVBAgMCVN0b2NraG9sbTESMBAGA1UE
                    BwwJU3RvY2tob2xtMQ4wDAYDVQQKDAVTVU5FVDEPMA0GA1UECwwGU1dBTUlEMSQw
                    IgYDVQQDDBtTV0FNSUQgbWV0YWRhdGEgc2lnbmVyIHYyLjAwggIiMA0GCSqGSIb3
                    DQEBAQUAA4ICDwAwggIKAoICAQDQVw72PnIo9QIeV439kQnPcxZh/LddKw86eIU+
                    nMfl4TpjSIyqTu4KJSnXbJyqXg+jQj3RzE9BUblpGrR7okmQwOh2nh+5A6SmyTOR
                    p7VEVT/Zw0GNnQi9gAW7J8Cy+Gnok4LeILI5u43hPylNKAnvs1+bo0ZlbHM6U5jm
                    6MlO+lrYA9dZzoPQqoCQbr3OweAaq5g8H54HuZacpYa3Q2GnUa4v+xywjntPdSQU
                    RTAbWWyJl3cHctX5+8UnX8nGCaxoBZqNp9PcEopyYJX8O1nrLumBMqu9Uh6GW1nx
                    OHfKDLvUoykG3Dm704ENVs88KaJXB1qQNsjdlm14UI9XCZbHfnFVnQ53ehsGFMha
                    Bf/Abd6v2wnhBLH/RxEUlw347qSeokw+SdDTSdW8jOEBiSqP/8BUzpCcbGlgAsVO
                    NKUS0K7IB2Bb79YYhyMvmJl24BGtkX+VM/mv47dxOtfzNFCMtUcJ2Dluv0xJG8xI
                    ot7umx/kbMBLuq7WdWELZJrgpt2bb9sXtYBpuxtGCW5g7+U7MNN1aKCiCSfq09YH
                    qu2DsU7HHAxEcGFXBiepBliCwZ24WLQh53bA3rihaln7SjdapT9VuSTpCvytb9RX
                    rq39mVuHMXvWYOG20XTV0+8U2vnsjAwsy28xPAcrLWRWoZbRJ+RoGp6L3GACq+t+
                    HPIukwIDAQABo1AwTjAdBgNVHQ4EFgQUQ2iqKQV/mMZDeJDtLXvy0Bsn/BQwHwYD
                    VR0jBBgwFoAUQ2iqKQV/mMZDeJDtLXvy0Bsn/BQwDAYDVR0TBAUwAwEB/zANBgkq
                    hkiG9w0BAQsFAAOCAgEAHviIAfS8viUN8Qk//U1p6Z1VK5718NeS7uqabug/SwhL
                    Vxtg/0x9FPJYf05HXj4moAf2W1ZLnhr0pnEPGDbdHAgDC672fpaAV7DO95d7xubc
                    rofR7Of2fehYSUZbXBWFiQ+xB5QfRsUFgB/qgHUolgn+4RXniiBYlWe6QJVncHx+
                    FtxD+vh1l5rLNkJgJLw2Lt3pbemSxUvv0CJtnK4jt2y95GsWGu1uSsVLrs0PR1Lj
                    kuxL6zZH4Pp9yjRDOUhbVYAnQ017mdcjvHYtp7c4GIWgyaBkDoMtU6fAt70QpeGj
                    XhecXk7Llx+oYNdZn14ZdFPRGMyAESLrT4Zf9M7QS3ypnWn/Ux0SwKWbnPUeRVbO
                    VZZ+M0jmdYK6o+UU5xH3peRWSJIjjRaKjbVlW5GgHwGFmQc/LN+va2jjThRsQWWt
                    zEwObijedInQ6wfL/VzFAwlWWoDAzKK9qnK4Rf3ORKkvhKrUa//2OYnZD0kHtHiC
                    OL+iFRLtJ/DQP5iZAF+M1Hta7acLmQ8v7Mn1ZR9lyDWzFx57VOKKtJ6RAmBvxOdP
                    8cIgBNvLAEdXh2knOLqYU/CeaGkxTD7Y0SEKx6OxEEdafba//MBkVLt4bRoLXts6
                    6JY25FqFh3eJZjR6h4W1NW8KnBWuy+ITGfXxoJSsX78/pwAY+v32jRxMZGUi1J4=
                    </ds:X509Certificate>
                </ds:X509Data>
          </ds:KeyInfo>
        </Link>
    </XRD>
    <XRD>
        <Subject>https://incommon.org</Subject>
        <Link rel="urn:oasis:names:tc:SAML:2.0:metadata" href="http://md.incommon.org/InCommon/InCommon-metadata.xml">
            <Title>InCommon Metadata (main aggregate)</Title>
            <ds:KeyInfo xmlns:ds="http://www.w3.org/2000/09/xmldsig#">
                <ds:X509Data>
                    <ds:X509Certificate>
     MIIDgTCCAmmgAwIBAgIJAJRJzvdpkmNaMA0GCSqGSIb3DQEBCwUAMFcxCzAJBgNV
     BAYTAlVTMRUwEwYDVQQKDAxJbkNvbW1vbiBMTEMxMTAvBgNVBAMMKEluQ29tbW9u
     IEZlZGVyYXRpb24gTWV0YWRhdGEgU2lnbmluZyBLZXkwHhcNMTMxMjE2MTkzNDU1
     WhcNMzcxMjE4MTkzNDU1WjBXMQswCQYDVQQGEwJVUzEVMBMGA1UECgwMSW5Db21t
     b24gTExDMTEwLwYDVQQDDChJbkNvbW1vbiBGZWRlcmF0aW9uIE1ldGFkYXRhIFNp
     Z25pbmcgS2V5MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA0Chdkrn+
     dG5Zj5L3UIw+xeWgNzm8ajw7/FyqRQ1SjD4Lfg2WCdlfjOrYGNnVZMCTfItoXTSp
     g4rXxHQsykeNiYRu2+02uMS+1pnBqWjzdPJE0od+q8EbdvE6ShimjyNn0yQfGyQK
     CNdYuc+75MIHsaIOAEtDZUST9Sd4oeU1zRjV2sGvUd+JFHveUAhRc0b+JEZfIEuq
     /LIU9qxm/+gFaawlmojZPyOWZ1JlswbrrJYYyn10qgnJvjh9gZWXKjmPxqvHKJcA
     TPhAh2gWGabWTXBJCckMe1hrHCl/vbDLCmz0/oYuoaSDzP6zE9YSA/xCplaHA0mo
     C1Vs2H5MOQGlewIDAQABo1AwTjAdBgNVHQ4EFgQU5ij9YLU5zQ6K75kPgVpyQ2N/
     lPswHwYDVR0jBBgwFoAU5ij9YLU5zQ6K75kPgVpyQ2N/lPswDAYDVR0TBAUwAwEB
     /zANBgkqhkiG9w0BAQsFAAOCAQEAaQkEx9xvaLUt0PNLvHMtxXQPedCPw5xQBd2V
     WOsWPYspRAOSNbU1VloY+xUkUKorYTogKUY1q+uh2gDIEazW0uZZaQvWPp8xdxWq
     Dh96n5US06lszEc+Lj3dqdxWkXRRqEbjhBFh/utXaeyeSOtaX65GwD5svDHnJBcl
     AGkzeRIXqxmYG+I2zMm/JYGzEnbwToyC7yF6Q8cQxOr37hEpqz+WN/x3qM2qyBLE
     CQFjmlJrvRLkSL15PCZiu+xFNFd/zx6btDun5DBlfDS9DG+SHCNH6Nq+NfP+ZQ8C
     GzP/3TaZPzMlKPDCjp0XOQfyQqFIXdwjPFTWjEusDBlm4qJAlQ==
                    </ds:X509Certificate>
                </ds:X509Data>
          </ds:KeyInfo>
        </Link>
    </XRD>
</XRDS>

The structure of the file should be fairly self-evident. Only links with @rel=”urn:oasis:names:tc:SAML:2.0:metadata” will be parsed. If a KeyInfo with a X509Certificate element (usual base64-encoded certificate format) then this certificate is used to validate the signature on the downloaded SAML metadata. Note that while ‘load’ supports validation based on certificate fingerprint the XRD format does not and you will have to include Base64-encoded certificates if you want validation to work.

Example 4 - Sign using a PKCS#11 module

Fetch SWAMID metadata (and validate the signature using a certificate matching the given SHA256 fingerprint), select the Identity Providers, tidy it up a bit and sign with the key with the label ‘signer’ in the PKCS#11 module /usr/lib/libsofthsm.so. If a certificate is found in the same PKCS#11 object, that certificate is included in the Signature object.

- load:
   - http://mds.swamid.se/md/swamid-2.0.xml A6:78:5A:37:C9:C9:0C:25:AD:5F:1F:69:22:EF:76:7B:C9:78:67:67:3A:AF:4F:8B:EA:A1:A7:6D:A3:A8:E5:85
- select: "!//md:EntityDescriptor[md:IDPSSODescriptor]"
- xslt:
    stylesheet: tidy.xsl
- sign:
    key: pkcs11:///usr/lib/libsofthsm.so/signer
- publish: /tmp/idp.xml
- stats

Running this example requires some preparation. Run the ‘p11setup.sh’ script in the examples directory. This results in a SoftHSM token being setup with the PIN ‘secret1’ and SO_PIN ‘secret2’. Now run pyFF (assuming you are using a unix-like environment).

# env PYKCS11PIN=secret1 SOFTHSM_CONF=softhsm.conf pyff --loglevel=DEBUG p11.fd

Extending pyFF

Not much here yet - come back later or UTSL

Frequently Asked Questions

I get ‘select is empty’ but I know my xpath should match. What is wrong?

You may have forgotten to include namespaces in your xpath expression. For instance //EntityDescriptor won’t match anything - //md:EntityDescriptor is what you want etc. PyFF is not a full XML processor and supports a set of well-known XML namespaces commonly used in SAML metadata by prefix only. The full list of prefixes can be found in pyff.constants

pyff package

Submodules

pyff.api module

pyff.builtins module

pyff.constants module

pyff.decorators module

pyff.exceptions module

pyff.fetch module

pyff.locks module

pyff.logs module

pyff.md module

pyff.mdq module

pyff.merge_strategies module

pyff.parse module

pyff.pipes module

pyff.repo module

pyff.resource module

pyff.samlmd module

pyff.store module

pyff.tools module

pyff.utils module

pyff.wsgi module

The pyFF logo is the chemical symbol for sublimation - a process by which elements are transitioned from solid to gas without becoming liquids.