pyPodcastParser¶
Introduction¶
pyPodcastParser is a podcast parser. It should parse any RSS file, but it specializes in parsing podcast rss feeds. pyPodcastParser is agnostic about the method you use to get a podcast RSS feed. Most user will be most comfortable with the Requests library.
Installation¶
pip install pyPodcastParser
Usage¶
from pyPodcastParser.Podcast import Podcast
import requests
request = requests.get('https://some_rss_feed')
podcast = Podcast(request.content)
Objects and their Useful Attributes¶
Notes:
- All attributes with empty or nonexistent element will have a value of None.
- Attributes are generally strings or lists of strings, because we want to record the literal value of elements.
- The cloud element aka RSS Cloud is not supported as it has been superseded by the superior PubSubHubbub protocal
Podcast¶
- categories (list) A list for strings representing the feed categories
- copyright (string): The feed’s copyright
- creative_commons (string): The feed’s creative commons license
- items (list): A list of Item objects
- description (string): The feed’s description
- generator (string): The feed’s generator
- image_title (string): Feed image title
- image_url (string): Feed image url
- image_link (string): Feed image link to homepage
- image_width (string): Feed image width
- image_height (Sample H4string): Feed image height
- itunes_author_name (string): The podcast’s author name for iTunes
- itunes_block (boolean): Does the podcast block itunes
- itunes_categories (list): List of strings of itunes categories
- itunes_complete (string): Is this podcast done and complete
- itunes_explicit (string): Is this item explicit. Should only be “yes” and “clean.”
- itune_image (string): URL to itunes image
- itunes_keywords (list): List of strings of itunes keywords
- itunes_new_feed_url (string): The new url of this podcast
- language (string): Language of feed
- last_build_date (string): Last build date of this feed
- link (string): URL to homepage
- managing_editor (string): managing editor of feed
- published_date (string): Date feed was published
- pubsubhubbub (string): The URL of the pubsubhubbub service for this feed
- owner_name (string): Name of feed owner
- owner_email (string): Email of feed owner
- subtitle (string): The feed subtitle
- title (string): The feed title
- ttl (string): The time to live or number of minutes to cache feed
- web_master (string): The feed’s webmaster
Item¶
- author (string): The author of the item
- comments (string): URL of comments
- creative_commons (string): creative commons license for this item
- description (string): Description of the item.
- enclosure_url (string): URL of enclosure
- enclosure_type (string): File MIME type
- enclosure_length (integer): File size in bytes
- guid (string): globally unique identifier
- itunes_author_name (string): Author name given to iTunes
- itunes_block (boolean): It this Item blocked from itunes
- itunes_closed_captioned: (string): It is this item have closed captions
- itunes_duration (string): Duration of enclosure
- itunes_explicit (string): Is this item explicit. Should only be “yes” and “clean.”
- itune_image (string): URL of item cover art
- itunes_order (string): Override published_date order
- itunes_subtitle (string): The item subtitle
- itunes_summary (string): The summary of the item
- link (string): The URL of item.
- published_date (string): Date item was published
- title (string): The title of item.
Bugs & Feature Requests¶
Development¶
License¶
The MIT License (MIT) Copyright (c) 2016 Jason Rigden
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
API¶
These are the details:
-
class
pyPodcastParser.Podcast.
Podcast
(feed_content)¶ Parses an xml rss feed
RSS Specs http://cyber.law.harvard.edu/rss/rss.html
More RSS Specs http://www.rssboard.org/rss-specification
iTunes Podcast Specs http://www.apple.com/itunes/podcasts/specs.html
The cloud element aka RSS Cloud is not supported as it has been superseded by the superior PubSubHubbub protocal
Parameters: feed_content (str) – An rss string Note
All attributes with empty or nonexistent element will have a value of None
Attributes are generally strings or lists of strings, because we want to record the literal value of elements.
-
feed_content
¶ str
The actual xml of the feed
-
soup
¶ bs4.BeautifulSoup
A soup of the xml with items and image removed
-
image_soup
¶ bs4.BeautifulSoup
soup of image
-
full_soup
¶ bs4.BeautifulSoup
A soup of the xml with items
-
categories
¶ list
List for strings representing the feed categories
-
copyright
¶ str
The feed’s copyright
-
creative_commons
¶ str
The feed’s creative commons license
-
items
¶ item
Item objects
-
description
¶ str
The feed’s description
-
generator
¶ str
The feed’s generator
-
image_title
¶ str
Feed image title
-
image_url
¶ str
Feed image url
-
image_link
¶ str
Feed image link to homepage
-
image_width
¶ str
Feed image width
-
image_height
¶ str
Feed image height
str
The podcast’s author name for iTunes
-
itunes_block
¶ bool
Does the podcast block itunes
-
itunes_categories
¶ list
List of strings of itunes categories
-
itunes_complete
¶ str
Is this podcast done and complete
-
itunes_explicit
¶ str
Is this item explicit. Should only be “yes” and “clean.”
-
itune_image
¶ str
URL to itunes image
-
itunes_keywords
¶ list
List of strings of itunes keywords
-
itunes_new_feed_url
¶ str
The new url of this podcast
-
language
¶ str
Language of feed
-
last_build_date
¶ str
Last build date of this feed
-
link
¶ str
URL to homepage
-
managing_editor
¶ str
managing editor of feed
-
published_date
¶ str
Date feed was published
-
pubsubhubbub
¶ str
The URL of the pubsubhubbub service for this feed
-
owner_name
¶ str
Name of feed owner
-
owner_email
¶ str
Email of feed owner
-
subtitle
¶ str
The feed subtitle
-
title
¶ str
The feed title
-
ttl
¶ str
The time to live or number of minutes to cache feed
-
web_master
¶ str
The feed’s webmaster
-
is_valid_rss
¶ bool
Is this a valid RSS Feed
-
is_valid_podcast
¶ bool
Is this a valid Podcast
-
date_time
¶ datetime
When published
-
count_items
()¶ Counts Items in full_soup and soup. For debugging
-
set_categories
()¶ Parses and set feed categories
-
set_copyright
()¶ Parses copyright and set value
-
set_creative_commons
()¶ Parses creative commons for item and sets value
-
set_description
()¶ Parses description and sets value
-
set_extended_elements
()¶ Parses and sets non required elements
-
set_full_soup
()¶ Sets soup and keeps items
-
set_generator
()¶ Parses feed generator and sets value
-
set_image
()¶ Parses image element and set values
-
set_is_valid_rss
()¶ Check to if this is actually a valid RSS feed
-
set_itune_image
()¶ Parses itunes images and set url as value
-
set_itunes
()¶ Sets elements related to itunes
Parses author name from itunes tags and sets value
-
set_itunes_block
()¶ Check and see if podcast is blocked from iTunes and sets value
-
set_itunes_categories
()¶ Parses and set itunes categories
-
set_itunes_complete
()¶ Parses complete from itunes tags and sets value
-
set_itunes_explicit
()¶ Parses explicit from itunes tags and sets value
-
set_itunes_keywords
()¶ Parses itunes keywords and set value
-
set_itunes_new_feed_url
()¶ Parses new feed url from itunes tags and sets value
-
set_language
()¶ Parses feed language and set value
-
set_last_build_date
()¶ Parses last build date and set value
-
set_link
()¶ Parses link to homepage and set value
-
set_managing_editor
()¶ Parses managing editor and set value
-
set_optional_elements
()¶ Sets elements considered option by RSS spec
-
set_owner
()¶ Parses owner name and email then sets value
-
set_published_date
()¶ Parses published date and set value
-
set_pubsubhubbub
()¶ Parses pubsubhubbub and email then sets value
-
set_required_elements
()¶ Sets elements required by RSS spec
-
set_soup
()¶ Sets soup and strips items
-
set_subtitle
()¶ Parses subtitle and sets value
-
set_summary
()¶ Parses summary and set value
-
set_title
()¶ Parses title and set value
-
set_ttl
()¶ Parses summary and set value
-
set_web_master
()¶ Parses the feed’s webmaster and sets value
-
-
class
pyPodcastParser.Item.
Item
(soup)¶ Parses an xml rss feed
RSS Specs http://cyber.law.harvard.edu/rss/rss.html iTunes Podcast Specs http://www.apple.com/itunes/podcasts/specs.html
Parameters: soup (bs4.BeautifulSoup) – BeautifulSoup object representing a rss item Note
All attributes with empty or nonexistent element will have a value of None
str
The author of the item
-
comments
¶ str
URL of comments
-
creative_commons
¶ str
creative commons license for this item
-
description
¶ str
Description of the item.
-
enclosure_url
¶ str
URL of enclosure
-
enclosure_type
¶ str
File MIME type
-
enclosure_length
¶ int
File size in bytes
-
guid
¶ str
globally unique identifier
str
Author name given to iTunes
-
itunes_block
¶ bool
It this Item blocked from itunes
-
itunes_closed_captioned
¶ (str): It is this item have closed captions
-
itunes_duration
¶ str
Duration of enclosure
-
itunes_explicit
¶ str
Is this item explicit. Should only be yes or clean.
-
itune_image
¶ str
URL of item cover art
-
itunes_order
¶ str
Override published_date order
-
itunes_subtitle
¶ str
The item subtitle
-
itunes_summary
¶ str
The summary of the item
-
link
¶ str
The URL of item.
-
published_date
¶ str
Date item was published
-
title
¶ str
The title of item.
-
date_time
¶ datetime
When published
Parses author and set value.
-
set_categories
()¶ Parses and set categories
-
set_comments
()¶ Parses comments and set value.
-
set_creative_commons
()¶ Parses creative commons for item and sets value
-
set_description
()¶ Parses description and set value.
-
set_enclosure
()¶ Parses enclosure_url, enclosure_type then set values.
-
set_guid
()¶ Parses guid and set value
-
set_itune_image
()¶ Parses itunes item images and set url as value
Parses author name from itunes tags and sets value
-
set_itunes_block
()¶ Check and see if item is blocked from iTunes and sets value
-
set_itunes_closed_captioned
()¶ Parses isClosedCaptioned from itunes tags and sets value
-
set_itunes_duration
()¶ Parses duration from itunes tags and sets value
-
set_itunes_element
()¶ Set each of the itunes elements.
-
set_itunes_explicit
()¶ Parses explicit from itunes item tags and sets value
-
set_itunes_order
()¶ Parses episode order and set url as value
-
set_itunes_subtitle
()¶ Parses subtitle from itunes tags and sets value
-
set_itunes_summary
()¶ Parses summary from itunes tags and sets value
-
set_link
()¶ Parses link and set value.
-
set_published_date
()¶ Parses published date and set value.
-
set_rss_element
()¶ Set each of the basic rss elements.
-
set_title
()¶ Parses title and set value.