Welcome to pycrunchbase’s documentation!¶
Contents:
Overview¶
pycrunchbase¶
Python bindings to CrunchBase
Starting from v0.3.0, pycrunchbase has support for CrunchBase API version 3, but things are still flaky, so any kind of bug reports is greatly appreciated, for detail see notes below.
Note: I currently do not need to use this library, so it’s feature-complete for me. Bug reports are welcome, and pull requests for features are still accepted.
Examples¶
Initialize the API using your API Key, will throw ValueError if missing
cb = CrunchBase(API_KEY)
Look up an organization by name
github = cb.organization('github')
The response contains snippets of data regarding relationships that the organization has, an example is the funding_rounds
funding_rounds_summary = github.funding_rounds
All relationships are paged, and only 8 is returned initially to get more data do this, it handles paging for you and returns a False-y value if there are no more pages
more_funding_rounds = cb.more(funding_rounds_summary)
Data in relations are just summaries, and you probably want more details For example funding_rounds returns 5 values: type, name, path created_at, updated_at.
If you actually want to know who invested, you have to get to make more API calls.
First get the uuid of the round
round_uuid = funding_rounds_summary[0].uuid
Then use the CrunchBase API to make that call
round = cb.funding_round(round_uuid)
Again, investments is a relationship on a FundingRound, so we can get the first item in that relationship
an_investor = round.investments[0] # a Investment
And printing that gives us the name of the investor, and the amount invested in USD
print(str(an_investor)) # prints: Investment: [Organization: Name]
Installation¶
pip install pycrunchbase
Documentation¶
Development¶
To run the all tests run:
tox
Contributions are always welcome! Visit pycrunchbase’s Homepage <https://github.com/ngzhian/pycrunchbase/>
Use GitHub issues to report a bug or send feedback.
The best way to send feedback is to file an issue at https://github.com/ngzhian/pycrunchbase/issues.
Goals¶
- Support all (or almost all) of CrunchBase’s API functionalities
- Speedy updates when CrunchBase’s API changes
- ‘Pythonic’ bindings, user doesn’t feel like we’re requesting URLs
Notes on CrunchBase version 3 changes¶
In version 3, CrunchBase changed the names of some endpoints, e.g person -> people, and they have gone with the plural form of all entities. pycrunchbase does not adhere strictly to that. For example, there is still a person method, but a people method is also provided so that it remains backwards compatible and also supports methods that matches the name of the entity.
License¶
MIT
Usage¶
To use pycrunchbase in a project:
import pycrunchbase
Instantiate your CrunchBase
using your API Key:
cb = pycrunchbase.CrunchBase(API_KEY)
Get details about an organization:
github = cb.organization('github')
Get properties about the organization:
what_is_github = github.description
where_is_github = github.homepage_url
Get relationships (summarized version) about the organization:
github_team = github.current_team
who_started_github = github.founders
in_the_news = github.news
Get more relationships about the organization:
more_news = cb.more(in_the_news)
all_news_urls = [news.url for news in more_news]
If the relationship has more details to it, e.g. a Person in the current team, we need to do a bit more to grab those information:
a_founder = who_started_github[0]
cb.person(a_founder.permalink)
permalink is a special field on on an item in a relationship, this is the unique CrunchBase identifier of that node.
Reference¶
pycrunchbase¶
-
class
pycrunchbase.
Acquisition
(data)¶ Represents a Acquisition on CrunchBase
-
class
pycrunchbase.
Address
(data)¶ Represents a Address on CrunchBase
-
class
pycrunchbase.
Category
(data)¶ Represents a Category on CrunchBase
-
class
pycrunchbase.
Degree
(data)¶ Represents a Degree on CrunchBase
-
class
pycrunchbase.
FundingRound
(data)¶ Represents a FundingRound on CrunchBase
-
class
pycrunchbase.
Fund
(data)¶ Represents an Fund on CrunchBase Previously known as a FundRaise.
-
class
pycrunchbase.
Image
(data)¶ Represents a Image on CrunchBase
-
class
pycrunchbase.
Investment
(data)¶ Represents a Investment (investor-investment) on CrunchBase
-
class
pycrunchbase.
IPO
(data)¶ Represents an IPO on CrunchBase
-
class
pycrunchbase.
Job
(data)¶ Represents a Job on CrunchBase
-
class
pycrunchbase.
Location
(data)¶ Represents a Location on CrunchBase
-
class
pycrunchbase.
News
(data)¶ Represents a News on CrunchBase
-
class
pycrunchbase.
Organization
(data)¶ Represents an Organization on CrunchBase
-
class
pycrunchbase.
Page
(name, data)¶ A Page represents a a page of results returned by CrunchBase. Page contains useful information regarding how many items there are in total (total_items), items per page (items_per_page), etc.
A page contains information for going to the prev/next page (if available).
The data that is used to initialize a Page looks like this:
"data": { "paging": { "items_per_page": 1000, "current_page": 1, "number_of_pages": 1, "next_page_url": null, "prev_page_url": null, "total_items": 1, "sort_order": "custom" }, "items": [ { "properties": { "path": "organization/example", "name": "Example", "updated_at": 1423666090, "created_at": 1371717055, }, "type": "Organization", "uuid": "uuid" } ] }
-
class
pycrunchbase.
PageItem
(data)¶ A item within a Page.
A page is a homogenous collection of PageItem, and there are many kinds of PageItem.
build()
is a helper class method to help build the correct type of PageItem based on- path, or
- type
-
class
pycrunchbase.
Person
(data)¶ Represents a Person on CrunchBase
-
class
pycrunchbase.
Product
(data)¶ Represents a Product on CrunchBase
-
class
pycrunchbase.
Relationship
(name, data)¶ A Relationhip represents relationship between a Node and interesting information regarding the Node.
This is a summary returned alongside the Node details information, e.g. ad call to /organizatin/example will return many properties and many relationships.
To get more details of this relationship, call
CrunchBase
‘smore()
.-
get
(i)¶ Gets the i-th element of this page
- Args:
- i (int): 0-based index of the element to retrieve
- Returns:
- PageItem: if valid item exists at index i None if the index is too small or too large
-
-
class
pycrunchbase.
StockExchange
(data)¶ Represents a Website on CrunchBase
-
class
pycrunchbase.
Video
(data)¶ Represents a Video on CrunchBase
-
class
pycrunchbase.
Website
(data)¶ Represents a Website on CrunchBase
-
class
pycrunchbase.
CrunchBase
(api_key=None)¶ Class that manages talking to CrunchBase API
-
acquisition
(uuid)¶ Get the details of a acquisition given a uuid.
- Returns:
- Acquisition or None
-
categories
()¶ Queries for a list of all active Categories, returns the first
Page
of results.- Returns:
- Page or None
-
fund
(permalink)¶ Get the details of an fundraise given a fundraise uuid.
- Returns:
- Fund or None
-
funding_round
(uuid)¶ Get the details of a FundingRound given the uuid.
- Returns
- FundingRound or None
-
fundraise
(permalink)¶ Get the details of an fundraise given a fundraise uuid.
- Returns:
- Fund or None
-
get_node
(node_type, uuid, params=None)¶ Get the details of a Node from CrunchBase. The node_type must match that of CrunchBase’s, and the uuid is either the {uuid} or {permalink} as stated on their docs.
- Returns:
- dict: containing the data describing this node with the keys uuid, type, properties, relationships. Or None if there’s an error.
-
ipo
(permalink)¶ Get the details of an ipo given a ipo uuid.
- Returns:
- IPO or None
-
locations
()¶ Queries for a list of all active Locations, returns the first
Page
of results.- Returns:
- Page or None
-
more
(page)¶ Given a Page, tries to get more data using the first_page_url or next_page_url given in the response.
If page happens to be a Relationship, i.e. page.first_page_url is not None, we just call that url to retrieve the first page.
- Returns:
- None if there is no more page to get, else Relationship with the new data
-
organization
(permalink)¶ Get the details of a organization given a organization’s permalink.
- Returns:
- Organization or None
-
organizations
(name)¶ Search for a organization given a name, returns the first
Page
of results- Returns:
- Page or None
-
people
(permalink)¶ Get the details of a person given a person’s permalink
- Returns:
- Person or None
-
person
(permalink)¶ Helper to maintain backward compatability
-
product
(permalink)¶ Get the details of a product given a product permalink.
- Returns:
- Product or None
-
products
()¶ Gets a list of products on CrunchBase
-
Relationship¶
-
class
Relationship
¶ -
A relationship
-
class
NoneRelationship
¶
A NoneRelationship
is a subclass of Relationship
that represents a non-existent relationship. Think of it as a None that is also a Relationship.
NoneRelationship and NonePageItem¶
A NoneRelationship
is a subclass of Relationship
that represents a non-existent relationship. Think of it as a None that is also a Relationship.
A NoneRelationship
is returned by _parse_relationship()
in
Node
, when there is an expected relationship but it isn’t in the returned data.
For example:
members = organization.team_members # the data we got from CrunchBase was missing the team_members realtionship
assert instanceof(members[0], NoneRelationship) == True
The benefit of this is that calling the conventional relationship methods, such as get()
,
will return an object, rather than throwing an AttributeError:
members = organization.team_members
first_member = members.get(0) # will not explode
assert bool(first_member) == False
Because Relationship
are made up of PageItem
, we have a similar
None for it as well, called NonePageItem
. The idea behind it is similar to what
was discussed above as well:
# continuing the example from above, we have member
first_member = members.get(0) # will not explode
assert first_member.name == None
assert first_member.whatever == None
Page¶
-
class
Page
¶ -
A ``Page`` presents a page of results as returned by a query on an endpoint of CrunchBase, e.g. the `/organizations` endpoint.
A
Page
contains information on how items there are per page, the current page number, amongst others.-
items_per_page
¶ an
int
representing the maximum number of items there can be in a page
-
current_page
¶ The page number of this current page
-
number_of_pages
¶ Number of pages there are for this query result
-
next_page_url
¶ A url that you can request to get the next page of data
-
prev_page_url
¶ A url that you can request to get the prev page of data
-
total_items
¶ Total number of items that exist for this endpoint/query
-
sort_oder
¶ The sorting order of this list of results
-
get
(i)¶ Gets the i-th element in this page, 0-based.
Parameters: i (int) – Index of
PageItem
to get, 0-based.Returns: the
PageItem
Return type: PageItem
Raises: - IndexError – if i is out the bounds
- TypeError – if i is not an int
-
Adding a node to pycrunchbase¶
If CrunchBase adds a new Node, we can add it to pycrunchbase as such:
- Make a new file under resource/ with the name <node>.py, rerplacing <node> with the name of the node.
- Write a test file for this node in tests/ called test_<node>.py.
- Add a method on
CrunchBase
with the name <node> as the public api to access this node. - Ensure that all imports are working fine, this includes adding the node to resource/__init__.py, pycrunchbase/__init__.py.
- Add a test case to test_pycrunchbase.py to test the public api for the new node.
Contributing¶
Contributions are welcome, and they are greatly appreciated! Every little bit helps, and credit will always be given.
Bug reports¶
When reporting a bug please include:
- Your operating system name and version.
- Any details about your local setup that might be helpful in troubleshooting.
- Detailed steps to reproduce the bug.
Documentation improvements¶
pycrunchbase could always use more documentation, whether as part of the official pycrunchbase docs, in docstrings, or even on the web in blog posts, articles, and such.
Feature requests and feedback¶
The best way to send feedback is to file an issue at https://github.com/ngzhian/pycrunchbase/issues.
If you are proposing a feature:
- Explain in detail how it would work.
- Keep the scope as narrow as possible, to make it easier to implement.
- Remember that this is a volunteer-driven project, and that contributions are welcome :)
Development¶
To set up pycrunchbase for local development:
Clone your fork locally:
git clone git@github.com:your_name_here/pycrunchbase.git
Create a branch for local development:
git checkout -b name-of-your-bugfix-or-feature
Now you can make your changes locally.
When you’re done making changes, run all the checks, doc builder and spell checker with tox one command:
tox
Commit your changes and push your branch to GitHub:
git add . git commit -m "Your detailed description of your changes." git push origin name-of-your-bugfix-or-feature
Submit a pull request through the GitHub website.
Pull Request Guidelines¶
If you need some code review or feedback while you’re developing the code just make the pull request.
For merging, you should:
- Include passing tests (run
tox
) [1]. - Update documentation when there’s new API, functionality etc.
- Add a note to
CHANGELOG.rst
about the changes. - Add yourself to
AUTHORS.rst
.
[1] | If you don’t have all the necessary python versions available locally you can rely on Travis - it will run the tests for each change you add in the pull request. It will be slower though ... |
Tips¶
To run a subset of tests:
tox -e envname -- py.test -k test_myfeature
To run all the test environments in parallel (you need to pip install detox
):
detox
New version checklist¶
- Update CHANGELOG.rst with changes
- Update version in pycrunchbase.__init__
- Update version in setup.py
- Commit with messasge Bump to vX.X.X
- Tag commit with vX.X.X
- Push to origin master git push origin master –tags
- Release to pypi python setup.py release
Authors¶
- Ng Zhi An - https://github.com/ngzhian
- David Tran - https://github.com/dtran320
Changelog¶
0.3.8 (2017-2-9) * Fix #26 encode url if it has special entities
0.3.7 (2016-1-13) * Added profile_image_url known property to Organization, Person, and Product per CB-5541 bugfix from 2015-10-21 * Added featured_team relationship for Organization per Crunchbase change on 2016-06-22 * Added known properties is_current for Job and is_lead_investor for Investment per CB-9048 on 2016-10-14 * Fixed typos in addnode.rst * Added David Tran to AUTHORS.rst
0.3.6 (2015-10-21) * Alias ‘PressReference’ to news * Fix checking for the type of a PageItem, use lowercase compare * Update test data, those were out of sync with what CrunchBase no returns. Specifically the test data for Fund and Relationship (Organization.past_team)
0.3.5 (2015-09-28) * Fixed handling null rleationships that api returns * Update setup.py release alias
0.3.4 (2015-09-27) * Fixed instructions in usage.rst (#20) * Support nested relationships FundingRound -> Investments -> Organization * Update README
0.3.3 (2015-08-29)¶
- Added stock_exchange as a known property of Organization, ref #19 <https://github.com/ngzhian/pycrunchbase/issues/19>
0.3.2 (2015-07-25)¶
- New resource type StockExchange (fixes #18)
- Better __str__ for IPO
0.3.1 (2015-05-25)¶
- Bug fix when relationship data returned from crunchbase is [null]. Thanks @karlalopez
0.3.0 (2015-05-01)¶
- Updated to support version 3 of CrunchBase API
- Fix endpoint urls, e.g. ‘funding-round’ -> ‘funding-rounds’
- Internal cleanups, Page now subclass Relationship
0.2.7 (2015-04-23)¶
- Fixed: #9 sub_organization and websites relationship of Organization
0.2.6 (2015-04-13)¶
- Fixed: #8 printing PageItem leads to unbounded recursion (@dustinfarris)
0.2.5 (2015-04-04)¶
- Added: Locations - get a list of active locations from CrunchBase
- Added: LocationPageItem - each location in the Page of Locations
- Added: Categories - get a list of active categories from CrunchBase
- Added: CategoryPageItem - each location in the Page of Categories
0.2.4 (2015-04-03)¶
- Added: IPO - you can now use a uuid to grab IPO data
0.2.3 (2015-03-01)¶
- Fix: Travis builds and tests
0.2.2 (2015-02-25)¶
- Fix: Unicode output (using UTF-8 encoding)
0.2.1 (2015-02-21)¶
- Fix __version__
0.2.0 (2015-02-15)¶
- The API is now considered relatively stabled. Updated the classifier to reflect so
- Change to how CrunchBase.more reacts to a Relationship, we no longer optimize when the Relationship has all items, just call first_page_url
0.1.9 (2015-02-15)¶
- Add series to the FundingRound node.
0.1.8 (2015-02-15)¶
- Update __str__ for nodes and relationships
0.1.7 (2015-02-15)¶
- Relationship is now a subclass of Page, although this strictly isn’t true. The benefit is that this allows us to reuse a lot of logic. Relationship can be thought of as Page 0, which is a summary of potentially multiple pages of PageItem. The only time we get a relationship is when we query for a particular Node, e.g. organiation, and we grab the relationships returned by the API. After this, to get more details we call Crunchbase.more, and this returns us a Page.
- Added __repr__ methods to all the Node, Relationship, PageItem. Previously we only defined __str__, but these didn’t show up in places like the REPL. This fixes that. We try to make it obvious what object it is based on what is printed, but also don’t want to be too verbose.
0.1.6 (2015-02-15)¶
- InvestorInvestmentPageItem now has the possibility of being either a investor, or a invested_in relationship
- Propogates any exception when making the actual HTTP call to CrunchBase
0.1.5 (2015-02-13)¶
Add a cb_url attribute for all PageItem, this url is a CrunchBase page (not the API) that holds more information for a particular PageItem Allows you to make calls like:
company.funding_rounds[0].cb_url
to get the url of the page for the first funding round of company.
A new page item, InvestorInvestmentPageItem, that is useful for FundingRound info:
round = cb.funding_round('round_uuid') an_investor = round.investments[0] # a InvestorInvestmentPageItem print(str(an_investor)) # prints: Investor Name $100000
Add simplified Contribution guidelines in README
0.1.4 (2015-02-13)¶
- Relationship retrieval is 0-based now, 1-based just doesn’t fit well with array
- Better __str__ for Node and Relationship
- Relationship.get(i) if i is too large or small will return a NonePageItem singleton
0.1.3 (2015-02-12)¶
- Fix Relationship: wasn’t using the right build method of PageItem
- Add test to checkk for the above
- remove unused reference to CrunchBase in Relationship
0.1.2 (2015-02-12)¶
- PageItem and it’s subclasses to represent an item within a relationship of a Node
- Cleanup of where utility methods live (parse_date)
- More tests as always, overall 98.21% coverage
0.1.0 (2015-02-21)¶
- First release on PyPI.