Welcome to pycrunchbase’s documentation!

Contents:

Overview

pycrunchbase

Documentation Status Travis-CI Build Status coveralls
PyPI Package latest release PyPI Package monthly downloads PyPI Wheel Supported versions Supported imlementations

Python bindings to CrunchBase

Starting from v0.3.0, pycrunchbase has support for CrunchBase API version 3, but things are still flaky, so any kind of bug reports is greatly appreciated, for detail see notes below.

Note: I currently do not need to use this library, so it’s feature-complete for me. Bug reports are welcome, and pull requests for features are still accepted.

Examples

Initialize the API using your API Key, will throw ValueError if missing

cb = CrunchBase(API_KEY)

Look up an organization by name

github = cb.organization('github')

The response contains snippets of data regarding relationships that the organization has, an example is the funding_rounds

funding_rounds_summary = github.funding_rounds

All relationships are paged, and only 8 is returned initially to get more data do this, it handles paging for you and returns a False-y value if there are no more pages

more_funding_rounds = cb.more(funding_rounds_summary)

Data in relations are just summaries, and you probably want more details For example funding_rounds returns 5 values: type, name, path created_at, updated_at.

If you actually want to know who invested, you have to get to make more API calls.

First get the uuid of the round

round_uuid = funding_rounds_summary[0].uuid

Then use the CrunchBase API to make that call

round = cb.funding_round(round_uuid)

Again, investments is a relationship on a FundingRound, so we can get the first item in that relationship

an_investor = round.investments[0]  # a Investment

And printing that gives us the name of the investor, and the amount invested in USD

print(str(an_investor))  # prints: Investment: [Organization: Name]

Installation

pip install pycrunchbase

Development

To run the all tests run:

tox

Contributions are always welcome! Visit pycrunchbase’s Homepage <https://github.com/ngzhian/pycrunchbase/>

Use GitHub issues to report a bug or send feedback.

The best way to send feedback is to file an issue at https://github.com/ngzhian/pycrunchbase/issues.

Contributors

Thanks to these contributors:

Goals

  1. Support all (or almost all) of CrunchBase’s API functionalities
  2. Speedy updates when CrunchBase’s API changes
  3. ‘Pythonic’ bindings, user doesn’t feel like we’re requesting URLs

Notes on CrunchBase version 3 changes

In version 3, CrunchBase changed the names of some endpoints, e.g person -> people, and they have gone with the plural form of all entities. pycrunchbase does not adhere strictly to that. For example, there is still a person method, but a people method is also provided so that it remains backwards compatible and also supports methods that matches the name of the entity.

License

MIT

Installation

At the command line:

pip install pycrunchbase

Usage

To use pycrunchbase in a project:

import pycrunchbase

Instantiate your CrunchBase using your API Key:

cb = pycrunchbase.CrunchBase(API_KEY)

Get details about an organization:

github = cb.organization('github')

Get properties about the organization:

what_is_github = github.description
where_is_github = github.homepage_url

Get relationships (summarized version) about the organization:

github_team = github.current_team
who_started_github = github.founders
in_the_news = github.news

Get more relationships about the organization:

more_news = cb.more(in_the_news)
all_news_urls = [news.url for news in more_news]

If the relationship has more details to it, e.g. a Person in the current team, we need to do a bit more to grab those information:

a_founder = who_started_github[0]
cb.person(a_founder.permalink)

permalink is a special field on on an item in a relationship, this is the unique CrunchBase identifier of that node.

Reference

pycrunchbase

class pycrunchbase.Acquisition(data)

Represents a Acquisition on CrunchBase

class pycrunchbase.Address(data)

Represents a Address on CrunchBase

class pycrunchbase.Category(data)

Represents a Category on CrunchBase

class pycrunchbase.Degree(data)

Represents a Degree on CrunchBase

class pycrunchbase.FundingRound(data)

Represents a FundingRound on CrunchBase

class pycrunchbase.Fund(data)

Represents an Fund on CrunchBase Previously known as a FundRaise.

class pycrunchbase.Image(data)

Represents a Image on CrunchBase

class pycrunchbase.Investment(data)

Represents a Investment (investor-investment) on CrunchBase

class pycrunchbase.IPO(data)

Represents an IPO on CrunchBase

class pycrunchbase.Job(data)

Represents a Job on CrunchBase

class pycrunchbase.Location(data)

Represents a Location on CrunchBase

class pycrunchbase.News(data)

Represents a News on CrunchBase

class pycrunchbase.Organization(data)

Represents an Organization on CrunchBase

class pycrunchbase.Page(name, data)

A Page represents a a page of results returned by CrunchBase. Page contains useful information regarding how many items there are in total (total_items), items per page (items_per_page), etc.

A page contains information for going to the prev/next page (if available).

The data that is used to initialize a Page looks like this:

"data": {
    "paging": {
        "items_per_page": 1000,
        "current_page": 1,
        "number_of_pages": 1,
        "next_page_url": null,
        "prev_page_url": null,
        "total_items": 1,
        "sort_order": "custom"
    },
    "items": [
     {
        "properties": {
            "path": "organization/example",
            "name": "Example",
            "updated_at": 1423666090,
            "created_at": 1371717055,
         },
         "type": "Organization",
         "uuid": "uuid"
     }
    ]
}
class pycrunchbase.PageItem(data)

A item within a Page.

A page is a homogenous collection of PageItem, and there are many kinds of PageItem. build() is a helper class method to help build the correct type of PageItem based on

  1. path, or
  2. type
class pycrunchbase.Person(data)

Represents a Person on CrunchBase

class pycrunchbase.Product(data)

Represents a Product on CrunchBase

class pycrunchbase.Relationship(name, data)

A Relationhip represents relationship between a Node and interesting information regarding the Node.

This is a summary returned alongside the Node details information, e.g. ad call to /organizatin/example will return many properties and many relationships.

To get more details of this relationship, call CrunchBase‘s more().

get(i)

Gets the i-th element of this page

Args:
i (int): 0-based index of the element to retrieve
Returns:
PageItem: if valid item exists at index i None if the index is too small or too large
class pycrunchbase.StockExchange(data)

Represents a Website on CrunchBase

class pycrunchbase.Video(data)

Represents a Video on CrunchBase

class pycrunchbase.Website(data)

Represents a Website on CrunchBase

class pycrunchbase.CrunchBase(api_key=None)

Class that manages talking to CrunchBase API

acquisition(uuid)

Get the details of a acquisition given a uuid.

Returns:
Acquisition or None
categories()

Queries for a list of all active Categories, returns the first Page of results.

Returns:
Page or None
fund(permalink)

Get the details of an fundraise given a fundraise uuid.

Returns:
Fund or None
funding_round(uuid)

Get the details of a FundingRound given the uuid.

Returns
FundingRound or None
fundraise(permalink)

Get the details of an fundraise given a fundraise uuid.

Returns:
Fund or None
get_node(node_type, uuid, params=None)

Get the details of a Node from CrunchBase. The node_type must match that of CrunchBase’s, and the uuid is either the {uuid} or {permalink} as stated on their docs.

Returns:
dict: containing the data describing this node with the keys uuid, type, properties, relationships. Or None if there’s an error.
ipo(permalink)

Get the details of an ipo given a ipo uuid.

Returns:
IPO or None
locations()

Queries for a list of all active Locations, returns the first Page of results.

Returns:
Page or None
more(page)

Given a Page, tries to get more data using the first_page_url or next_page_url given in the response.

If page happens to be a Relationship, i.e. page.first_page_url is not None, we just call that url to retrieve the first page.

Returns:
None if there is no more page to get, else Relationship with the new data
organization(permalink)

Get the details of a organization given a organization’s permalink.

Returns:
Organization or None
organizations(name)

Search for a organization given a name, returns the first Page of results

Returns:
Page or None
people(permalink)

Get the details of a person given a person’s permalink

Returns:
Person or None
person(permalink)

Helper to maintain backward compatability

product(permalink)

Get the details of a product given a product permalink.

Returns:
Product or None
products()

Gets a list of products on CrunchBase

Relationship

class Relationship
A relationship
class NoneRelationship

A NoneRelationship is a subclass of Relationship that represents a non-existent relationship. Think of it as a None that is also a Relationship.

NoneRelationship and NonePageItem

A NoneRelationship is a subclass of Relationship that represents a non-existent relationship. Think of it as a None that is also a Relationship.

A NoneRelationship is returned by _parse_relationship() in Node, when there is an expected relationship but it isn’t in the returned data.

For example:

members = organization.team_members  # the data we got from CrunchBase was missing the team_members realtionship
assert instanceof(members[0], NoneRelationship) == True

The benefit of this is that calling the conventional relationship methods, such as get(), will return an object, rather than throwing an AttributeError:

members = organization.team_members
first_member = members.get(0)  # will not explode
assert bool(first_member) == False

Because Relationship are made up of PageItem, we have a similar None for it as well, called NonePageItem. The idea behind it is similar to what was discussed above as well:

# continuing the example from above, we have member
first_member = members.get(0)  # will not explode
assert first_member.name == None
assert first_member.whatever == None

Page

class Page
A ``Page`` presents a page of results as returned by a query on an endpoint of CrunchBase, e.g. the `/organizations` endpoint.

A Page contains information on how items there are per page, the current page number, amongst others.

items_per_page

an int representing the maximum number of items there can be in a page

current_page

The page number of this current page

number_of_pages

Number of pages there are for this query result

next_page_url

A url that you can request to get the next page of data

prev_page_url

A url that you can request to get the prev page of data

total_items

Total number of items that exist for this endpoint/query

sort_oder

The sorting order of this list of results

get(i)

Gets the i-th element in this page, 0-based.

Parameters:

i (int) – Index of PageItem to get, 0-based.

Returns:

the PageItem

Return type:

PageItem

Raises:
  • IndexError – if i is out the bounds
  • TypeError – if i is not an int

Adding a node to pycrunchbase

If CrunchBase adds a new Node, we can add it to pycrunchbase as such:

  1. Make a new file under resource/ with the name <node>.py, rerplacing <node> with the name of the node.
  2. Write a test file for this node in tests/ called test_<node>.py.
  3. Add a method on CrunchBase with the name <node> as the public api to access this node.
  4. Ensure that all imports are working fine, this includes adding the node to resource/__init__.py, pycrunchbase/__init__.py.
  5. Add a test case to test_pycrunchbase.py to test the public api for the new node.

Contributing

Contributions are welcome, and they are greatly appreciated! Every little bit helps, and credit will always be given.

Bug reports

When reporting a bug please include:

  • Your operating system name and version.
  • Any details about your local setup that might be helpful in troubleshooting.
  • Detailed steps to reproduce the bug.

Documentation improvements

pycrunchbase could always use more documentation, whether as part of the official pycrunchbase docs, in docstrings, or even on the web in blog posts, articles, and such.

Feature requests and feedback

The best way to send feedback is to file an issue at https://github.com/ngzhian/pycrunchbase/issues.

If you are proposing a feature:

  • Explain in detail how it would work.
  • Keep the scope as narrow as possible, to make it easier to implement.
  • Remember that this is a volunteer-driven project, and that contributions are welcome :)

Development

To set up pycrunchbase for local development:

  1. Fork pycrunchbase on GitHub.

  2. Clone your fork locally:

    git clone git@github.com:your_name_here/pycrunchbase.git
    
  3. Create a branch for local development:

    git checkout -b name-of-your-bugfix-or-feature
    

    Now you can make your changes locally.

  4. When you’re done making changes, run all the checks, doc builder and spell checker with tox one command:

    tox
    
  5. Commit your changes and push your branch to GitHub:

    git add .
    git commit -m "Your detailed description of your changes."
    git push origin name-of-your-bugfix-or-feature
    
  6. Submit a pull request through the GitHub website.

Pull Request Guidelines

If you need some code review or feedback while you’re developing the code just make the pull request.

For merging, you should:

  1. Include passing tests (run tox) [1].
  2. Update documentation when there’s new API, functionality etc.
  3. Add a note to CHANGELOG.rst about the changes.
  4. Add yourself to AUTHORS.rst.
[1]

If you don’t have all the necessary python versions available locally you can rely on Travis - it will run the tests for each change you add in the pull request.

It will be slower though ...

Tips

To run a subset of tests:

tox -e envname -- py.test -k test_myfeature

To run all the test environments in parallel (you need to pip install detox):

detox

New version checklist

  1. Update CHANGELOG.rst with changes
  2. Update version in pycrunchbase.__init__
  3. Update version in setup.py
  4. Commit with messasge Bump to vX.X.X
  5. Tag commit with vX.X.X
  6. Push to origin master git push origin master –tags
  7. Release to pypi python setup.py release

Changelog

0.3.8 (2017-2-9) * Fix #26 encode url if it has special entities

0.3.7 (2016-1-13) * Added profile_image_url known property to Organization, Person, and Product per CB-5541 bugfix from 2015-10-21 * Added featured_team relationship for Organization per Crunchbase change on 2016-06-22 * Added known properties is_current for Job and is_lead_investor for Investment per CB-9048 on 2016-10-14 * Fixed typos in addnode.rst * Added David Tran to AUTHORS.rst

0.3.6 (2015-10-21) * Alias ‘PressReference’ to news * Fix checking for the type of a PageItem, use lowercase compare * Update test data, those were out of sync with what CrunchBase no returns. Specifically the test data for Fund and Relationship (Organization.past_team)

0.3.5 (2015-09-28) * Fixed handling null rleationships that api returns * Update setup.py release alias

0.3.4 (2015-09-27) * Fixed instructions in usage.rst (#20) * Support nested relationships FundingRound -> Investments -> Organization * Update README

0.3.3 (2015-08-29)

  • Added stock_exchange as a known property of Organization, ref #19 <https://github.com/ngzhian/pycrunchbase/issues/19>

0.3.2 (2015-07-25)

  • New resource type StockExchange (fixes #18)
  • Better __str__ for IPO

0.3.1 (2015-05-25)

  • Bug fix when relationship data returned from crunchbase is [null]. Thanks @karlalopez

0.3.0 (2015-05-01)

  • Updated to support version 3 of CrunchBase API
  • Fix endpoint urls, e.g. ‘funding-round’ -> ‘funding-rounds’
  • Internal cleanups, Page now subclass Relationship

0.2.7 (2015-04-23)

  • Fixed: #9 sub_organization and websites relationship of Organization

0.2.6 (2015-04-13)

  • Fixed: #8 printing PageItem leads to unbounded recursion (@dustinfarris)

0.2.5 (2015-04-04)

  • Added: Locations - get a list of active locations from CrunchBase
  • Added: LocationPageItem - each location in the Page of Locations
  • Added: Categories - get a list of active categories from CrunchBase
  • Added: CategoryPageItem - each location in the Page of Categories

0.2.4 (2015-04-03)

  • Added: IPO - you can now use a uuid to grab IPO data

0.2.3 (2015-03-01)

  • Fix: Travis builds and tests

0.2.2 (2015-02-25)

  • Fix: Unicode output (using UTF-8 encoding)

0.2.1 (2015-02-21)

  • Fix __version__

0.2.0 (2015-02-15)

  • The API is now considered relatively stabled. Updated the classifier to reflect so
  • Change to how CrunchBase.more reacts to a Relationship, we no longer optimize when the Relationship has all items, just call first_page_url

0.1.9 (2015-02-15)

  • Add series to the FundingRound node.

0.1.8 (2015-02-15)

  • Update __str__ for nodes and relationships

0.1.7 (2015-02-15)

  • Relationship is now a subclass of Page, although this strictly isn’t true. The benefit is that this allows us to reuse a lot of logic. Relationship can be thought of as Page 0, which is a summary of potentially multiple pages of PageItem. The only time we get a relationship is when we query for a particular Node, e.g. organiation, and we grab the relationships returned by the API. After this, to get more details we call Crunchbase.more, and this returns us a Page.
  • Added __repr__ methods to all the Node, Relationship, PageItem. Previously we only defined __str__, but these didn’t show up in places like the REPL. This fixes that. We try to make it obvious what object it is based on what is printed, but also don’t want to be too verbose.

0.1.6 (2015-02-15)

  • InvestorInvestmentPageItem now has the possibility of being either a investor, or a invested_in relationship
  • Propogates any exception when making the actual HTTP call to CrunchBase

0.1.5 (2015-02-13)

  • Add a cb_url attribute for all PageItem, this url is a CrunchBase page (not the API) that holds more information for a particular PageItem Allows you to make calls like:

    company.funding_rounds[0].cb_url
    

    to get the url of the page for the first funding round of company.

  • A new page item, InvestorInvestmentPageItem, that is useful for FundingRound info:

    round = cb.funding_round('round_uuid')
    an_investor = round.investments[0]  # a InvestorInvestmentPageItem
    print(str(an_investor))  # prints: Investor Name $100000
    
  • Add simplified Contribution guidelines in README

0.1.4 (2015-02-13)

  • Relationship retrieval is 0-based now, 1-based just doesn’t fit well with array
  • Better __str__ for Node and Relationship
  • Relationship.get(i) if i is too large or small will return a NonePageItem singleton

0.1.3 (2015-02-12)

  • Fix Relationship: wasn’t using the right build method of PageItem
  • Add test to checkk for the above
  • remove unused reference to CrunchBase in Relationship

0.1.2 (2015-02-12)

  • PageItem and it’s subclasses to represent an item within a relationship of a Node
  • Cleanup of where utility methods live (parse_date)
  • More tests as always, overall 98.21% coverage

0.1.0 (2015-02-21)

  • First release on PyPI.

Indices and tables