US Patents

US Patents bulk datasets are made available as XML file from Google

This connector retrieves that data

Since these XML files are not well-formed, this connector provides built-in cleansing and parsing. The resulting dataset contains one "patent" column (JSON) containing the patent metadata, abstract, description and claims

The connector enables a local folder cache to simplify your developments. You can choose to retrieve the whole patent database (beware it's big : 40GB) or any year between 2005 and 2015

Configuration screen.

Plugin information

Last updated2015/11/15
LicenseApache Software License
Source codeGithub
Reporting issuesGithub

How to use

You need to install the dependencies of the plugin. Go to the Administration > Plugins page to get the command-line to install dependencies

The dataset is partitioned by year if you select to.