This repo is archival. For the live repo, please visit https://github.com/diffbot/cdata-diffbot-api-profile
A CData API Driver profile that enables SQL-based access to the Diffbot Knowledge Graph API.
This API Profile allows CData tools (API Driver, Sync, API Server, etc.) to connect to Diffbot's Knowledge Graph, exposing KG data as SQL-queryable tables. Query organizations, people, articles, and places from Diffbot's extensive knowledge graph using standard SQL syntax. No manual data uploads required.
- CData API Driver for JDBC (Get a trial license here)
- Diffbot API Token (Get a free token here)
Skip if you already have the CData API Driver installed.
- Install the appropriate API Driver for JDBC for your operating system.
- Note the install location of your driver.
- Create a new user driver in your SQL client and add
<location>/lib/cdata.jdbc.api.jarand<location>/lib/cdata.jdbc.api.licas the driver files. If the.licfile is missing in your lib directory, generate one by runningjava -jar cdata.jdbc.api.jar -license
- Download the latest release.
- Create a new data source in your SQL client with the CData API Driver you installed in the previous step.
- Set authentication to "none".
- Set the URL property to the following, adjust values enclosed in
<>with your own values.
jdbc:api:Profile=/<location>/diffbot-api-profile;ProfileSettings="APIToken=<Your Diffbot API Token>";AuthScheme=None;
IMPORTANT: Do not include the .apip extension in the URL.
Returns status and metadata about your Diffbot token. This should be run on first install to confirm you have an active token and the CData/Diffbot connection is working.
SELECT * FROM AccountQuery organizations from the Diffbot Knowledge Graph. To filter the data, supply a DQL statement to the dql property in the WHERE clause. (Learn how to write DQL)
-- Get organizations in San Francisco with 100+ employees
SELECT * FROM Organization
WHERE dql = 'type:Organization location.city.name:"San Francisco" nbEmployees>100'Key Fields Returned:
name- Organization namehomepageUri- Organization websitenbEmployees- Employee countcategories- Industry classificationrevenue- Publicly known revenue figures for the organizationinvestments- Publicly known investment rounds for the organizationlocations- Location fields (address, city, country, coordinates)linkedInUri/wikipediaUri- Social and other links on the web
For the full ontology, see https://docs.diffbot.com/docs/ont-organization
Query people in the Diffbot Knowledge Graph. To filter the data, supply a DQL statement to the dql property in the WHERE clause. (Learn how to write DQL)
-- List everyone who has ever been a founder
SELECT * FROM Person
WHERE dql = 'type:Person employments.title:"Founder"'DQL is a highly flexible querying language. Brace yourself.
-- List AI founders who previously worked at FAANG companies
SELECT * FROM Person
WHERE dql = 'type:Person employments.{title:"Founder" employer.categories.name:"Artificial Intelligence Software" isCurrent:true} employments.{employer.name:or("Facebook", "Amazon", "Apple", "Netflix", "Google") isCurrent:false}'Key Fields:
nameeducations- Education history in a JSON arrayemployments- Work history in a JSON arraylinkedInUri- LinkedIn and other social URLs
For the full ontology, see https://docs.diffbot.com/docs/ont-person
Query articles from the Diffbot Knowledge Graph. To filter the data, supply a DQL statement to the dql property in the WHERE clause. (Learn how to write DQL)
-- Get M&A articles in English published within the last 7 days
SELECT * FROM Article
WHERE dql = 'type:Article categories.name:"Acquisitions, Mergers and Takeovers" date<7d language:en sortBy:date'Key Fields Returned:
title- Article titletext- Article textauthor- Article authorpageUrl- Article URLdate.str- Published datepublisherCountry- Publishing countrytags- Tagged entities
For the full ontology, see https://docs.diffbot.com/docs/ont-article
Query places from the Diffbot Knowledge Graph. To filter the data, supply a DQL statement to the dql property in the WHERE clause. (Learn how to write DQL)
-- Get every city in Texas, United States
SELECT * FROM Place
WHERE dql = 'type:Place types:"City" location.country.name:"United States" location.region.name:"Texas"'Key Fields Returned:
name- Name of placesummary- One-linerdescription- Many-linerpopulation- Population of a placelocation.latitude/location.longitude- Lat/long coordinates of a place
For the full ontology, see https://docs.diffbot.com/docs/ont-place
├── Account.rsd # Account endpoint schema
├── Article.rsd # Article entity schema
├── Organization.rsd # Organization entity schema
├── Person.rsd # Person entity schema
├── Place.rsd # Place entity schema
├── sys_indexes.rsd # System metadata table
├── ConnectionProperties.json # Connection property definitions
├── META-INF/
│ └── MANIFEST.MF # Profile metadata and checksums
└── README.md
- Read-only: This profile only supports SELECT operations (no INSERT/UPDATE/DELETE). You're querying a database, not writing to one!
- Beta status: Some features may change in future versions
- Paging disabled: During the beta, queries will return a maximum of 25 records at once. Reach out to jerome@diffbot.com to unlock this.
This profile is provided for use with licensed CData products. Diffbot API usage is subject to Diffbot's terms of service.