API Documentation

The Sindice API provides programmatic access to its search capabilities. Please refer to the API forum for support questions.

Query services (v2)

There are two types of search in the new API: term search and advanced search.
In general these APIs are based on the OpenSearch 1.1 specification.

  • the q parameter specifies the query
  • the page parameter (mandatory) specifies the result page. Pages are 1-indexed, so the first page is 1, the second is 2 and so on.
  • the qt parameter must be either "term" or "advanced" to select between term Search and Triple Search.

Example:

http://api.sindice.com/v2/search?q=Rome&qt=term&page=1

Term Search

Term Search allows you to retrieve documents that are related to keywords and or URIS.
to activate the Term Search use qt=term in the query parameters. Example:

http://api.sindice.com/v2/search?q=Rome&qt=term

Currently, term search enjoys better ranking and is in general more suitable when searching for user provided strings.
Term search automatically parses URIs and uses them to look at URIs inside the RDF. Example:

http://api.sindice.com/v2/search?q=Giovanni+Tummarello+http%3A%2F%2Frichard.cyganiak.de%2Ffoaf.rdf%23cygri&qt=term&page=1

For the complete documentation of the Term Search query language see http://sindice.com/developers/querylanguages.

Advanced Search

Advanced Search allows the use of triple level expressions in the query. Example

http://api.sindice.com/v2/search?q=*+%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2Fname%3E+%22Renaud+Delbru%22&qt=advanced&page=1

will locate RDF that contain resources which have "foaf:name" "Renaud Delbru"

For the complete documentation of the Advanced Search query language see http://sindice.com/developers/querylanguages.

Result formats

You can negotiate the content ant retrieve three different formats:

  • json: curl -H "Accept: application/x-json" "http://api.sindice.com/v2/search?q=gabriele&qt=term&page=1
  • rdf: curl -H "Accept: application/rdf+xml" "http://api.sindice.com/v2/search?q=gabriele&qt=term&page=1
  • atom: curl -H "Accept: application/atom+xml" "http://api.sindice.com/v2/search?q=gabriele&qt=term&page=1

The basic format has three "groups" of fields :

  • generation time of this search
  • base url, without the specific page
  • number of total results
  • url of this result page
  • url of previous, next, first and last page of results
  • link to the HTML alternate representation for this page, in the normal sindice website
  • author field, Sindice.com
  • number of items per page
  • starting index in this page
  • a Query object with fields that allow replaying of this query (search Term, page, role)

then there is a list of entries, each one has

  • title, a list of the document labels in JSON and RDF, and a single field with comma separated strings for Atom (we can't change the spec)
  • formats, a list, for example RDFa and Microformat
  • content, a simple string such as: "13 triples in 1000 bytes"
  • link, the document URI
  • updated, the document modification date

In specific, a JSON-encoded object looks like this:

{
 "updated": "2008/06/03 18:27:29 \+0100",
 "base": "http://api.sindice.com/v2/search?q=gabriele\u0026qt=term"
 "totalResults": 211,
 "search": "http://www.sindice.com/opensearch.xml",
 "self": "http://api.sindice.com/v2/search?q=gabriele\u0026qt=term\u0026page=1",
 "previous": "http://api.sindice.com/v2/search?q=gabriele\u0026qt=term\u0026page=",
 "title": "Sindice search: gabriele",
 "last": "http://api.sindice.com/v2/search?q=gabriele\u0026qt=term\u0026page=22",
 "alternate": "http://sindice.com/v2/search?q=gabriele\u0026qt=term",
 "author": "Sindice.com",
 "first": "http://api.sindice.com/v2/search?q=gabriele\u0026qt=term\u0026page=1",
 "itemsPerPage": 10,
 "startIndex": 1,
 "next": "http://api.sindice.com/v2/search?q=gabriele\u0026qt=term\u0026page=2",
 "query":
  {
   "role": "request",
   "startPage": 1,
   "searchTerms": "gabriele"
  },
 "link": "http://api.sindice.com/v2/search?q=gabriele\u0026qt=term\u0026page=1",
 "entries":
  [
   {
    "title": ["Gabriele Albertini"],
    "formats": ["RDF"],
    "content": "183 triples in 32484 bytes",
    "link": "http://dbpedia.org/resource/Gabriele_Albertini",
    "updated": "2008/05/23"
   },
   {
    "title": ["Gabriele Paonessa"],
    "formats": ["RDF"],
    "content": "111 triples in 16153 bytes",
    "link": "http://dbpedia.org/resource/Gabriele_Paonessa",
    "updated": "2008/05/23"
   },
  ...
  ]
}

The format closely matches the OpenSearch format, so refer to that for further details, the only two differences are the title field in the entry, which is a list (a document can have different labels) and the format field which is a list of the formats found in one page (for example, RDFa and microformats).

Example ATOM format:

<?xml version="1.0" encoding="iso-8859-1"?>
<feed xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/"
      xmlns:sindice="http://sindice.com/vocab/fields#"
      xmlns="http://www.w3.org/2005/Atom">
  <title>Sindice search: gabriele</title>
  <link href="http://api.sindice.com/v2/search?page=1&amp;q=gabriele&amp;qt=term"/>
  <updated>2008-06-03T19:50:39+01:00</updated>
  <author>
    <name>Sindice.com</name>
  </author>
  <id>http://api.sindice.com/v2/search?page=1&amp;q=gabriele&amp;qt=term</id>
  <opensearch:totalResults>211</opensearch:totalResults>
  <opensearch:startIndex>1</opensearch:startIndex>
  <opensearch:itemsPerPage>10</opensearch:itemsPerPage>
  <opensearch:Query role="request" startPage="1" searchTerms="gabriele"/>
  <link href="http://sindice.com/search?page=1&amp;q=gabriele&amp;qt=term"
        rel="alternate" type="text/html"/>
  <link href="http://api.sindice.com/v2/search?page=1&amp;q=gabriele&amp;qt=term"
        rel="first" type="application/atom+xml"/>
  <link href="http://api.sindice.com/v2/search?q=gabriele&amp;qt=term"
        rel="previous" type="application/atom+xml"/>
  <link href="http://api.sindice.com/v2/search?page=2&amp;q=gabriele&amp;qt=term"
        rel="next" type="application/atom+xml"/>
  <link href="http://api.sindice.com/v2/search?page=22&amp;q=gabriele&amp;qt=term"
        rel="last" type="application/atom+xml"/>
  <link href="http://api.sindice.com/v2/search?page=1&amp;q=gabriele&amp;qt=term"
        rel="self" type="application/atom+xml"/>
  <link href="http://www.sindice.com/opensearch-term.xml"
        rel="search" type="application/opensearchdescription+xml"/>
  <entry>
    <title>Gabriele Albertini</title>
    <link href="http://dbpedia.org/resource/Gabriele_Albertini"/>
    <id>http://dbpedia.org/resource/Gabriele_Albertini</id>
    <updated>2008-05-23T00:00:00+01:00</updated>
    <sindice:format>RDF</sindice:format>
    <content>183 triples in 32484 bytes</content>
  </entry>
  <entry>
    <title>Gabriele Paonessa</title>
    <link href="http://dbpedia.org/resource/Gabriele_Paonessa"/>
    <id>http://dbpedia.org/resource/Gabriele_Paonessa</id>
    <updated>2008-05-23T00:00:00+01:00</updated>
    <sindice:format>RDF</sindice:format>
    <content>111 triples in 16153 bytes</content>
  </entry>
</feed>

It is a simple ATOM file, plus the OpenSearch schema plus a single additional tag for carrying informations about the document format. You should be able to parse this easily with any XML parser.

The RDF representation defines the base search URI as a search:Result object, which has many search:resultPage}}s, each one having many {{search:Entry. the other fields should be obvious, and mimic the other searches.

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:fields="http://sindice.com/vocab/fields#"
         xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
         xmlns:foaf="http://xmlns.com/foaf/0.1/"
         xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
         xmlns:dc="http://purl.org/dc/elements/1.1/"
         xmlns:dcterms="http://purl.org/dc/terms/"
         xmlns="http://sindice.com/vocab/search#">
  <Results rdf:about="http://api.sindice.com/v2/search?q=gabriele&amp;qt=term">
    <dc:title>Sindice search: gabriele</dc:title>
    <dc:date>2008-06-03T19:54:11+01:00</dc:date>
    <dc:creator>Sindice.com</dc:creator>
    <totalResults>211</totalResults>
    <itemsPerPage>10</itemsPerPage>
    <terms>gabriele</terms>
    <firstPage rdf:resource="http://api.sindice.com/v2/search?page=1&amp;q=gabriele&amp;qt=term"/>
    <lastPage rdf:resource="http://api.sindice.com/v2/search?page=22&amp;q=gabriele&amp;qt=term"/>
    <page rdf:resource="http://api.sindice.com/v2/search?page=1&amp;q=gabriele&amp;qt=term"/>
    <opensearchDescription rdf:resource="http://www.sindice.com/opensearch.xml"/>
  </Results>
  <ResultPage rdf:about="http://api.sindice.com/v2/search?page=1&amp;q=gabriele&amp;qt=term">
    <startIndex>1</startIndex>
    <previousPage rdf:resource="http://api.sindice.com/v2/search?q=gabriele&amp;qt=term"/>
    <nextPage rdf:resource="http://api.sindice.com/v2/search?page=2&amp;q=gabriele&amp;qt=term"/>
    <htmlPage rdf:resource="http://sindice.com/search?page=1&amp;q=gabriele&amp;qt=term"/>
    <entry rdf:resource="#result1"/>
    <entry rdf:resource="#result2"/>
    ...
  </ResultPage>
  <Entry rdf:about="#result1">
    <dc:title>Gabriele Albertini</dc:title>
    <link rdf:resource="http://dbpedia.org/resource/Gabriele_Albertini"/>
    <dc:created>2008-05-23T00:00:00+01:00</dc:created>
    <fields:format>RDF</fields:format>
    <content>183 triples in 32484 bytes</content>
    <rank>1</rank>
  </Entry>
  <Entry rdf:about="#result2">
    <dc:title>Gabriele Paonessa</dc:title>
    <link rdf:resource="http://dbpedia.org/resource/Gabriele_Paonessa"/>
    <dc:created>2008-05-23T00:00:00+01:00</dc:created>
    <fields:format>RDF</fields:format>
    <content>111 triples in 16153 bytes</content>
    <rank>2</rank>
  </Entry>
 ...
</rdf:RDF>

Integrating JSON in your script

If you want, you can add an additional argument to the request called callback, which will cause the code to be wrapped in a function with the name you choose.
This allows clean integration of the Sindice results in your webpage, for example:

<script type="text/javascript"
        src="http://api.sindice.com/v2/search?q=mike&qt=term&format=json&callback=showSindiceResults" />

Notice that to force the rendering of JSON output we added an additional parameter format. It can obviously be used with values atom and rdfxml

Other API versions

Currently, our API Version is 2, with base address http://api.sindice.com/v2/
As new APIs will be released, the old one will be kept at the existing locations.

API v1

The previous version of Sindice API is still available. It implements the following 3 searches:

In the simple APIs there are 3 query types, which mimic the old Sindice search queries,

V1 Result Formats

The result format can be selected in two ways: by HTTP content negotiation or by an optional format query parameter. The default format is HTML.

Content negotiation examples:

  • To get results in RDF:
    curl -H "Accept: application/rdf+xml" http://api.sindice.com/v1/lookup?keyword=berlin
  • To get results in JSON:
    curl -H "Accept: application/json"http://api.sindice.com/v1/lookup?keyword=berlin
  • To get results in Plain text:
    curl -H "Accept: text/plain" http://api.sindice.com/v1/lookup?keyword=berlin
  • To get results in XOXO:
    curl -H "Accept: text/html" http://api.sindice.com/v1/lookup?keyword=berlin

V1 Query Parameters

Instead of using a single query type parameter, the V1 API uses multiple parameters. This means that you can specify more than one arguments, and they are tried in order: thus, specifying both keyword and url means that you will get the results only for the former.

Query Limits

Sindice currently limits to 100 the number of result pages for each query. For special needs you can refer to our developer forum or contact us directly.