Any23 v0.2 Released

We are proud to announce a new release of Any23 - Anything to Triples
http://developers.any23.org/
Any23 is a Java library that parses RDF from a variety of Web document formats.
The currently supported input formats are RDFa, RDF/XML, Turtle, N3, N-Triples,
and a number of Microformats.
Any23 is an Open Source project originated from the code created within the Sindice project
and now used both inside sindice and in related projects e.g. Sig.Ma .
Any23 comes with a handy command-line tool for parsing RDF and converting between formats.
We have also set up a demo service where you can try any23 online and use a REST API to convert
between different RDF formats, similar in spirit to triplr.org:
http://any23.org/
The major new features in this release are:
  • Redesigned Java API
  • -  Input from string, stream, file, or URI
  • -  Allow choosing which extractors to use
  • -  Report origin of triples (document/extractor) to client processors
  • -  Various processors/serializers for extracted triples
  • Added flexible command-line tool for easy testing
  • Vastly improved website and documentation
  • Media type and encoding detection via Apache Tika
  • Switched RDF library from Jena to Sesame
  • Added Maven build
  • Better RDF extraction from Microformats
  • Extractors come with example file to document typical in- and output
  • Major refactoring
  • Lots and lots of bugfixes
The following people have contributed to this release:
Michele Mostarda and Davide Palmisano (FBK, Trento, Italy, Web of Data Unit (WED) );
Richard Cyganiak and Jurgen Umbrich (DERI, NUI Galway, Ireland);
Michele Catasta (EPFL, Lausanne, Switzerland), Giovanni Tummarello.
This release is the first result of the joint effort between Fondazione Bruno Kessler and DERI,
that recently started working together on Sindice. We strongly believe that Any23 could benefit from the wide
Open Source community, especially considering the license under which it has been released.
We think that the new Any23 v0.2, now integrated in the Sindice ingestion pipeline,
will impact on the quality of the indexed data.
This release adds other pieces of Open Sources in Sindice, notably the Semantic Information Retrieval Index (SIREN) available at http://siren.sindice.com .
Post filed under Sindice.

No comments

No comments for this post.

About this blog

In this blog you'll find announcements related to Sindice project, as well as news about Semantic Web topics or technical issues strictly related to the search engine.

Categories