Short info about the trip
- Venue title: Semantic Technology and Business Conference
- Related project: SFB
- Venue location: San Jose, California, USA
- Venue date: 2014-08-20
- Website: http://semtechbizsj2014.semanticweb.com/index.cfm
- Short description: 10th version of the SemTech conference. People from mainly industry but also researchers meet to exchange experiences with Semantic Web technologies. Around 150 participants (SemTech, only; without collocated NoSQL conference).
- Outcome: Main take-home messages: Apache Spark for fast in-memory-cluster-based analytics is gaining momentum as an alternative to hadoop and mapreduce. The main graph-based data models discussed seem to be RDF (surprise!) and the property-graph model as used by Neo4J and SAP (relations as first-class citizens to which arbitrary properties such as weights can be attached); oracle showed in a presentation how to (partly) quite trivially map between the two models. Based on the workshop "RDF as a Universal Healthcare Exchange Language - 2nd Annual", also other people recommend using RDF for representing medical information as we do in the SFB project.
- Other participants: Phil Archer (W3C), Jeff Z. Pan (Aberdeen), Dan Brickley (Google), Thanh (San Jose, currently on leave), Sudhir (Stanford), Sandro Hawke (W3C), Lukasz Porwol (Insight, DERI), Peter Mika (Yahoo), Peter Haase (FluidOps), Bryan Thompson (Systap)...
- Good experience: Great overview of companies and topics in the field of Semantic Technologies. Great for networking.
- Bad experience: Unfortunately, most talks would not go into much technical detail; much buzz word dropping.
Phil Archer gave a keynote about "Semantic Web - 10 years of Achievement". He tried to argue against some people saying "RDF is an academic solution looking for a problem". He mentioned the elaborate use of Linked Data by the BBC and Sarven Capadisli's Linked Statistical Data Analysis tool  as examples. Another example is JSON-LD  which is a lightweight Linked Data format based on JSON and as such can be consumed by many applications (not necessarily leveraging Semantic Web features). Phil also highlighted HTML5 Web Components  for seamlessly embedding of components into websites. Also he recommended http://semanticweb.com/ as a source for Semantic Web-related information, e.g., jobs. Side note: SemanticWeb.com seems to become a more active SW reference than http://semanticweb.org/ which (as far as I know) is hosted by AIFB and unfortunately has had some severe spam attacks.
Dan Brickley gave a status report on schema.org. For instance, schema.org includes owl:sameAs, GoodRelations ontology; schema.org annotations can be added to E-Mails. They plan to cooperate more with Wikidata; Wikidata is seen as a great source for lists of entities, schema.org does not intend to be a source of lists, but rather a dictionary of most important, concrete terms. The main design goal remains to make it easier for web developers to publish schema.org data. The killer app remain search engines.
Ramanathan V. Guha, from Google, Inc., gave a follow-up keynote on schema.org. Three years after its launch over 5 million Internet domains, over 20% of the pages on the web, are using schema.org markup; thus the Semantic Web finally reached web-scale with structured data directly available on the web. Next big thing: Scientific Data Publishing with schema.org. His final remark: This new medium of sharing structured data over the web needs time to evolve. The same happened to other mediums: the book where only after people tried to carry books with horses started to add book covers and page numbers; the TV, where broadcasting companies first would have scripts read out loud.
The conference was collocated with NoSQL conference (program at ). Both conferences had an exhibition together. Example companies represented: FluidOps, Cambridge Semantics, MarkLogic (was quite prominent at the conference; was mentioned in several presentations as backend provider), MapR (current employer of Michael Hausenblas, formerly from DERI), TopQuadrant, XSB, Postgres, Poolparty (represented by an American partner of the Semantic Web Company), Couchbase, Neo4j, Oracle, mongoDB, 3roundstones...
Not always it was easy to uniquely relate a company to either SemTech or NoSQL. Their pitch typcially would be: "We can handle all kinds of data, be it documents, key-values, graphs or RDF; we allow flexible visualisations; we scale ETL and analytics using Hadoop or Spark."
The two sessions I liked the most were (full program :
"Big Data Modeling" (video and slides at ) which talked about using OWL ontologies as the canonical data model within companies which would on demand be mapped to various data representations such as attribute calls in Ruby and Key-Value store data. Unfortunately, most talks - also this one - would not go into much technical detail.
"RDF as a Universal Healthcare Exchange Language - 2nd Annual" (presentations at ) with a talk by Michel Dumontier from Stanford about using RDF as a basis for translational science in healthcare, e.g., for "Identifying human drug targets with animal model phenotypes".
The Yosemite Manifesto  presented by David Booth argues for RDF as a Universal Healthcare Exchange Language and has been signed by several researchers.
Other interesting talks:
- "Bringing Coherence to Cognition: Flexible Semantics, Deep Reasoning, and Explanations Benjamin Grosof, Coherent Knowledge Systems, LLC"
- "Cognitive Distance based on Kolmogorov Complexity". Example: Relatedness of saddle - cowboy - movie. According to Google, cowboy - movie are more related than cowboy - saddle. According to Kolmogorov complexity, cowboy and saddle are more related.
This event was attended by: Benedikt Kämpgen