VoCampSunnyvale2009

What[edit]

The sixth ever VoCamp, to be held in Sunnyvale and organized with the help of Yahoo. See WhatIsVoCamp for a basic intro to the concept.

In addition, it's good to know that VoCamps are intended for hands-on developer work on vocabularies and other issues related to interoperability of semantic applications. A minimal level of understanding semantic technologies is required because the program will not feature tutorials on ontology engineering or semantic web standards.

When[edit]

Thu 18th - Fri 19th, June, 2009

The first day is only half-day (afternoon), the second day is full day.

This VoCamp is just after SemTech 2009, the Semantic Technology Conference taking place in San Jose.

Programme[edit]

Detailed programme to be determined later. The first half-day will be used for introductions using lightning talks and determining the topics to be worked on during the second day. The second day will be spent working in small groups (3-6 people).

For introductions you are welcome to use at most two slides BUT only if you send them in advance. (We are a fairly large group so we have to get a bit more organized.) Tell us who you are but also what you would like to achieve! Email your slides to pmika(at)yahoo-inc.com

Question re. slides: What format? Jpg? Pdf? Link to slideshare? Powerpoint? OpenOffice Presentation?

Thursday 18th June 2009[edit]

13:00 Arrivals
13:15 Introduction by participants
15:00 Snacks
15:30 Formation of Working Groups
18:00 Close

Friday 19th June 2009[edit]

8:30 Breakfast
9:00 Overview of the first day and planning for the second
9:00-12:00 Working Groups
12:00-13:00 Lunch
17:00 Reporting from Working Groups and discussion
18:00 Close

Where[edit]

Yahoo! Sunnyvale campus
700 First Avenue Sunnyvale, CA 94089
Building E, Classrooms 9/10 (1st Floor)
****Please note that VoCamp will be held in Building E, which is across the street (Mathilda) from the main campus (Building D).****

PARKING:
You should not have to swipe a card key to enter the parking lot. Park at any available (non-handicapped, carpool, or otherwise reserved) spot.

REGISTRATION:
Please tell the receptionist you are here for VoCamp, and if you registered prior to 9am on Wed June 17, your name should with the receptionist in the lobby.

Once you enter through the lobby, walk past the elevators, and turn left. Classrooms 9/10 are at the end of the hall.

CONTACT INFORMATION:[edit]

If you have any difficulties, please contact:
Melinda Chung
Evan Goer

Accommodation[edit]

Hotels[edit]

Mention that you are with Yahoo! to obtain the discounted rate.

Best Western Silicon Valley
600 N Mathilda Ave Sunnyvale CA
(408) 735-7800
Yahoo! Rate: $90/night
Included: Breakfast, High Speed Internet Access

Larkspur Landing Sunnyvale
748 N Mathilda Ave, Sunnyvale, CA
(408) 733-1212
Yahoo! Rate: $124/night
Included: Breakfast, High Speed Internet Access

Sheraton Sunnyvale Hotel
1100 N. Mathilda Ave, Sunnyvale, CA‎
(408) 542-8207‎
Yahoo! Rate: $179/night
Included: High Speed Internet Access

Motels[edit]

No discounted Yahoo! rate

Travel Inn
590 North Mathilda Ave.
Sunnyvale, CA 94086
(408) 737-1177

Vagabond Inn
816 W. Ahwanee Ave
Sunnyvale, CA‎
(408) 734-4607‎

Quality Inn
940 Weddell Drive
Sunnyvale, CA 94089
(408) 734-3742

Travel[edit]

The nearest airport is San Jose International (SJC), followed by the even-more-international San Francisco Airport (SFO) and Oakland International Airport (OAK).

Carpooling[edit]

Looking for a car[edit]

AlexandrePassant looking for a car for 2 people
- from and to SanJosé (SemTech) on the 18th morning and from SanJosé on the 19th morning
- to SF on the 19th evening

Offerings[edit]

How Much[edit]

The VoCamp event itself is free, although participants will need to pay for their own travel, accommodation and food. The venue is sponsored by Yahoo. See also VoCampSupporters...

Who[edit]

Organisers[edit]

Peter Mika
Melinda Chung
Evan Goer

and many other friendly folks at Yahoo!.

Participants[edit]

Participants, please list your name and any areas of interest related to vocabularies and interoperability of Semantic Web applications (note: this is non-binding; it's also fine to list none). NB Only 40 places available this time, first come first served! If you sign up, also subscribe to the VoCamp mailing list so that we can reach you in case of changes. Please sign up by June 16 so we can ensure that everyone is able to get past security and into parking (if needed).

PeterMika is thrilled to be again co-organizing a VoCamp after the success of Ibiza. I would like to discuss SearchMonkey and the new SearchMonkey Objects start page and in general our role in facilitating convergence around vocabularies on the Semantic Web.
NovaSpivack is looking forward to this event. I would like to discuss Twine and where we are heading with the service, the Semantic Web in general, and how to move things forward as a community.
TomGruber is interested in fostering an Internet ecosystem of structured data and services, and bringing the value of intelligent services to the public.
Jamie Taylor: I work on Freebase and I'm interested in community generated data/models. I would be excited to collaborate with people on creating vocabulary to vocabulary mappings in Freebase (using something like: http://vocabulary.freebase.com/view/base/vocabulary/views/ontology)
NickCox is looking forward to his first VoCamp. I'm the Product Lead for SearchMonkey but have a wider interest in Vocabulary definition and standardisation. The Semantic Web is gaining in popularity all the time. Once hooked, the 'how do I get started' question should be a simple one to answer. That's not always the case today, and it certainly isn't a quick answer. I'd like to discuss how we can break down these entry barriers. Do you feel our SearchMonkey Objects move helps?
[Alexandre Passant] from DERI Galway. Interested in Social Web and Semantic Web convergence (co-author of the SIOC and CommonTag vocabularies). Would like to discuss interoperability between social data as well as developing mash-ups / demos using this data. Also interested on helping existing applications to enable such vocabularies in their pages (w/ RDFa) and identifying what's missing regarding their needs in terms of vocabularies.
Julie Letierce, M.Sc. student at DERI Galway, interested in outreaching Semantic Web technologies.
Alex Cozzi with Yahoo! research. I am involved with SearchMonkey and Semantic Web initiatives at Yahoo!
[Ben Ward] works at the Yahoo Developer Network and is an administrator at [microformats.org], where he helps manage the community, and writes microformat specifications and parsing patterns, and co-organises the weekly Microformats Dinners events in San Francisco.
Paul Tarjan is excited about working on the "code documentation" vocabulary that we used for YUI [1]. I want to integrate this into javadoc, doxygen, etc. I'm also the Tech Lead for SearchMonkey so I'm happy to help there as well, as long as I still get to do my code vocab stuff :)
Nicolas Torzec, Scientist at Yahoo!, is working on acquiring and combining different forms of knowledge (entities, facts...) from several types of sources (structured, semi-structured, unstructured), and integrating them centrally in a large-scale knowledge base providing a single unified representation convenient for querying and consumption.
Marco Neumann KONA
Micah Dubinko, Lead Engineer at Mark Logic. I'm still building search applications, which increasingly depend on (semi-)structured data.
Ian Davis, CTO at Talis. I'd like to discuss Open Vocab, Data Incubator, Talis Connected Commons and the Talis Platform
Leigh Dodds, Programme Manager at Talis. I'd like to discuss the Talis Connected Commons, the Talis Platform and some personal projects I'm working on, e.g. modelling NASA launch data
Paul Miller, Founder at the Cloud of Data.
Mike Dean is only available on Thursday.
Felix Van de Maele, CEO at Collibra. I can't make it unfortunately. Damien Trog will come in my place.
Carl Hewitt, Emeritus, MIT EECS
Steve Williams is a DataPortability Project steering group member, BayCHI volunteer, contractor to Digg.com, and independent web developer: We're interested in developing a "social voting" vocabulary for RDFa and/or microformats, so digg counts and other social gestures can be marked up in a way that's not specific to Digg.
Tom Wilson, Web Software Architect, Lucile Packard Children's Hospital
Hwee Song, interested in bridging semantic web with data integration
Karen Lopez, Project Manager at InfoAdvisors and Standards Architect for ARTS. Beginner at ontologies, but long time data person. Interested in hands on experience leveraging traditional data models and industry standard data models for the semantic web.
Linnea Shieh, MLIS Candidate at UW, student of Semantic Web-based ontologies and thesauri.
Jill McRae, Data Architect, Service Architectures, SharePoint MOSS, bridging Semantic Web, dbiai LLC
Aju Badardeen. EAI Consultant, I have been ivloved in the ConceptVISTA project at Penn State where I did some work on basic concept comparison using RDF ontologies.
Peter Offringa, VP Engineering, CBS Interactive - CNET. At CNET, we plan to publish our technology product, news and software catalog data to the linked data web via RDF. Would love to collaborate with other organizations on vocabularies/ontologies pertinent to these domain spaces. (UPDATE: I may have a conflict, but Adam Goldband, #37, is on my team)
Andraz Tori, CTO at Zemanta. I am interested in vocabularies for interoperability of smart services and ecosystem of structured data.
Michael Erdmann, Chief Architect, ontoprise GmbH, Karlsruhe, Germany, believes that semantic technology will prevail. As a result of this VoCamp I would like to see a best-practice for representing units of measurment in RDF.
Reiner Kraft, Technical Yahoo at Yahoo Search, would like to learn more about semantic Web technologies and how it can be applied within the Web search domain
C Lee, community contributor
Todd Pehle, interested in design of the neoGeoSemanticWeb, lying at the intersection of the SocialWeb, the GeoWeb & the Semantic Web. Such a design requires not only location vocabularies but linking with the myriad of ontologies that utilize location. Seems like VoCamp would be a great place to collaborate. Looking forward to it!
James Isaacs, trained in model-theoretic semantics. Very excited to be part of this VoCamp.
Adam Goldband, Director of Systems Engineering, CBSInteractice - CNET. Working on the creation of CNET's presence on the Semantic Web. Developing CNET's addition to the overall vocabulary of the SW, and exposing that via RDF.
Ritesh Agrawal, Research Engineer, AT&T Interactive. I am interested in personalization of the Internet and aims to design rich user model that explicitly captures an individual's subjective view of the world and use it to design personalzied search engines and recommendation systems.
Juan Sequeda, PhD Student at UT Austin, Co-Founder of Semantic Web Austin. Working on Migrant and Displaced Population Ontology (MIPO)
Uldis and ericP, I'm a PhD student at DERI in Ireland working on SIOC, and I'm a Sanitation Engineer at MIT cleaning the web for W3C. - (will only be at VoCamp on the 1st day)
Newton Chan, CS Faculty at Foothill College and Technology Evangelist of Silicon Valley Web Builder, has been interested in applying SemTech in goal-oriented contextual software system design.
Oswald Campesato (product development)
Bobbin Teegarden, Evangelist for agents who think with ontologies. My interest is in entropic intelligence: building thoughtful ontologies and software that makes the world more intelligent.

Standby List[edit]

Sorry - we are at capacity and will not be able to accomodate Standby List participants.

When the list above gets full, add your name here if you want to be on the standby list in case places become free or we decide to move to a bigger room (It looks like a slightly larger room is needed):

1. Craig A. Cook. I'm interested in especially, literature (fiction especially) and anthropology. I'm a web developer (Ruby) and have a CS degree, so I'd like to see more work done for writers and readers.

2. NicoAdams

3. Irene Gabashvili , using semantic technologies for data integration

4. Christian Grant. I'm interested in entity extraction for creating a structured data model from unstructured data. I'm also interested in populating the data model using unstructured data.

4. Daniela Barbosa. Dow Jones, DataPortability project, Librarian www.danielabarbosa.com

5. Havi Hoffman. Yahoo! Developer Network. developer.yahoo.com/blog

6. Mark Carranza. Cooperative Mind Cooperative, SF. I want to explore how semantic vocabularies link with natural language expressions, i.e., tag each other, ala "Everything is Miscellaneous." Interested in how semantic technologies are used/useful in human creative thinking, research, work. Also in education.

7. Lakshmi Reddy - using semantic technologies for bio tech area ... (increase as needed)

Would like to, but can't[edit]

Mani Kumar, Software Engineer, SlideShare Inc. I am based out of New Delhi, India. I hope you guys setup a video conferencing or atleast capture a video.
David Morris: My social networking project went away and unfortunately food on the table must take priority. Perhaps in the future.
Wen Ruan, CSO at TextWise. I would to explore how unstructured data work within the structured data environment.

Outcomes[edit]

To be filled during and after the event.

Finding vocabularies[edit]

Finding vocabularies is an important first step in using vocabularies, as well as in developing new vocabularies (in the latter case to check if there are no existing vocabularies out there with a similar purpose).

There is no central repository for vocabularies, but there are a number of sites that might be worthwhile to look at for finding vocabularies that contain particular terms.

Open Ontology Repository is the design document behind the OOR system, which is currently used at BioPortal. The same technology could be used to set up other ontology repositories in different domains.
Swoogle is a Semantic Web search engine that might be useful to find ontologies. It's not clear how the ontologies are ranked, but there seems to be some ranking. It's not clear how often Swoogle is updated.
Watson is a Semantic Web search engine that claims to specialize in finding ontologies. However, the ontologies don't seem to be ranked in any particular order. It's not clear how often Watson is updated.
SchemaWeb is an old site with vocabularies, by now outdated.
The semanticweb.org wiki has a list of ontologies sorted by their counts in Swoogle. Table might be out of date.
Sindice is the most frequently updated and the most comprehensive Semantic Web search engine. The interface itself doesn't allow to query for an ontology with a particular name for a class, but the API does.

Mapping vocabularies[edit]

The issues are related to how we can express mappings, how we can publish them and find mappings created by others.

Expressing mappings[edit]

OWL1 and OWL2 ontology languages allows to explicitly express or infer the equivalence of classes and properties.
Semantic Web Rule Language (SWRL) is a W3C submission. Gives more expressivity (based on rules) than OWL1. Unclear how many implementations.
Rule Interchange Format (RIF) is currently being developed at the W3C.
SPARQL query language contains a data transformation operator (CONSTRUCT). Problem is that there is no standard way for 'publishing' SPARQL queries and thus share them.

However, the current options are inadequate for expressing data transformations that are commonly required for mapping literals such as birthday in FOAF (month and day) and birthday in VCard (year, month and day).

Finding mappings[edit]

There are no specific search engines for finding mappings, but some of the resources at Where_to_find_vocabularies might be useful.

Executing mappings[edit]

A transformation service might be useful, but we don't know of any.

RDB to RDF mapping[edit]

Leading tool for mapping D2R server.

Mapping and RDB-to-RDF[edit]

Related topics: you can publish your data using a fake namespace (mydb) but that will not help others to reuse your data. Most of the effort thus goes into the mapping from the relational schema to existing ontologies.

Publishing a SPARQL endpoints vs. other forms of publishing RDF[edit]

SPARQL vs. RDFa vs. API vs. download

SPARQL vs. REST-style publishing of canned queries

Do we need to convert the world to SPARQL for it to be useful?

How to find SPARQL endpoints[edit]

Methodologies for developing vocabularies[edit]

Attendees: Tantek Çelik, Newton Chan, Alex Cozzi, Karen Lopez, Evan Goer, Marco Neumann, Linnea Shieh, Paul Tarjan, Nicolas Torzec

Start with a use case.
Do your research. What's already been done? Look for examples of content where people have done this already -- on the web or off. Document your research publicly! Think like a scientist. You're abstracting anthropological data.
Determine your scope. Don't model the entire world.
Focus on i18n. Localize your vocabularies, but do it in a way that reflects your world.
Focus on accessibility. Factoring in i18n and accessibility is a good ways to take your specific use case and make it "general enough".
Don't rename things. Lots of examples of existing vocabularies that were migrated over, but renamed for arbitrary reasons
Object-oriented thinking is bad for vocabulary design. Don't worry about being typesafe! You'll end up putting types on everything, even when it doesn't fit.
Link to public resources rather than just using labels. Labels are fine for making things readable, but every time you move away from linking (URIs) it's much more work to reuse things.
Read existing methodologies. http://microformats.org/wiki/process :)

Learning About Microformats[edit]

Goal was to introduce Microformats to people who barely know it.

Tantek Çelik [2] shared his knowledge and thoughts about Microformats, and pointed to:

Vocabulary for Code Documentation[edit]

Inventory of Vocabularies for code documentation
Proposal (TBA)