About

Semantic Matchmaking is a usual business case our clients are looking for. In this demo we are simulating the process of "matching" the best qualified consultants (based on their CVs and previous project assignments) to projects based on a purely text based project description. The "Semantic Matchmaker demo" is a semantic matchmaking tool that supports this effort by providing a visualized dashboard and sorted lists. It is based on the concept of a Semantic Data Fabric using PoolParty as the Semantic Middleware and data.world as the Data Catalog. The demo uses a data catalog (data.world) to access all datasets (CVs, Timesheets with "assignments" and Project Descriptions) and PoolParty to extract a "semantic footprint" from structured and unstructured text based on a knowledge graph related to skills and occupations.

With the joint capabilities of a data catalog and a semantic middleware and its innovative and powerful knowledge model, the Semantic Matchmaker accesses structured and unstructured data from various data-sources, extracts keywords from CVs and Projects to generate a semantically enriched footprint that is used to provide the "best matches".

Knowledge model

The core functionality of the Semantic Matchmaker comes from a powerful knowledge model, containing skills, topics, and occupations. The knowledge model has been built with PoolParty and contains over 21,000 concepts. It uses an adapted and enhanced version of the ESCO classification of the European Commission as a basis for skills and the occupations that are related to them.

Method / Technical Details

The matchmaking process:

  • A Taxonomy including soft and hard skills, research fields, etc. is created in PoolParty.
  • Project related data (in CSV format) and people related data – CVs and experience – (in PDF and CSV format respectively) are uploaded in data.world workspace.
  • The PoolParty Extractor plugin makes the integration between data.world and PoolParty. When a file is selected and the plugin is activated the content of the given file is sent to the extractor which returns all tagged concepts.
  • PoolParty Extractor returns a description of the extraction process itself. This graph is transformed into a description of a project (foaf:Project) or a person (foaf:Person) respectively.
  • The project and person descriptions are stored back to data.world as RDF files using the text/turtle serialization format.
  • Those project and person descriptions are available via data.world's SPARQL endpoint and constitute the data source of the Semantic Matchmaking application.

Note that by nature of data.world data organization, we need to introduce some conventions to be able to distinguish between various data files. This is mainly related to the fact that data.world is not capable of storing files in folders or directories, but rather in a flat list for a project's dataset. For the use case of Semantic Matchmaking, we need to be able to distinguish between a CV of a person (e.g. CV_AndreasAndersen.pdf), his/her work experiences (i.e. EXP_AndreasAndersen.csv) and project descriptions (e.g. project_814572.csv).

Queries

There are several queries stored on data.world that serve the frontend application. Those queries are parameterized and serve lists of projects and people in case the parameter is not set. The frontend uses those queries to either list projects and people or get data for a single project or a single person. Both queries, for projects and people return a title, a description and the concatenated set of tags, the semantic footprint of either projects and people. As an example the "ListProjects" query, stored on data.world is given below.

PREFIX : <https://semanticwebcompany.linked.data.world/d/integrationdevproject/>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT
    ?project ?created ?title ?description
    (GROUP_CONCAT(DISTINCT ?tagging; separator="::") AS ?taggings)
WHERE {
    ?project
        a foaf:Project;
        dcterms:created ?created;
        dcterms:title ?title;
        dcterms:description ?description;
        ppx:tagged ?tag .
    ?tag skos:prefLabel ?tagLabel .
    BIND(CONCAT(STR(?tag),"|",STR(?tagLabel)) AS ?tagging)
}
GROUP BY ?project ?created ?title ?description
ORDER BY DESC(?created)
OFFSET 0
LIMIT 20