Call for Participation – LDK Conference & DBpedia Day

Sunday, March 24, 2019 - 1:40pm

Call for Participation

With the advent of digital technologies, an ever-increasing amount of language data is now available across various application areas and industry sectors, thus making language data more and more valuable. In that context, we would like to draw your attention to the 2nd Language, Data and Knowledge (LDK) conference which will be held in Leipzig from May 20th till 22nd, 2019.

LDK – Conference

This new biennial conference series aims at bringing together researchers from across disciplines concerned with language data in data science and knowledge-based applications.

Keynote Speakers

We are happy, that Christian Bizer, a founding member of DBpedia, will be one of the three amazing keynote speakers that open the event. Apart from Christian, Christiane Fellbaum from Princeton University and  Eduart Werner, representative of Leipzig University will share their thoughts on current language data issues to start vital discussions revolving around language data.

Be part of this event in Leipzig and catch up with the latest research outcomes in the areas of acquisition, provenance, representation, maintenance, usability, quality as well as legal, organizational and infrastructure aspects of language data.  

DBpedia Community Meeting

To get the full Leipzig experience, we also like to invite you to our DBpedia Community meeting, which is colocated with LDK and will be held on May, 23rd 2019. We also offer an interesting side-event, the Thinktank and Hackathon “Artificial Intelligence for Smart Agriculture”. Visit our website for further information.

Join LDK 2019 and our DBpedia Community Meeting to catch up with the latest research and developments in the Semantic Web Community. 

 

Yours DBpedia Association

The post Call for Participation – LDK Conference & DBpedia Day appeared first on DBpedia Blog.

One of 206 – GSoC 2019 – Call for students

Wednesday, February 27, 2019 - 10:52am

 

Pinky: Gee, Brain, what are we gonna do this year?
Brain: The same thing we do every year, Pinky. Taking over GSoC.

Exactly what DBpedia plans to do this summer. We have been accepted as one of 206 open source organizations to participate in Google Summer of Code  (GSoC) again. Yes, ONE OF 206, let that sink in. The upcoming GSoC marks the 15th consecutive year of the program and is the 8th year in a row for DBpedia.

What is GSoC? 

Google Summer of Code is a global program focused on bringing student developers into open source software development. Funds will be given to students (BSc, MSc, Ph.D.) to work for three months on a specific task. For GSoC- Newbies, this short video and the information provided on their website will explain all there is to know about GSoC.

Time for a New Narrative

In the past years, we mentored many successful projects by female students but mostly male applicants. Now, it is time to change this narrative and work towards more diversity in science. This year, we at DBpedia are more determined than ever to encourage female students to apply for our projects. That being said, we already engaged excellent female mentors to also raise the female percentage in our mentor team. We are proud of all female DBpedians that help to shape the future DBpedia.

In the following four weeks, we invite all students, female and male, who are interested in Semantic Web and Open Source development to apply for our projects. You can also contribute your own ideas to work on during the summer. We are regularly growing our community through GSoC and can deliver more and more opportunities to you. 

And this is how it works: 3 steps to GSoC stardom

  1. Open source organizations such as DBpedia announce their projects ideas.
  2. Students contact the mentor organizations they want to work with and write up a project proposal.
  3. After a selection phase, students are matched with a specific project and a set of mentors to work on the project during the summer.
To all the smart brains out there, if you are a student who wants to work with us during summer 2019, check our list of project ideas, warm-up tasks or come up with your own idea and get in touch with us.

Further information on the application procedure is available via our DBpedia -Guidelines.  There you will find information on how to contact us and how to appropriately apply for GSoC. Please also note the official GSoC 2019 timeline for your proposal submission and make sure to submit on time.  Unfortunately, extensions cannot be granted. Final submission deadline is April 9th, 2019, 8pm CET.

Finally, check our website for information on DBpedia, follow us on Twitter or subscribe to our newsletter.

And in case you still have questions, please do not hesitate to contact us via praetor@infai.org.

We are thrilled to meet you and your ideas.

Your DBpedia-GSoC -Team

The post One of 206 – GSoC 2019 – Call for students appeared first on DBpedia Blog.

Vítejte v Praze!

Wednesday, February 13, 2019 - 1:18pm

After our meetups in Poland and France last year, we delighted the Czech DBpedia community with a DBpedia meetup. It was co-located with the XML Prague conference on February 7th, 2019.

First and foremost, we would like to thank Jirka Kosek (University of Economics, Prague), Milan Dojchinovski (AKSW/KILT, Czech Technical University in Prague), Tomáš Kliegr (KIZI/University of Economics, Prague) and, the XML Prague conference for co-hosting and support the event.

Opening the DBpedia community meetup

The Czech DBpedia community and the DBpedia Databus were in the focus of this meetup. Therefore, we invited local data scientists as well as DBpedia enthusiasts to discuss the state-of-the-art of the DBpedia databus. Sebastian Hellmann (AKSW/KILT) opened the meeting with an introduction to DBpedia and the DBpedia Databus. Following, Marvin Hofer explained how to use the DBpedia databus in combination with the Docker technology and, Johannes Frey (AKSW/KILT) presented the methods behind the DBpedia’s Data Fusion and Global ID Management.

Showcase Session

Marek Dudáš (KIZI/UEP) started the DBpedia Showcase Session with a presentation on “Concept Maps with the help of DBpedia”, where he showed the audience how to create a “concept map” with the ContextMinds application. Furthermore, Tomáš Kliegr (KIZI/UEP) presented “Explainable Machine Learning and Knowledge Graphs”. He explained his contribution to a rule-based classifier for business use cases. Two other showcases followed: Václav Zeman (KIZI/UEP), who presented “RdfRules: Rule Mining from DBpedia” and Denis Streitmatter (AKSW/KILT), who demonstrated the “DBpedia API”.

Miroslav Blasko presents “Ontology-based Dataset Exploration”

Closing this Session, Miroslav Blasko (CTU, Prague) gave a presentation on “Ontology-based Dataset Exploration”. He explained a taxonomy developed for dataset description. Additionally, he presented several use cases that have the main goal of improving content-based descriptors.

Summing up, the DBpedia meetup in Prague brought together more than 50 DBpedia enthusiasts from all over Europe. They engaged in vital discussions about Linked Data, the DBpedia databus, as well as DBpedia use cases and services.

 

 

 

In case you missed the event, all slides and presentations are available on our website. Further insights  feedback, and photos about the event can be found on Twitter via #DBpediaPrague.

We are currently looking forward to the next DBpedia Community Meeting, on May 23rd, 2019 in Leipzig, Germany. This meeting is co-located with the Language, Data and Knowledge (LDK) conference. Stay tuned and check Twitter, Facebook and the website or subscribe to our newsletter for the latest news and updates.

Your DBpedia Association

The post Vítejte v Praze! appeared first on DBpedia Blog.

Call for Participation: DBpedia meetup @ XML Prague

Wednesday, December 26, 2018 - 7:37pm

We are happy to announce that the upcoming DBpedia meetup will be held in Prague, Czech Republic. During the XML conference Prague , Feb 7-9,  the DBpedia Community will get together on February 7, 2019.

Highlights

– Intro: DBpedia: Global and Unified Access to Knowledge (Graphs)

– DBpedia Databus presentation

– DBpedia Showcase Session

Quick Facts

– Web URL: https://wiki.dbpedia.org/meetings/Prague2019

– When: February 7th, 2019

– Where: University of Economics, nam. W. Churchilla 4, 130 67 Prague 3, Czech Republic

Schedule
Tickets

– Attending the DBpedia Community Meetup costs €40. DBpedia members get free admission, please contact your nearest DBpedia chapter or the DBpedia Association for a promotion code.

– You need to buy a ticket. Please check all details here: http://www.xmlprague.cz/conference-registration/

Sponsors and Acknowledgments

– XML conference Prague (http://www.xmlprague.cz/)

– Institute for Applied Informatics (http://infai.org/en/AboutInfAI)

– OpenLink Software (http://www.openlinksw.com/)

Organisation

-Milan Dojčinovski, AKSW/KILT

– Julia Holze, DBpedia Association

– Sebastian Hellmann, AKSW/KILT, DBpedia Association

– Tomáš Kliegr, KIZI/University of Economics, Prague

 

Tell us what cool things you do with DBpedia. If you would like to give a talk at the DBpedia meetup, please get in contact with the DBpedia Association.

We are looking forward to meeting you in Prague!

For latest news and updates check Twitter, Facebook and our Website or subscribe to our newsletter.

Your DBpedia Association

The post Call for Participation: DBpedia meetup @ XML Prague appeared first on DBpedia Blog.

A year with DBpedia – Retrospective Part 3

Friday, December 21, 2018 - 11:58am

This is the final part of our journey around the world with DBpedia. This time we will take you from Austria, to Mountain View, California and to London, UK.

Come on, let’s do this.

Welcome to Vienna, Austria  – Semantics

More than 110 DBpedia enthusiasts joined our Community Meeting in Vienna, on September 10th, 2018. The event was again co-located with SEMANTiCS, a very successful collaboration. Lucky us, we got hold of two brilliant Keynote speakers, to open our meeting. Javier David Fernández García, Vienna University of Economics, opened the meeting with his keynote Linked Open Data cloud – act now before it’s too late. He reflected on challenges towards arriving at a truly machine-readable and decentralized Web of Data. Javier reviewed the current state of affairs, highlighted key technical and non-technical challenges, and outlined potential solution strategies. The second keynote speaker was Mathieu d’Aquin, Professor of Informatics at the Insight Centre for Data Analytics at NUI Galway. Mathieu, who is specialized in data analytics, completed the meeting with his keynote Dealing with Open-Domain Data.

The 12th edition of the DBpedia Community Meeting also covered a special chapter session, chaired by Enno Meijers, from the Dutch DBpedia Language Chapter. The speakers presented the latest technical or organizational developments of their respective chapter. This session has mainly created an exchange platform for the different DBpedia chapters. For the first time, representatives of the European chapters discussed problems and challenges of DBpedia from their point of view. Furthermore, tools, applications, and projects were presented by each chapter’s representative.

In case you missed the event, a more detailed article can be found here. All slides and presentations are also available on our Website. Further insights, feedback, and photos about the event are available on Twitter via #DBpediaDay.

Welcome to Mountain View  – GSoC mentor summit

GSoC was a vital part of DBpedia’s endeavors in 2018. We had three very talented students that with the help of our great mentors made it to the finish line of the program. You can read about their projects and success story in a dedicated post here.

After a successful 3-month mentoring, two of our mentors had the opportunity to attend the annual Google Summer of Code mentor summit. Mariano Rico and Thiago Galery represented DBpedia at the event this year. They engaged in a vital discussion about this years program, about lessons learned, highlights and drawbacks they experienced during the summer. A special focus was put on how to engage potential GSoC students as early as possible to get as much commitment as possible. The ideas the two mentors brought back in their suitcases will help to improve DBpedia’s part of the program for 2019. And apparently, chocolate was a very big thing there ;).

In case you have a project idea for GSoC2019 or want to mentor a DBpedia project next year, just drop us a line via dbpedia@infai.org. Also, as we intend to participate in the upcoming edition, please spread the word amongst students, and especially female students,  that fancy spending their summer coding on a DBpedia project. Thank you.

 

Welcome to London, England – Connected Data London 2018

In early November, we were invited to Connected Data London again. After 2017 this great event seems to become a regular in our DBpedia schedule.

Executive Director of the DBpedia Association, Sebastian Hellmannparticipated as panel candidate in the discussion around “Building Knowledge Graphs in the Real World”. Together with speakers from Thomson Reuters, Zalando, and Textkernel, he discussed definitions of KG, best practices of how to build and use knowledge graphs as well as the recent hype about it.

Visitors of CNDL2018 had the chance to grab a copy of our brand new flyer and exchange with us about the DBpedia Databus. This event gave us the opportunity to already met early adopters of our databus  – a decentralized data publication, integration, and subscription platform. Thank you very much for that opportunity.

A year went by

2018 has gone by so fast and brought so much for DBpedia. The DBpedia Association got the chance to meet more of DBpedia’s language chapters, we developed the DBpedia Databus to an extent that it can finally be launched in spring 2019. DBpedia is a community project relying on people and with the DBpedia Databus, we create a platform that allows publishing and provides a networked data economy around it. So stay tuned for exciting news coming up next year. Until then we like to thank all DBpedia enthusiasts around the world for their research with DBpedia, and support and contributions to DBpedia. Kudos to you.

 

All that remains to say is have yourself a very merry Christmas and a dazzling New Year. May 2019 be peaceful, exciting and prosperous.

 

Yours – being in a cheerful and festive mood –

 

DBpedia Association

 

The post A year with DBpedia – Retrospective Part 3 appeared first on DBpedia Blog.

A year with DBpedia – Retrospective Part Two

Thursday, December 13, 2018 - 4:05pm

Retrospective Part II. Welcome to the second part of our journey around the world with DBpedia. This time we are taking you to Greece, Germany, to Australia and finally France.

Let the travels begin.

Welcome to Thessaloniki, Greece & ESWC

DBpedians from the Portuguese Chapter presented their research results during ESWC 2018 in Thessaloniki, Greece.  the team around Diego Moussalem developed a demo to extend MAG  to support Entity Linking in 40 different languages. A special focus was put on low-resources languages such as Ukrainian, Greek, Hungarian, Croatian, Portuguese, Japanese and Korean. The demo relies on online web services which allow for an easy access to (their) entity linking approaches. Furthermore, it can disambiguate against DBpedia and Wikidata. Currently, MAG is used in diverse projects and has been used largely by the Semantic Web community. Check the demo via http://bit.ly/2RWgQ2M. Further information about the development can be found in a research paper, available here

 

Welcome back to Leipzig Germany

With our new credo “connecting data is about linking people and organizations”, halfway through 2018, we finalized our concept of the DBpedia Databus. This global DBpedia platform aims at sharing the efforts of OKG governance, collaboration, and curation to maximize societal value and develop a linked data economy.

With this new strategy, we wanted to meet some DBpedia enthusiasts of the German DBpedia Community. Fortunately, the LSWT (Leipzig Semantic Web Tag) 2018 hosted in Leipzig, home to the DBpedia Association proofed to be the right opportunity.  It was the perfect platform to exchange with researchers, industry and other organizations about current developments and future application of the DBpedia Databus. Apart from hosting a hands-on DBpedia workshop for newbies we also organized a well-received WebID -Tutorial. Finally,  the event gave us the opportunity to position the new DBpedia Databus as a global open knowledge network that aims at providing unified and global access to knowledge (graphs).

Welcome down under – Melbourne Australia

Further research results that rely on DBpedia were presented during ACL2018, in Melbourne, Australia, July 15th to 20th, 2018. The core of the research was DBpedia data, based on the WebNLG corpus, a challenge where participants automatically converted non-linguistic data from the Semantic Web into a textual format. Later on, the data was used to train a neural network model for generating referring expressions of a given entity. For example, if Jane Doe is a person’s official name, the referring expression of that person would be “Jane”, “Ms Doe”, “J. Doe”, or  “the blonde woman from the USA” etc.

If you want to dig deeper but missed ACL this year, the paper is available here.

 

Welcome to Lyon, France

In July the DBpedia Association travelled to France. With the organizational support of Thomas Riechert (HTWK, InfAI) and Inria, we finally met the French DBpedia Community in person and presented the DBpedia Databus. Additionally, we got to meet the French DBpedia Chapter, researchers and developers around Oscar Rodríguez Rocha and Catherine Faron Zucker.  They presented current research results revolving around an approach to automate the generation of educational quizzes from DBpedia. They wanted to provide a useful tool to be applied in the French educational system, that:

  • helps to test and evaluate the knowledge acquired by learners and…
  • supports lifelong learning on various topics or subjects. 

The French DBpedia team followed a 4-step approach:

  1. Quizzes are first formalized with Semantic Web standards: questions are represented as SPARQL queries and answers as RDF graphs.
  2. Natural language questions, answers and distractors are generated from this formalization.
  3. We defined different strategies to extract multiple choice questions, correct answers and distractors from DBpedia.
  4. We defined a measure of the information content of the elements of an ontology, and of the set of questions contained in a quiz.

Oscar R. Rocha and Catherine F. Zucker also published a paper explaining the detailed approach to automatically generate quizzes from DBpedia according to official French educational standards. 

 

 

Thank you to all DBpedia enthusiasts that we met during our journey. A big thanks to

With this journey from Europe to Australia and back we provided you with insights into research based on DBpedia as well as a glimpse into the French DBpedia Chapter. In our final part of the journey coming up next week, we will take you to Vienna,  San Francisco and London.  In the meantime, stay tuned and visit our Twitter channel or subscribe to our DBpedia Newsletter.

 

Have a great week.

Yours DBpedia Association

The post A year with DBpedia – Retrospective Part Two appeared first on DBpedia Blog.

A year with DBpedia – A Retrospective Part One

Wednesday, December 5, 2018 - 11:34am

Looking back, 2018 was a very successful year for DBpedia. First and foremost, we refined our strategy and developed our concept of the DBpedia Databus, a central communication system that allows exchanging, curating and accessing data between multiple stakeholders. The Databus simplifies working with data and will be launched in early 2019. 

Moreover, we travelled many miles in 2018 to not only visit our language chapters and exchange about DBpedia but also to meet enthusiast from our community to exchange during workshops and conferences worldwide.

In the upcoming Blog-Series, we like to take you on a retrospective tour around the world, giving you insights into a year with DBpedia. We will start out with Stop-overs in Japan, Poland and Germany and will continue our journey to other continents in the following two weeks.

Sit back and read on.

Big Spring in Japan – Welcome to Myazaki

Welcome to Miyazaki, to LREC, Language Resources and Evaluation Conference 2018 and meet RDF2PT.  No idea what that is and what it has to do with DBpedia? Read on!

The generation of natural language from RDF data has recently gained significant attention due to the continuous growth of Linked Data. Proposing the RDF2PT approach a research team around Diego Moussalem, part of the Portuguese DBpedia Chapter described how RDF data is verbalized to Brazilian Portuguese texts. They highlight the steps taken to generate Portuguese texts and addressed challenges with grammatical gender, classes and resources and properties. The results suggest that RDF2PT generates texts that can be easily understood by humans. It also helps to identify some of the challenges related to the automatic generation of Brazilian Portuguese (especially from RDF).

The full paper is available via https://arxiv.org/pdf/1802.08150.pdf 

 

Welcome to Poznan, Poland

Our community is our asset. In order to grow it and encourage contributions, the DBpedia Association continuously organizes community meetups to tackle the interests of our multi-faceted community. In late May, we travelled to Poland to meet Polish DBpedia enthusiasts in our meetup in Poznań. The idea was, to find out what the Polish DBpedia community uses DBpedia for, what applications and tools they have and what they are currently developing. Members of the chapter presented, amongst others, results of the primary research project “Quality of Data in DBpedia”. Attendees exchanged in vital discussions about uses of DBpedia applications and tools and listened to a presentation of Professor Witold Abramowicz, chair of the Department of Information Systems at Poznan University of Economics and also the head of SmartBrain. He talked about opportunities and challenges of data science.

Further information on the Polish DBpedia Chapter can be found on their website.

Welcome to Leipzig, home to the DBpedia Association

For the first time ever, DBpedia was part of the German culture-hackathon Coding da Vinci, held at Bibliotheca Albertina, University Library of Leipzig University,  in June 2018. In this year’s edition, we not only offered a hands-on workshop but also provided our DBpedia datasets. This data supported more than 30 cultural institutions, enriching their own datasets. In turn, hackathon participants could creatively develop new tools, apps, games quizzes etc. out of the data. 

One of the projects that used DBpedia as a source was Birdory . It is a memory game using bird voices and pictures. The goal is, much like in regular memory games, to match the correct picture to the bird sound that is played. The data used for the game was taken from Museum für Naturkunde Berlin (bird voices) as well as from DBpedia (pictures). So in case you need some me-time during Christmas gatherings, you might want to check it out via: https://birdory.firebaseapp.com/.

 

In our upcoming Blog-Post next week we will take you to Thessaloniki Greece, Australia and again, Leipzig.  In the meantime, stay tuned and visit our Twitter channel or subscribe to our DBpedia Newsletter.  

 

Have a great week,

 

Yours DBpedia Association

 

The post A year with DBpedia – A Retrospective Part One appeared first on DBpedia Blog.

Chaudron, chawdron , cauldron and DBpedia

Tuesday, October 30, 2018 - 10:29am

Meet Chaudron

Before getting into the technical details of, did you know the term Chaudron derives from Old French and denotes a large metal cooking pot? The word was used as an alternative form of chawdron which means entrails.  Entrails and cauldron –  a combo that seems quite fitting with Halloween coming along.

And now for something completely different

To begin with, Chaudron is a dataset of more than two million triples. It complements DBpedia with physical measures. The triples are automatically extracted from Wikipedia infoboxes using a pattern-matching and a formal grammar approaches.  This dataset adds triples to the existing DBpedia resources. Additionally, it includes measures on various resources such as chemical elements, railway, people places, aircrafts, dams and many other types of resources.

Chaudron was published on wiki.dbpedia.org and is one of many other projects and applications featuring DBpedia.

Want to find out more about our DBpedia Applications? Why not read about the DBpedia Chatbot, DBpedia Entity or the NLI-Go DBpedia Demo.?

Happy reading & happy Halloween!

Yours DBpedia Association

 

PS: In case you want your DBpedia tool, demo or any kind of application published on our Website and the DBpedia Blog, fill out this form and submit your information.

 

Powered by WPeMatico

The post Chaudron, chawdron , cauldron and DBpedia appeared first on DBpedia Blog.

Who are these DBpedia users ? …(and why? )

Wednesday, October 24, 2018 - 11:14am

Guest article by Victor de Boer, Vrije Universiteit Amsterdam, NL, member of NL-DBpedia

Who uses DBpedia anyway?…

This question started a research project for Frank Walraven, an Information Sciences Master student at Vrije Universiteit Amsterdam (VUA). The question came up during one of the meetings of the Dutch DBpedia chapter, of which VUA is a member.

If DBpedia users and their usage are better understood, this can lead to better servicing of those Dbpedia users by, for example, prioritizing the enrichment or improvement of specific sections of DBpedia. Characterizing use(r)s of a Linked Open Dataset is an inherently challenging task because in an open web world it is difficult to tell who is accessing your digital resources.

Frank conducted his MSc project research at the Dutch National Library F and used a hybrid approach utilizing both, a data-driven method based on user log analysis and a short survey to get to know the users of the dataset.

 As a scope, Frank selected just the Dutch DBpedia dataset. For the data-driven part of the method, Frank used a complete user log of HTTP requests on the Dutch DBpedia. This log file consisted of over 4.5 Million entries and logged both URI lookups and SPARQL endpoint requests. For this research, he only included a subset of the URI lookups.

Analysis of IP- Addresses od DBpedia Users

As a first analysis step, the requests’ origins IPs were categorized. Five classes can be identified (A-E), with the vast majority of IP addresses being in class “A”: Very large networks and bots. Most of the IP addresses in these lists could be traced back to search engine indexing bots such as those from Yahoo or Google. In classes B-F, Frank manually traced the top 30 most encountered IP-addresses. He concluded that even there 60% of the requests came from bots, 10% definitely not from bots, with 30% remaining unclear.

 

 

 

Step II – Identification of Page Requests

The second analysis step in the data-driven method consisted of identifying what types of pages were most requested. To cluster the thousands of DBpedia URI request, Frank retrieved the ‘categories’ of the pages. These categories are extracted from Wikipedia category links. An example is the “Android_TV” resource, which has two categories: “Google” and “Android_(operating_system)”. Following skos:broader links, a ‘level 2 category’ could also be found to aggregate to an even higher level of abstraction. As not all resources have such categories, this does not give a complete image, but it does provide some ideas on the most popular categories of items requested. After normalizing for categories with large amounts of incoming links, for example, the category “non-endangered animal”, the most popular categories where

  • 1. Domestic & International movies,
  • 2. Music,
  • 3. Sports,
  • 4. Dutch & International municipality information and
  • 5. Books.
 Survey

Additionally, Frank set up a user survey to corroborate this evidence. The survey contained questions about the how and why of the respondents use of the Dutch DBpedia, including the categories they were most interested in.

The survey was distributed using the Dutch DBpedia website and via Twitter. However, the endeavour only attracted 5 respondents. This illustrates the difficulty of the problem that users of the DBpedia resource are not necessarily easily reachable through communication channels. The five respondents were all quite closely related to the chapter but the results were interesting nonetheless. Most of the DBpedia users used the DBpedia SPARQL endpoint. The full results of the survey can be found through Frank’s thesis, but in terms of corroboration, the survey revealed that four out of the five categories found in the data-driven method were also identified in the top five results from the survey. The fifth one identified in the survey was ‘geography’, which could be matched to the fifth from the data-driven method.

Conclusion

Frank’s research shows that it remains a challenging problem, using a combination of data-driven and user-driven method. Yet,  it is indeed possible to get an indication into the most-used categories on DBpedia. Within the Dutch DBpedia Chapter, we are currently considering follow-up research questions based on Frank’s research. For further information about the work of the Dutch DBpedia chapter, please visit their website. 

A big thanks to the Dutch DBpedia Chapter for supervising this research and providing insights via this post.

Yours

DBpedia Association

The post Who are these DBpedia users ? …(and why? ) appeared first on DBpedia Blog.

Who are these DBpedia users ? …(and why ? )

Wednesday, October 24, 2018 - 11:14am

Guest article by Victor de Boer, Vrije Universiteit Amsterdam, NL, member of NL-DBpedia

Who uses DBpedia anyway?…

This question started a research project for Frank Walraven, an Information Sciences Master student at Vrije Universiteit Amsterdam (VUA). The question came up during one of the meetings of the Dutch DBpedia chapter, of which VUA is a member.

If DBpedia users and their usage are better understood, this can lead to better servicing of those Dbpedia users by, for example, prioritizing the enrichment or improvement of specific sections of DBpedia. Characterizing use(r)s of a Linked Open Dataset is an inherently challenging task because in an open web world it is difficult to tell who is accessing your digital resources.

Frank conducted his MSc project research at the Dutch National Library  and used a hybrid approach utilizing both, a data-driven method based on user log analysis and a short survey to get to know the users of the dataset.

 As a scope, Frank selected just the Dutch DBpedia dataset. For the data-driven part of the method, Frank used a complete user log of HTTP requests on the Dutch DBpedia. This log file consisted of over 4.5 Million entries and logged both URI lookups and SPARQL endpoint requests. For this research, he only included a subset of the URI lookups.

Analysis of IP- Addresses od DBpedia Users

As a first analysis step, the requests’ origins IPs were categorized. Five classes can be identified (A-E), with the vast majority of IP addresses being in class “A”: Very large networks and bots. Most of the IP addresses in these lists could be traced back to search engine indexing bots such as those from Yahoo or Google. In classes B-F, Frank manually traced the top 30 most encountered IP-addresses. He concluded that even there 60% of the requests came from bots, 10% definitely not from bots, with 30% remaining unclear.

 

 

 

Step II – Identification of Page Requests

The second analysis step in the data-driven method consisted of identifying what types of pages were most requested. To cluster the thousands of DBpedia URI request, Frank retrieved the ‘categories’ of the pages. These categories are extracted from Wikipedia category links. An example is the “Android_TV” resource, which has two categories: “Google” and “Android_(operating_system)”. Following skos:broader links, a ‘level 2 category’ could also be found to aggregate to an even higher level of abstraction. As not all resources have such categories, this does not give a complete image, but it does provide some ideas on the most popular categories of items requested. After normalizing for categories with large amounts of incoming links, for example, the category “non-endangered animal”, the most popular categories where

  • 1. Domestic & International movies,
  • 2. Music,
  • 3. Sports,
  • 4. Dutch & International municipality information and
  • 5. Books.
 Survey

Additionally, Frank set up a user survey to corroborate this evidence. The survey contained questions about the how and why of the respondents use of the Dutch DBpedia, including the categories they were most interested in.

The survey was distributed using the Dutch DBpedia website and via Twitter. However, the endeavour only attracted 5 respondents. This illustrates the difficulty of the problem that users of the DBpedia resource are not necessarily easily reachable through communication channels. The five respondents were all quite closely related to the chapter but the results were interesting nonetheless. Most of the DBpedia users used the DBpedia SPARQL endpoint. The full results of the survey can be found through Frank’s thesis, but in terms of corroboration, the survey revealed that four out of the five categories found in the data-driven method were also identified in the top five results from the survey. The fifth one identified in the survey was ‘geography’, which could be matched to the fifth from the data-driven method.

Conclusion

Frank’s research shows that it remains a challenging problem, using a combination of data-driven and user-driven method. Yet,  it is indeed possible to get an indication into the most-used categories on DBpedia. Within the Dutch DBpedia Chapter, we are currently considering follow-up research questions based on Frank’s research. For further information about the work of the Dutch DBpedia chapter, please visit their website. 

A big thanks to the Dutch DBpedia Chapter for supervising this research and providing insights via this post.

Yours

DBpedia Association

The post Who are these DBpedia users ? …(and why ? ) appeared first on DBpedia Blog.

Pages