SB-CH: Swiss German Sentiment Corpus

SB-CH is a publicly available corpus that contains 165’916 Swiss German sentences, of which 2799 are labeled by 5 annotators with “positive”, “negative”, “neutral”, “mixed”, or “unknown”. It was created by SpinningBytes in collaboration with the Zurich University of Applied Sciences (ZHAW).

Licence

All data is provided under Creative Commons License CC BY 4.0.

This means that they are free to use and distribute, even commercially, as long as appropriate credit to the reference below is given.

Human-readable format: Link

Licence Contract: Link

Reference

If you use the corpus, please make sure to reference the following publication:

 

  • Towards a Corpus of Swiss German Annotated with Sentiment. by Ralf Grubenmann, Don Tuggener, Pius von Däniken, Jan Deriu, Mark Cieliebak. In “Proceedings of the 11th Language Resources and Evaluation Conference (LREC), 2018 (to appear)”

Description

A detailed description of the corpus and how it was constructed can be found in the reference above, as well as the README file contained in the corpus.

Instructions

In order to use the corpus, download the annotations below. Since Facebook does not allow to distribute the content of posts, the dataset only contains comment ID’s and the corresponding annotations for Facebook posts. A download script is provided, simply follow the Readme on the linked page.

Download

SB-CH

Contact Us
close slider

Please check to consent to your data being stored in line with the guidelines in our Privacy Policy


captcha

We are using cookies on our website

Please confirm, if you accept our tracking cookies. You can also decline the tracking, so you can continue to visit our website without any data sent to third party services. For more information please visit our Privacy Policy