Shared Tasks
A shared task is an international competition among researchers on a pre-defined dataset. We participate regularly to keep up with the state-of-the-art technologies.

We are proud winners of SemEval 2016 and Evalltalia 2016 and placed 2nd in VarDial 2018 and Germeval 2018

Sentiment Analysis
SemEval 2017 – Task 4B
Goal: Topic-Based Sentiment Classification
Technology: Distant-trained Convolutional Neural Network (CNN)
Result: Recall 84.6 (best score: 88.2)
Rank: 4th place out of 23
Paper:
TopicThunder at SemEval-2017 Task 4: Sentiment Classification Using a Convolutional Neural Network with Distant Supervision
Simon Müller, Tobias Huonder, Jan Deriu, and Mark Cieliebak
SemEval 2016 – Task 4
Goal: Message-level sentiment classification of English micro-blog messages from Twitter
Technology: Distant-trained, 2-layer Convolutional Neural Network (CNN)
Result: F1-Score 63.3 (was best score)
Rank: 1st place out of 34
Paper: SwissCheese at SemEval-2016 Task 4: Sentiment Classification Using an Ensemble of Convolutional Neural Networks with Distant Supervision
Jan Deriu, Maurice Gonzenbach, Fatih Uzdilli, Aurelien Lucchi, Valeria De Luca, and Martin Jaggi
EVALITA 2016 – SENTIPOLC – Task 2
Goal: Message Level Sentiment Classification for Italian Tweets
Technology: Multi-task trained Convolutional Neural Network (CNN) with weakly labelled distant learning phase
Result: F1-Score 68.28 (was best score)
Rank: 1st place out of 26
Paper:
Sentiment Detection using Convolutional Neural Networks with Multi-Task Training and Distant Supervision on Italian Tweets
Jan Deriu and Mark Cieliebak
SemEval 2015 – Task 10
Goal: Message Level Sentiment Classification for Tweets
Technology: Meta-Classifier on several flipout-regularized Support Vector Machines
Result: F1-Score 62.61 (best score: 64.84)
Rank: 8th place out of 40
Paper:
Swiss-Chocolate: Combining Flipout Regularization and Random Forests with Artificially Built Subsystems to Boost Text-Classification for Sentiment
Fatih Uzdilli, Martin Jaggi, Dominic Egger, Pascal Julmy, Leon Derczynski and Mark Cieliebak
SemEval 2014 – Task 9B
Goal: Message Level Sentiment Classification for Tweets
Technology: Regularized Support Vector Machine (SVM) with hand-crafted features.
Result: F1-Score 67.54 (best score: 70.96)
Rank: 8th place out of 50
Paper:
Swiss-Chocolate: Sentiment Detection using Sparse SVMs and Part-Of-Speech n-Grams
Martin Jaggi, Fatih Uzdilli and Mark Cieliebak
SemEval 2014 – Task 9B
Goal: Message Level Sentiment classification for Tweets
Technology: Combining 12 sentiment classification systems with a meta-classifier
Result: F1-Score 66.79 (best score: 70.96)
Rank: 12th place out of 50
Paper:
JOINT_FORCES: Unite Competing Sentiment Classifiers with Random Forest
Oliver Dürr, Fatih Uzdilli, and Mark Cieliebak
Hatespeech Detection
Germeval 2018 Task 1
Goal: Offensive Speech Classification of German Tweets
Technology: Ensemble of Convolutional Neural Networks (CNN)
Result: F1 75.52 (best score: 76.77)
Rank: 2nd and 3rd place out of 15
Paper:
spMMMP at GermEval 2018 Shared Task: Classification of Offensive Content in Tweets using Convolutional Neural Networks and Gated Recurrent Units
Dirk von Grünigen, Ralf Grubenmann, Fernando Benites, Pius von Däniken and Mark Cieliebak
Author Profiling
Coling 2018 – VarDial
Goal: German Dialect Identification
Technology: Meta SVM with multiple Feature Extraction Methods
Results: F-1 macro 64.6 % (best score 68.6%)
Rank: 2nd out of 8
Paper:
Twist Bytes – German Dialect Identification with Data Mining Optimization
Fernando Benites, Ralf Grubenmann, Pius von Däniken, Dirk von Grünigen, Jan Deriu and Mark Cieliebak
CLEF 2017 – PAN
Goal: Gender and Language Variety Detection from Tweets
Technology: Bi-directional Recurrent Neural Network (RNN) with attention mechanism
Results:
Gender Classification: Accuracy 75.31% (best score: 82.53)
Language Variety: Accuracy 85.22% (best score: 91.84)
Rank: 12th place out of 22
Paper:
Author Profiling with Bidirectional RNNs using Attention with GRUs – Notebook for PAN at CLEF 2017
Don Kodiyan, Florin Hardegger, Stephan Neuhaus, and Mark Cieliebak
Named Entity Recognition
CAp 2017
Goal: Named Entity Recognition on French Tweets
Technology: Deep learning with partially annotated data
Result: F-score 50.05 (best score: 58.89)
Rank: 5th place out of 8
Paper:
Swiss Chocolate at CAp 2017 NER Challenge: Partially Annotated Data and Transfer Learning
Nicole Falkner, Stefano Dolce, Pius von Däniken, and Mark Cieliebak
WNUT 2017
Goal: Named Entity Recognition on Tweets
Technology:
Result: F1-Score 40.78 (best score: 41.86)
Rank: 2nd place out of 7
Paper:
Transfer Learning and Sentence Level Features for Named Entity Recognition on Tweets
Pius von Däniken and Mark Cieliebak.
Language Classification
Germeval 2018 GDI Task
Goal: Swiss-German Dialect Identification
Technology: Metaclassifier of Support Vector Machines(SVM)
Result: F1 64.63 (best score: 68.57)
Rank: 2nd place out of 8
Paper:
Twist Bytes – German Dialect Identification with Data Mining Optimization
Fernando Benites, Ralf Grubenmann, , Pius von Däniken, Dirk von Grünigen, Jan Deriu and Mark Cieliebak
Adverse Drug Reactions
PSB 2016 – Task 1
Goal: Detect Mentions of Adverse Drug Reactions (ADR) in Tweets
Technology: Adaption of a feature-based sentiment classifier to ADR
Result: F1-Score 31.74 (best score: 41.95)
Rank: 5th place out of 8
Paper:
Adverse Drug Reaction Detection using an Adapted Sentiment Classifier
Dominic Egger, Fatih Uzdilli, and Mark Cieliebak
Question Answering
SemEval 2017 – Task 3A
Goal: Finding Relevant Responses to never-before seen Questions
Technology: Siamese Convolutional Neural Network (CNN) with attention mechanism
Result: MAP-Score 86.24 (best score: 88.43)
Rank: 7th place out of 13
Paper:
SwissAlps at SemEval-2017 Task 3: Attention-based Convolutional Neural
Network for Community Question Answering
Jan Deriu and Mark Cieliebak
Natural Language Generation
E2E NLG Challenge 2017
Goal: Generate Restaurant Reviews from Structured Data
Technology: Character-based Semantically Controlled Long Short-term Memory Network (SC-LSTM) with first-word control
Rank: 2nd rank out of 4 clusters
Paper:
End-to-End Trainable System for Enhancing Diversity in Natural Language Generation
Jan Deriu and Mark Cieliebak
Resources and Datasets
We provide datasets, such as textual corpora and word embeddings for sentiment analysis, to give back to the research community, as well as for commercial usage.

Our Publications
Together with our research partners, we regularly publish our achievements in distinguished conferences and journals:
- spMMMP at GermEval 2018 shared task : classification of offensive content in tweets using convolutional neural networks and gated recurrent units
Dirk von Grünigen, Ralf Grubenmann, Fernando Benites, Pius von Däniken and Mark Cieliebak. Proceedings of the GermEval 2018 Workshop : 14th Conference on Natural Language Processing – KONVENS 2018. - Twist Bytes – German Dialect Identification with Data Mining Optimization
Fernando Benites, Ralf Grubenmann, , Pius von Däniken, Dirk von Grünigen, Jan Deriu and Mark Cieliebak. Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018). - Towards a Corpus of Swiss German Annotated with Sentiment
Ralf Grubenmann, Don Tuggener, Pius von Däniken, Jan Deriu, Mark Cieliebak. Proceedings of the 11th Language Resources
and Evaluation Conference (LREC), 2018. - EmbedRank: Unsupervised Keyphrase Extraction using Sentence Embeddings
K Bennani-Smires, C Musat, A Hossmann, M Baeriswyl, M Jaggi. arXiv preprint arXiv:1801.0447, 2018. - Transfer Learning and Sentence Level Features for Named Entity Recognition on Tweets
Pius von Däniken and Mark Cieliebak. WNUT, 2017. - Author Profiling with Bidirectional RNNs using Attention with GRUs
Don Kodiyan, Florin Hardegger, Stephan Neuhaus, and Mark Cieliebak. PAN at CLEF, 2017. - Swiss Chocolate at CAp 2017 NER Challenge: Partially Annotated Data and Transfer Learning
Nicole Falkner, Stefano Dolce, Pius von Däniken, and Mark Cieliebak. Conférence sur l’Apprentissage Automatique CAp, 2017. - Fully Convolutional Neural Networks for Newspaper Article Segmentation
Benjamin Meier, Thilo Stadelmann, Jan Stampfli, Marek Arnold, and Mark Cieliebak. ICDAR, 2017. - Potential and Limitations of Cross-Domain Sentiment Classification
Dirk von Grünigen, Martin Weilenmann, Jan Deriu, and Mark Cieliebak. SocialNLP, 2017. - A Twitter Corpus and Benchmark Resources for German Sentiment Analysis
Mark Cieliebak, Jan Deriu, Dominic Egger and Fatih Uzdilli. SocialNLP, 2017. - SwissAlps at SemEval-2017 Task 3: Attention-based Convolutional Neural Network for Community Question Answering
Jan Deriu and Mark Cieliebak. SemEval, 2017. - TopicThunder at SemEval-2017 Task 4: Sentiment Classification Using an Ensemble of Convolutional Neural Networks with Distant Supervision
Simon Müller, Tobias Huonder, Jan Deriu, and Mark Cieliebak. SemEval, 2017. - Leveraging large amounts of weakly supervised data for multi-language sentiment classification
Jan Deriu, Aurelien Lucchi, Valeria De Luca, Aliaksei Severyn, Simon Müller, Mark Cieliebak, Thomas Hofmann, Martin Jaggi. Accepted at 26th International World Wide Web Conference (WWW), 2017. - Efficient Use of Limited-Memory Accelerators for Linear Learning on Heterogeneous Systems
Celestine Dünner, Thomas Parnell, Martin Jaggi. NIPS 2017 – Advances in Neural Information Processing Systems. - Sentiment Detection using Convolutional Neural Networks with Multi-Task Training and Distant Supervision on Italian Tweets
Jan Deriu and Mark Cieliebak. Evaluation of NLP and Speech Tools for Italian (EVALITA), 2016. - Adverse Drug Reaction Detection using an adapted Sentiment Classifier
Dominic Egger, Fatih Uzdilli, and Mark Cieliebak. Proceedings of the Sociel Media Mining Shared Task Workshop at the Pacific Symposium on Biocomputing (PSB), 2016. - SwissCheese at SemEval-2016 Task 4: Sentiment Classification Using an Ensemble of Convolutional Neural Networks with Distant Supervision
Jan Deriu, Maurice Gonzenbach, Fatih Uzdilli, Aurelien Lucchi, Valeria De Luca, Martin Jaggi. Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval 2016). - Swiss-Chocolate: Combining Flipout Regularization and Random Forests with Artificially Built Subsystems to Boost Text-Classification for Sentiment
Fatih Uzdilli, Martin Jaggi, Dominic Egger, Pascal Julmy, Leon Derczynski, and Mark Cieliebak. Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval-2015), 2015. - JOINT_FORCES: Unite Competing Sentiment Classifiers with Random Forest
Oliver Dürr, Fatih Uzdilli and Mark Cieliebak. Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval-2014), 2014. - Swiss-Chocolate: Sentiment Detection using Sparse SVMs and Part-Of-Speech n-Grams
Martin Jaggi, Fatih Uzdilli, and Mark Cieliebak. Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval-2014), 2014. - Meta-Classifiers Easily Improve Commercial Sentiment Detection Tools
Mark Cieliebak, Oliver Dürr, and Fatih Uzdilli.Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC 2014), 2014. - Potential and Limitations of Commercial Sentiment Detection Tools
Mark Cieliebak, Oliver Dürr, and Fatih Uzdilli. Proceedings of the First International Workshop on Emotion and Sentiment in Social and Expressive Media: approaches perspectives from AI (ESSEM 2013), 2013.
Our resources are free and open to the public!