All data and code can be accessed in IndoLEM Github Account 😊
♦ Morpho-syntax and Sequence Labelling
We use the Indonesian POS tagging of Dinakaramani et al. (2014), and 5-fold partitioning of Kurniawan and Aji (2018). Train/Dev/Test distribution is 7,222/802/2,006.
♦ Semantic Task
This dataset is based on binary classification (positive and negative), with distribution:
The data is sourced from 1) Twitter (Koto and Rahmaningtyas, 2017) and 2) hotel reviews.
♦ Discourse Coherence
To evaluate model coherence, we design a next tweet prediction (NTP) task that is similar to the next sentence prediction (NSP) task used to train BERT (Devlin et al., 2019). In NTP, each instance consists of a Twitter thread (2–4 tweets) that we call the premise, and four possible options for the next tweet, one of which is the actual response from the original thread.
This task is based on the sentence ordering task of Barzilay and Lapata (2008) to assess text relatedness. We construct the data by shuffling Twitter threads (containing 3–5 tweets), and assessing the predicted ordering in terms of rank correlation (ρ) with the original.