soSoft is a research and business group working on natural language processing technologies for Kurdish language. Kurdish language is a member of the Indo-Iranian branch of Indo-European languages which is spoken by more than 40 million people in western Asia mainly in Iraq, Turkey, Iran, Syria, Armenia, and Azerbaijan. Also, Kurdish language has a variety of dialects. Despite this diversity, Kurdish belongs to low-resourced languages especially for computational linguistics purposes. AsoSoft is doing research and developing computational linguistics resources for Kurdish language and aims to eventually develop NLP technologies and applications. For those purposes, AsoSoft activities focus on, but are not limited to, the following:

1- Linguistic resources and data

  • Text corpora
  • Speech corpora
  • Computational grammars
  • Lexicons and dictionaries
  • Parallel corpora
  • WordNet and TreeBanks

2- Basic Tools

  • Tokenizer
  • POS Tagger
  • Parsers

3- Products

  • Kurdish Automatic Speech Recognition
  • Kurdish Text-to-Speech
  • Spell checker