Below is information about some recent work by Ali Nirheche (with short descriptions by Ali).
Momayiz, Imane., Outchakoucht, Aissam., Choukrani, Omar., & Nirheche, Ali. (2024). TerjamaBench: A culturally specific dataset for evaluating translation models for Moroccan Darija. AtlasIA. Published on Hugging Face (machine learning and data science platform)
- “What we did is we compared prominent AI models (GPT4o, Claude 3.5 sonnet, Gemini) as well as few models that my colleagues have worked on for Moroccan Arabic (e.g., Terjman-Large-v1.2). We compared the performance of these models with respect to Moroccan Arabic (how well their output resembles how Moroccan Arabic is actually used).”
Nirheche, Ali. (2025). Moroccan Arabic Plurals Corpus [Data set]. Zenodo. https://doi.org/10.5281/zenodo.14642330
- “This is a corpus of Moroccan Arabic plural that I published back in January. The corpus contains 1,166 singular-plural noun pairs in Moroccan Arabic.”
Nirheche, Ali. 2024. Shiny App for MaxEnt with Hidden Structure. Shiny Application. Amherst, MA: University of Massachusetts Amherst. https://alingwist.shinyapps.io/HGR_app/
- “This is a user-friendly Shiny application, MaxEnt with Hidden Structure in R, that I developed. It’s designed to assist linguists in generating phonological grammars (weights) using a Maximum Entropy model.” (more info about the project here, funded by NSF grant with PI Joe Pater).





