🌍 LahjatBERT: Multi-Label Arabic Dialect Classifier

This demo predicts which country-level Arabic dialects a sentence sounds natural in. Unlike classic “pick one dialect” systems, a single sentence can be acceptable in multiple dialects.

How to use

  1. Paste an Arabic sentence
  2. Adjust the Confidence Threshold (higher = fewer highlights)
  3. Click Predict Dialects

How to interpret the results

  • Highlighted countries = dialects predicted as valid/acceptable for the sentence
Model

Select which LahjatBERT variant to use for prediction.

0.1 0.9

Tip: If you’re testing a sentence that’s close to Modern Standard Arabic (MSA), you may see many countries highlighted—that’s expected, because MSA-like text can be acceptable across dialects.

Detailed Results


🗺️ Dialect Map (Zoomed to the Arab World)

The map updates after each prediction.
Green countries indicate dialects predicted as valid at your selected threshold.

Simple World Map Author: Al MacDonald Editor: Fritz Lekschas License: CC BY-SA 3.0 ID: ISO 3166-1 or "_[a-zA-Z]" if an ISO code is not available
Highlighted = confidence ≥ threshold

✨ Try these examples

These examples are meant to show dialect overlap:

  • Some expressions are widely shared and may light up multiple regions
  • Others contain strong local signals (e.g., Egyptian, Gulf/Khaleeji, Levantine, Maghrebi)
Click an example to auto-fill the input
Arabic Text Input Confidence Threshold
Pages:

Notes

  • The model outputs multi-label predictions: more than one dialect can be valid at once.

If you use this demo in research, please cite the accompanying paper.